logdetective - Man Page
Analyze and summarize log files using LLM and Drain templates
Synopsis
logdetective [Options] file
Description
logdetective is a tool that analyzes log files with a LLM using the Drain log template miner. It can consume logs from a local path or a URL, summarize them, and cluster them for easier inspection.
Positional Arguments
- file
The URL or path to the log file to be analyzed.
Options
- -h, --help
Show usage description and exit.
- -M MODEL, --model MODEL
The path to the language model for analysis (if stored locally). You can also specify the model by name based on the repo on Hugging face (see Examples). Repo id must be in the form 'namespace/repo_name'. As we are using LLama.cpp we want this to be in the gguf format. If the model is already on your machine it will skip the download. (optional, default: "fedora-copr/Mistral-7B-Instruct-v0.3-GGUF")
- -F FILENAME_SUFFIX, --filename_suffix FILENAME_SUFFIX
Define the suffix of the model file name to retrieve from Hugging Face. This option only applies when the model is specified by its Hugging face repo name, and not its path. (default Q4_K.gguf)
- -n, --no-stream
Disable streaming output of analysis results.
- -C N_CLUSTERS, --n_clusters N_CLUSTERS
Number of clusters to use with the Drain summarizer. Ignored if LLM summarizer is selected. (optional, default 8)
- -v, --verbose
Enable verbose output during processing (use -vv or -vvv for higher levels of verbosity).
- -q, --quiet
Suppress non-essential output.
- --prompts PROMPTS_FILE
Path to prompt configuration file where you can customize (override default) prompts sent to the LLM. See https://github.com/fedora-copr/logdetective/blob/main/logdetective/prompts.yml for reference.
Prompts need to have a form compatible with Python format string syntax (see https://docs.python.org/3/library/string.html#format-string-syntax) with spaces, or replacement fields marked with curly braces, {} left for insertion of snippets. Number of replacement fields in new prompts must be the same as in original, although their position may be different.
- --temperature TEMPERATURE
Temperature for inference. Higher temperature leads to more creative and variable outputs. (default 0.8)
- --csgrep
Use csgrep to process the log before analysis. When working with logs containing messages from GCC, it can be beneficial to employ additional extractor based on csgrep tool, to ensure that the messages are kept intact. Since csgrep is not available as a Python package, it must be installed separately, with a package manager or from https://github.com/csutils/csdiff.
- --skip_snippets SNIPPETS_FILE
Path to patterns for skipping snippets. User can specify regular expressions matching log chunks (which may not contribute to the analysis of the problem), along with simple description. Patterns to be skipped must be defined in a yaml file as a dictionary, where key is a description and value is a regex.
Examples:
contains_capital_a: "^.*A.*"
starts_with_numeric: "^[0-9].*"
child_exit_code_zero: "Child return code was: 0"
Examples
Example usage:
$ logdetective https://example.com/logs.txt
Or if the log file is stored locally:
$ logdetective ./data/logs.txt
Analyze a local log file using a LLM found locally:
$ logdetective -M /path/to/llm /var/log/syslog
With specific model from HuggingFace (namespace/repo_name, note that --filename_suffix is also needed):
$ logdetective https://example.com/logs.txt --model QuantFactory/Meta-Llama-3-8B-Instruct-GGUF --filename_suffix Q5_K_S.gguf
Cluster logs from a URL (using Drain):
$ logdetective -C 10 https://example.com/logs.txt
See Also
Additional Notes
Note that logdetective works as intended with instruction tuned text generation models in GGUF format.