Analog API¶

`analog` Command¶

The primary way to invoke analog is via the analog command which calls analog.main.main().

analog.main.main(argv=None)[source]¶

analog - Log Analysis Utility.

Name the logfile to analyze (positional argument) or leave it out to read from stdin. This can be handy for piping in filtered logfiles (e.g. with grep).

Select the logfile format subcommand that suits your needs or define a custom log format using analog custom --pattern-regex <...> --time-format <...>.

To analyze for the logfile for specified paths, provide them via --path arguments (mutliple times). Also, monitoring specifig HTTP verbs (request methods) via --verb and specific response status codes via --status argument(s) is possible.

Paths and status codes all match the start of the actual log entry values. Thus, specifying a path /foo will group all paths beginning with that value.

Arguments can be listed in a file by specifying @argument_file.txt as parameter.

Analyzer¶

The Analyzer is the main logfile parser class. It uses a analog.formats.LogFormat instance to parse the log entries and passes them on to a analog.report.Report instance for statistical analysis. The report itsself can be passed through a analog.renderers.Renderer subclass for different report output formats.

class analog.analyzer.Analyzer(log, format, pattern=None, time_format=None, verbs=['DELETE', 'GET', 'PATCH', 'POST', 'PUT'], status_codes=[1, 2, 3, 4, 5], paths=[], max_age=None, path_stats=False)[source]¶

Log analysis utility.

Scan a logfile for logged requests and analyze calculate statistical metrics in a analog.report.Report.

__call__()[source]¶

Analyze defined logfile.

Returns:	log analysis report object.
Return type:	`analog.report.Report`

__init__(log, format, pattern=None, time_format=None, verbs=['DELETE', 'GET', 'PATCH', 'POST', 'PUT'], status_codes=[1, 2, 3, 4, 5], paths=[], max_age=None, path_stats=False)[source]¶

Configure log analyzer.

Parameters:

log (io.TextIOWrapper) – handle on logfile to read and analyze.
format (str) – log format identifier or ‘custom’.
pattern (str) – custom log format pattern expression.
time_format (str) – log entry timestamp format (strftime compatible).
verbs (list) – HTTP verbs to be tracked. Defaults to analog.analyzer.DEFAULT_VERBS.
status_codes (list) – status_codes to be tracked. May be prefixes, e.g. [“100”, “2”, “3”, “4”, “404” ]. Defaults to analog.analyzer.DEFAULT_STATUS_CODES.
paths (list of str) – Paths to explicitly analyze. If not defined, paths are detected automatically. Defaults to analog.analyzer.DEFAULT_PATHS.
max_age (int) – Max. age of log entries to analyze in minutes. Unlimited by default.

Raises:

analog.exceptions.MissingFormatError if no format is specified.

analyze is a convenience wrapper around analog.analyzer.Analyzer and can act as the main and only required entry point when using analog from code.

analog.analyzer.analyze(log, format, pattern=None, time_format=None, verbs=['DELETE', 'GET', 'PATCH', 'POST', 'PUT'], status_codes=[1, 2, 3, 4, 5], paths=[], max_age=None, path_stats=False, timing=False, output_format=None)[source]¶

Convenience wrapper around analog.analyzer.Analyzer.

Parameters:	log (`io.TextIOWrapper`) – handle on logfile to read and analyze. format (`str`) – log format identifier or ‘custom’. pattern (`str`) – custom log format pattern expression. time_format (`str`) – log entry timestamp format (strftime compatible). verbs (`list`) – HTTP verbs to be tracked. Defaults to `analog.analyzer.DEFAULT_VERBS`. status_codes (`list`) – status_codes to be tracked. May be prefixes, e.g. [“100”, “2”, “3”, “4”, “404” ]. Defaults to `analog.analyzer.DEFAULT_STATUS_CODES`. paths (`list` of `str`) – Paths to explicitly analyze. If not defined, paths are detected automatically. Defaults to `analog.analyzer.DEFAULT_PATHS`. max_age (`int`) – Max. age of log entries to analyze in minutes. Unlimited by default. path_stats (`bool`) – Print per-path analysis report. Default off. timing (`bool`) – print analysis timing information? output_format (`str`) – report output format.
Returns:	log analysis report object.
Return type:	`analog.report.Report`

analog.analyzer.DEFAULT_VERBS = ['DELETE', 'GET', 'PATCH', 'POST', 'PUT']¶: Default verbs to monitor if unconfigured.

analog.analyzer.DEFAULT_STATUS_CODES = [1, 2, 3, 4, 5]¶: Default status codes to monitor if unconfigured.

analog.analyzer.DEFAULT_PATHS = []¶: Default paths (all) to monitor if unconfigured.

Log Format¶

A LogFormat defines how log entries are represented in and can be parsed from a log file.

class analog.formats.LogFormat(name, pattern, time_format)[source]¶

Log format definition.

Represents log format recognition patterns by name.

A name:format mapping of all defined log format patterns can be retrieved using analog.formats.LogFormat.all_formats().

Each log format should at least define the following match groups:

timestamp: Local time.
verb: HTTP verb (GET, POST, PUT, ...).
path: Request path.
status: Response status code.
body_bytes_sent: Body size in bytes.
request_time: Request time.
upstream_response_time: Upstream response time.

__init__(name, pattern, time_format)[source]¶

Describe log format.

The format pattern is a (verbose) regex pattern string specifying the log entry attributes as named groups that is compiled into a re.Pattern object.

All pattern group names are be available as attributes of log entries when using a analog.formats.LogEntry.entry().

Parameters:	name (`str`) – log format name. pattern (raw `str`) – regular expression pattern string. time_format (`str`) – timestamp parsing pattern.
Raises:	`analog.exceptions.InvalidFormatExpressionError` if missing required format pattern groups or the pattern is not a valid regular expression.

classmethod all_formats()[source]¶

Mapping of all defined log format patterns.

Returns:	dictionary of name:`LogFormat` instances.
Return type:	`dict`

entry(match)[source]¶

Convert regex match object to log entry object.

Parameters:	match (`re.MatchObject`) – regex match object from `pattern` match.
Returns:	log entry object with all pattern keys as attributes.
Return type:	`collections.namedtuple`

Predefined Formats¶

nginx

analog.formats.NGINX = <analog.formats.LogFormat object>¶

Nginx combinded_timed format:

'$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'$request_time $upstream_response_time $pipe';

Reports¶

A Report collects log entry information and computes the statistical analysis.

class analog.report.Report(verbs, status_codes)[source]¶

Log analysis report object.

Provides these statistical metrics:

Number for requests.
Response request method (HTTP verb) distribution.
Response status code distribution.
Requests per path.
Response time statistics (mean, median).
Response upstream time statistics (mean, median).
Response body size in bytes statistics (mean, median).
Per path request method (HTTP verb) distribution.
Per path response status code distribution.
Per path response time statistics (mean, median).
Per path response upstream time statistics (mean, median).
Per path response body size in bytes statistics (mean, median).

__init__(verbs, status_codes)[source]¶

Create new log report object.

Use add() method to add log entries to be analyzed.

Parameters:	verbs (`list`) – HTTP verbs to be tracked. status_codes (`list`) – status_codes to be tracked. May be prefixes, e.g. [“100”, “2”, “3”, “4”, “404” ]
Returns:	Report analysis object
Return type:	`analog.report.Report`

add(path, verb, status, time, upstream_time, body_bytes)[source]¶

Add a log entry to the report.

Any request with verb not matching any of self._verbs or status not matching any of self._status is ignored.

Parameters:	path (`str`) – monitored request path. verb (`str`) – HTTP method (GET, POST, ...) status (`int`) – response status code. time (`float`) – response time in seconds. upstream_time (`float`) – upstream response time in seconds. body_bytes (`float`) – response body size in bytes.

body_bytes¶

Response body size in bytes of all matched requests.

Returns:	response body size statistics.
Return type:	`analog.report.ListStats`

finish()[source]¶: Stop execution timer.

path_body_bytes¶

Response body size in bytes of all matched requests per path.

Returns:	path mapping of body size statistics.
Return type:	`dict` of `analog.report.ListStats`

path_requests¶

List paths of all matched requests, ordered by frequency.

Returns:	tuples of path and occurrency count.
Return type:	`list` of `tuple`

path_status¶

List status codes of all matched requests per path.

Status codes are grouped by path and ordered by frequency.

Returns:	path mapping of tuples of status code and occurrency count.
Return type:	`dict` of `list` of `tuple`

path_times¶

Response time statistics of all matched requests per path.

Returns:	path mapping of response time statistics.
Return type:	`dict` of `analog.report.ListStats`

path_upstream_times¶

Response upstream time statistics of all matched requests per path.

Returns:	path mapping of response upstream time statistics.
Return type:	`dict` of `analog.report.ListStats`

path_verbs¶

List request methods (HTTP verbs) of all matched requests per path.

Verbs are grouped by path and ordered by frequency.

Returns:	path mapping of tuples of verb and occurrency count.
Return type:	`dict` of `list` of `tuple`

render(path_stats, output_format)[source]¶

Render report data into output_format.

Parameters:	path_stats (`bool`) – include per path statistics in output. output_format (`str`) – name of report renderer.
Raises:	`analog.exceptions.UnknownRendererError` or unknown `output_format` identifiers.
Returns:	rendered report data.
Return type:	`str`

status¶

List status codes of all matched requests, ordered by frequency.

Returns:	tuples of status code and occurrency count.
Return type:	`list` of `tuple`

times¶

Response time statistics of all matched requests.

Returns:	response time statistics.
Return type:	`analog.report.ListStats`

upstream_times¶

Response upstream time statistics of all matched requests.

Returns:	response upstream time statistics.
Return type:	`analog.report.ListStats`

verbs¶

List request methods of all matched requests, ordered by frequency.

Returns:	tuples of HTTP verb and occurrency count.
Return type:	`list` of `tuple`

class analog.report.ListStats(elements)[source]¶

Statistic analysis of a list of values.

Provides the mean, median and 90th, 75th and 25th percentiles.

__init__(elements)[source]¶

Calculate some stats from list of values.

Parameters:	elements (`list`) – list of values.

Renderers¶

Reports are rendered using one of the available renderers. These all implement the basic analog.renderers.Renderer interface.

class analog.renderers.Renderer[source]¶

Base report renderer interface.

classmethod all_renderers()[source]¶

Get a mapping of all defined report renderer names.

Returns:	dictionary of name to renderer class.
Return type:	`dict`

classmethod by_name(name)[source]¶

Select specific Renderer subclass by name.

Parameters:	name (`str`) – name of subclass.
Returns:	`Renderer` subclass instance.
Return type:	`analog.renderers.Renderer`
Raises:	`analog.exceptions.UnknownRendererError` for unknown subclass names.

render(report, path_stats=False)[source]¶

Render report statistics.

Parameters:	report (`analog.report.Report`) – log analysis report object. path_stats (`bool`) – include per path statistics in output.
Returns:	output string
Return type:	str

Available Renderers¶

default

class analog.renderers.PlainTextRenderer[source]¶

Default renderer for plain text output in list format.

Tabular Data¶

class analog.renderers.TabularDataRenderer[source]¶

Base renderer for report output in any tabular form.

_list_stats(list_stats)[source]¶

Get list of (key,value) tuples for each attribute of list_stats.

Parameters:	list_stats (`analog.report.ListStats`) – list statistics object.
Returns:	(key, value) tuples for each `ListStats` attribute.
Return type:	`list` of `tuple`

_tabular_data(report, path_stats)[source]¶

Prepare tabular data for output.

Generate a list of header fields, a list of total values for each field and a list of the same values per path.

Parameters:	report (`analog.report.Report`) – log analysis report object. path_stats (`bool`) – include per path statistics in output.
Returns:	tuple of table (headers, rows).
Return type:	`tuple`

Visual Tables¶

class analog.renderers.ASCIITableRenderer[source]¶: Base renderer for report output in ascii-table format.

table

class analog.renderers.SimpleTableRenderer[source]¶

Renderer for tabular report output in simple reSt table format.

grid

class analog.renderers.GridTableRenderer[source]¶

Renderer for tabular report output in grid table format.

Separated Values¶

class analog.renderers.SeparatedValuesRenderer[source]¶: Base renderer for report output in delimiter-separated values format.

csv

class analog.renderers.CSVRenderer[source]¶

Renderer for report output in comma separated values format.

tsv

class analog.renderers.TSVRenderer[source]¶

Renderer for report output in tab separated values format.

Utils¶

class analog.utils.AnalogArgumentParser(prog=None, usage=None, description=None, epilog=None, parents=[], formatter_class=<class 'argparse.HelpFormatter'>, prefix_chars='-', fromfile_prefix_chars=None, argument_default=None, conflict_handler='error', add_help=True, allow_abbrev=True)[source]¶

ArgumentParser that reads multiple values per argument from files.

Arguments read from files may contain comma or whitespace separated values.

To read arguments from files create a parser with fromfile_prefix_chars set:

parser = AnalogArgumentParser(fromfile_prefix_chars='@')

Then this parser can be called with argument files:

parser.parse_args(['--arg1', '@args_file', 'more-args'])

The argument files contain one argument per line. Arguments can be comma or whitespace separated on a line. For example all of this works:

nginx
-o       table
--verb   GET, POST, PUT
--verb   PATCH
--status 404, 500
--path   /foo/bar
--path   /baz
--path-stats
-t
positional
arg

convert_arg_line_to_args(arg_line)[source]¶

Comma/whitespace-split arg_line and yield separate attributes.

Argument names defined at the beginning of a line (-a, --arg) are repeated for each argument value in arg_line.

Parameters:	arg_line (`str`) – one line of argument(s) read from a file
Returms:	argument generator
Return type:	generator

class analog.utils.PrefixMatchingCounter(*args, **kwds)[source]¶

Counter-like object that increments a field if it has a common prefix.

Example: “400”, “401”, “404” all increment a field named “4”.

Exceptions¶

Analog exceptions.

exception analog.exceptions.AnalogError[source]¶: Exception base class for all Analog errors.

exception analog.exceptions.InvalidFormatExpressionError[source]¶: Error raised for invalid format regex patterns.

exception analog.exceptions.MissingFormatError[source]¶: Error raised when Analyzer is called without format.

exception analog.exceptions.UnknownRendererError[source]¶: Error raised for unknown output format names (to select renderer).