Analog API

analog Command

The primary way to invoke analog is via the analog command which calls analog.main.main().

analog.main.main(argv=None)[source]

analog - Log Analysis Utility.

Name the logfile to analyze (positional argument) or leave it out to read from stdin. This can be handy for piping in filtered logfiles (e.g. with grep).

Select the logfile format subcommand that suits your needs or define a custom log format using analog custom --pattern-regex <...> --time-format <...>.

To analyze for the logfile for specified paths, provide them via --path arguments (mutliple times). Also, monitoring specifig HTTP verbs (request methods) via --verb and specific response status codes via --status argument(s) is possible.

Paths and status codes all match the start of the actual log entry values. Thus, specifying a path /foo will group all paths beginning with that value.

Arguments can be listed in a file by specifying @argument_file.txt as parameter.

Analyzer

The Analyzer is the main logfile parser class. It uses a analog.formats.LogFormat instance to parse the log entries and passes them on to a analog.report.Report instance for statistical analysis. The report itsself can be passed through a analog.renderers.Renderer subclass for different report output formats.

class analog.analyzer.Analyzer(log, format, pattern=None, time_format=None, verbs=['DELETE', 'GET', 'PATCH', 'POST', 'PUT'], status_codes=[1, 2, 3, 4, 5], paths=[], max_age=None, path_stats=False)[source]

Log analysis utility.

Scan a logfile for logged requests and analyze calculate statistical metrics in a analog.report.Report.

__call__()[source]

Analyze defined logfile.

Returns:log analysis report object.
Return type:analog.report.Report
__init__(log, format, pattern=None, time_format=None, verbs=['DELETE', 'GET', 'PATCH', 'POST', 'PUT'], status_codes=[1, 2, 3, 4, 5], paths=[], max_age=None, path_stats=False)[source]

Configure log analyzer.

Parameters:
  • log (io.TextIOWrapper) – handle on logfile to read and analyze.
  • format (str) – log format identifier or ‘custom’.
  • pattern (str) – custom log format pattern expression.
  • time_format (str) – log entry timestamp format (strftime compatible).
  • verbs (list) – HTTP verbs to be tracked. Defaults to analog.analyzer.DEFAULT_VERBS.
  • status_codes (list) – status_codes to be tracked. May be prefixes, e.g. [“100”, “2”, “3”, “4”, “404” ]. Defaults to analog.analyzer.DEFAULT_STATUS_CODES.
  • paths (list of str) – Paths to explicitly analyze. If not defined, paths are detected automatically. Defaults to analog.analyzer.DEFAULT_PATHS.
  • max_age (int) – Max. age of log entries to analyze in minutes. Unlimited by default.
Raises:

analog.exceptions.MissingFormatError if no format is specified.

analyze is a convenience wrapper around analog.analyzer.Analyzer and can act as the main and only required entry point when using analog from code.

analog.analyzer.analyze(log, format, pattern=None, time_format=None, verbs=['DELETE', 'GET', 'PATCH', 'POST', 'PUT'], status_codes=[1, 2, 3, 4, 5], paths=[], max_age=None, path_stats=False, timing=False, output_format=None)[source]

Convenience wrapper around analog.analyzer.Analyzer.

Parameters:
  • log (io.TextIOWrapper) – handle on logfile to read and analyze.
  • format (str) – log format identifier or ‘custom’.
  • pattern (str) – custom log format pattern expression.
  • time_format (str) – log entry timestamp format (strftime compatible).
  • verbs (list) – HTTP verbs to be tracked. Defaults to analog.analyzer.DEFAULT_VERBS.
  • status_codes (list) – status_codes to be tracked. May be prefixes, e.g. [“100”, “2”, “3”, “4”, “404” ]. Defaults to analog.analyzer.DEFAULT_STATUS_CODES.
  • paths (list of str) – Paths to explicitly analyze. If not defined, paths are detected automatically. Defaults to analog.analyzer.DEFAULT_PATHS.
  • max_age (int) – Max. age of log entries to analyze in minutes. Unlimited by default.
  • path_stats (bool) – Print per-path analysis report. Default off.
  • timing (bool) – print analysis timing information?
  • output_format (str) – report output format.
Returns:

log analysis report object.

Return type:

analog.report.Report

analog.analyzer.DEFAULT_VERBS = ['DELETE', 'GET', 'PATCH', 'POST', 'PUT']

Default verbs to monitor if unconfigured.

analog.analyzer.DEFAULT_STATUS_CODES = [1, 2, 3, 4, 5]

Default status codes to monitor if unconfigured.

analog.analyzer.DEFAULT_PATHS = []

Default paths (all) to monitor if unconfigured.

Log Format

A LogFormat defines how log entries are represented in and can be parsed from a log file.

class analog.formats.LogFormat(name, pattern, time_format)[source]

Log format definition.

Represents log format recognition patterns by name.

A name:format mapping of all defined log format patterns can be retrieved using analog.formats.LogFormat.all_formats().

Each log format should at least define the following match groups:

  • timestamp: Local time.
  • verb: HTTP verb (GET, POST, PUT, ...).
  • path: Request path.
  • status: Response status code.
  • body_bytes_sent: Body size in bytes.
  • request_time: Request time.
  • upstream_response_time: Upstream response time.
__init__(name, pattern, time_format)[source]

Describe log format.

The format pattern is a (verbose) regex pattern string specifying the log entry attributes as named groups that is compiled into a re.Pattern object.

All pattern group names are be available as attributes of log entries when using a analog.formats.LogEntry.entry().

Parameters:
  • name (str) – log format name.
  • pattern (raw str) – regular expression pattern string.
  • time_format (str) – timestamp parsing pattern.
Raises:

analog.exceptions.InvalidFormatExpressionError if missing required format pattern groups or the pattern is not a valid regular expression.

classmethod all_formats()[source]

Mapping of all defined log format patterns.

Returns:dictionary of name:LogFormat instances.
Return type:dict
entry(match)[source]

Convert regex match object to log entry object.

Parameters:match (re.MatchObject) – regex match object from pattern match.
Returns:log entry object with all pattern keys as attributes.
Return type:collections.namedtuple

Predefined Formats

nginx

analog.formats.NGINX = <analog.formats.LogFormat object>

Nginx combinded_timed format:

'$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'$request_time $upstream_response_time $pipe';

Reports

A Report collects log entry information and computes the statistical analysis.

class analog.report.Report(verbs, status_codes)[source]

Log analysis report object.

Provides these statistical metrics:

  • Number for requests.
  • Response request method (HTTP verb) distribution.
  • Response status code distribution.
  • Requests per path.
  • Response time statistics (mean, median).
  • Response upstream time statistics (mean, median).
  • Response body size in bytes statistics (mean, median).
  • Per path request method (HTTP verb) distribution.
  • Per path response status code distribution.
  • Per path response time statistics (mean, median).
  • Per path response upstream time statistics (mean, median).
  • Per path response body size in bytes statistics (mean, median).
__init__(verbs, status_codes)[source]

Create new log report object.

Use add() method to add log entries to be analyzed.

Parameters:
  • verbs (list) – HTTP verbs to be tracked.
  • status_codes (list) – status_codes to be tracked. May be prefixes, e.g. [“100”, “2”, “3”, “4”, “404” ]
Returns:

Report analysis object

Return type:

analog.report.Report

add(path, verb, status, time, upstream_time, body_bytes)[source]

Add a log entry to the report.

Any request with verb not matching any of self._verbs or status not matching any of self._status is ignored.

Parameters:
  • path (str) – monitored request path.
  • verb (str) – HTTP method (GET, POST, ...)
  • status (int) – response status code.
  • time (float) – response time in seconds.
  • upstream_time (float) – upstream response time in seconds.
  • body_bytes (float) – response body size in bytes.
body_bytes

Response body size in bytes of all matched requests.

Returns:response body size statistics.
Return type:analog.report.ListStats
finish()[source]

Stop execution timer.

path_body_bytes

Response body size in bytes of all matched requests per path.

Returns:path mapping of body size statistics.
Return type:dict of analog.report.ListStats
path_requests

List paths of all matched requests, ordered by frequency.

Returns:tuples of path and occurrency count.
Return type:list of tuple
path_status

List status codes of all matched requests per path.

Status codes are grouped by path and ordered by frequency.

Returns:path mapping of tuples of status code and occurrency count.
Return type:dict of list of tuple
path_times

Response time statistics of all matched requests per path.

Returns:path mapping of response time statistics.
Return type:dict of analog.report.ListStats
path_upstream_times

Response upstream time statistics of all matched requests per path.

Returns:path mapping of response upstream time statistics.
Return type:dict of analog.report.ListStats
path_verbs

List request methods (HTTP verbs) of all matched requests per path.

Verbs are grouped by path and ordered by frequency.

Returns:path mapping of tuples of verb and occurrency count.
Return type:dict of list of tuple
render(path_stats, output_format)[source]

Render report data into output_format.

Parameters:
  • path_stats (bool) – include per path statistics in output.
  • output_format (str) – name of report renderer.
Raises:

analog.exceptions.UnknownRendererError or unknown output_format identifiers.

Returns:

rendered report data.

Return type:

str

status

List status codes of all matched requests, ordered by frequency.

Returns:tuples of status code and occurrency count.
Return type:list of tuple
times

Response time statistics of all matched requests.

Returns:response time statistics.
Return type:analog.report.ListStats
upstream_times

Response upstream time statistics of all matched requests.

Returns:response upstream time statistics.
Return type:analog.report.ListStats
verbs

List request methods of all matched requests, ordered by frequency.

Returns:tuples of HTTP verb and occurrency count.
Return type:list of tuple
class analog.report.ListStats(elements)[source]

Statistic analysis of a list of values.

Provides the mean, median and 90th, 75th and 25th percentiles.

__init__(elements)[source]

Calculate some stats from list of values.

Parameters:elements (list) – list of values.

Renderers

Reports are rendered using one of the available renderers. These all implement the basic analog.renderers.Renderer interface.

class analog.renderers.Renderer[source]

Base report renderer interface.

classmethod all_renderers()[source]

Get a mapping of all defined report renderer names.

Returns:dictionary of name to renderer class.
Return type:dict
classmethod by_name(name)[source]

Select specific Renderer subclass by name.

Parameters:name (str) – name of subclass.
Returns:Renderer subclass instance.
Return type:analog.renderers.Renderer
Raises:analog.exceptions.UnknownRendererError for unknown subclass names.
render(report, path_stats=False)[source]

Render report statistics.

Parameters:
  • report (analog.report.Report) – log analysis report object.
  • path_stats (bool) – include per path statistics in output.
Returns:

output string

Return type:

str

Available Renderers

default

class analog.renderers.PlainTextRenderer[source]

Default renderer for plain text output in list format.

Tabular Data

class analog.renderers.TabularDataRenderer[source]

Base renderer for report output in any tabular form.

_list_stats(list_stats)[source]

Get list of (key,value) tuples for each attribute of list_stats.

Parameters:list_stats (analog.report.ListStats) – list statistics object.
Returns:(key, value) tuples for each ListStats attribute.
Return type:list of tuple
_tabular_data(report, path_stats)[source]

Prepare tabular data for output.

Generate a list of header fields, a list of total values for each field and a list of the same values per path.

Parameters:
  • report (analog.report.Report) – log analysis report object.
  • path_stats (bool) – include per path statistics in output.
Returns:

tuple of table (headers, rows).

Return type:

tuple

Visual Tables
class analog.renderers.ASCIITableRenderer[source]

Base renderer for report output in ascii-table format.

table

class analog.renderers.SimpleTableRenderer[source]

Renderer for tabular report output in simple reSt table format.

grid

class analog.renderers.GridTableRenderer[source]

Renderer for tabular report output in grid table format.

Separated Values
class analog.renderers.SeparatedValuesRenderer[source]

Base renderer for report output in delimiter-separated values format.

csv

class analog.renderers.CSVRenderer[source]

Renderer for report output in comma separated values format.

tsv

class analog.renderers.TSVRenderer[source]

Renderer for report output in tab separated values format.

Utils

class analog.utils.AnalogArgumentParser(prog=None, usage=None, description=None, epilog=None, parents=[], formatter_class=<class 'argparse.HelpFormatter'>, prefix_chars='-', fromfile_prefix_chars=None, argument_default=None, conflict_handler='error', add_help=True, allow_abbrev=True)[source]

ArgumentParser that reads multiple values per argument from files.

Arguments read from files may contain comma or whitespace separated values.

To read arguments from files create a parser with fromfile_prefix_chars set:

parser = AnalogArgumentParser(fromfile_prefix_chars='@')

Then this parser can be called with argument files:

parser.parse_args(['--arg1', '@args_file', 'more-args'])

The argument files contain one argument per line. Arguments can be comma or whitespace separated on a line. For example all of this works:

nginx
-o       table
--verb   GET, POST, PUT
--verb   PATCH
--status 404, 500
--path   /foo/bar
--path   /baz
--path-stats
-t
positional
arg
convert_arg_line_to_args(arg_line)[source]

Comma/whitespace-split arg_line and yield separate attributes.

Argument names defined at the beginning of a line (-a, --arg) are repeated for each argument value in arg_line.

Parameters:arg_line (str) – one line of argument(s) read from a file
Returms:argument generator
Return type:generator
class analog.utils.PrefixMatchingCounter(*args, **kwds)[source]

Counter-like object that increments a field if it has a common prefix.

Example: “400”, “401”, “404” all increment a field named “4”.

Exceptions

Analog exceptions.

exception analog.exceptions.AnalogError[source]

Exception base class for all Analog errors.

exception analog.exceptions.InvalidFormatExpressionError[source]

Error raised for invalid format regex patterns.

exception analog.exceptions.MissingFormatError[source]

Error raised when Analyzer is called without format.

exception analog.exceptions.UnknownRendererError[source]

Error raised for unknown output format names (to select renderer).