LogsQL #

LogsQL is a simple yet powerful query language for VictoriaLogs . See examples , LogsQL tutorial , how to convert Loki queries to VictoriaLogs queries and SQL to LogsQL conversion guide .

LogsQL provides the following features:

Full-text search in any log field (defaults to _msg ). See word filter , phrase filter and prefix filter .
Ability to combine filters into arbitrary complex logical filters .
Ability to extract structured fields from unstructured logs at query time. See these docs .
Ability to calculate various stats over the selected log entries. See these docs .

LogsQL tutorial #

If you aren’t familiar with VictoriaLogs, then start with key concepts docs .

Then follow these docs:

The simplest LogsQL query is just a word , which must be found in the log message . For example, the following query finds all the logs with error word:

      error

It is recommended to use vlogscli for querying VictoriaLogs.

If the queried word clashes with LogsQL keywords, then just wrap it into quotes according to these docs . For example, the following query finds all the log messages with and word :

      "and"

It is OK to wrap any word into quotes. For example:

      "error"

Moreover, it is possible to wrap phrases containing multiple words in quotes. For example, the following query finds log messages with the error: cannot find file phrase:

      "error: cannot find file"

Queries above match logs with any timestamp unless it is additionally limited by query-time filters or server-side limits, e.g. they may return logs with timestamps from the previous year alongside recently received logs.

Usually logs from the previous year aren’t as interesting as the recently ingested logs. So it is recommended to add a time filter to the query. For example, the following query returns logs with the error word , which have _time during the last 5 minutes:

      error AND _time:5m

This query consists of two filters joined with AND operator :

The filter on the error word .
The filter on the _time field .

The AND operator means that the log entry must match both filters in order to be selected.

Typical LogsQL query consists of multiple filters joined with AND operator. It may be tiresome typing and then reading all these AND words. So LogsQL allows omitting AND words. For example, the following query is equivalent to the query above:

      _time:5m error

The query returns logs in arbitrary order because sorting a large number of logs may require non-trivial amounts of CPU and RAM. If the number of logs with the error word over the last 5 minutes isn’t too big (e.g., less than a few million), then it is OK to sort them with the sort pipe . The following query sorts the selected logs by _time field:

      _time:5m error | sort by (_time)

It is often convenient to limit the number of returned logs with limit pipe . The following query returns 10 most recent log entries (by _time ) with the error word over the last 5 minutes:

      _time:5m error | sort by (_time) desc | limit 10

By default VictoriaLogs returns all the log fields . If you need only the given set of fields, then add fields pipe to the end of the query. For example, the following query returns only _time , _stream and _msg fields:

      error _time:5m | fields _time, _stream, _msg

Suppose the query above selects too many rows because some buggy app pushes invalid error logs to VictoriaLogs. Suppose the app adds buggy_app word to every log line. Then the following query removes all the logs from the buggy app, allowing us paying attention to the real errors:

      _time:5m error NOT buggy_app

This query uses NOT operator for removing log lines from the buggy app. The NOT operator is used frequently, so it can be substituted with - or ! char. Also, = and ~ filters have shortcut negated forms: != and !~. The following query is equivalent to the previous one:

      _time:5m error -buggy_app

Suppose another buggy app starts pushing invalid error logs to VictoriaLogs - it adds foobar word to every emitted log line. No problems - just add -foobar to the query in order to remove these buggy logs:

      _time:5m error -buggy_app -foobar

This query can be rewritten to a clearer query with the OR operator inside parentheses:

      _time:5m error -(buggy_app OR foobar)

The parentheses are required here, since otherwise the query won’t return the expected results. The query error -buggy_app OR foobar is interpreted as (error AND NOT buggy_app) OR foobar according to priorities for AND, OR and NOT operator . This query returns logs with foobar word , even if they do not contain error word or contain buggy_app word. So it is recommended wrapping the needed query parts into explicit parentheses if you are unsure about operator precedence rules. As an additional bonus, explicit parentheses make queries easier to read and maintain.

Queries above assume that the error word is stored in the log message . If this word is stored in other field such as log.level, then add log.level: prefix in front of the error word:

      _time:5m log.level:error -(buggy_app OR foobar)

The field name can be wrapped into quotes if it contains special chars or keywords, which may clash with LogsQL syntax. Any word also can be wrapped into quotes according to these docs . So the following query is equivalent to the previous one:

      _time:5m "log.level":"error" -("buggy_app" OR "foobar")

What if the application identifier - such as buggy_app and foobar - is stored in the app field? Correct - just add app: prefix in front of buggy_app and foobar:

      _time:5m log.level:error -(app:buggy_app OR app:foobar)

The query can be simplified by moving the app: prefix outside the parentheses:

      _time:5m log.level:error -app:(buggy_app OR foobar)

The app field uniquely identifies the application instance if a single instance runs for each unique app. In this case it is recommended associating the app field with log stream fields during data ingestion . This usually improves both compression rate and query performance when querying the needed streams via _stream filter . If the app field is associated with the log stream, then the query above can be rewritten to more performant one:

      _time:5m log.level:error {app!~"buggy_app|foobar"}

This query skips scanning for log messages from buggy_app and foobar apps. It inspects only log.level and _stream labels. This significantly reduces disk read IO and CPU time needed for performing the query.

LogsQL also provides functions for statistics calculation over the selected logs. For example, the following query returns the number of logs with the error word for the last 5 minutes:

      _time:5m error | stats count() logs_with_error

Finally, it is recommended reading performance tips .

Now you are familiar with LogsQL basics. See LogsQL examples and query syntax if you want to continue learning LogsQL.

Key concepts #

Word #

LogsQL treats log field values as sequences of words delimited by non-word chars such as whitespace, parens, punctuation chars, etc. For example, the foo: (bar,"тест")! string is split into foo, bar and тест words. Words can contain UTF-8 -encoded Unicode letters and digits. These words are taken into account by full-text search filters such as word filter , phrase filter and prefix filter .

Query syntax #

LogsQL query must contain at least a single filter for selecting the matching logs. For example, the following query selects all the logs for the last 5 minutes by using _time filter :

      _time:5m

Tip: try * filter , which selects all the logs stored in VictoriaLogs. Do not worry - this doesn’t crash VictoriaLogs, even if the query selects trillions of logs. See these docs if you are curious why.

In addition to filters, LogsQL query may contain an arbitrary mix of optional actions for processing the selected logs. These actions are delimited by | and are known as pipes . For example, the following query uses stats pipe for returning the number of log messages with the error word for the last 5 minutes:

      _time:5m error | stats count() errors

See the list of supported pipes in LogsQL .

Filters #

LogsQL supports various filters for searching for log messages (see below). They can be combined into arbitrary complex queries via logical filters .

Filters are applied to _msg field by default. If the filter must be applied to other log field , then its name followed by the colon must be put in front of the filter. For example, if error word filter must be applied to the log.level field, then use log.level:error query.

Field names and filter args can be put into quotes if they contain special chars, which may clash with LogsQL syntax. LogsQL supports quoting via double quotes ", single quotes ' and backticks according to these docs :

      "some 'field':123":i('some("value")') AND `other"value'`

If in doubt, it is recommended quoting field names and filter args.

The list of LogsQL filters:

Time filter - matches logs with _time field in the given time range
Day range filter - matches logs with _time field in the given per-day time range
Week range filter - matches logs with _time field in the given per-week day range
Stream filter - matches logs, which belong to the given streams
_stream_id filter - matches logs, which belong to the given stream id(s)
Word filter - matches logs with the given word
Phrase filter - matches logs with the given phrase
Prefix filter - matches logs with the given word prefix or phrase prefix
Substring filter - matches logs with the given substring
Pattern match filter - matches logs by the given pattern
Range comparison filter - matches logs with field values in the provided range
Empty value filter - matches logs without the given log field
Any value filter - matches logs with the given non-empty log field
Exact filter - matches logs with the exact value for the given log field
Exact prefix filter - matches logs starting with the given prefix for the given log field
Multi-exact filter - matches logs with one of the specified exact values for the given log field
Subquery filter - matches logs with log field values matching the results of another query
contains_all filter - matches logs with log field containing all the provided words / phrases
contains_any filter - matches logs with log field containing at least one of the provided words / phrases
Case-insensitive filter - matches logs with the given case-insensitive word, phrase or prefix
contains_common_case filter - matches logs with log fields containing the given words and phrases with cases according to the given pattern
equals_common_case filter - matches logs with log fields equal to the given words and phrases with cases according to the given pattern
Sequence filter - matches logs with the given sequence of words or phrases
Regexp filter - matches logs for the given regexp
Range filter - matches logs with numeric field values in the given range
IPv4 range filter - matches logs with IP address field values in the given range
String range filter - matches logs with field values in the given string range
Length range filter - matches logs with field values of the given length range
Value type filter - matches logs with fields stored under the given value type
Fields’ equality filter - matches logs, which contain identical values in the given fields
Less than filter - matches logs where the given field value is smaller than the other field value
Less than or equal filter - matches logs where the given field value doesn’t exceed the other field value
Logical filter - allows combining other filters

Time filter #

VictoriaLogs scans all the logs for a query if it doesn’t contain a filter on the _time field . It uses various optimizations in order to accelerate queries without the _time filter, but such queries can be slow if the storage contains large number of logs over long time range. The easiest way to optimize queries is to narrow down the search with a filter on the _time field .

For example, the following query returns logs with _time during the last hour, which contain the error word in the _msg field :

      _time:1h AND error

The following formats are supported for _time filter:

_time:duration matches logs with timestamps on the time range (now-duration, now], where duration can have these values . Examples:
- _time:5m - returns logs for the last 5 minutes
- _time:2.5d15m42.345s - returns logs for the last 2.5 days, 15 minutes and 42.345 seconds
- _time:1y - returns logs for the last year
_time:>duration - matches logs with timestamps older than now-duration.
_time:<duration - matches logs with timestamps newer than now-duration. It is equivalent to _time:duration.
_time:YYYY-MM-DDZ - matches all the logs for the particular day by UTC. For example, _time:2023-04-25Z matches logs on April 25, 2023 by UTC.
_time:YYYY-MMZ - matches all the logs for the particular month by UTC. For example, _time:2023-02Z matches logs on February, 2023 by UTC.
_time:YYYYZ - matches all the logs for the particular year by UTC. For example, _time:2023Z matches logs on 2023 by UTC.
_time:YYYY-MM-DDTHHZ - matches all the logs for the particular hour by UTC. For example, _time:2023-04-25T22Z matches logs on April 25, 2023 at 22 hour by UTC.
_time:YYYY-MM-DDTHH:MMZ - matches all the logs for the particular minute by UTC. For example, _time:2023-04-25T22:45Z matches logs on April 25, 2023 at 22:45 by UTC.
_time:YYYY-MM-DDTHH:MM:SSZ - matches all the logs for the particular second by UTC. For example, _time:2023-04-25T22:45:59Z matches logs on April 25, 2023 at 22:45:59 by UTC.
_time:>min_time - matches logs with timestamps bigger than the min_time.
_time:>=min_time - matches logs with timestamps bigger or equal to the min_time.
_time:<max_time - matches logs with timestamps smaller than the max_time.
_time:<=max_time - matches logs with timestamps smaller or equal to the max_time.
_time:[min_time, max_time] - matches logs on the time range [min_time, max_time], including both min_time and max_time. The min_time and max_time can contain any format specified here . For example, _time:[2023-04-01Z, 2023-04-30Z] matches logs for the whole April, 2023 by UTC, e.g. it is equivalent to _time:2023-04Z.
_time:[min_time, max_time) - matches logs on the time range [min_time, max_time), not including max_time. The min_time and max_time can contain any format specified here . For example, _time:[2023-02-01Z, 2023-03-01Z) matches logs for the whole February, 2023 by UTC, e.g. it is equivalent to _time:2023-02Z.

It is possible to specify time zone offset for all the absolute time formats by appending +hh:mm or -hh:mm suffix. For example, _time:2023-04-25+05:30 matches all the logs on April 25, 2023 by India time zone, while _time:2023-02-07:00 matches all the logs on February, 2023 by California time zone.

If the timezone offset information is missing, then the local time zone of the host where VictoriaLogs runs is used. See how to control the timezone at VictoriaLogs server . For example, _time:2023-10-20 matches all the logs for 2023-10-20 day according to the local time zone of the host where VictoriaLogs runs.

It is possible to specify generic offset for the selected time range by appending offset after the _time filter. Examples:

_time:offset 1h matches logs until now-1h.
_time:5m offset 1h matches logs on the time range (now-1h5m, now-1h].
_time:2023-07Z offset 5h30m matches logs on July, 2023 by UTC with offset 5h30m.
_time:[2023-02-01Z, 2023-03-01Z) offset 1w matches logs the week before the time range [2023-02-01Z, 2023-03-01Z) by UTC.

See also time_offset option , which allows applying the given offset to all the filters on _time field without the need to modify the query.

See also time_add pipe , which allows adding the given duration to the given log field.

Performance tips:

It is recommended to specify the smallest possible time range during the search, since it reduces the amount of log entries that need to be scanned during the query. For example, _time:1h is usually faster than _time:5h.
While LogsQL supports arbitrary number of _time:... filters at any level of logical filters , it is recommended specifying a single _time filter at the top level of the query.
See other performance tips .

See also:

Day range filter #

_time:day_range[start, end] filter allows returning logs in the particular start ... end time for each day, where start and end have the format hh:mm. For example, the following query matches logs between 08:00 and 18:00 each day at the VictoriaLogs server-side timezone. See how to change server-size timezone .

      _time:day_range[08:00, 18:00)

This query includes 08:00, while 18:00 is excluded, e.g. the last matching time is 17:59:59.999999999. Replace [ with ( in order to exclude the starting time. Replace ) with ] in order to include the ending time. For example, the following query matches logs between 08:00 and 18:00, excluding 08:00:00.000000000 and including 18:00:

      _time:day_range(08:00, 18:00]

If the time range must be applied to other than the local time zone, then add offset <duration>, where <duration> can have any supported duration value . For example, the following query selects logs between 08:00 and 18:00 at +0200 time zone:

      _time:day_range[08:00, 18:00) offset 2h

Use offset 0h for filtering logs on the given time range at UTC time zone:

      _time:day_range[08:00, 18:00) offset 0h

If logs outside the given per-day time range must be selected, then put NOT or - in front of the day_range filter. For example, the following query selects logs outside working hours [08:00 - 17:00]:

      -_time:day_range[08:00, 17:00]

Performance tip: it is recommended to specify a regular time filter additionally to the day_range filter. For example, the following query selects logs between 08:00 and 18:00 every day for the last week:

      _time:1w _time:day_range[08:00, 18:00)

See also:

Week range filter #

_time:week_range[start, end] filter allows returning logs on the particular start ... end days for each week, at the local timezone for the VictoriaLogs server. See how to control local timezone for VictoriaLogs .

The following values are supported for the start and end:

Sun or Sunday
Mon or Monday
Tue or Tuesday
Wed or Wednesday
Thu or Thursday
Fri or Friday
Sat or Saturday

For example, the following query matches logs between Monday and Friday by local timezone at VictoriaLogs server:

      _time:week_range[Mon, Fri]

This query includes Monday and Friday. Replace [ with ( in order to exclude the starting day. Replace ] with ) in order to exclude the ending day. For example, the following query matches logs between Sunday and Saturday, excluding Sunday and Saturday (e.g. it is equivalent to the previous query):

      _time:week_range(Sun, Sat)

If the week range must be applied to other than the local time zone at VictoriaLogs server, then add offset <duration>, where <duration> can have any supported duration value . For example, the following query selects logs between Monday and Friday at +0200 time zone:

      _time:week_range[Mon, Fri] offset 2h

If logs must be selected on the giveen week days according to UTC, then use offset 0h:

      _time:week_range[Mon, Fri] offset 0h

The week_range filter can be combined with day_range filter using logical filters . For example, the following query selects logs between 08:00 and 18:00 every workday of the week:

      _time:week_range[Mon, Fri] _time:day_range[08:00, 18:00)

If logs outside the given weekday range must be selected, then put NOT or - in front of the week_range. For example, the following query selects logs outside work days (e.g. logs for Saturday and Sunday):

      -_time:week_range[Mon, Fri]

Performance tip: it is recommended to specify a regular time filter additionally to the week_range filter. For example, the following query selects logs between Monday and Friday for each week over the last 4 weeks:

      _time:4w _time:week_range[Mon, Fri]

See also:

Stream filter #

VictoriaLogs provides an optimized way to select logs, which belong to particular log streams . This can be done via {...} filter, which may contain arbitrary Prometheus-compatible label selector over fields associated with log streams . Assuming that the app field is a stream field, the following query selects log entries with app field equal to nginx:

      {app="nginx"}

This query is equivalent to the following exact filter query, but the upper query usually works much faster:

      app:="nginx"

The stream filter supports {label in (v1,...,vN)} and {label not_in (v1,...,vN)} syntax. It is equivalent to {label=~"v1|...|vN"} and {label!~"v1|...|vN"} respectively. The v1, …, vN are properly escaped inside the regexp. For example, {app in ("nginx", "foo.bar")} is equivalent to {app=~"nginx|foo\\.bar"} - note that the . char is properly escaped.

It is allowed to add _stream: prefix in front of {...} filter in order to make clear that the filtering is performed on the _stream log field . The following filter is equivalent to {app="nginx"}:

      _stream:{app="nginx"}

Performance tips:

It is recommended to use the most specific {...} filter matching the smallest number of log streams, which needs to be scanned by the rest of filters in the query.
While LogsQL supports arbitrary number of {...} filters at any level of logical filters , it is recommended specifying a single {...} filter at the top level of the query.
See other performance tips .

See also:

_stream_id filter #

Every log stream in VictoriaLogs is uniquely identified by _stream_id field. The _stream_id:... filter allows quickly selecting all the logs belonging to the particular stream.

For example, the following query selects all the logs, which belong to the log stream with _stream_id equal to 0000007b000001c850d9950ea6196b1a4812081265faa1c7:

      _stream_id:0000007b000001c850d9950ea6196b1a4812081265faa1c7

If the log stream contains too many logs, then it is a good idea to limit the number of returned logs with a time filter . For example, the following query selects logs for the given stream for the last hour:

      _time:1h _stream_id:0000007b000001c850d9950ea6196b1a4812081265faa1c7

The _stream_id filter supports specifying multiple _stream_id values via _stream_id:in(...) syntax. For example:

      _stream_id:in(0000007b000001c850d9950ea6196b1a4812081265faa1c7, 1230007b456701c850d9950ea6196b1a4812081265fff2a9)

It is also possible to specify a subquery inside in(...) (the subquery must end with fields _stream_id or uniq by (_stream_id)), which selects the needed _stream_id values. For example, the following query returns logs for log streams containing error word in the _msg field during the last 5 minutes:

      _stream_id:in(_time:5m error | fields _stream_id)

See also:

Word filter #

The simplest LogsQL query consists of a single word to search in log messages. For example, the following query matches log messages with error word inside them:

      error

This query matches the following log messages :

error
an error happened
error: cannot open file

This query doesn’t match the following log messages:

ERROR, since the filter is case-sensitive by default. Use i(error) for this case. See these docs for details.
multiple errors occurred, since the errors word doesn’t match error word. Use error* for this case. See these docs for details.

By default the given word is searched in the _msg field . Specify the field name in front of the word and put a colon after it if it must be searched in the given field. For example, the following query returns log entries containing the error word in the log.level field:

      log.level:error

Both the field name and the word in the query can contain UTF-8 -encoded Unicode letters and digits. For example:

      სფერო:τιμή

Put the field name and/or the word in the query inside quotes if they contain other chars (e.g. :), which may clash with the query syntax. For example, the following query searches for the ip 1.2.3.45 in the field ip:remote:

      "ip:remote":"1.2.3.45"

See also:

Phrase filter #

If you need to search for log messages with the specific phrase inside them, then just wrap the phrase into quotes according to these docs . The phrase can contain any chars, including whitespace, punctuation, parens, etc. They are taken into account during the search. For example, the following query matches log messages with ssh: login fail phrase inside them:

      "ssh: login fail"

This query matches the following log messages :

ERROR: ssh: login fail for user "foobar"
ssh: login fail!

This query doesn’t match the following log messages:

ssh login fail, since the message misses : char just after the ssh. Use seq("ssh", "login", "fail") query if log messages with the sequence of these words must be found. See these docs for details.
login fail: ssh error, since the message doesn’t contain the full phrase requested in the query. If you need matching a message with all the words listed in the query, then use ssh AND login AND fail query. See these docs for details.
ssh: login failed, since the message ends with failed word instead of fail word. Use "ssh: login fail"* query for this case. See these docs for details.
SSH: login fail, since the SSH word is in capital letters. Use i("ssh: login fail") for case-insensitive search. See these docs for details.

If the phrase contains double quotes, then either put \ in front of double quotes or put the phrase inside single quotes. For example, the following filter searches logs with "foo":"bar" phrase:

      '"foo":"bar"'

By default the given phrase is searched in the _msg field . Specify the field name in front of the phrase and put a colon after it if it must be searched in the given field. For example, the following query returns log entries containing the cannot open file phrase in the event.original field:

      event.original:"cannot open file"

Both the field name and the phrase can contain arbitrary utf-8 -encoded chars. For example:

      შეტყობინება:"Το αρχείο δεν μπορεί να ανοίξει"

The field name can be put inside quotes if it contains special chars, which may clash with the query syntax. For example, the following query searches for the cannot open file phrase in the field some:message:

      "some:message":"cannot open file"

See also:

Prefix filter #

If you need to search for log messages with words / phrases containing some prefix, then just add * char to the end of the word / phrase in the query. For example, the following query returns log messages , which contain words with err prefix:

      err*

This query matches the following log messages :

err: foobar
cannot open file: error occurred

This query doesn’t match the following log messages:

Error: foobar, since the Error word starts with capital letter. Use i(err*) for this case. See these docs for details.
fooerror, since the fooerror word doesn’t start with err. Use *err* for this case. See these docs for details.

Prefix filter can be applied to phrases put inside quotes according to these docs . For example, the following query matches log messages containing phrases with unexpected fail prefix:

      "unexpected fail"*

This query matches the following log messages :

unexpected fail: IO error
error:unexpected failure

This query doesn’t match the following log messages:

unexpectedly failed, since the unexpectedly doesn’t match unexpected word . Use unexpected* AND fail* for this case. See these docs for details.
failed to open file: unexpected EOF, since failed word occurs before the unexpected word. Use unexpected AND fail* for this case. See these docs for details.

If the prefix contains double quotes, then either put \ in front of double quotes or put the prefix inside single quotes. For example, the following filter searches logs with "foo":"bar prefix:

      '"foo":"bar'*

By default the prefix filter is applied to the _msg field . Specify the needed field name in front of the prefix filter in order to apply it to the given field. For example, the following query matches log.level field containing any word with the err prefix:

      log.level:err*

If the field name contains special chars, which may clash with the query syntax, then it may be put into quotes according to these docs . For example, the following query matches log:level field containing any word with the err prefix.

      "log:level":err*

Performance tips:

Prefer using word filters and phrase filters combined via logical filter instead of prefix filter.
Prefer moving word filters and phrase filters in front of prefix filter when using logical filter .
See other performance tips .

See also:

Pattern match filter #

VictoriaLogs supports filtering logs by patterns with the following filters:

pattern_match("pattern") - matches the given pattern for any part of the _msg
pattern_match_full("pattern") - matches the given pattern for the whole _msg
pattern_match_prefix("pattern") - matches the given pattern at the begining of the _msg
pattern_match_suffix("pattern") - matches the given pattern at the end of the _msg .

These filters can be applied to any given log field with the log_field:pattern_match("pattern") syntax.

The "pattern" must contain the text to match, plus arbitrary number of the following placeholders:

<N> - matches an integer number. It also matches hexadecimal numbers with the length of 4 chars and longer and with even length. For example, it matches 123 and 12abcdEF. To match floating point numbers as a whole (e.g. 123.456), use <N>.<N> pattern.
<UUID> - matches any UUID such as 2edfed59-3e98-4073-bbb2-28d321ca71a7.
<IP4> - matches IPv4 such as 123.45.67.89. Use <IP4>/<N> for matching IPv4 masks.
<TIME> - matches time strings such as 10:20:30. It also captures fractional seconds such as 10:20:30.123 and 10:20:30,123.
<DATE> - matches date strings such as 2025-10-20 and 2025/10/20.
<DATETIME> - matches datetime strings such as 2025-10-20T08:09:11 and 2025-10-20 08:09:11. It also captures fractional seconds and timezones.
<W> - matches any word or any quoted string in single quotes, double quotes and backticks.

Such patterns are generated by the collapse_nums pipe .

For example, the following filter matches _msg field with the <arbitrary_prefix>user_id=123, ip=45.67.89.12, time=2025-10-20T23:32:12Z<arbitrary_suffix> contents:

      pattern_match("user_id=<N>, ip=<IP4>, time=<DATETIME>")

See also:

Substring filter #

If it is needed to find logs with some substring, then *substring* filter can be used. The substring can be put in quotes according to these docs if needed. For example, the following query matches log entries, which contain ampl text in the _msg field :

      *ampl*

It matches the following messages:

Example message
This is a sample

It doesn’t match EXAMPLE message, since AMPL substring here is in uppercase. Use ~"(?i)ampl" filter instead. Note that case-insensitive filter may be much slower than case-sensitive one.

Performance tip: prefer using word filter and phrase filter , since substring filter may be quite slow.

See also:

Range comparison filter #

LogsQL supports field:>X, field:>=X, field:<X and field:<=X filters, where field is the name of log field and X is numeric value , IPv4 address or a string. For example, the following query returns logs containing numeric values for the response_size field bigger than 10*1024:

      response_size:>10KiB

The following query returns logs with username field containing string values smaller than John:

      username:<"John"

See also:

Empty value filter #

Sometimes it is needed to find log entries without the given log field . This can be performed with log_field:"" syntax. For example, the following query matches log entries without host.hostname field (VictoriaLogs treats empty values as non-existing):

      host.hostname:""

See also:

Any value filter #

Sometimes it is needed to find log entries containing any non-empty value for the given log field . This can be performed with log_field:* syntax. For example, the following query matches log entries with non-empty host.hostname field:

      host.hostname:*

See also:

No-op filter #

Sometimes it is needed to apply e.g. no-op filter to the given log field , which does nothing, e.g. it matches any logs, even if they do not contain the given log field.

The following options are supported for no-op filter:

field_name:in(*) - a special case for the in() filter
field_name:contains_any(*) - a special case for the contains_any() filter
field_name:contains_all(*) - a special case for the contains_all() filter

See also:

Exact filter #

The word filter and phrase filter return log messages , which contain the given word or phrase inside them. The message may contain additional text other than the requested word or phrase. If you need to search for log messages or log fields with the exact value, then use the exact filter. For example, the following query returns log messages with the exact value fatal error: cannot find /foo/bar:

      ="fatal error: cannot find /foo/bar"

The query doesn’t match the following log messages:

fatal error: cannot find /foo/bar/baz or some-text fatal error: cannot find /foo/bar, since they contain an additional text other than the specified in the exact filter. Use "fatal error: cannot find /foo/bar" query in this case. See these docs for details.
FATAL ERROR: cannot find /foo/bar, since the exact filter is case-sensitive. Use i("fatal error: cannot find /foo/bar") in this case. See these docs for details.

By default the exact filter is applied to the _msg field . Specify the field name in front of the exact filter and put a colon after it if it must be searched in the given field. For example, the following query returns log entries with the exact error value at log.level field:

      log.level:="error"

Both the field name and the phrase can contain arbitrary utf-8 -encoded chars. For example:

      log.დონე:="შეცდომა"

The field name can be put inside quotes if it contains special chars, which may clash with the query syntax. For example, the following query matches the error value in the field log:level:

      "log:level":="error"

See also:

Exact prefix filter #

Sometimes it is needed to find log messages starting with some prefix. This can be done with the ="prefix"* filter. For example, the following query matches log messages, which start from Processing request prefix:

      ="Processing request"*

This filter matches the following log messages :

Processing request foobar
Processing requests from ...

It doesn’t match the following log messages:

processing request foobar, since the log message starts with lowercase p. Use ="processing request"* OR ="Processing request"* query in this case. See these docs for details.
start: Processing request, since the log message doesn’t start with Processing request. Use "Processing request" query in this case. See these docs for details.

      log.level:="err"*

Both the field name and the phrase can contain arbitrary utf-8 -encoded chars. For example:

      log.დონე:="შეცდომა"*

The field name can be put inside quotes if it contains special chars, which may clash with the query syntax. For example, the following query matches log:level values starting with err prefix:

      "log:level":="err"*

See also:

Multi-exact filter #

Sometimes it is needed to locate log messages with a field containing one of the given values. This can be done with multiple exact filters combined into a single logical filter . For example, the following query matches log messages with log.level field containing either error or fatal exact values:

      log.level:(="error" OR ="fatal")

While this solution works OK, LogsQL provides simpler and faster solution for this case - the in() filter.

      log.level:in("error", "fatal")

It works very fast for long lists passed to in().

There is a special case - in(*) - this filter matches all the logs. See no-op filter docs for details.

It is possible to pass arbitrary query inside in(...) filter in order to match against the results of this query. See these docs for details.

See also:

contains_all filter #

If it is needed to find logs, which contain all the given words / phrases, then v1 AND v2 ... AND vN logical filter can be used. VictoriaLogs provides an alternative approach with the contains_all(v1, v2, ..., vN) filter. For example, the following query matches logs, which contain both foo word and "bar baz" phrase in the _msg field :

      contains_all(foo, "bar baz")

This is equivalent to the following query:

      foo AND "bar baz"

There is a special case - contains_all(*) - this filter matches all the logs. See no-op filter docs for details.

It is possible to pass arbitrary query inside contains_all(...) filter in order to match against the results of this query. See these docs for details.

See also:

contains_any filter #

Sometimes it is needed to find logs, which contain at least one word or phrase out of many words / phrases. This can be done with v1 OR v2 OR ... OR vN logical filter . VictoriaLogs provides an alternative approach with the contains_any(v1, v2, ..., vN) filter. For example, the following query matches logs, which contain foo word or "bar baz" phrase in the _msg field :

      contains_any(foo, "bar baz")

This is equivalent to the following query:

      foo OR "bar baz"

There is a special case - contains_any(*) - this filter matches all the logs. See no-op filter docs for details.

It is possible to pass arbitrary query inside contains_any(...) filter in order to match against the results of this query. See these docs for details.

See also:

Subquery filter #

Sometimes it is needed to select logs with fields matching values selected by another query (aka subquery). LogsQL provides such an ability with the following filters:

field:in(<subquery>) - it returns logs with field values matching the unique values returned by the <subquery>. For example, the following query selects all the logs for the last 5 minutes for users, who visited pages with admin word in the path field during the last day:
```
      _time:5m AND user_id:in(_time:1d AND path:admin | fields user_id)
    
```
field:contains_all(<subquery>) - it returns logs with field values containing all the words and phrases returned by the <subquery> (duplicate values from the <subquery> are ignored). For example, the following query selects all the logs for the last 5 minutes, which contain all the user_id values from admin logs over the last day in the _msg field :
```
      _time:5m _msg:contains_all(_time:1d is_admin:true | fields user_id)
    
```
field:contains_any(<subquery>) - it returns logs with the field values containing at least one word or phrase returned by the <subquery> (duplicate values from the <subquery> are ignored). For example, the following query selects all the logs for the last 5 minutes, which contain at least one user_id value from admin logs over the last day in the _msg field :
```
      _time:5m _msg:contains_any(_time:1d is_admin:true | fields user_id)
    
```

The <subquery> must end with either fields pipe or uniq pipe containing a single field name, so VictoriaLogs could use values of this field for matching the given filter.

See also:

Case-insensitive filter #

Case-insensitive filter can be applied to any word, phrase or prefix by wrapping the corresponding word filter , phrase filter or prefix filter into i(). For example, the following query returns log messages with error word in any case:

      i(error)

The query matches the following log messages :

unknown error happened
ERROR: cannot read file
Error: unknown arg
An ErRoR occurred

The query doesn’t match the following log messages:

FooError, since the FooError word has superfluous prefix Foo. Use ~"(?i)error" for this case. See these docs for details.
too many Errors, since the Errors word has superfluous suffix s. Use i(error*) for this case.

By default the i() filter is applied to the _msg field . Specify the needed field name in front of the filter in order to apply it to the given field. For example, the following query matches log.level field containing error word in any case:

      log.level:i(error)

      "log:level":i("error")

Performance tips:

Prefer using contains_common_case filter over i(...), since contains_common_case(...) usually works much faster.
Prefer using case-sensitive filters such as word filter and phrase filter over case-insensitive filter.
Prefer moving word filter , phrase filter and prefix filter in front of the case-insensitive filter when using logical filter with AND.
See other performance tips .

See also:

equals_common_case filter #

The field_name:equals_common_case(phrase1, ..., phraseN) filter searches for logs where the field_name log field equals the following phrases and words :

the given phrases - phrase1, …, phraseN
uppercase and lowercase phrases
individual phrases where every uppercase letter is independently replaced with the corresponding lowercase letter

For example, _msg:equals_common_case("VictoriaMetrics") finds logs where the _msg field equals one of the following words :

VictoriaMetrics
VICTORIAMETRICS
victoriametrics
Victoriametrics
victoriaMetrics

The equals_common_case(...) usually works much faster than the i(...) .

If you need to find logs with log fields containing the common case words or phrases, then use contains_common_case filter .

See also:

contains_common_case filter #

The field_name:contains_common_case(phrase1, ..., phraseN) filter searches for logs where the field_name log field contains the following phrases and words :

the given phrases - phrase1, …, phraseN
uppercase and lowercase phrases
individual phrases where every uppercase letter is independently replaced with the corresponding lowercase letter

For example, _msg:contains_common_case("VictoriaMetrics") finds logs where the _msg field contains at least one of the following words :

VictoriaMetrics
VICTORIAMETRICS
victoriametrics
Victoriametrics
victoriaMetrics

The contains_common_case(...) usually works much faster than the i(...) .

If you need to find logs with log fields equal to the common case words or phrases, then use equals_common_case filter .

See also:

Sequence filter #

Sometimes it is needed to find log messages with words or phrases in a particular order. For example, if log messages with error word followed by open file phrase must be found, then the following LogsQL query can be used (every word / phrase can be quoted according to these docs ):

      seq("error", "open file")

This query matches some error: cannot open file /foo/bar message, since the open file phrase goes after the error word . The query doesn’t match the cannot open file: error message, since the open file phrase is located in front of the error word . If you need to match log messages with both the error word and the open file phrase, then use the error AND "open file" query. See these docs for details.

By default the seq() filter is applied to the _msg field . Specify the needed field name in front of the filter in order to apply it to the given field. For example, the following query matches event.original field containing (error, "open file") sequence:

      event.original:seq(error, "open file")

If the field name contains special chars, which may clash with the query syntax, then it may be put into quotes according to these docs . For example, the following query matches event:original field containing (error, "open file") sequence:

      "event:original":seq(error, "open file")

See also:

Regexp filter #

LogsQL supports regular expression filter with RE2 syntax via ~"regex" syntax. The regex can be put in one of the supported quotes according to these docs . For example, the following query returns all the log messages containing err or warn substrings:

      ~"err|warn"

The query matches the following log messages , which contain either err or warn substrings:

error: cannot read data
2 warnings have been raised
data transferring finished

The query doesn’t match the following log messages:

ERROR: cannot open file, since the ERROR word is in uppercase letters. Use ~"(?i)(err|warn)" query for case-insensitive regexp search. See these docs for details. See also case-insensitive filter docs .
it is warmer than usual, since it doesn’t contain either err or warn substrings.

If the regexp contains double quotes, then either put \ in front of double quotes or put the regexp inside single quotes. For example, the following regexp searches logs matching "foo":"(bar|baz)" regexp:

      ~'"foo":"(bar|baz)"'

If the regexp is put inside double quotes or single quotes, then the \ char inside the regexp must be encoded as \\. For example, the following query searches for logs with a.b substring inside them:

      ~"a\\.b"

It is recommended to use the substring filter when a substring search is needed.

By default the regexp filter is applied to the _msg field . Specify the needed field name in front of the filter in order to apply it to the given field. For example, the following query matches event.original field containing either err or warn substrings:

      event.original:~"err|warn"

      "event:original":~"err|warn"

Performance tips:

Prefer combining simple word filter with logical filter instead of using regexp filter. For example, the ~"error|warning" query can be substituted with error OR warning query, which usually works much faster. Note that the ~"error|warning" matches errors as well as warnings words , while error OR warning matches only the specified words . See also multi-exact filter .
Prefer moving the regexp filter to the end of the logical filter , so lighter filters are executed first.
Prefer using ="some prefix"* instead of ~"^some prefix", since the exact prefix filter works much faster than the regexp filter.
See other performance tips .

See also:

Range filter #

If you need to filter log message by some field containing only numeric values, then the range() filter can be used. For example, if the request.duration field contains the request duration in seconds, then the following LogsQL query can be used for searching for log entries with request durations exceeding 4.2 seconds:

      request.duration:range(4.2, Inf)

This query can be shortened by using the range comparison filter :

      request.duration:>4.2

The lower and the upper bounds of the range(lower, upper) are excluded by default. If they must be included, then substitute the corresponding parentheses with square brackets. For example:

range[1, 10) includes 1 in the matching range
range(1, 10] includes 10 in the matching range
range[1, 10] includes 1 and 10 in the matching range

The range boundaries can contain any supported numeric values .

Note that the range() filter doesn’t match values, which contain non-numeric chars. For example, range(1, 10) doesn’t match the request took 4.2 seconds log message , since the 4.2 number is surrounded by other text. Extract the numeric value from the message with extract pipe and then apply the range() filter pipe to the extracted field.

Performance tips:

It is better to query pure numeric field instead of extracting numeric field from text field via transformations at query time.
See other performance tips .

See also:

IPv4 range filter #

If you need to filter log message by some field containing only IPv4 addresses such as 1.2.3.4, then the ipv4_range() filter can be used. For example, the following query matches log entries with user.ip address in the range [127.0.0.0 - 127.255.255.255]:

      user.ip:ipv4_range('127.0.0.0', '127.255.255.255')

The ipv4_range() accepts also IPv4 subnetworks in CIDR notation . For example, the following query is equivalent to the query above:

      user.ip:ipv4_range("127.0.0.0/8")

If you need matching a single IPv4 address, then just put it inside ipv4_range(). For example, the following query matches 1.2.3.4 IP at user.ip field :

      user.ip:ipv4_range("1.2.3.4")

Note that the ipv4_range() doesn’t match a string with IPv4 address if this string contains other text. For example, ipv4_range("127.0.0.0/24") doesn’t match request from 127.0.0.1: done log message , since the 127.0.0.1 ip is surrounded by other text. Extract the IP from the message with extract pipe and then apply the ipv4_range() filter pipe to the extracted field.

Hints:

If you need to search for log messages containing the given X.Y.Z.Q IPv4 address, then "X.Y.Z.Q" query can be used. See these docs for details.
If you need to search for log messages containing at least a single IPv4 address out of the given list, then "ip1" OR "ip2" ... OR "ipN" query can be used. See these docs for details.
If you need to find log entries with the ip field in multiple ranges, then use ip:(ipv4_range(range1) OR ipv4_range(range2) ... OR ipv4_range(rangeN)) query. See these docs for details.

Performance tips:

It is better querying pure IPv4 field instead of extracting IPv4 from text field via transformations at query time.
See other performance tips .

See also:

IPv6 range filter #

The ipv6_range() filter behaves similarly to the ipv4_range() filter , but it works with IPv6 addresses.

For example, the following query matches log entries with user.ipv6 address in the range [2001:db8:: - 2001:db8::ffff]:

      user.ipv6:ipv6_range('2001:db8::', '2001:db8::ffff')

The ipv6_range() accepts also IPv6 subnetworks in CIDR notation . For example, the following query is equivalent to the query above:

      user.ipv6:ipv6_range("2001:db8::/112")

If you need matching a single IPv6 address, then just put it inside ipv6_range(). For example, the following query matches 2001:db8::1 at the user.ipv6 field:

      user.ipv6:ipv6_range("2001:db8::1")

Same as for ipv4_range(), the ipv6_range() filter doesn’t match an IPv6 address that is embedded into a larger string (for example, with port or surrounding text). In such cases, first extract the IPv6 address into a dedicated field (for example via extract pipe ) and then apply ipv6_range() to that field.

Hints:

If you need to search for log messages containing a given IPv6 address anywhere in the message text, use a phrase filter, e.g. "2001:db8::1".
If you need to search for log messages containing at least one IPv6 address from a given list, use "ip1" OR "ip2" ... OR "ipN" query.
If you need to find log entries with the ip field in multiple IPv6 ranges, then use ip:(ipv6_range(range1) OR ipv6_range(range2) ... OR ipv6_range(rangeN)) query.

Performance tips:

Prefer writing IPv6 addresses into a dedicated field (e.g. user.ipv6) instead of embedding them into free-form text.
See other performance tips .

String range filter #

If you need to filter log message by some field with string values in some range, then string_range() filter can be used. For example, the following LogsQL query matches log entries with user.name field starting from A and B chars:

      user.name:string_range(A, C)

The string_range() includes the lower bound, while excluding the upper bound. This simplifies querying distinct sets of logs. For example, the user.name:string_range(C, E) would match user.name fields, which start from C and D chars.

See also:

Length range filter #

If you need to filter log message by its length, then len_range() filter can be used. For example, the following LogsQL query matches log messages with lengths in the range [5, 10] chars:

      len_range(5, 10)

This query matches the following log messages, since their length is in the requested range:

foobar
foo bar

This query doesn’t match the following log messages:

foo, since it is too short
foo bar baz abc, since it is too long

It is possible to use inf as the upper bound. For example, the following query matches log messages with the length bigger or equal to 5 chars:

      len_range(5, inf)

The range boundaries can be expressed in the following forms:

Hexadecimal form. For example, len_range(0xff, 0xABCD).
Binary form. For example, len_range(0b100110, 0b11111101)
Integer form with _ delimiters for better readability. For example, len_range(1_000, 2_345_678).

By default the len_range() is applied to the _msg field . Put the field name in front of the len_range() in order to apply the filter to the needed field. For example, the following query matches log entries with the foo field length in the range [10, 20] chars:

      foo:len_range(10, 20)

See also:

value_type filter #

VictoriaLogs automatically detects types for the ingested log fields and stores log field values according to the detected type (such as dict, string, uint64, int64, float64, ipv4, iso8601, etc.). Value types for stored fields can be obtained via block_stats pipe .

Sometimes it is needed to select logs with fields of a particular value type. Then value_type(type) filter can be used. For example, the following filter selects logs where user_id field values are stored as uint64 type:

      user_id:value_type(uint64)

See also:

eq_field filter #

Sometimes it is needed to find logs, which contain identical values in the given fields . This can be done with field1:eq_field(field2) filter.

For example, the following query matches logs with identical values at user_id and customer_id fields:

      user_id:eq_field(customer_id)

Quick tip: use NOT user_id:eq_field(customer_id) for finding logs where user_id isn’t equal to customer_id. It uses NOT logical operator .

See also:

le_field filter #

Sometimes it is needed to find logs where one field value doesn’t exceed the other field value. This can be done with field1:le_field(field2) filter.

For example, the following query matches logs where duration field doesn’t exceed the max_duration field:

      duration:le_field(max_duration)

Quick tip: use NOT duration:le_field(max_duration) for finding logs where duration exceeds the max_duration.

See also:

lt_field filter #

Sometimes it is needed to find logs where one field value is smaller than the other field value. This can be done with field1:lt_field(field2) filter.

For example, the following query matches logs where duration field is smaller than the max_duration field:

      duration:lt_field(max_duration)

Quick tip: use NOT duration:lt_field(max_duration) for finding logs where duration is bigger or equal to the max_duration.

See also:

Logical filter #

Basic LogsQL filters can be combined into more complex filters with the following logical operations:

q1 AND q2 - matches common log entries returned by both q1 and q2. Arbitrary number of filters can be combined with AND operation. For example, error AND file AND app matches log messages , which simultaneously contain error, file and app words . The AND operation is frequently used in LogsQL queries, so it is allowed to skip the AND word. For example, error file app is equivalent to error AND file AND app. See also contains_all filter .
q1 OR q2 - merges log entries returned by both q1 and q2. Arbitrary number of filters can be combined with OR operation. For example, error OR warning OR info matches log messages , which contain at least one of error, warning or info words . See also contains_any filter .
NOT q - returns all the log entries except of those which match q. For example, NOT info returns all the log messages , which do not contain info word . The NOT operation is frequently used in LogsQL queries, so it is allowed substituting NOT with - and ! in queries. For example, -info and !info are equivalent to NOT info. The ! must be used instead of - in front of = and ~ filters like != and !~.

The NOT operation has the highest priority, AND has the middle priority and OR has the lowest priority. The priority order can be changed with parentheses. For example, NOT info OR debug is interpreted as (NOT info) OR debug, so it matches log messages , which do not contain info word , while it also matches messages with debug word (which may contain the info word). This is not what most users expect. In this case the query can be rewritten to NOT (info OR debug), which correctly returns log messages without info and debug words .

LogsQL supports arbitrary complex logical queries with arbitrary mix of AND, OR and NOT operations and parentheses.

By default logical filters apply to the _msg field unless the inner filters explicitly specify the needed log field via field_name:filter syntax. For example, (error OR warn) AND host.hostname:host123 is interpreted as (_msg:error OR _msg:warn) AND host.hostname:host123.

It is possible to specify a single log field for multiple filters with the following syntax:

      field_name:(q1 OR q2 OR ... OR qN)

For example, log.level:error OR log.level:warning OR log.level:info can be substituted with the shorter query: log.level:(error OR warning OR info).

Performance tips:

VictoriaLogs executes logical operations from the left to the right, so it is recommended moving the most specific and the fastest filters (such as word filter and phrase filter ) to the left, while moving less specific and the slowest filters (such as regexp filter and case-insensitive filter ) to the right. For example, if you need to find log messages with the error word, which match some /foo/(bar|baz) regexp, it is better from performance PoV to use the query error ~"/foo/(bar|baz)" instead of ~"/foo/(bar|baz)" error.
The most specific filter means that it matches the lowest number of log entries comparing to other filters.
See other performance tips .

Pipes #

Additionally to filters , LogsQL query may contain arbitrary mix of ‘|’-delimited actions known as pipes. For example, the following query uses stats , sort and limit pipes for returning top 10 log streams with the biggest number of logs during the last 5 minutes:

      _time:5m | stats by (_stream) count() per_stream_logs | sort by (per_stream_logs desc) | limit 10

LogsQL supports the following pipes:

block_stats returns various stats for the selected blocks with logs.
blocks_count counts the number of blocks with logs processed by the query.
collapse_nums replaces all the decimal and hexadecimal numbers with <N> in the given log field .
copy copies log fields (alias: cp).
decolorize drops ANSI color codes from the given log field .
delete deletes log fields (aliases: del, drop, rm).
drop_empty_fields drops log fields with empty values.
extract extracts the specified text into the given log fields.
extract_regexp extracts the specified text into the given log fields via RE2 regular expressions .
facets returns the most frequently seen log fields across the selected logs.
field_names returns all the names of log fields .
field_values returns all the values for the given log field .
fields selects the given set of log fields (alias: keep).
filter applies additional filters to results (alias: where).
first returns the first N logs after sorting them by the given log fields .
format formats output field from input log fields .
generate_sequence generates output logs with messages containing integer sequence.
join joins query results by the given log fields .
json_array_len returns the length of JSON array stored at the given log field .
hash returns the hash over the given log field value.
last returns the last N logs after sorting them by the given log fields .
len returns byte length of the given log field value.
limit limits the number of selected logs (alias: head).
math performs mathematical calculations over log fields (alias: eval).
offset skips the given number of selected logs (alias: skip).
pack_json packs log fields into JSON object.
pack_logfmt packs log fields into logfmt message.
query_stats returns query execution statistics.
rename renames log fields (alias: mv).
replace replaces substrings in the specified log fields .
replace_regexp updates log fields with regular expressions.
running_stats performs running stats calculations over the given log fields .
sample returns a sample of the matching logs according to the provided sample value.
set_stream_fields sets the given log fields as _stream fields .
sort sorts logs by the given fields (alias: order).
split splits the given log field into tokens by the given separator.
stats calculates various stats over the selected logs.
stream_context allows selecting surrounding logs before and after the matching logs for each log stream .
time_add adds the given duration to the given field containing RFC3339 time .
top returns top N field sets with the maximum number of matching logs.
total_stats performs total (global) stats calculations over the given log fields .
union returns results from multiple LogsQL queries.
uniq returns unique log entries.
unpack_json unpacks JSON messages from log fields .
unpack_logfmt unpacks logfmt messages from log fields .
unpack_syslog unpacks syslog messages from log fields .
unpack_words unpacks words from the given log field .
unroll unrolls JSON arrays from log fields into separate rows.

block_stats pipe #

<q> | block_stats pipe returns the following stats for each field in every data block processed by <q> query :

field - field name
rows - the number of rows in the data block
type - internal storage type for the given field
dict_items - the number of unique values in the dictionary for the given field
dict_bytes - on-disk size of the dictionary data for the given field
values_bytes - on-disk size of the data for the given field
bloom_bytes - on-disk size of bloom filter data for the given field
_stream - the log stream for the given field
part_path - the path to the data part where the field data is stored

The block_stats pipe is needed mostly for debugging purposes. See, for example, how to detect which log field occupies the most of the disk space , or how to detect which log stream occupies the most of the disk space .

See also:

blocks_count pipe #

<q> | blocks_count pipe counts the number of blocks with logs processed by <q>. This pipe is needed mostly for debugging.

See also:

collapse_nums pipe #

<q> | collapse_nums at <field> pipe replaces all the decimal and hexadecimal numbers at the given <field> returned by the <q> query with <N> placeholder. For example, if the _msg field contains 2024-10-20T12:34:56Z request duration 1.34s, then it is replaced with <N>-<N>-<N>T<N>:<N>:<N>Z request duration <N>.<N>s by the following query:

      _time:5m | collapse_nums at _msg

The at ... suffix can be omitted if collapse_nums is applied to _msg field. The following query is equivalent to the previous one:

      _time:5m | collapse_nums

This functionality is useful for locating the most frequently seen log patterns across log messages with various decimal and hexadecimal numbers. This includes the following entities: timestamps, IPv4 addresses, request durations, response sizes, UUIDs , trace IDs, user IDs, etc. Log messages with such entities become identical after applying collapse_nums pipe to them, so the top pipe can be applied to them in order to get the most frequently seen patterns across log messages. For example, the following query returns top 5 the most frequently seen log patterns across log messages for the last hour:

      _time:1h | collapse_nums | top 5 by (_msg)

collapse_nums can detect certain patterns in the collapsed numbers and replace them with the corresponding placeholders if prettify suffix is added to the collapse_nums pipe:

<N>-<N>-<N>-<N>-<N> is replaced with <UUID> placeholder.
<N>.<N>.<N>.<N> is replaced with <IP4> placeholder.
<N>:<N>:<N> is replaced with <TIME> placeholder. Optional fractional seconds after the time are treated as a part of <TIME>.
<N>-<N>-<N> and <N>/<N>/<N> is replaced with <DATE> placeholder.
<N>-<N>-<N>T<N>:<N>:<N> and <N>-<N>-<N> <N>:<N>:<N> is replaced with <DATETIME> placeholder. Optional timezone after the datetime is treated as a part of <DATETIME>.

For example, the log message 2edfed59-3e98-4073-bbb2-28d321ca71a7 - [2024/12/08 15:21:02] 10.71.20.32 GET /foo 200 is replaced with <UUID> - [<DATETIME>] <IP4> GET /foo <N> when the following query is executed:

      _time:1h | collapse_nums prettify

The patterns returned by collapse_nums prettify pipe can be used in pattern match filter .

collapse_nums can miss some numbers or can collapse unexpected numbers. In this case conditional collapse_nums can be used for skipping such values and pre-processing them separately with replace_regexp .

See also:

Conditional collapse_nums #

If the collapse_nums pipe must be applied only to some log entries , then add if (<filters>) after collapse_nums. The <filters> can contain arbitrary filters . For example, the following query collapses nums in the foo field only if the user_type field equals admin:

      _time:5m | collapse_nums if (user_type:=admin) at foo

copy pipe #

If some log fields must be copied, then | copy src1 as dst1, ..., srcN as dstN pipe can be used. For example, the following query copies host field to server for logs over the last 5 minutes, so the output contains both host and server fields:

      _time:5m | copy host as server

Multiple fields can be copied with a single | copy ... pipe. For example, the following query copies _time field to timestamp, while _msg field is copied to message:

      _time:5m | copy _time as timestamp, _msg as message

The as keyword is optional.

cp keyword can be used instead of copy for convenience. For example, _time:5m | cp foo bar is equivalent to _time:5m | copy foo as bar.

It is possible to copy multiple fields with identical prefix to fields with another prefix. For example, the following query copies all the fields with the prefix foo to fields with the prefix bar:

      _time:5m | copy foo* as bar*

See also:

decolorize pipe #

<q> | decolorize <field> pipe drops ANSI color codes from the given <field> across all the logs returned by <q> query .

The <field> may be omitted if ANSI color codes must be dropped from the _msg field . For example, the following query drops ANSI color codes from all the _msg fields over the logs for the last 5 minutes:

      _time:5m | decolorize

This query is equivalent to the following query:

      _time:5m | decolorize _msg

It is recommended to drop ANSI color codes at the data ingestion stage according to these docs . This simplifies further querying of the logs without the need to apply | decolorize pipe to them.

See also:

delete pipe #

If some log fields must be deleted, then | delete field1, ..., fieldN pipe can be used. For example, the following query deletes host and app fields from the logs over the last 5 minutes:

      _time:5m | delete host, app

drop, del and rm keywords can be used instead of delete for convenience. For example, _time:5m | drop host is equivalent to _time:5m | delete host.

It is possible to delete fields with common prefix. For example, the following query deletes all the fields with foo prefix:

      _time:5m | delete foo*

See also:

drop_empty_fields pipe #

<q> | drop_empty_fields pipe drops fields with empty values from results returned by <q> query . It also skips log entries with zero non-empty fields.

For example, the following query drops possible empty email field generated by extract pipe if the foo field doesn’t contain email:

      _time:5m | extract 'email: <email>,' from foo | drop_empty_fields

See also:

extract pipe #

<q> | extract "pattern" from field_name pipe extracts text into output fields according to the pattern from the given field_name returned by <q> query . Existing log fields remain unchanged after the | extract ... pipe.

extract pipe can be useful for extracting additional fields needed for further data processing with other pipes such as stats pipe or sort pipe .

For example, the following query selects logs with the error word for the last day, extracts IP address from _msg field into ip field and then calculates top 10 IP addresses with the biggest number of logs using top pipe :

      _time:1d error | extract "ip=<ip> " from _msg | top 10 (ip)

It is expected that _msg field contains ip=... substring ending with space. For example, error ip=1.2.3.4 from user_id=42. If there is no such substring in the current _msg field, then the ip output field will be empty.

If the extract pipe is applied to _msg field , then the from _msg part can be omitted. For example, the following query is equivalent to the previous one:

      _time:1d error | extract "ip=<ip> " | top 10 (ip)

If the pattern contains double quotes, then either put \ in front of double quotes or put the pattern inside single quotes. For example, the following query extracts ip from the corresponding JSON field:

      _time:5m | extract '"ip":"<ip>"'

Add keep_original_fields to the end of extract ... when the original non-empty values of the fields mentioned in the pattern must be preserved instead of overwriting them with the extracted values. For example, the following query extracts <ip> only if the original value for ip field is missing or is empty:

      _time:5m | extract 'ip=<ip> ' keep_original_fields

By default extract writes empty matching fields to the output, which may overwrite existing values. Add skip_empty_results to the end of extract ... in order to prevent overwriting the existing values for the corresponding fields with empty values. For example, the following query preserves the original ip field value if the foo field doesn’t contain the matching IP:

      _time:5m | extract 'ip=<ip> ' from foo skip_empty_results

Performance tip: it is recommended using more specific log filters in order to reduce the number of log entries, which are passed to extract. See general performance tips for details.

See also:

Format for extract pipe pattern #

The pattern part from extract pipe has the following format:

      text1<field1>text2<field2>...textN<fieldN>textN+1

Where text1, … textN+1 is arbitrary text, which matches as is to the input text. The delimiter text between placeholders (text2, …, textN) must be non-empty, while text1 and textN+1 may be empty.

The field1, … fieldN are placeholders, which match a substring of any length (including zero length) in the input text until the next textX. Placeholders can be anonymous and named. Anonymous placeholders are written as <>, <_> or <*>. They are used for convenience when some input text must be skipped until the next textX. Named placeholders are written as <some_name>, where some_name is the name of the log field to store the corresponding matching substring to.

Matching starts from the first occurrence of the text1 in the input text. If the pattern starts with <field1> and doesn’t contain text1, then the matching starts from the beginning of the input text. Matching is performed sequentially according to the pattern. If some textX isn’t found in the remaining input text, then the remaining named placeholders receive empty string values and the matching finishes prematurely. The empty string values can be dropped with drop_empty_fields pipe .

Matching finishes successfully when textN+1 is found in the input text. If the pattern ends with <fieldN> and doesn’t contain textN+1, then the <fieldN> matches the remaining input text.

For example, if _msg field contains the following text:

      1.2.3.4 GET /foo/bar?baz 404 "Mozilla  foo bar baz" some tail here

Then the following pattern can be used for extracting ip, path and user_agent fields from it:

      <ip> <_> <path> <_> "<user_agent>"

Note that the user-agent part of the log message is in double quotes. This means that it may contain special chars, including escaped double quote, e.g. \". This may break proper matching of the string in double quotes.

VictoriaLogs automatically detects quoted strings and automatically unquotes them if the first matching char in the placeholder is a single quote, double quote or a backtick. So it is better to use the following pattern for proper matching of quoted user_agent string:

      <ip> <_> <path> <_> <user_agent>

This is useful for extracting JSON strings. For example, the following pattern properly extracts the message JSON string into msg field, even if it contains special chars:

      "message":<msg>

The automatic string unquoting can be disabled if needed by adding plain: prefix in front of the field name. For example, if some JSON array of string values must be captured into json_array field, then the following pattern can be used:

      some json string array: [<plain:json_array>]

If some special chars such as < must be matched by the pattern, then they can be html-escaped . For example, the following pattern properly matches a < b text by extracting a into left field and b into right field:

      <left> &lt; <right>

Conditional extract #

If some log entries must be skipped from extract pipe , then add if (<filters>) filter after the extract word. The <filters> can contain arbitrary filters . For example, the following query extracts ip field from _msg field only if the input log entry doesn’t contain ip field or this field is empty:

      _time:5m | extract if (ip:"") "ip=<ip> "

An alternative approach is to add keep_original_fields to the end of extract, in order to keep the original non-empty values for the extracted fields. For example, the following query is equivalent to the previous one:

      _time:5m | extract "ip=<ip> " keep_original_fields

extract_regexp pipe #

<q> | extract_regexp "pattern" from field_name pipe extracts substrings from the field_name field returned from <q> query according to the provided pattern, and stores them into field names according to the named fields inside the pattern. The pattern must contain RE2 regular expression with named fields (aka capturing groups) in the form (?P<capture_field_name>...). Matching substrings are stored to the given capture_field_name log fields . For example, the following query extracts substrings that look like IPv4 addresses from _msg field and puts it into ip field for logs over the last 5 minutes:

      _time:5m | extract_regexp "(?P<ip>([0-9]+[.]){3}[0-9]+)" from _msg

The from _msg part can be omitted if the data extraction is performed from the _msg field . So the following query is equivalent to the previous one:

      _time:5m | extract_regexp "(?P<ip>([0-9]+[.]){3}[0-9]+)"

Add keep_original_fields to the end of extract_regexp ... when the original non-empty values of the fields mentioned in the pattern must be preserved instead of overwriting it with the extracted values. For example, the following query extracts <ip> only if the original value for ip field is missing or is empty:

      _time:5m | extract_regexp 'ip=(?P<ip>([0-9]+[.]){3}[0-9]+)' keep_original_fields

By default extract_regexp writes empty matching fields to the output, which may overwrite existing values. Add skip_empty_results to the end of extract_regexp ... in order to prevent overwriting the existing values for the corresponding fields with empty values. For example, the following query preserves the original ip field value if the foo field doesn’t contain the matching IP:

      _time:5m | extract_regexp 'ip=(?P<ip>([0-9]+[.]){3}[0-9]+)' from foo skip_empty_results

Performance tip: it is recommended using extract pipe instead of extract_regexp for achieving higher query performance.

See also:

Conditional extract_regexp #

If some log entries must be skipped from extract_regexp pipe , then add if (<filters>) after the extract_regexp word. The <filters> can contain arbitrary filters . For example, the following query extracts ip from _msg field only if the input log entry doesn’t contain ip field or this field is empty:

      _time:5m | extract_regexp if (ip:"") "ip=(?P<ip>([0-9]+[.]){3}[0-9]+)"

An alternative approach is to add keep_original_fields to the end of extract_regexp, in order to keep the original non-empty values for the extracted fields. For example, the following query is equivalent to the previous one:

      _time:5m | extract_regexp "ip=(?P<ip>([0-9]+[.]){3}[0-9]+)" keep_original_fields

<q> | facets pipe returns the most frequent non-empty values for every seen log field returned by <q> query . It also returns an estimated number of hits for every returned field=value pair.

For example, the following query returns the most frequent values for every seen log field across logs with the error word over the last hour:

      _time:1h error | facets

It is possible to specify the number of most frequently seen values to return for each log field by using the facets N syntax. For example, the following query returns up to 3 most frequently seen values for each field across logs with the error word over the last hour:

      _time:1h error | facets 3

By default facets pipe doesn’t return log fields with too many unique values, since this may require a lot of additional memory to track. The limit can be changed during the query via max_values_per_field M suffix. For example, the following query returns up to 15 most frequently seen field values across fields with up to 100000 unique values:

      _time:1h error | facets 15 max_values_per_field 100000

By default facets pipe doesn’t return log fields with too long values. The limit can be changed during query via max_value_len K suffix. For example, the following query returns the most frequent values for all the log fields containing values no longer than 100 bytes:

      _time:1h error | facets max_value_len 100

By default facets pipe doesn’t return log fields, which contain a single constant value across all the selected logs, since such facets aren’t interesting in most cases. Add keep_const_fields suffix to the facets pipe in order to get such fields:

      _time:1h error | facets keep_const_fields
    

See also:

field_names pipe #

<q> | field_names pipe returns all the names of log fields with an estimated number of logs for each field name returned from <q> query .

For example, the following query returns all the field names with the number of matching logs over the last 5 minutes:

      _time:5m | field_names

Field names are returned in arbitrary order. Use sort pipe in order to sort them if needed.

See also:

field_values pipe #

<q> | field_values field_name pipe returns all the values for the given field_name field with the number of logs for each value returned from <q> query . For example, the following query returns all the values with the number of matching logs for the field level over logs for the last 5 minutes:

      _time:5m | field_values level

It is possible to limit the number of returned values by adding limit N to the end of field_values .... For example, the following query returns up to 10 values for the field user_id over logs for the last 5 minutes:

      _time:5m | field_values user_id limit 10

If the limit is reached, then the set of returned values is arbitrary. In this case hits for the returned values are set to 0, since they may be misleading if not all the values are returned.

See also:

fields pipe #

By default all the log fields are returned in the response. It is possible to select the given set of log fields with | fields field1, ..., fieldN pipe . For example, the following query selects only host and _msg fields from logs for the last 5 minutes:

      _time:5m | fields host, _msg

keep can be used instead of fields for convenience. For example, the following query is equivalent to the previous one:

      _time:5m | keep host, _msg

It is possible to use wildcard prefixes in the list of fields to keep. For example, the following query keeps all the fields with names starting with foo prefix, while drops the rest of the fields:

      _time:5m | fields foo*

See also:

filter pipe #

The <q> | filter ... pipe filters logs returned by <q> query with the given filter .

For example, the following query returns host field values if the number of log messages with the error word for them over the last hour exceeds 1_000:

      _time:1h error | stats by (host) count() logs_count | filter logs_count:> 1_000

It is allowed to use where prefix instead of filter prefix for convenience. For example, the following query is equivalent to the previous one:

      _time:1h error | stats by (host) count() logs_count | where logs_count:> 1_000

It is allowed to omit filter prefix if the used filters do not clash with pipe names . So the following query is equivalent to the previous one:

      _time:1h error | stats by (host) count() logs_count | logs_count:> 1_000

See also:

first pipe #

<q> | first N by (fields) pipe returns the first N logs from <q> query after sorting them by the given fields .

The first N pipe is a shortcut for sorting and taking top N rows in a single step (equivalent to sort ... | limit N). It always performs its own “top N” selection.

If first is used with by (...) (or just (...)), then it returns the top N rows according to these sort keys.
If first is used without sort keys (no by (...) / no (...)), then it returns the top N rows after sorting by all the fields in the row.

For example, the following query returns the first 10 logs with the smallest value of request_duration field over the last 5 minutes:

      _time:5m | first 10 by (request_duration)

It is possible to return up to N logs individually for each group of logs with the same set of fields , by enumerating the set of these fields in partition by (...). For example, the following query returns up to 3 logs with the smallest request_duration for each host over the last hour:

      _time:1h | first 3 by (request_duration) partition by (host)

See also:

format pipe #

<q> | format "pattern" as result_field pipe combines log fields from <q> query results according to the pattern and stores it into result_field.

For example, the following query stores request from <ip>:<port> text into _msg field , by substituting <ip> and <port> with the corresponding log field values:

      _time:5m | format "request from <ip>:<port>" as _msg

If the result of the format pattern is stored into _msg field , then as _msg part can be omitted. The following query is equivalent to the previous one:

      _time:5m | format "request from <ip>:<port>"

String fields can be formatted with the following additional formatting rules:

The number of seconds in the duration value - add duration_seconds: in front of the corresponding field name. The formatted number is fractional if the duration value contains non-zero milliseconds, microseconds or nanoseconds.
JSON-compatible quoted string - add q: in front of the corresponding field name. For example, the following query generates properly encoded JSON object from _msg and stacktrace log fields and stores it into my_json output field:
```
      _time:5m | format '{"_msg":<q:_msg>,"stacktrace":<q:stacktrace>}' as my_json
    
```
Uppercase and lowercase strings - add uc: or lc: in front of the corresponding field name. For example, the following query stores uppercase value of foo field and lowercase value of bar field in the result field:
```
      _time:5m | format 'uppercase foo: <uc:foo>, lowercase bar: <lc:bar>' as result
    
```
URL encoding and decoding (application/x-www-form-urlencoded ) - add urlencode: or urldecode: in front of the corresponding field name (urlencode: replaces spaces with +). For example, the following query properly encodes user field in the url query arg:
```
      _time:5m | format 'url: http://foo.com/?user=<urlencode:user>'
    
```
Hex encoding and decoding - add hexencode: or hexdecode: in front of the corresponding field name. For example, the following query hex-encodes password field:
```
      _time:5m | format 'hex-encoded password: <hexencode:password>'
    
```
Base64 encoding and decoding - add base64encode: or base64decode: in front of the corresponding field name. For example, the following query base64-encodes password field:
```
      _time:5m | format 'base64-encoded password: <base64encode:password>'
    
```
Converting of hexadecimal number to decimal number - add hexnumdecode: in front of the corresponding field name. For example, format "num=<hexnumdecode:some_hex_field>".

Numeric fields can be transformed into the following string representation at format pipe:

RFC3339 time - by adding time: in front of the corresponding field name containing Unix timestamp . The numeric timestamp can be in seconds, milliseconds, microseconds, or nanoseconds — the precision is automatically detected based on the value. Both integer and floating-point values are supported. For example, format "time=<time:timestamp>".
Human-readable duration - by adding duration: in front of the corresponding numeric field name containing duration in nanoseconds. For example, format "duration=<duration:duration_nsecs>". The duration can be converted into nanoseconds with the math pipe .
IPv4 - by adding ipv4: in front of the corresponding field name containing uint32 representation of the IPv4 address. For example, format "ip=<ipv4:ip_num>".
Zero-padded 64-bit hex number - by adding hexnumencode: in front of the corresponding field name. For example, format "hex_num=<hexnumencode:some_field>".

Add keep_original_fields to the end of format ... as result_field when the original non-empty value of the result_field must be preserved instead of overwriting it with the format results. For example, the following query adds formatted result to foo field only if it was missing or empty:

      _time:5m | format 'some_text' as foo keep_original_fields

Add skip_empty_results to the end of format ... if empty results shouldn’t be written to the output. For example, the following query writes formatted result to foo field when at least field1 or field2 aren’t empty, otherwise it preserves the original foo value:

      _time:5m | format "<field1><field2>" as foo skip_empty_results

Performance tip: it is recommended using more specific log filters in order to reduce the number of log entries, which are passed to format. See general performance tips for details.

See also:

Conditional format #

If the format pipe must be applied only to some log entries , then add if (<filters>) just after the format word. The <filters> can contain arbitrary filters . For example, the following query stores the formatted result to message field only if ip and host fields aren’t empty, otherwise the original message field isn’t modified:

      _time:5m | format if (ip:* and host:*) "request from <ip>:<host>" as message

generate_sequence pipe #

The <q> | generate_sequence <N> pipe skips all the <q> results and generates <N> output logs with the _msg field containing integer sequence starting from 0 and ending at N-1.

This pipe is useful for testing and debugging of the LogsQL pipes. For example, the following query generates 1000 random integers in the range [0..9] and collects the statistics on the number of hits for each random number:

      * | generate_sequence 1000
    | math round(rand()*10) as rand_num
    | stats by (rand_num) count() hits
    | sort by (rand_num)
    

See also:

join pipe #

The <q1> | join by (<fields>) (<q2>) pipe joins <q1> query results with the <q2> results by the given set of comma-separated <fields>. This pipe works in the following way:

It executes the <q2> query and remembers its results.
For each input row from <q1> it searches for matching rows in the <q2> results by the given <fields>.
If the <q2> results have no matching rows, then the input row is sent to the output as is.
If the <q2> results have matching rows, then for each matching row the input row is extended with new fields seen at the matching row, and the result is sent to the output.

This logic is similar to LEFT JOIN in SQL. For example, the following query returns the number of per-user logs across two applications — app1 and app2 (see stream filters for details on the {...} filter):

      _time:1d {app="app1"} | stats by (user) count() app1_hits
  | join by (user) (
    _time:1d {app="app2"} | stats by (user) count() app2_hits
  )
    

If you need results similar to INNER JOIN in SQL, then add inner suffix after the join pipe. For example, the following query returns stats only for users, which exist in both applications app1 and app2:

      _time:1d {app="app1"} | stats by (user) count() app1_hits
  | join by (user) (
    _time:1d {app="app2"} | stats by (user) count() app2_hits
  ) inner
    

It is possible to add a prefix to all the field names returned by the <query> by specifying the needed prefix after the <query>. For example, the following query adds app2. prefix to all <query> log fields:

      _time:1d {app="app1"} | stats by (user) count() app1_hits
  | join by (user) (
    _time:1d {app="app2"} | stats by (user) count() app2_hits
  ) prefix "app2."
    

Performance tips:

Make sure that the <query> in the join pipe returns relatively small number of results, since they are kept in RAM during execution of join pipe.
Conditional stats is usually faster to execute. They usually require less RAM than the equivalent join pipe.

See also:

json_array_len pipe #

<q> | json_array_len(field) as result_field calculates the length of JSON array at the given field and stores it into the result_field, for every log entry returned by <q> query .

For example, the following query returns top 5 logs that contain log messages with the biggest number of words across all the logs for the last 5 minutes:

      _time:5m | unpack_words _msg as words | json_array_len (words) as words_count | first 5 (words_count desc)

See also:

hash pipe #

<q> | hash(field) as result_field calculates an integer hash value for the given field and stores it into the result_field, for every log entry returned by <q> query .

For example, the following query calculates the hash value over user_id field and stores it into user_id_hash field, across logs for the last 5 minutes:

      _time:5m | hash(user_id) as user_id_hash

See also:

last pipe #

<q> | last N by (fields) pipe returns the last N logs from <q> query after sorting them by the given fields .

For example, the following query returns the last 10 logs with the biggest value of request_duration field over the last 5 minutes:

      _time:5m | last 10 by (request_duration)

It is possible to return up to N logs individually for each group of logs with the same set of fields , by enumerating the set of these fields in partition by (...). For example, the following query returns up to 3 logs with the biggest request_duration for each host over the last hour:

      _time:1h | last 3 by (request_duration) partition by (host)

See also:

len pipe #

<q> | len(field) as result pipe stores byte length of the given field value into the result field across all the logs returned by <q> query .

For example, the following query shows top 5 log entries with the maximum byte length of _msg field across logs for the last 5 minutes:

      _time:5m | len(_msg) as msg_len | sort by (msg_len desc) | limit 5

See also:

limit pipe #

If only a subset of selected logs must be processed, then | limit N pipe can be used, where N can contain any supported integer numeric value . For example, the following query returns up to 100 logs over the last 5 minutes:

      _time:5m | limit 100

head keyword can be used instead of limit for convenience. For example, _time:5m | head 100 is equivalent to _time:5m | limit 100.

The N in limit N / head N can be omitted - in this case up to 10 matching logs are returned:

      error | head

By default rows are selected in arbitrary order for performance reasons, so the query above can return different sets of logs every time it is executed. sort pipe can be used for making sure the logs are in the same order before applying limit ... to them.

See also:

math pipe #

<q> | math ... pipe performs mathematical calculations over numeric values of log fields returned by <q> query . It has the following format:

      | math
  expr1 as resultName1,
  ...
  exprN as resultNameN
    

Where exprX is one of the supported math expressions mentioned below, while resultNameX is the name of the field to store the calculated result to. The as keyword is optional. The result name can be omitted. In this case the result is stored to a field with the name equal to string representation of the corresponding math expression.

exprX may reference resultNameY calculated before the given exprX.

For example, the following query divides duration_msecs field value by 1000, then rounds it to integer and stores the result in the duration_secs field:

      _time:5m | math round(duration_msecs / 1000) as duration_secs

The following mathematical operations are supported by math pipe:

arg1 + arg2 - returns the sum of arg1 and arg2
arg1 - arg2 - returns the difference between arg1 and arg2
arg1 * arg2 - multiplies arg1 by arg2
arg1 / arg2 - divides arg1 by arg2
arg1 % arg2 - returns the remainder of the division of arg1 by arg2
arg1 ^ arg2 - returns the power of arg1 by arg2
arg1 & arg2 - returns bitwise and for arg1 and arg2. It is expected that arg1 and arg2 are in the range [0 .. 2^53-1]
arg1 or arg2 - returns bitwise or for arg1 and arg2. It is expected that arg1 and arg2 are in the range [0 .. 2^53-1]
arg1 xor arg2 - returns bitwise xor for arg1 and arg2. It is expected that arg1 and arg2 are in the range [0 .. 2^53-1]
arg1 default arg2 - returns arg2 if arg1 is non- numeric or equals NaN
abs(arg) - returns an absolute value for the given arg
ceil(arg) - returns the least integer value greater than or equal to arg
exp(arg) - powers e by arg
floor(arg) - returns the greatest integer value less than or equal to arg
ln(arg) - returns natural logarithm for the given arg
max(arg1, ..., argN) - returns the maximum value among the given arg1, …, argN
min(arg1, ..., argN) - returns the minimum value among the given arg1, …, argN
now() - returns the current Unix timestamp in nanoseconds.
rand() - returns pseudo-random number in the range [0...1).
round(arg) - returns rounded to integer value for the given arg. The round() accepts optional nearest arg, which allows rounding the number to the given nearest multiple. For example, round(temperature, 0.1) rounds temperature field to one decimal digit after the point.

Every argX argument in every mathematical operation can contain one of the following values:

The name of log field . For example, errors_total / requests_total. The log field is parsed into numeric value if it contains supported numeric value . The log field is parsed into Unix timestamp in nanoseconds if it contains rfc3339 time . The log field is parsed into uint32 number if it contains IPv4 address. The log field is parsed into NaN in other cases.
Any supported numeric value , rfc3339 time or IPv4 address. For example, 1MiB, "2024-05-15T10:20:30.934324Z" or "12.34.56.78".
Another mathematical expression, which can be put inside (...). For example, (a + b) * c.

The parsed time, duration and IPv4 address can be converted back to string representation after math transformations with the help of format pipe . For example, the following query rounds the request_duration field to seconds before converting it back to string representation:

      _time:5m | math round(request_duration, 1e9) as request_duration_nsecs | format '<duration:request_duration_nsecs>' as request_duration

The eval keyword can be used instead of math for convenience. For example, the following query calculates duration_msecs field by multiplying duration_secs field to 1000:

      _time:5m | eval (duration_secs * 1000) as duration_msecs

See also:

offset pipe #

If some selected logs must be skipped after sort , then | offset N pipe can be used, where N can contain any supported integer numeric value . For example, the following query skips the first 100 logs over the last 5 minutes after sorting them by _time :

      _time:5m | sort by (_time) | offset 100

skip keyword can be used instead of offset keyword for convenience. For example, _time:5m | skip 10 is equivalent to _time:5m | offset 10.

Note that skipping rows without sorting makes little sense, since they can be returned in arbitrary order for performance reasons. Rows can be sorted with sort pipe .

See also:

pack_json pipe #

<q> | pack_json as field_name pipe packs all the fields of every log entry returned by <q> query into JSON object and stores it as a string in the given field_name.

For example, the following query packs all the fields into JSON object and stores it into _msg field for logs over the last 5 minutes:

      _time:5m | pack_json as _msg

The as _msg part can be omitted if packed JSON object is stored into _msg field . The following query is equivalent to the previous one:

      _time:5m | pack_json

If only a subset of fields must be packed into JSON, then it must be listed inside fields (...) after pack_json. For example, the following query builds JSON with foo and bar fields only and stores the result in baz field:

      _time:5m | pack_json fields (foo, bar) as baz

It is possible to pass field prefixes into fields (...) in order to pack only the fields, which start with the given prefixes. For example, the following query builds JSON with all the fields, which start with either foo. or bar.:

      _time:5m | pack_json fields (foo.*, bar.*) as baz

The pack_json doesn’t modify or delete other fields. If you do not need them, then add | fields ... after the pack_json pipe. For example, the following query leaves only the foo field with the original log fields packed into JSON:

      _time:5m | pack_json as foo | fields foo

See also:

pack_logfmt pipe #

<q> | pack_logfmt as field_name pipe packs all the fields for every log entry returned by <q> query into logfmt message and stores it as a string in the given field_name.

For example, the following query packs all the fields into logfmt message and stores it into _msg field for logs over the last 5 minutes:

      _time:5m | pack_logfmt as _msg

The as _msg part can be omitted if packed message is stored into _msg field . The following query is equivalent to the previous one:

      _time:5m | pack_logfmt

If only a subset of fields must be packed into logfmt , then it must be listed inside fields (...) after pack_logfmt. For example, the following query builds logfmt message with foo and bar fields only and stores the result in baz field:

      _time:5m | pack_logfmt fields (foo, bar) as baz

It is possible to pass field prefixes into fields (...) in order to pack only the fields, which start with the given prefixes. For example, the following query builds logfmt message with all the fields, which start with either foo. or bar.:

      _time:5m | pack_logfmt fields (foo.*, bar.*) as baz

The pack_logfmt doesn’t modify or delete other fields. If you do not need them, then add | fields ... after the pack_logfmt pipe. For example, the following query leaves only the foo field with the original log fields packed into logfmt :

      _time:5m | pack_logfmt as foo | fields foo

See also:

query_stats pipe #

The <q> | query_stats pipe returns the following execution statistics for the given query <q> :

BytesReadColumnsHeaders - the number of bytes read from disk for column headers. Use fields pipe for reducing the number of bytes read for column headers.
BytesReadColumnsHeaderIndexes - the number of bytes read from disk for column header indexes.
BytesReadBloomFilters - the number of bytes read from disk for bloom filters.
BytesReadValues - the number of bytes read from disk for log fields . Use fields pipe for reducing the number of bytes read for log field values.
BytesReadTimestamps - the number of bytes read from disk for _time field .
BytesReadBlockHeaders - the number of bytes read from disk for block headers.
BytesReadTotal - the total number of bytes read from disk.
BlocksProcessed - the number of data blocks processed during query execution. Use more narrow time filter and log stream filter for reducing the number of data blocks processed.
RowsProcessed - the number of log entries processed during query execution. Use more narrow time filter and log stream filter for reducing the number of processed log entries.
RowsFound - the number of log entries found by the query.
ValuesRead - the number of log field values read during query processing. Use fields pipe for reducing the number of field values read.
TimestampsRead - the number of _time fields read during query processing.
BytesProcessedUncompressedValues - the number of uncompressed bytes for log field values , which are processed during query execution.
QueryDurationNsecs - the duration of the query in nanoseconds. It can be used for calculating various rates over the query stats with the math pipe .

This pipe is useful for investigation and optimizing slow queries.

See also:

rename pipe #

If some log fields must be renamed, then | rename src1 as dst1, ..., srcN as dstN pipe can be used. For example, the following query renames host field to server for logs over the last 5 minutes, so the output contains server field instead of host field:

      _time:5m | rename host as server

Multiple fields can be renamed with a single | rename ... pipe. For example, the following query renames host to instance and app to job:

      _time:5m | rename host as instance, app as job

The as keyword is optional.

mv keyword can be used instead of rename keyword for convenience. For example, _time:5m | mv foo bar is equivalent to _time:5m | rename foo as bar.

It is possible to rename multiple fields with the given prefix to fields with another prefix. For example, the following query renames all the fields starting with foo prefix to fields starting with bar prefix:

      _time:5m | rename foo* as bar*

It is also possible to remove the common prefix from some fields. For example, the following query removes the foo prefix from all the fields that start with foo:

      _time:5m | rename foo* as *

It is also possible to add a common prefix to all the fields. For example, the following query adds the foo prefix to all the fields:

      _time:5m | rename * as foo*

See also:

replace pipe #

<q> | replace ("old", "new") at field pipe replaces all the occurrences of the old substring with the new substring in the given field over all the logs returned by <q> query .

For example, the following query replaces all the secret-password substrings with *** in the _msg field for logs over the last 5 minutes:

      _time:5m | replace ("secret-password", "***") at _msg

The at _msg part can be omitted if the replacement occurs in the _msg field . The following query is equivalent to the previous one:

      _time:5m | replace ("secret-password", "***")

The number of replacements can be limited with limit N at the end of replace. For example, the following query replaces only the first foo substring with bar at the log field baz:

      _time:5m | replace ('foo', 'bar') at baz limit 1

Performance tip: it is recommended using more specific log filters in order to reduce the number of log entries, which are passed to replace. See general performance tips for details.

See also:

Conditional replace #

If the replace pipe must be applied only to some log entries , then add if (<filters>) after replace. The <filters> can contain arbitrary filters . For example, the following query replaces secret with *** in the password field only if the user_type field equals admin:

      _time:5m | replace if (user_type:=admin) ("secret", "***") at password

replace_regexp pipe #

<q> | replace_regexp ("regexp", "replacement") at field pipe replaces all the substrings matching the given regexp with the given replacement in the given field over all the logs returned by <q> query .

The regexp must contain regular expression with RE2 syntax . The replacement may contain $N or ${N} placeholders, which are substituted with the N-th capturing group in the regexp.

For example, the following query replaces all the substrings starting with host- and ending with -foo with the contents between host- and -foo in the _msg field for logs over the last 5 minutes:

      _time:5m | replace_regexp ("host-(.+?)-foo", "$1") at _msg

The at _msg part can be omitted if the replacement occurs in the _msg field . The following query is equivalent to the previous one:

      _time:5m | replace_regexp ("host-(.+?)-foo", "$1")

The number of replacements can be limited with limit N at the end of replace_regexp. For example, the following query replaces only the first password: ... substring ending with whitespace with empty substring at the log field baz:

      _time:5m | replace_regexp ('password: [^ ]+', '') at baz limit 1

Performance tips:

It is recommended to use the replace pipe instead of replace_regexp if possible, since it works faster.
It is recommended to use more specific log filters in order to reduce the number of log entries that are passed to replace_regexp. See general performance tips for details.

See also:

Conditional replace_regexp #

If the replace_regexp pipe must be applied only to some log entries , then add if (<filters>) after replace_regexp. The <filters> can contain arbitrary filters . For example, the following query replaces password: ... substrings ending with whitespace with *** in the foo field only if the user_type field equals admin:

      _time:5m | replace_regexp if (user_type:=admin) ("password: [^ ]+", "***") at foo

running_stats pipe #

The <q> | running_stats ... pipe calculates running stats (such as running count or running sum ) over the specified log fields returned by <q> query and stores the stats in the specified log fields for each input log entry.

The running stats is calculated over the logs sorted by time, so the <q> must return the _time field in order to properly calculate the running stats.

The running_stats pipe puts all the logs returned by <q> in memory, so make sure the <q> returns the limited number of logs in order to avoid high memory usage.

For example, the following LogsQL query calculates running sum for the hits field over the logs for the last 5 minutes:

      _time:5m | running_stats sum(hits) as running_hits

The | running_stats ... pipe has the following basic format:

      ... | running_stats
  stats_func1(...) as result_name1,
  ...
  stats_funcN(...) as result_nameN
    

Where stats_func* is any of the supported running stats function , while the result_name* is the name of the log field to store the result of the corresponding stats function. The as keyword is optional.

For example, the following query calculates the following running stats for logs over the last 5 minutes:

the number of logs with the help of count function ;
the sum of hits field with the help of sum function :

      _time:5m
    | running_stats
        count() as running_logs,
        sum(hits) as running_hits
    

It is allowed omitting the result name. In this case the result name equals the string representation of the used running stats function . For example, the following query returns the same stats as the previous one, but gives count(*) and sum(hits) names for the returned fields:

      _time:5m | running_stats count(), sum(hits)

It is useful to combine running_stats with stats by time buckets . For example, the following query returns per-hour number of logs over the last day, plus running number of logs.

      _time:1d
    | stats by (_time:hour) count() as hits
    | running_stats sum(hits) as running_hits
    

See also:

running_stats by fields #

The following LogsQL syntax can be used for calculating independent running stats per group of log fields:

      <q> | running_stats by (field1, ..., fieldM)
  stats_func1(...) as result_name1,
  ...
  stats_funcN(...) as result_nameN
    

This calculates stats_func* for each (field1, ..., fieldM) group of log fields seen in the logs returned by <q> query .

For example, the following query calculates running number of logs and running number of hits over the last 5 minutes, grouped by (host, path) fields:

      _time:5m
    | running_stats by (host, path)
        count() running_logs,
        sum(hits) running_hits
    

The by keyword can be skipped in running_stats ... pipe. For example, the following query is equivalent to the previous one:

      _time:5m | running_stats (host, path) count() running_logs, sum(hits) running_hits

See also:

sample pipe #

The <q> | sample N pipe returns 1/Nth random sample of logs for the <q> query . For example, the following query returns ~1% (1/100th random sample) of logs over the last hour with the error word in the _msg field :

      _time:1h error | sample 100

See also:

limit pipe

set_stream_fields pipe #

The | set_stream_fields field1, ..., fieldN pipe sets the given log fields as _stream fields .

For example, if the logs returned by _time:5m filter have host="foo" and path="/bar" fields, then the following query sets _stream field to {host="foo", path="/bar"}:

      _time:5m | set_stream_fields host, path

See also:

Conditional set_stream_fields #

The set_stream_fields pipe can be applied to a subset of input logs which match the given filters , by using if (...) after the set_stream_fields. For example, the following query updates _stream field only for logs with the host="foobar" field, while leaving the original _stream value for the rest of the logs:

      _time:5m | set_stream_fields if (host:="foobar") host, app

sort pipe #

By default logs are selected in arbitrary order for performance reasons. If logs must be sorted, then <q> | sort by (field1, ..., fieldN) pipe can be used for sorting logs returned by <q> query by the given fields using natural sorting .

For example, the following query returns logs for the last 5 minutes sorted by _stream and then by _time :

      _time:5m | sort by (_stream, _time)

Add desc after the given log field in order to sort in reverse order of this field. For example, the following query sorts logs in reverse order of request_duration_seconds field:

      _time:5m | sort by (request_duration_seconds desc)

The reverse order can be applied globally via desc keyword after by(...) clause:

      _time:5m | sort by (foo, bar) desc

The by keyword can be skipped in sort ... pipe. For example, the following query is equivalent to the previous one:

      _time:5m | sort (foo, bar) desc

The order alias can be used instead of sort, so the following query is equivalent to the previous one:

      _time:5m | order by (foo, bar) desc

Sorting a large number of logs can consume a lot of CPU time and memory. Sometimes it is enough to return the first N entries with the biggest or the smallest values. This can be done by adding limit N to the end of sort ... pipe. Such a query consumes less memory when sorting a large number of logs, since it keeps in memory only N log entries. For example, the following query returns top 10 log entries with the biggest values for the request_duration field during the last hour:

      _time:1h | sort by (request_duration desc) limit 10

This query is equivalent to the following one, which uses last pipe :

      _time:1h | last 10 by (request_duration)

If the first N sorted results must be skipped, then offset N can be added to sort pipe. For example, the following query skips the first 10 logs with the biggest request_duration field , and then returns the next 20 sorted logs for the last 5 minutes:

      _time:1h | sort by (request_duration desc) offset 10 limit 20

It is possible to sort the logs and apply the limit individually for each group of logs with the same set of fields , by enumerating the set of these fields in partition by (...). For example, the following query returns up to 3 logs with the biggest request_duration for each host over the last hour:

      _time:1h | sort by (request_duration desc) partition by (host) limit 3

It is possible to return a rank (sort order number) for every sorted log by adding rank as <fieldName> to the end of the | sort ... pipe. For example, the following query stores rank for sorted by _time logs into position field :

      _time:5m | sort by (_time) rank as position

Note that sorting a large number of logs can be slow and can consume a lot of additional memory. It is recommended to limit the number of logs before sorting with the following approaches:

Adding limit N to the end of sort ... pipe.
Reducing the selected time range with time filter .
Using more specific filters , so they select less logs.
Limiting the number of selected fields via fields pipe .

See also:

split pipe #

The <q> | split <separator> from <src_field> as <dst_field> pipe splits <src_field> log field obtained from <q> query results into <dst_field> as a JSON array, by using the given <separator>.

For example, the following query splits log messages by , and stores the results into items field:

      _time:5m | split "," from _msg as items

The as <dst_field> part is optional. If it is missing, then the result is stored in the <src_field> specified in from <src_field>. For example, the following query stores the split result into _msg field:

      _time:5m | split "," from _msg

The from <src_field> part is optional. If it is missing, then the _msg field is used as a source field. The following query is equivalent to the previous one:

      _time:5m | split ","

It is convenient to use unroll pipe for unrolling the JSON array with the split results. For example, the following query returns top 5 most frequently seen comma-separated items across log messages for the last 5 minutes:

      _time:5m | split "," as items | unroll items | top 5 (items)

See also:

stats pipe #

The <q> | stats ... pipe calculates various stats over the logs returned by <q> query . For example, the following LogsQL query uses count stats function for calculating the number of logs for the last 5 minutes:

      _time:5m | stats count() as logs_total

The | stats ... pipe has the following basic format:

      ... | stats
  stats_func1(...) as result_name1,
  ...
  stats_funcN(...) as result_nameN
    

Where stats_func* is any of the supported stats function , while the result_name* is the name of the log field to store the result of the corresponding stats function. The as keyword is optional.

For example, the following query calculates the following stats for logs over the last 5 minutes:

the number of logs with the help of count stats function ;
the number of unique log streams with the help of count_uniq stats function :

      _time:5m | stats count() logs_total, count_uniq(_stream) streams_total

It is allowed omitting stats prefix for convenience. So the following query is equivalent to the previous one:

      _time:5m | count() logs_total, count_uniq(_stream) streams_total

It is allowed omitting the result name. In this case the result name equals the string representation of the used stats function . For example, the following query returns the same stats as the previous one, but gives count(*) and count_uniq(_stream) names for the returned fields:

      _time:5m | count(), count_uniq(_stream)

See also:

Stats by fields #

The following LogsQL syntax can be used for calculating independent stats per group of log fields:

      <q> | stats by (field1, ..., fieldM)
  stats_func1(...) as result_name1,
  ...
  stats_funcN(...) as result_nameN
    

This calculates stats_func* for each (field1, ..., fieldM) group of log fields seen in the logs returned by <q> query .

For example, the following query calculates the number of logs and unique IP addresses over the last 5 minutes, grouped by (host, path) fields:

      _time:5m | stats by (host, path) count() logs_total, count_uniq(ip) ips_total

The by keyword can be skipped in stats ... pipe. For example, the following query is equivalent to the previous one:

      _time:5m | stats (host, path) count() logs_total, count_uniq(ip) ips_total

See also:

Stats by time buckets #

The following syntax can be used for calculating stats grouped by time buckets:

      <q> | stats by (_time:step)
  stats_func1(...) as result_name1,
  ...
  stats_funcN(...) as result_nameN
    

This calculates stats_func* for each step of the _time field across logs returned by <q> query . The step can have any duration value . For example, the following LogsQL query returns per-minute number of logs and unique IP addresses over the last 5 minutes:

      _time:5m | stats by (_time:1m) count() logs_total, count_uniq(ip) ips_total

It might be useful to calculate running stats over the calculated per-time bucket stats, with the help of running_stats pipe . For example, the following query adds the running number of logs to the running_hits field for the query above:

      _time:5m
    | stats by (_time:1m) count() as hits
    | running_stats sum(hits) as running_hits
    

It might be useful to calculate total stats over the calculated per-time bucket stats, with the help of total_stats pipe . For example, the following query adds the total number of logs into total_hits field and then uses this field for calculating the per-minute percentage of hits with the math pipe :

      _time:5m
    | stats by (_time:1m) count() as hits
    | total_stats sum(hits) as total_hits
    | math round((hits / total_hits)*100) as hits_percent
    

Additionally, the following step values are supported:

nanosecond - equals 1ns duration .
microsecond - equals 1µs duration .
millisecond - equals 1ms duration .
second - equals 1s duration .
minute - equals 1m duration .
hour - equals 1h duration .
day - equals 1d duration .
week - equals 1w duration .
month - equals one month. It properly takes into account the number of days in each month.
year - equals one year. It properly takes into account the number of days in each year.

See also:

Stats by time buckets with timezone offset #

VictoriaLogs stores _time values as Unix time in nanoseconds. This time corresponds to UTC time zone. Sometimes it is needed calculating stats grouped by days or weeks at non-UTC timezone. This is possible with the following syntax:

      <q> | stats by (_time:step offset timezone_offset) ...

For example, the following query calculates per-day number of logs over the last week, in UTC+02:00 time zone :

      _time:1w | stats by (_time:1d offset 2h) count() logs_total

See also:

Stats by field buckets #

Every log field inside <q> | stats by (...) can be bucketed in the same way as _time field in this example . Any numeric value can be used as step value for the bucket. For example, the following query calculates the number of requests for the last hour, bucketed by 10KB of request_size_bytes field :

      _time:1h | stats by (request_size_bytes:10KB) count() requests

Stats by IPv4 buckets #

Stats can be bucketed by log field containing IPv4 addresses via the ip_field_name:/network_mask syntax inside by(...) clause. For example, the following query returns the number of log entries per /24 subnetwork extracted from the ip log field during the last 5 minutes:

      _time:5m | stats by (ip:/24) count() requests_per_subnet

Stats with additional filters #

Sometimes it is needed to calculate stats on different subsets of matching logs. This can be done by inserting an if (<any_filters>) condition between the stats function and result_name, where any_filters can contain arbitrary filters . For example, the following query calculates individually the number of log messages with GET, POST and PUT words , additionally to the total number of logs over the last 5 minutes:

      _time:5m | stats
  count() if (GET) gets,
  count() if (POST) posts,
  count() if (PUT) puts,
  count() total
    

If zero input rows match the given if (...) filter, then the corresponding stats function is calculated over an empty set of rows. The returned value depends on the function. For example, count(*) returns 0, while sum(...) and avg(...) return NaN.

See also:

stream_context pipe #

<q> | stream_context ... pipe allows selecting surrounding logs in a log stream across the logs returned by <q> query in the way similar to grep -A / grep -B. The returned log chunks are delimited with --- log message for easier investigation.

For example, the following query returns up to 10 additional logs after every log message with the panic word across all the logs for the last 5 minutes:

      _time:5m panic | stream_context after 10

The following query returns up to 5 additional logs in front of every log message with the stacktrace word across all the logs for the last 5 minutes:

      _time:5m stacktrace | stream_context before 5

The following query returns up to 2 logs in front of the log message with the error word and up to 5 logs after this log message across all the logs for the last 5 minutes:

      _time:5m error | stream_context before 2 after 5

By default stream_context pipe looks for surrounding logs in a one-hour window. This window can be changed via the time_window option at query time. For example, the following query searches for surrounding logs in a one-week window:

      _time:5m error | stream_context before 10 time_window 1w

The | stream_context pipe must go first just after the filters .

See also:

stream filter

time_add pipe #

<q> | time_add <duration> adds the given <duration> to the _time field . The <duration> can be in any format described here .

For example, the following query adds one hour to _time field in the selected logs:

      _time:5m | time_add 1h

Specify negative duration for subtracting it from the _time field:

      _time:5m | time_add -1h

Add at <field_name> to the end of the time_add pipe in order to add the given <duration> to the log field with the given <field_name>. For example, the following query adds one week to the field transaction_time:

      _time:5m | time_add 1w at transaction_time

See also:

top pipe #

<q> | top N by (field1, ..., fieldN) pipe returns top N sets for (field1, ..., fieldN) log fields with the maximum number of matching log entries across logs returned by <q> query .

For example, the following query returns top 7 log streams with the maximum number of log entries over the last 5 minutes. The number of entries is returned in the hits field:

      _time:5m | top 7 by (_stream)

The N is optional. If it is skipped, then top 10 entries are returned. For example, the following query returns top 10 values for ip field seen in logs for the last 5 minutes:

      _time:5m | top by (ip)

It is possible to give another name for the hits field via hits as <new_name> syntax. For example, the following query returns top per-path hits in the visits field:

      _time:5m | top by (path) hits as visits

It is possible to set a rank field for each returned entry for the top pipe by adding rank. For example, the following query sets the rank field for each returned ip:

      _time:5m | top 10 by (ip) rank

The rank field can have other name. For example, the following query uses the position field name instead of rank field name in the output:

      _time:5m | top 10 by (ip) rank as position

See also:

total_stats pipe #

The <q> | total_stats ... pipe calculates total (global) stats (such as global count or global sum ) over the specified log fields returned by <q> query and stores these stats in the specified log fields for each input log entry.

The total_stats pipe sorts the logs returned by <q> by the _time field inside each by(...) group before calculating stats and returning results. If <q> doesn’t return _time, then the output order is undefined.

The total_stats pipe puts all the logs returned by <q> in memory (up to a certain limit) before it can return results, so make sure the <q> returns the limited number of logs in order to avoid high memory usage.

For example, the following LogsQL query calculates total sum for the hits field over the logs for the last 5 minutes:

      _time:5m | total_stats sum(hits) as total_hits

The | total_stats ... pipe has the following basic format:

      ... | total_stats
  stats_func1(...) as result_name1,
  ...
  stats_funcN(...) as result_nameN
    

Where stats_func* is any of the supported total stats function , while the result_name* is the name of the log field to store the result of the corresponding stats function. The as keyword is optional.

For example, the following query calculates the following total stats for logs over the last 5 minutes:

the number of logs with the help of count function ;
the sum of hits field with the help of sum function :

      _time:5m
    | total_stats
        count() as total_logs,
        sum(hits) as total_hits
    

It is allowed omitting the result name. In this case the result name equals the string representation of the used total stats function . For example, the following query returns the same stats as the previous one, but gives count(*) and sum(hits) names for the returned fields:

      _time:5m | total_stats count(), sum(hits)

It is useful to combine total_stats with stats by time buckets . For example, the following query returns per-hour number of logs over the last day, plus the total number of logs, and then calculates the per-hour percent of hits over the daily hits.

      _time:1d
    | stats by (_time:hour) count() as hits
    | total_stats sum(hits) as total_hits
    | math round((hits / total_hits)*100) as hits_percent
    

See also:

total_stats by fields #

The following LogsQL syntax can be used for calculating independent total stats per group of log fields:

      <q> | total_stats by (field1, ..., fieldM)
  stats_func1(...) as result_name1,
  ...
  stats_funcN(...) as result_nameN
    

This calculates stats_func* for each (field1, ..., fieldM) group of log fields seen in the logs returned by <q> query .

For example, the following query calculates total number of logs and total number of hits over the last 5 minutes, grouped by (host, path) fields:

      _time:5m
    | total_stats by (host, path)
        count() total_logs,
        sum(hits) total_hits
    

The by keyword can be skipped in total_stats ... pipe. For example, the following query is equivalent to the previous one:

      _time:5m | total_stats (host, path) count() total_logs, sum(hits) total_hits

See also:

union pipe #

<q1> | union (<q2>) pipe returns results of <q1> query followed by results of <q2> query . It works similarly to UNION ALL in SQL. <q1> and q2 may contain arbitrary LogsQL queries . For example, the following query returns logs with error word for the last 5 minutes, plus logs with panic word for the last hour:

      _time:5m error | union (_time:1h panic)

See also:

uniq pipe #

<q> | uniq by (field1, ..., fieldN) pipe returns unique values for the given log fields over the logs returned by <q> query . For example, the following LogsQL query returns unique values for ip log field over logs for the last 5 minutes:

      _time:5m | uniq by (ip)

It is possible to specify multiple fields inside by(...) clause. In this case all the unique sets for the given fields are returned. For example, the following query returns all the unique (host, path) pairs for the logs over the last 5 minutes:

      _time:5m | uniq by (host, path)

The unique entries are returned in arbitrary order. Use sort pipe in order to sort them if needed.

Add with hits after uniq by (...) in order to return the number of matching logs for each field value:

      _time:5m | uniq by (host) with hits

Unique entries are stored in memory during query execution. Big number of unique selected entries may require a lot of memory. Sometimes it is enough to return up to N unique entries. This can be done by adding limit N after by (...) clause. This allows limiting memory usage. For example, the following query returns up to 100 unique (host, path) pairs for the logs over the last 5 minutes:

      _time:5m | uniq by (host, path) limit 100

If the limit is reached, then arbitrary subset of unique values can be returned. If with hits is used, then all the returned hits are set to 0 when the limit is reached.

The by keyword can be skipped in uniq ... pipe. For example, the following query is equivalent to the previous one:

      _time:5m | uniq (host, path) limit 100

See also:

unpack_json pipe #

<q> | unpack_json from field_name pipe unpacks {"k1":"v1", ..., "kN":"vN"} JSON from the given field_name of <q> query results into k1, … kN output field names with the corresponding v1, …, vN values. It overrides existing fields with names from the k1, …, kN list. Other fields remain untouched.

Nested JSON is unpacked according to the rules defined here . If values for some JSON keys must be preserved from flattening, then enumerate the needed keys in preserve_keys (...). For example, the following query flattens all the JSON fields except foo and bar from the _msg field across logs for the last 5 minutes:

      _time:5m | unpack_json from _msg preserve_keys (foo, bar)

The from _msg part can be omitted when JSON fields are unpacked from the _msg field :

      _time:5m | unpack_json

If only some fields must be extracted from JSON, then they can be enumerated inside fields (...). For example, the following query unpacks only foo and bar fields from JSON value stored in my_json log field :

      _time:5m | unpack_json from my_json fields (foo, bar)

If it is needed to extract all the fields with some common prefix, then this can be done via fields(prefix*) syntax.

If it is needed to preserve the original non-empty field values, then add keep_original_fields to the end of unpack_json .... For example, the following query preserves the original non-empty values for ip and host fields instead of overwriting them with the unpacked values:

      _time:5m | unpack_json from foo fields (ip, host) keep_original_fields

Add skip_empty_results to the end of unpack_json ... if the original field values must be preserved when the corresponding unpacked values are empty. For example, the following query preserves the original ip and host field values for empty unpacked values:

      _time:5m | unpack_json fields (ip, host) skip_empty_results

Performance tip: if you need to extract a single field from long JSON, it is faster to use the extract pipe . For example, the following query extracts the "ip" field from JSON stored in _msg field faster than parsing the full JSON object:

      _time:5m | extract '"ip":<ip>'

If you want to reduce the chance that the unpacked JSON fields clash with the existing fields, then specify common prefix for all the fields extracted from JSON, by adding result_prefix "prefix_name" to unpack_json. For example, the following query adds the foo_ prefix for all the unpacked fields from foo:

      _time:5m | unpack_json from foo result_prefix "foo_"

Performance tips:

It is better from performance and resource usage PoV ingesting parsed JSON logs into VictoriaLogs according to the supported data model instead of ingesting unparsed JSON lines into VictoriaLogs and then parsing them at query time with unpack_json pipe .
It is recommended to use more specific log filters in order to reduce the number of log entries that are passed to unpack_json. See general performance tips for details.

See also:

Conditional unpack_json #

If the unpack_json pipe must be applied only to some log entries , then add if (<filters>) after unpack_json. The <filters> can contain arbitrary filters . For example, the following query unpacks JSON fields from foo field only if ip field in the current log entry isn’t set or empty:

      _time:5m | unpack_json if (ip:"") from foo

unpack_logfmt pipe #

<q> | unpack_logfmt from field_name pipe unpacks k1=v1 ... kN=vN logfmt fields from the given field_name of <q> query results into k1, … kN field names with the corresponding v1, …, vN values. It overrides existing fields with names from the k1, …, kN list. Other fields remain untouched.

For example, the following query unpacks logfmt fields from the _msg field across logs for the last 5 minutes:

      _time:5m | unpack_logfmt from _msg

The from _msg part can be omitted when logfmt fields are unpacked from the _msg field . The following query is equivalent to the previous one:

      _time:5m | unpack_logfmt

If only some fields must be unpacked from logfmt, then they can be enumerated inside fields (...). For example, the following query extracts only foo and bar fields from logfmt stored in the my_logfmt field:

      _time:5m | unpack_logfmt from my_logfmt fields (foo, bar)

If it is needed to extract all the fields with some common prefix, then this can be done via fields(prefix*) syntax.

If it is needed to preserve the original non-empty field values, then add keep_original_fields to the end of unpack_logfmt .... For example, the following query preserves the original non-empty values for ip and host fields instead of overwriting them with the unpacked values:

      _time:5m | unpack_logfmt from foo fields (ip, host) keep_original_fields

Add skip_empty_results to the end of unpack_logfmt ... if the original field values must be preserved when the corresponding unpacked values are empty. For example, the following query preserves the original ip and host field values for empty unpacked values:

      _time:5m | unpack_logfmt fields (ip, host) skip_empty_results

Performance tip: if you need to extract a single field from a long logfmt line, it is faster to use the extract pipe . For example, the following query extracts "ip" field from logfmt line stored in _msg field :

      _time:5m | extract ' ip=<ip>'

If you want to make sure that the unpacked logfmt fields do not clash with the existing fields, then specify common prefix for all the fields extracted from logfmt, by adding result_prefix "prefix_name" to unpack_logfmt. For example, the following query adds foo_ prefix for all the unpacked fields from foo field:

      _time:5m | unpack_logfmt from foo result_prefix "foo_"

Performance tips:

It is better from performance and resource usage PoV ingesting parsed logfmt logs into VictoriaLogs according to the supported data model instead of ingesting unparsed logfmt lines into VictoriaLogs and then parsing them at query time with unpack_logfmt pipe .
It is recommended to use more specific log filters in order to reduce the number of log entries that are passed to unpack_logfmt. See general performance tips for details.

See also:

Conditional unpack_logfmt #

If the unpack_logfmt pipe must be applied only to some log entries , then add if (<filters>) after unpack_logfmt. The <filters> can contain arbitrary filters . For example, the following query unpacks logfmt fields from foo field only if ip field in the current log entry isn’t set or empty:

      _time:5m | unpack_logfmt if (ip:"") from foo

unpack_syslog pipe #

<q> | unpack_syslog from field_name pipe unpacks syslog message from the given field_name of <q> query results. It understands the following Syslog formats:

RFC3164 aka <PRI>MMM DD hh:mm:ss HOSTNAME APP-NAME[PROCID]: MESSAGE
RFC5424 aka <PRI>1 TIMESTAMP HOSTNAME APP-NAME PROCID MSGID [STRUCTURED-DATA] MESSAGE

The following fields are unpacked:

level - obtained from PRI.
priority - obtained from PRI.
facility - calculated as PRI / 8.
facility_keyword - string representation of the facility field according to these docs .
severity - calculated as PRI % 8.
format - either rfc3164 or rfc5424 depending on which Syslog format is unpacked.
timestamp - timestamp in ISO8601 format . The MMM DD hh:mm:ss timestamp in RFC3164 is automatically converted into ISO8601 format by assuming the current year and adjusting the year to the previous one if the parsed timestamp would be more than a day in the future.
hostname
app_name
proc_id
msg_id
message

The <PRI> part is optional. If it is missing, then level, priority, facility, facility_keyword and severity fields aren’t set.

The [STRUCTURED-DATA] is parsed into fields with the SD-ID.param1, SD-ID.param2, …, SD-ID.paramN names and the corresponding values according to the specification .

If the app_name equals CEF and the message contains Common Event Format data for SIEM aka CEF for Syslog , then it is automatically parsed into the following fields:

cef.version - the CEF version
cef.device_vendor - the device vendor field
cef.device_product - the device product field
cef.device_version - the device version field
cef.device_event_class_id - the device event class id
cef.name - the CEF name
cef.severity - the severity field

Optional extension fields are parsed into cef.extension.<key> fields with the corresponding values.

For example, the following query unpacks syslog message from the _msg field across logs for the last 5 minutes:

      _time:5m | unpack_syslog from _msg

The from _msg part can be omitted when syslog message is unpacked from the _msg field . The following query is equivalent to the previous one:

      _time:5m | unpack_syslog

By default timestamps in RFC3164 format are interpreted in the local timezone. It is possible to change the timezone offset via offset option. For example, the following query interprets rfc3164 timestamps as if they were generated in the +05:30 timezone:

      _time:5m | unpack_syslog offset 5h30m

If it is needed to preserve the original non-empty field values, then add keep_original_fields to the end of unpack_syslog ...:

      _time:5m | unpack_syslog keep_original_fields

If you want to reduce the chance that the unpacked syslog fields clash with the existing fields, then specify common prefix for all the fields extracted from syslog, by adding result_prefix "prefix_name" to unpack_syslog. For example, the following query adds foo_ prefix for all the unpacked fields from foo field:

      _time:5m | unpack_syslog from foo result_prefix "foo_"

Performance tips:

It is better from performance and resource usage PoV ingesting parsed syslog messages into VictoriaLogs according to the supported data model instead of ingesting unparsed syslog lines into VictoriaLogs and then parsing them at query time with unpack_syslog pipe .
It is recommended to use more specific log filters in order to reduce the number of log entries that are passed to unpack_syslog. See general performance tips for details.

See also:

Conditional unpack_syslog #

If the unpack_syslog pipe must be applied only to some log entries , then add if (<filters>) after unpack_syslog. The <filters> can contain arbitrary filters . For example, the following query unpacks syslog message fields from foo field only if hostname field in the current log entry isn’t set or empty:

      _time:5m | unpack_syslog if (hostname:"") from foo

unpack_words pipe #

The <q> | unpack_words from <src_field> as <dst_field> pipe unpacks words from the given <src_field> log field of <q> query results into <dst_field> as a JSON array.

For example, the following query unpacks words from log messages into words field:

      _time:5m | unpack_words from _msg as words

      _time:5m | unpack_words from _msg

The from <src_field> part is optional. If it is missing, then words are unpacked from the _msg field. The following query is equivalent to the previous one:

      _time:5m | unpack_words

By default unpack_words pipe unpacks all the words, including duplicates, from the <src_field>. It is possible to drop duplicate words by adding drop_duplicates suffix to the pipe. For example, the following query extracts only unique words from every text field:

      _time:5m | unpack_words from text as words drop_duplicates

It is convenient to use unroll pipe for unrolling the JSON array with unpacked words from the destination field. For example, the following query returns top 5 most frequently seen words across log messages for the last 5 minutes:

      _time:5m | unpack_words as words | unroll words | top 5 (words)

See also:

unroll pipe #

<q> | unroll [by] (field1, ..., fieldN) pipe can be used for unrolling JSON arrays from field1, …, fieldN log fields of <q> query results into separate rows.

For example, the following query unrolls timestamp and value log fields from logs for the last 5 minutes:

      _time:5m | unroll (timestamp, value)

The by keyword can be skipped in unroll ... pipe. If multiple fields are unrolled, then the longest JSON array among the fields defines the number of returned rows, while missing items for shorter arrays are substituted with empty strings.

If the unrolled JSON array contains JSON objects, then it may be handy to use unpack_json for unpacking the unrolled array items into separate fields for further processing.

See also:

Conditional unroll #

If the unroll pipe must be applied only to some log entries , then add if (<filters>) after unroll. The <filters> can contain arbitrary filters . For example, the following query unrolls the value field only if the value_type field equals json_array:

      _time:5m | unroll if (value_type:="json_array") (value)

running_stats pipe functions #

LogsQL supports the following functions for running_stats pipe :

count returns the number of log entries.
max returns the maximum value over the given log fields .
min returns the minimum value over the given log fields .
sum returns the sum for the given numeric log fields .

count running_stats #

count() running_stats pipe function calculates running number of selected logs.

For example, the following query adds the running_logs field to the selected logs over the last 5 minutes:

      _time:5m | running_stats count() running_logs

It is possible to calculate the number of logs with non-empty values for some log field with the count(fieldName) syntax. For example, the following query returns running number of logs with non-empty username field over the last 5 minutes:

      _time:5m | running_stats count(username) running_logs_with_username

If multiple fields are enumerated inside count(), then it counts the number of logs with at least a single non-empty field mentioned inside count(). For example, the following query returns the number of logs with non-empty username or password fields over the last 5 minutes:

      _time:5m | running_stats count(username, password) running_logs_with_username_or_password

It is possible to calculate the number of logs with at least a single non-empty field with common prefix with count(prefix*) syntax. For example, the following query returns the number of logs with at least a single non-empty field with foo prefix over the last 5 minutes:

      _time:5m | running_stats count(foo*)

See also:

max running_stats #

max(field1, ..., fieldN) running_stats pipe function returns running maximum across all the mentioned log fields .

For example, the following query returns running maximum for the duration field over logs for the last 5 minutes:

      _time:5m | running_stats max(duration) running_max_duration

It is possible to calculate running maximum value across all the fields with common prefix via max(prefix*) syntax.

See also:

min running_stats #

min(field1, ..., fieldN) running_stats pipe function returns running minimum across all the mentioned log fields .

Missing fields are treated as empty strings. So min(field) may become an empty string if field is missing for at least a single processed log entry.

For example, the following query returns running minimum for the duration field over logs for the last 5 minutes:

      _time:5m | running_stats min(duration) running_min_duration

It is possible to find running minimum across all the fields with common prefix via min(prefix*) syntax.

See also:

sum running_stats #

sum(field1, ..., fieldN) running_stats pipe function calculates running sum of numeric values across all the mentioned log fields . Non-numeric values are skipped. If there were no numeric values across field1, …, fieldN in the processed logs so far, then NaN is returned.

For example, the following query returns running sum of numeric values for the duration field over logs for the last 5 minutes:

      _time:5m | running_stats sum(duration) running_sum_duration

It is possible to find running sum for all the fields with common prefix via sum(prefix*) syntax.

See also:

total_stats pipe functions #

LogsQL supports the following functions for total_stats pipe :

count returns the number of log entries.
max returns the maximum value over the given log fields .
min returns the minimum value over the given log fields .
sum returns the sum for the given numeric log fields .

count total_stats #

count() total_stats pipe function calculates the total number of selected logs.

For example, the following query adds the total_logs field to the selected logs over the last 5 minutes:

      _time:5m | total_stats count() total_logs

It is possible to calculate the number of logs with non-empty values for some log field with the count(fieldName) syntax. For example, the following query returns the total number of logs with non-empty username field over the last 5 minutes:

      _time:5m | total_stats count(username) total_logs_with_username

      _time:5m | total_stats count(username, password) total_logs_with_username_or_password

      _time:5m | total_stats count(foo*)

See also:

max total_stats #

max(field1, ..., fieldN) total_stats pipe function returns the total maximum across all the mentioned log fields .

For example, the following query returns the total maximum for the duration field over logs for the last 5 minutes:

      _time:5m | total_stats max(duration) total_max_duration

It is possible to calculate the total maximum value across all the fields with common prefix via max(prefix*) syntax.

See also:

min total_stats #

min(field1, ..., fieldN) total_stats pipe function returns the total minimum across all the mentioned log fields .

Missing fields are treated as empty strings. So min(field) may become an empty string if field is missing for at least a single selected log entry.

For example, the following query returns the total minimum for the duration field over logs for the last 5 minutes:

      _time:5m | total_stats min(duration) total_min_duration

It is possible to find the total minimum across all the fields with common prefix via min(prefix*) syntax.

See also:

sum total_stats #

sum(field1, ..., fieldN) total_stats pipe function calculates the total sum of numeric values across all the mentioned log fields . Non-numeric values are skipped. If there were no numeric values across field1, …, fieldN in the selected logs, then NaN is returned.

For example, the following query returns the total sum of numeric values for the duration field over logs for the last 5 minutes:

      _time:5m | total_stats sum(duration) total_sum_duration

It is possible to find the total sum for all the fields with common prefix via sum(prefix*) syntax.

See also:

stats pipe functions #

LogsQL supports the following functions for stats pipe :

avg returns the average value over the given numeric log fields .
count returns the number of log entries.
count_empty returns the number logs with empty log fields .
count_uniq returns the number of unique non-empty values for the given log fields .
count_uniq_hash returns the number of unique hashes for non-empty values at the given log fields .
histogram returns VictoriaMetrics histogram for the given log field .
json_values returns JSON-encoded logs as JSON array.
max returns the maximum value over the given log fields .
median returns the median value over the given log fields .
min returns the minimum value over the given log fields .
quantile returns the given quantile for the given log fields .
rate returns the average per-second rate of matching logs on the selected time range.
rate_sum returns the average per-second rate of sum for the given log fields .
row_any returns a sample log entry for each selected stats group .
row_max returns the log entry with the maximum value at the given field.
row_min returns the log entry with the minimum value at the given field.
sum returns the sum for the given numeric log fields .
sum_len returns the sum of lengths for the given log fields .
uniq_values returns unique non-empty values for the given log fields .
values returns all the values for the given log fields .

avg stats #

avg(field1, ..., fieldN) stats pipe function calculates the average value across all the mentioned log fields . Non-numeric values are ignored. If all the values are non-numeric, then NaN is returned.

For example, the following query returns the average value for the duration field over logs for the last 5 minutes:

      _time:5m | stats avg(duration) avg_duration

It is possible to calculate the average over fields with common prefix via avg(prefix*) syntax. For example, the following query calculates the average over all the log fields with foo prefix:

      _time:5m | stats avg(foo*)

See also:

count stats #

count() stats pipe function calculates the number of selected logs.

For example, the following query returns the number of logs over the last 5 minutes:

      _time:5m | stats count() logs

It is possible to calculate the number of logs with non-empty values for some log field with the count(fieldName) syntax. For example, the following query returns the number of logs with non-empty username field over the last 5 minutes:

      _time:5m | stats count(username) logs_with_username

      _time:5m | stats count(username, password) logs_with_username_or_password

      _time:5m | stats count(foo*)

See also:

count_empty stats #

count_empty(field1, ..., fieldN) stats pipe function calculates the number of logs with empty (field1, ..., fieldN) tuples. Missing fields are treated as empty.

For example, the following query calculates the number of logs with empty username field during the last 5 minutes:

      _time:5m | stats count_empty(username) logs_with_missing_username

It is possible to calculate the number of logs with empty fields with common prefix via count_empty(prefix*) syntax. For example, the following query calculates the number of logs with empty fields with foo prefix during the last 5 minutes:

      _time:5m | stats count_empty(foo*)

See also:

count_uniq stats #

count_uniq(field1, ..., fieldN) stats pipe function calculates the number of unique non-empty (field1, ..., fieldN) tuples.

For example, the following query returns the number of unique non-empty values for ip field over the last 5 minutes:

      _time:5m | stats count_uniq(ip) ips

The following query returns the number of unique (host, path) pairs for the corresponding fields over the last 5 minutes:

      _time:5m | stats count_uniq(host, path) unique_host_path_pairs

Every unique value is stored in memory during query execution. A large number of unique values may require a lot of memory. Sometimes it is necessary to know whether the number of unique values reaches a limit. In this case add limit N just after count_uniq(...) for capping the returned number of unique values at N, while limiting memory usage. For example, the following query counts up to 1_000_000 unique values for the ip field:

      _time:5m | stats count_uniq(ip) limit 1_000_000 as ips_1_000_000

If it is OK to count an estimated number of unique values, then count_uniq_hash can be used as faster alternative to count_uniq.

See also:

count_uniq_hash stats #

count_uniq_hash(field1, ..., fieldN) stats pipe function calculates the number of unique non-empty (field1, ..., fieldN) tuples by hashing them. The returned number may be underestimated because hash collisions are possible. This function usually works faster and uses less memory than count_uniq when counting a large number of unique values.

For example, the following query returns an estimated number of unique non-empty values for ip field over the last 5 minutes:

      _time:5m | stats count_uniq_hash(ip) unique_ips_count

The following query returns an estimated number of unique (host, path) pairs for the corresponding fields over the last 5 minutes:

      _time:5m | stats count_uniq_hash(host, path) unique_host_path_pairs

See also:

histogram stats #

histogram(field) stats pipe function returns VictoriaMetrics histogram buckets for the given field .

For example, the following query returns histogram buckets for the response_size field grouped by host field, across logs for the last 5 minutes:

      _time:5m | stats by (host) histogram(response_size)

If the field contains duration value , then histogram normalizes it to nanoseconds. For example, 1.25ms is normalized to 1_250_000.

If the field contains short numeric value , then histogram normalizes it to numeric value without any suffixes. For example, 1KiB is converted to 1024.

Histogram buckets are returned as the following JSON array:

      [{"vmrange":"...","hits":...},...,{"vmrange":"...","hits":...}]
    

Every vmrange value contains value range for the corresponding VictoriaMetrics histogram bucket , while hits contains the number of values, which hit the given bucket.

It may be handy to unroll the returned histogram buckets for further processing during the query. For example, the following query calculates a histogram over the response_size field and then unrolls it into distinct rows with vmrange and hits fields with the help of unroll and unpack_json pipes:

      _time:5m
  | stats histogram(response_size) as buckets
  | unroll (buckets)
  | unpack_json from buckets
    

See also:

json_values stats #

json_values(field1, ..., fieldN) stats pipe function packs the given fields into JSON for every log entry and returns a JSON array, which can be unrolled with unroll pipe .

For example, the following query returns per-app JSON arrays containing _time and _msg fields for the last 5 minutes:

      _time:5m | stats by (app) json_values(_time, _msg) as json_logs

If the list of fields is empty, then all the log fields are encoded into a JSON array:

      _time:5m | stats json_values() as json_logs

If the list of fields is empty, then json_values selects all the fields and packs every log entry into a JSON object.

It is possible to select fields with the given prefix via json_values(prefix*) syntax.

It is possible to set the upper limit on the number of JSON-encoded logs with the limit N suffix. For example, the following query returns up to 3 JSON-encoded logs for every host:

      _time:5m | stats by (host) json_values() limit 3 as json_logs

It is possible to sort the selected log entries by appending sort by (...) (or order by (...)). For example, the following query returns per-host logs over the last 5 minutes sorted by descending order of _time field :

      _time:5m | stats by (host) json_values() sort by (_time desc) as json_logs

The sort by (...) allows selecting top N logs for each group when combined with limit N. For example, the following query selects up to 3 of the most recent logs for every host over the last 5 minutes:

      _time:5m | stats by (host) json_values() sort by (_time desc) limit 3 as json_logs

See also:

max stats #

max(field1, ..., fieldN) stats pipe function returns the maximum value across all the mentioned log fields .

For example, the following query returns the maximum value for the duration field over logs for the last 5 minutes:

      _time:5m | stats max(duration) max_duration

The max(some_field) function treats missing values as empty strings according to VictoriaLogs data model . Missing values do not change the result of max(some_field) if there is at least a single non-empty value for some_field. Otherwise max(some_field) returns an empty string. Use max(some_field) if (some_field:*) as max_value_without_empty_string syntax for filtering out empty string values. See conditional stats docs for more details.

It is possible to calculate the maximum value across all the fields with common prefix via max(prefix*) syntax.

row_max function can be used for obtaining other fields with the maximum duration.

See also:

median stats #

median(field1, ..., fieldN) stats pipe function calculates the estimated median value across the given log fields .

For example, the following query return median for the duration field over logs for the last 5 minutes:

      _time:5m | stats median(duration) median_duration

The median(some_field) function treats missing values as empty strings according to VictoriaLogs data model , so missing (or empty) values participate in the median calculation. Use median(some_field) if (some_field:*) as median_value_without_empty_string syntax for filtering out empty string values. See conditional stats docs for more details.

It is possible to calculate the median across all the fields with common prefix via median(prefix*) syntax.

See also:

min stats #

min(field1, ..., fieldN) stats pipe function returns the minimum value across all the mentioned log fields .

For example, the following query returns the minimum value for the duration field over logs for the last 5 minutes:

      _time:5m | stats min(duration) min_duration

The min(some_field) function treats missing values as empty strings according to VictoriaLogs data model . If some_field is missing (or empty) for at least a single processed log entry, then min(some_field) returns an empty string. Use min(some_field) if (some_field:*) as min_value_without_empty_string syntax for filtering out empty string values. See conditional stats docs for more details.

It is possible to find the minimum across all the fields with common prefix via min(prefix*) syntax.

row_min function can be used for obtaining other fields with the minimum duration.

See also:

quantile stats #

quantile(phi, field1, ..., fieldN) stats pipe function calculates an estimated phi percentile over values for the given log fields . The phi must be in the range 0 ... 1, where 0 means 0th percentile, while 1 means 100th percentile.

For example, the following query calculates 50th, 90th and 99th percentiles for the request_duration_seconds field over logs for the last 5 minutes:

      _time:5m | stats
  quantile(0.5, request_duration_seconds) p50,
  quantile(0.9, request_duration_seconds) p90,
  quantile(0.99, request_duration_seconds) p99
    

The quantile(phi, some_field) function treats missing values as empty strings according to VictoriaLogs data model , so missing (or empty) values participate in the quantile calculation. Use quantile(phi, some_field) if (some_field:*) as quantile_value_without_empty_string syntax for filtering out empty string values. See conditional stats docs for more details.

It is possible to calculate the quantile across all the fields with common prefix via quantile(phi, prefix*) syntax.

See also:

rate stats #

rate() stats pipe function returns the average per-second rate of matching logs on the selected time range. If the query doesn’t contain a global time filter, then rate() returns the total number of matching logs (same as count()).

For example, the following query returns the average per-second rate of logs with the error word over the last 5 minutes:

      _time:5m error | stats rate()

See also:

rate_sum stats #

rate_sum(field1, ..., fieldN) stats pipe function returns the average per-second rate of the sum over the given numeric fields . If the query doesn’t contain a global time filter, then rate_sum(...) returns the total sum (same as sum(...)). Non-numeric values are skipped. If all the values across field1, …, fieldN are non-numeric, then NaN is returned.

For example, the following query returns the average per-second rate of the sum of bytes_sent log field over the last 5 minutes:

      _time:5m | stats rate_sum(bytes_sent)

It is possible to calculate the average per-second rate of the sum over all the fields starting with a particular prefix by using rate_sum(prefix*) syntax.

See also:

row_any stats #

row_any() stats pipe function returns an arbitrary log entry (aka sample) for each selected stats group . The log entry is returned as a JSON-encoded dictionary with all the fields from the original log.

For example, the following query returns a sample log entry for each _stream across logs for the last 5 minutes:

      _time:5m | stats by (_stream) row_any() as sample_row

Fields from the returned values can be decoded with unpack_json or extract pipes.

If only the specific fields are needed, then they can be enumerated inside row_any(...). For example, the following query returns only _time and path fields from a sample log entry for logs over the last 5 minutes:

      _time:5m | stats row_any(_time, path) as time_and_path_sample

It is possible to return all the fields starting with particular prefix by using row_any(prefix*) syntax.

See also:

row_max stats #

row_max(field) stats pipe function returns log entry with the maximum value for the given field. Log entry is returned as JSON-encoded dictionary with all the fields from the original log. Empty values for the given field are ignored. If all the values for field are empty, then {} is returned.

For example, the following query returns log entry with the maximum value for the duration field across logs for the last 5 minutes:

      _time:5m | stats row_max(duration) as log_with_max_duration

Fields from the returned values can be decoded with unpack_json or extract pipes.

If only the specific fields are needed from the returned log entry, then they can be enumerated inside row_max(...). For example, the following query returns only _time, path and duration fields from the log entry with the maximum duration over the last 5 minutes. Note that duration must be explicitly listed in row_max(...) if it must be returned.

      _time:5m | stats row_max(duration, _time, path, duration) as time_and_path_with_max_duration

It is possible to return all the fields starting with particular prefix by using row_max(field, prefix*) syntax.

See also:

row_min stats #

row_min(field) stats pipe function returns log entry with the minimum value for the given field. Log entry is returned as JSON-encoded dictionary with all the fields from the original log. Empty values for the given field are ignored. If all the values for field are empty, then {} is returned.

For example, the following query returns log entry with the minimum value for the duration field across logs for the last 5 minutes:

      _time:5m | stats row_min(duration) as log_with_min_duration

Fields from the returned values can be decoded with unpack_json or extract pipes.

If only the specific fields are needed from the returned log entry, then they can be enumerated inside row_min(...). For example, the following query returns only _time, path and duration fields from the log entry with the minimum duration over the last 5 minutes. Note that duration must be explicitly listed in row_min(...) if it must be returned.

      _time:5m | stats row_min(duration, _time, path, duration) as time_and_path_with_min_duration

It is possible to return all the fields starting with particular prefix by using row_min(field, prefix*) syntax.

See also:

sum stats #

sum(field1, ..., fieldN) stats pipe function calculates the sum of numeric values across all the mentioned log fields . Non-numeric values are skipped. If all the values across field1, …, fieldN are non-numeric, then NaN is returned.

For example, the following query returns the sum of numeric values for the duration field over logs for the last 5 minutes:

      _time:5m | stats sum(duration) sum_duration

It is possible to find the sum for all the fields with common prefix via sum(prefix*) syntax.

See also:

sum_len stats #

sum_len(field1, ..., fieldN) stats pipe function calculates the sum of byte lengths of all the values for the given log fields .

For example, the following query returns the sum of byte lengths of _msg fields across all the logs for the last 5 minutes:

      _time:5m | stats sum_len(_msg) messages_len

It is possible to find the sum of byte lengths for all the fields with common prefix via sum_len(prefix*) syntax.

See also:

uniq_values stats #

uniq_values(field1, ..., fieldN) stats pipe function returns the unique non-empty values across the mentioned log fields . The returned values are encoded in sorted JSON array.

For example, the following query returns unique non-empty values for the ip field over logs for the last 5 minutes:

      _time:5m | stats uniq_values(ip) unique_ips

The returned unique IP addresses can be unrolled into distinct log entries with unroll pipe .

Every unique value is stored in memory during query execution. Big number of unique values may require a lot of memory. Sometimes it is enough to return only a subset of unique values. In this case add limit N after uniq_values(...) in order to limit the number of returned unique values to N, while limiting the maximum memory usage. For example, the following query returns up to 100 unique values for the ip field over the logs for the last 5 minutes:

      _time:5m | stats uniq_values(ip) limit 100 as unique_ips_100

If the limit is reached, then a subset of unique values is returned. This subset isn’t guaranteed to be stable across repeated query executions.

It is possible to find unique values for all the fields with common prefix via uniq_values(prefix*) syntax.

See also:

values stats #

values(field1, ..., fieldN) stats pipe function returns all the values (including empty values) for the mentioned log fields . The returned values are encoded in JSON array.

For example, the following query returns all the values for the ip field over logs for the last 5 minutes:

      _time:5m | stats values(ip) ips

The returned IP addresses can be unrolled into distinct log entries with unroll pipe .

It is possible to limit the number of returned values with limit N. If the limit is reached, then a subset of values is returned. This subset isn’t guaranteed to be stable across repeated query executions.

It is possible to get values for all the fields with common prefix via values(prefix*) syntax.

See also:

Subqueries #

LogsQL supports subqueries in the following places:

In filters - see subquery filters .
In the following pipes :
- join pipe
- union pipe

Stream context #

See stream_context pipe .

Transformations #

LogsQL supports various transformations on the log entries selected with filters , including:

Extracting arbitrary text from log fields according to the provided pattern. See these docs for details.
Unpacking JSON fields from log fields . See these docs .
Unpacking logfmt fields from log fields . See these docs .
Unpacking Syslog messages from log fields . See these docs .
Creating a new field from existing log fields according to the provided format. See format pipe .
Replacing substrings in the given log field . See replace pipe and replace_regexp pipe docs.
Creating a new field according to math calculations over existing log fields . See math pipe .

See also other pipes , which can be applied to the selected logs.

It is also possible to perform various transformations on the selected log entries at client side with jq, awk, cut, etc. Unix commands according to these docs .

Post-filters #

Post-filtering of query results can be performed at any step by using filter pipe .

It is also possible to perform post-filtering of the selected log entries at client side with grep and similar Unix commands according to these docs .

Stats #

Stats over the selected logs can be calculated via stats pipe .

It is also possible to perform stats calculations on the selected log entries at client side with sort, uniq, etc. Unix commands according to these docs .

Sorting #

By default VictoriaLogs doesn’t sort the returned results for performance reasons. Use sort pipe for sorting the results.

Limiters #

LogsQL provides the following pipes for limiting the returned results:

fields and delete pipes allow limiting the set of log fields to return.
limit pipe allows limiting the number of log entries to return.

Querying specific fields #

Specific log fields can be queried via fields pipe .

String literals #

LogsQL supports the following string literals:

"double quoted". Double quote and backslash inside such a string must be escaped with \: "escape\"doublequote and \\ backslash". Double-quoted strings may contain special sequences such as \n, \t, \f, \x8c, etc. They are decoded according to these docs .
'single quoted'. Single quote and backslash inside such a string must be escaped with \: 'escape\'singlequote and \\ backslash'.
`backtick quoted`. Strings with backslashes, double quotes and single quotes shouldn’t be escaped inside backtick-quoted strings. Backtick character cannot be used inside backtick-quoted strings.

Comments #

LogsQL query may contain comments at any place. The comment starts with # and continues until the end of the current line. Example query with comments:

      error                               # find logs with `error` word
  | stats by (_stream) count() logs # then count the number of logs per `_stream` label
  | sort by (logs) desc             # then sort by the found logs in descending order
  | limit 5                         # and show top 5 streams with the biggest number of logs
    

Numeric values #

LogsQL accepts numeric values in the following formats:

regular integers like 12345 or -12345
regular floating point numbers like 0.123 or -12.34
short numeric format
duration format

Short numeric values #

LogsQL accepts integer and floating point values with the following suffixes:

B - bytes.
K and KB - the value is multiplied by 10^3
M and MB - the value is multiplied by 10^6
G and GB - the value is multiplied by 10^9
T and TB - the value is multiplied by 10^12
Ki and KiB - the value is multiplied by 2^10
Mi and MiB - the value is multiplied by 2^20
Gi and GiB - the value is multiplied by 2^30
Ti and TiB - the value is multiplied by 2^40

Multiple values with suffixes can be combined. For example, 1MiB500KiB200B.

All the numbers may contain _ delimiters, which may improve readability of the query. For example, 1_234_567 is equivalent to 1234567, while 1.234_567 is equivalent to 1.234567.

Duration values #

LogsQL accepts duration values with the following suffixes at places where the duration is allowed:

ns - nanoseconds. For example, 123ns.
µs - microseconds. For example, 1.23µs.
ms - milliseconds. For example, 1.23456ms
s - seconds. For example, 1.234s
m - minutes. For example, 1.5m
h - hours. For example, 1.5h
d - days. For example, 1.5d
w - weeks. For example, 1w
y - years as 365 days. For example, 1.5y

Multiple durations can be combined. For example, 1h33m55s.

Internally duration values are converted into nanoseconds.

Performance tips #

It is highly recommended to specify a time filter in order to narrow down the search to a specific time range.
It is highly recommended to specify a stream filter in order to narrow down the search to specific log streams .
It is recommended to specify the log fields you need in query results with the fields pipe , if the selected log entries contain a large number of fields that aren’t interesting to you. This saves disk read IO and CPU time needed for reading and unpacking all the log fields from disk.
Move faster filters such as word filter and phrase filter to the beginning of the query. This rule doesn’t apply to time filter and stream filter , which can be put at any place of the query.
Move more specific filters, which match lower number of log entries, to the beginning of the query. This rule doesn’t apply to time filter and stream filter , which can be put at any place of the query.
If the selected logs are passed to pipes for further transformations and statistics calculations, then it is recommended reducing the number of selected logs by using more specific filters , which return lower number of logs to process by pipes .
If the logs are stored at high-latency storage systems such as NFS or S3, then increasing the number of parallel readers can help improve query performance. See these docs for details.

Query options #

VictoriaLogs supports the following options, which can be passed in the beginning of LogsQL query <q> via options(opt1=v1, ..., optN=vN) <q> syntax:

`concurrency` query option #

VictoriaLogs executes each query on all the available CPU cores in parallel. This usually provides the best query performance. Sometimes it is needed to reduce the number of used CPU cores, in order to reduce RAM usage and/or CPU usage. This can be done by setting concurrency option to the value smaller than the number of available CPU cores. For example, the following query executes on at most 2 CPU cores:

      options(concurrency=2) _time:1d | count_uniq(user_id)

The concurrency option is applied individually to every vlstorage node in VictoriaLogs cluster .

`parallel_readers` query option #

VictoriaLogs uses parallel data readers for query execution. The default number of parallel readers fits the majority of practical use cases. Sometimes it may be needed to configure it on a per-query basis (for example, to increase query performance by increasing the number of parallel readers when the logs are stored on the persistent storage with high read latency such as NFS or S3). This can be done via parallel_readers query option. For example, the following query uses 100 parallel readers:

      options(parallel_readers=100) _time:1d error | count()

If the parallel_readers option isn’t set, while the concurrency option is set, then the number of parallel readers equals the concurrency value (capped to 2000).

The default number of parallel readers can be configured via -defaultParallelReaders command-line flag.

The parallel_readers option is applied individually to every vlstorage node in VictoriaLogs cluster .

Note that too large a number of parallel readers may result in excessive usage of RAM and CPU.

`ignore_global_time_filter` query option #

When running via Web UI, Grafana, or HTTP querying API that may apply a global time range via start and end query args, VictoriaLogs injects a global _time filter into the query and every subquery . The injected time filter corresponds to the [start,end) time range passed via HTTP query args, where end is treated as exclusive (it is converted to end-1ns internally, since LogsQL time ranges are inclusive). Set ignore_global_time_filter=true to prevent injecting this global time filter into the given query and all its subqueries.

For example, the following query preserves the original time logic in the query body without adding a global _time filter:

      options(ignore_global_time_filter=true) _time:>1h | count()

This option can be used in subqueries for excluding the global time range for the given subquery. For example, the following query returns the number of logs with user_id values seen in logs during December 2024, on the [start...end) time range passed to /select/logsql/query :

      user_id:in(options(ignore_global_time_filter=true) _time:2024-12Z | keep user_id) | count()

The in(...) subquery without options(ignore_global_time_filter=true) takes into account only user_id values on the intersection of December 2024 and [start...end) time range passed to /select/logsql/query :

      user_id:in(_time:2024-12Z | keep user_id) | count()

`allow_partial_response` query option #

In VictoriaLogs cluster mode, some vlstorage nodes may be temporarily unavailable. Set allow_partial_response=true to return partial results from available nodes instead of failing the whole query.

For example:

      options(allow_partial_response=true) _time:1h error | stats count()

This may lead to incorrect results, so be careful when using this option. The query option overrides allow_partial_response HTTP query arg and -search.allowPartialResponse command-line flag, so it’s better to use the query option for more explicit control.

`time_offset` query option #

time_offset query option subtracts the given offset from all the time filters in the query, and then adds the given offset to the selected _time field values before passing them to query pipes . Allows comparing query results for the same duration at different offsets. Accepts duration values like 12h, 1d, 1y. For example, the following query returns the number of logs with the error word over the last hour 7 days ago.

      options(time_offset=7d) _time:1h error | stats count() as 'errors_7d_ago'

Troubleshooting #

LogsQL works well for most use cases when set up right. But sometimes you will see slow queries. The most common reason is querying too many logs without enough filtering. Always be specific when you build your queries.

Use these steps to help you understand your query and improve its speed.

Check how many logs your query matches #

You can do this by running multiple versions of the query, each time ending it with | count() after a filter or a pipe that may change the number of rows.

Suppose you have the following query, which executes slowly:

      _time:5m host:"api-" level:error "database" | stats by (app) count()

Substitute all the pipes in the query with | count() and run the updated query to see the total number of matching logs:

      _time:5m host:"api-" level:error "database" | count()

An example output (obtained via vlogscli , but you can use any supported querying method ):

      executing [_time:5m level:error database host:"api-" | stats count(*) as "count(*)"]...; duration: 0.474s
{
  "count(*)": "19217008"
}
    

So the given filters match 19,217,008 logs and the matching takes 0.474 seconds.

If the execution time is high, try reordering your filters. Put the most selective and cheapest conditions first. Filters run one after another, so an early filter that removes a lot of logs will make later filters faster to run. For more tips, see the Performance Tips .

If you are not sure which filter is the most selective or the most expensive, you can add | count() after each filter while removing the rest of filters. This helps you see how many logs each filter matches and gives you an idea about their performance:

      _time:5m level:error | count()

      _time:5m host:"api-" | count()

      _time:5m "database" | count()

The _time filter is the essential one - if it is missing, then VictoriaLogs may need to scan logs across the whole retention period instead of a specific time range. The _time filter allows reducing the amount of logs to scan to the given time range only. Note that Web UI for VictoriaLogs and Grafana plugin for VictoriaLogs automatically set the _time filter to the selected time range, so there is no need to specify it manually in the query.

Test stream filters in the query #

If the query doesn’t contain log stream filters , VictoriaLogs cannot preselect matching log streams and must search across all the log streams on the selected time range. If you add a log stream filter , like this:

      {app="nginx"}

Then VictoriaLogs searches only the log streams matching this stream filter and skips data blocks belonging to other streams. This is much faster. So, having a good log stream filter is important for query performance.

However, if your log stream has a stream field like app="nginx" but you write your filter as:

      app:=nginx

Then VictoriaLogs treats it as a regular “exact match” filter , so it will not be as fast as the corresponding stream filter. Make sure to use the correct stream filter syntax. See stream filters docs for details.

Check the number of unique log streams #

Log stream filters can help improve query performance, but they are not a magic fix for everything. Watch out for the following common problems:

If you have too many log streams, and each stream only covers a few logs, query performance can drop significantly.
If the log stream you are searching in covers a large number of logs (e.g., hundreds of millions and more), searching in that stream can be slow.

To check the number of log streams on the given time range, keep only the time filter and add | count_uniq(_stream_id) at the end of the query (see count_uniq docs ). For example, to see how many log streams you have in the last day:

      _time:1d | count_uniq(_stream_id)

The result could be:

      {
  "count_uniq(_stream_id)": "954"
}
    

This means that the logs over the last day contain 954 unique log streams.

The following query returns top 10 log streams with the biggest number of log entries (it uses top pipe ):

      _time:1d | top 10 by (_stream)

The following query returns the number of unique log streams and the number of logs for the {app="nginx"} stream filter over the last day:

      _time:1d {app="nginx"}
  | stats
      count_uniq(_stream) as streams,
      count() as logs
    

It uses stats pipe .

Streams with small number of logs usually happen when one or more stream fields have too many different values. In these cases it is better to remove those fields from the set of log stream fields - see these docs .

Identify the most costly parts of the query #

To see which parts of your logs take up the most space or slow down searches, you can use the block_stats pipe . It returns detailed per-block statistics for your data.

Start with your usual query. Then add the pipe | keep <field list> | block_stats:

      _time:1d | keep kubernetes.pod_name, kubernetes.pod_namespace | block_stats

The keep pipe keeps only the enumerated log fields and removes the others, so you get statistics just for the fields you care about. Include the fields you want to analyze with block_stats.

Sometimes, the raw numbers returned by block_stats pipe are still too detailed to be useful. You can add the stats pipe to summarize the numbers:

      _time:1d
  | keep kubernetes.pod_name, kubernetes.pod_namespace
  | block_stats
  | stats by (field)
      sum(values_bytes)  values_bytes_on_disk,
      sum(rows)          rows
  | sort by (values_bytes_on_disk) desc
    

Example output:

      values_bytes_on_disk: 561  field: kubernetes.pod_name       rows: 172
values_bytes_on_disk: 101  field: kubernetes.pod_namespace  rows: 172

Summing up value bytes and rows lets you see, at a glance, which fields occupy the most disk space or force VictoriaLogs to scan more data.

When you know which fields are expensive, you can decide whether to drop the noisy field from the query, split it out, or change your filters to avoid reading extra data.

You can find more details here: How to determine which log fields occupy the most of disk space? .

It might be useful to add the query_stats pipe to the end of the query in order to understand how much data of different types the query reads and processes.

Profile pipes incrementally #

Suppose you need to profile and optimize the following query:

      _time:5m -"cannot open file" error
  | extract "user_id=(<uid>)"
  | top 5 by (uid)
    

Drop all the pipes from the query and leave only the time range filter like _time:5m. This query returns all the logs on the given time range. If the query is executed via the built-in web UI or via the Grafana plugin for VictoriaLogs , then just leave * in the query input field, since both the web UI and Grafana plugin for VictoriaLogs automatically filter logs on the selected time range. Add | count() at the end of the query and measure the time it takes to execute. This provides a baseline for the time spent on reading and scanning logs on the selected time range (additional filters and pipes will add extra work on top of this baseline). The query also returns the number of logs on the selected time range, which is an upper bound for the number of logs that may be processed by subsequent query steps:

      _time:5m | count()

Then add filters from the original query one by one and measure the resulting query performance. Try different filters from the original query, leaving the filter that executes faster for each step.

      _time:5m error | count()

      _time:5m error -"cannot open file" | count()

If you hit some slow filter, try replacing it with faster and more specific filter. See the performance tips for details. For example, negative phrase filters such as -"cannot open file" may be slower than more specific positive filters. In some cases it may be better to explicitly select the needed logs with contains_any(phrase1, ..., phraseN) , where phrase1, …, phraseN are phrases seen in the logs you want to select:

      _time:5m error contains_any("access denied", "unauthorized", "403") | count()

After all the needed filters are added to the query, look at the number of matching logs. If the number is too big (e.g. exceeds tens of millions), then, probably, more specific filters can be added to the query in order to reduce the number of logs to process by the pipes . For example, adding phrase filters on constant string parts from the extract pattern can significantly reduce the number of logs to process by the extract pipe:

      _time:5m error contains_any("access denied", "unauthorized", "403") "user_id=(" | count()

Then add pipes from the original query one by one and measure the query duration for each step:

      _time:5m error contains_any("access denied", "unauthorized", "403") "user_id=("
  | extract "user_id=(<uid>)"
  | count()
    

      _time:5m error contains_any("access denied", "unauthorized", "403") "user_id=("
  | extract "user_id=(<uid>)"
  | top 5 by (uid)
  | count()
    

If the query becomes slow or starts using a lot of RAM after adding the next filter or pipe, then you can quickly identify which part of the query needs optimization.

It might be useful to add the query_stats pipe to the end of the query in order to understand how much data of different types the query reads and processes.

If you find a slow filter or pipe, try these ideas:

Regex matching and JSON parsing are expensive. Use faster alternatives if you can. See performance tips .
Sorting without a limit with sort pipe stores all the sorted logs in memory (or may fail if it needs too much memory). Add a limit or reduce the input number of logs.
High-cardinality functions like count_uniq() track unique values in memory (up to the configured limit, if it is set). Think how to reduce the number of unique values to track.
Large group counts in stats by (...) can use a lot of memory. Filter or transform your data to reduce the number of groups.

Previous Querying Next Alerting with Logs

LogsQL #

LogsQL tutorial #

Key concepts #

Word #

Query syntax #

Filters #

Time filter #

Day range filter #

Week range filter #

Stream filter #

_stream_id filter #

Word filter #

Phrase filter #

Prefix filter #

Pattern match filter #

Substring filter #

Range comparison filter #

Empty value filter #

Any value filter #

No-op filter #

Exact filter #

Exact prefix filter #

Multi-exact filter #

contains_all filter #

contains_any filter #

Subquery filter #

Case-insensitive filter #

equals_common_case filter #

contains_common_case filter #

Sequence filter #

Regexp filter #

Range filter #

IPv4 range filter #

IPv6 range filter #

String range filter #

Length range filter #

value_type filter #

eq_field filter #

le_field filter #

lt_field filter #

Logical filter #

Pipes #

block_stats pipe #

blocks_count pipe #

collapse_nums pipe #

Conditional collapse_nums #

copy pipe #

decolorize pipe #

delete pipe #

drop_empty_fields pipe #

extract pipe #

Format for extract pipe pattern #

Conditional extract #

extract_regexp pipe #

Conditional extract_regexp #

facets pipe #

field_names pipe #

field_values pipe #

fields pipe #

filter pipe #

first pipe #

format pipe #

Conditional format #

generate_sequence pipe #

join pipe #

json_array_len pipe #

hash pipe #

last pipe #

len pipe #

limit pipe #

math pipe #

offset pipe #

pack_json pipe #

pack_logfmt pipe #

query_stats pipe #

rename pipe #

replace pipe #

Conditional replace #

replace_regexp pipe #

Conditional replace_regexp #