This is a tutorial for the migration from SQL to LogsQL. It is expected you are familiar with SQL and know how to execute queries at VictoriaLogs.
data model #
SQL is usually used for querying relational tables. Every such table contains a pre-defined set of columns with pre-defined types. LogsQL is used for querying logs. Logs are stored in log streams. So log streams is an analogue of tables in relational databases. Log streams and relational tables have the following major differences:
- Log streams are created automatically when the first log entry (row) is ingested into them.
- There is no pre-defined scheme in log streams - logs with arbitrary set of fields can be ingested into every log stream. Both names and values in every log entry have string type. They may contain arbitrary string data.
- Every log entry (row) can be represented as a flat JSON object:
{"f1":"v1",...,"fN":"vN"}
. See these docs. - By default VictoriaLogs selects log entries across all the log streams. The needed set of log streams can be specified via stream filters.
- By default VictoriaLogs returns all the fields across the selected logs. The set of returned fields
can be limited with
fields
pipe.
query structure #
SQL query structure is quite convoluted:
SELECT
<fields, aggregations, calculations, transformations>
FROM <table>
<optional JOINs>
<optional filters with optional subqueries>
<optional GROUP BY>
<optional HAVING>
<optional ORDER BY>
<optional LIMIT and OFFSET>
<optional UNION>
LogsQL query structure is much simpler:
<filters>
| <optional_pipe1>
| ...
| <optional_pipeN>
The <filters>
part selects the needed logs (rows) according to the provided filters.
Then the provided pipes are executed sequentlially.
Every such pipe receives all the rows from the previous stage, performs some calculations and/or transformations,
and then pushes the resulting rows to the next stage. This simplifies reading and understanding the query - just read it from the beginning
to the end in order to understand what does it do at every stage.
LogsQL pipes cover all the functionality from SQL: aggregations, calculations, transformations, subqueries, joins, post-filters, sorting, etc. See the conversion rules on how to convert SQL to LogsQL.
conversion rules #
The following rules must be used for converting SQL query into LogsQL query:
- If the SQL query contains
WHERE
, then convert it into LogsQL filters. Otherwise just start LogsQL query with*
. For example,SELECT * FROM table WHERE field1=value1 AND field2<>value2
is converted intofield1:=value1 field2:!=value2
, whileSELECT * FROM table
is converted into*
. IN
subqueries insideWHERE
must be converted intoin
filters. For example,SELECT * FROM table WHERE id IN (SELECT id2 FROM table)
is converted intoid:in(* | fields id2)
.- If the
SELECT
part isn’t equal to*
and there are noGROUP BY
/ aggregate functions in the SQL query, then enumerate the selected columns atfields
pipe. For example,SELECT field1, field2 FROM table
is converted into* | fields field1, field2
. - If the SQL query contains
JOIN
, then convert it intojoin
pipe. - If the SQL query contains
GROUP BY
/ aggregate functions, then convert them tostats
pipe. For example,SELECT count(*) FROM table
is converted into* | count()
, whileSELECT user_id, count(*) FROM table GROUP BY user_id
is converted to* | stats by (user_id) count()
. - If the SQL query contains additional calculations and/or transformations at the
SELECT
, which aren’t covered yet byGROUP BY
, then convert them into the corresponding LogsQL pipes. The most frequently used pipes aremath
andformat
. For example,SELECT field1 + 10 AS x, CONCAT("foo", field2) AS y FROM table
is converted into* | math field1 + 10 as x | format "foo<field2>" as y | fields x, y
. - If the SQL query contains
HAVING
, then convert it intofilter
pipe. For example,SELECT user_id, count(*) AS c FROM table GROUP BY user_id HAVING c > 100
is converted into* | stats by (user_id) count() c | filter c:>100
. - If the SQL query contains
ORDER BY
,LIMIT
andOFFSET
, then convert them intosort
pipe. For example,SELECT * FROM table ORDER BY field1, field2 LIMIT 10 OFFSET 20
is converted into* | sort by (field1, field2) limit 10 offset 20
. - If the SQL query contains
UNION
, then convert it intounion
pipe. For exampleSELECT * FROM table WHERE filters1 UNION ALL SELECT * FROM table WHERE filters2
is converted intofilters1 | union (filters2)
.
SQL queries are frequently used for obtaining top N column values, which are the most frequently seen in the selected rows.
For example, the query below returns top 5 user_id
values, which present in the biggest number of rows:
SELECT user_id, count(*) hits FROM table GROUP BY user_id ORDER BY hits DESC LIMIT 5
LogsQL provides a shortcut syntax with top
pipe for this case:
* | top 5 (user_id)
It is equivalent to the longer LogsQL query:
* | by (user_id) count() hits | sort by (hits desc) limit 5
LogsQL pipes support much wider functionality comparing to SQL, so spend your spare time by reading pipe docs and playing with them at VictoriaLogs playground.