TQL programs compose statements into complete data processing workflows that can execute. Valid TQL programs adhere to the following rules:
- Adjacent operators must have identical types.
- A pipeline must be closed, i.e., begin with void input and end with void output.
Statement chaining
Section titled “Statement chaining”You chain statements with either a newline (\n
) or pipe symbol (|
). We
purposefully offer choice to cater to two primary styles:
- Vertical structuring with newlines for full-text editing
- Horizontal inline pipe composition for command-line usage
Prefer the vertical approach for readability in files and documentation. Throughout this documentation, we only use the vertical style for clarity and consistency.
Let’s juxtapose the two styles. Here’s a vertical TQL program:
let $ports = [22, 443]
from_file "/tmp/logs.json"where port in $portsselect src_ip, dst_ip, bytessummarize src_ip, total=sum(bytes)
And here a horziontal one:
let $ports = [22, 443] | from "/tmp/logs.json" | where port in $ports | select src_ip, dst_ip, bytes | summarize src_ip, total=sum(bytes)
In theory, you can combine pipes and newlines to write programs that resemble Kusto and similar languages. However, we discourage this practice because it can make the code harder to read and maintainespecially when adding nested pipelines that increase the level of indentation.
Diagnostics
Section titled “Diagnostics”TQL’s diagnostic system is designed to give you insights into what happens during data processing. There exist two types of diagnostics:
- Errors: Stop pipeline execution immediately (critical failures)
- Warnings: Signal data quality issues but continue processing
When a pipeline emits an error, it stops execution. Unless you configured the pipeline to restart on error, it now requires human intervention to resolve the issue and resume execution.
Warnings do not cause a screeching halt of the pipeline. They are useful for identifying potential issues that may impact the quality of the processed data, such as missing or unexpected values.
Pipeline nesting
Section titled “Pipeline nesting”Operators can contain entire subpipelines that execute based on the operator’s
semantics. For example, the every
operator
executes its subpipeline at regular intervals:
every 1h { from_http "api.example.com" select domain, risk context::update "domains", key=domain, value=risk}
You define subpipelines syntactically within a block of curly braces ({}
).
Some operators require that you define a closed (void-to-void) pipeline, whereas others exhibit parsing (bytes-to-events) or printing (events-to-bytes) semantics.
Comments
Section titled “Comments”Comments make implicit choices and assumptions explicit. They have no semantic effect and the compiler ignores them during parsing.
TQL features C-style comments, both single and multi-line.
Single-line comments
Section titled “Single-line comments”Use a double slash (//
) to comment until the end of the line.
Here’s an example where a comment spans a full line:
// the app only supports lower-case user nameslet $user = "jane"
Here’s an example where a comment starts in the middle of a line:
let $users = [ "jane", // NB: also admin! "john", // Been here since day 1.]
Multi-line comments
Section titled “Multi-line comments”Use a slash-star (/*
) to start a multi-line comment and a star-slash (*/
)
to end it.
Here’s an example where a comment spans multiple lines:
/* * User validation logic * --------------------- * Validate user input against a set of rules. * If any rule fails, the user is rejected. * If all rules pass, the user is accepted. */let $user = "jane"
Execution Model
Section titled “Execution Model”TQL pipelines execute on a streaming engine that processes data incrementally. Understanding the execution model helps you write efficient pipelines and predict performance characteristics.
Key execution principles:
- Stream processing by default: Data flows through operators as it arrives
- Lazy evaluation: Operations execute only when data flows through them
- Back-pressure handling: Automatic flow control prevents memory exhaustion
- Network transparency: Pipelines can span multiple nodes seamlessly
Streaming vs blocking
Section titled “Streaming vs blocking”Understanding operator behavior helps write efficient pipelines:
Streaming operators process events incrementally:
where
: Filters one event at a timeselect
: Transforms fields immediatelydrop
: Removes fields as events flow
Blocking operators need all input before producing output:
sort
: Must see all events to order themsummarize
: Aggregates across the entire streamreverse
: Needs complete input to reverse order
Efficient: streaming operations first:
from "large_file.json"where severity == "critical" // Streaming: reduces data earlyselect relevant_fields // Streaming: drops unnecessary datasort timestamp // Blocking: but on reduced dataset
L Less efficient: blocking operation on full data:
from "large_file.json"sort timestamp // Blocking: processes everythingwhere severity == "critical" // Then filters
Constant vs runtime evaluation
Section titled “Constant vs runtime evaluation”Understanding when expressions evaluate helps write efficient pipelines:
let $threshold = 1Kilet $start_time = 2024-01-15T09:00:00 // Would be now() - 1h in real usagelet $config = { ports: [80, 443, 8080], networks: [10.0.0.0/8, 192.168.0.0/16],}
// Runtime: evaluated per eventfrom {bytes: 2Ki, timestamp: 2024-01-15T09:30:00}, {bytes: 512, timestamp: 2024-01-15T09:45:00}, {bytes: 3Ki, timestamp: 2024-01-15T10:00:00}where bytes > $threshold // Constant comparisonwhere timestamp > $start_time // Constant comparisoncurrent_time = 2024-01-15T10:30:00 // Would be now() in real usageage = current_time - timestamp // Runtime calculation
Network transparency
Section titled “Network transparency”TQL pipelines can span network boundaries seamlessly. For example, the
import
operator implicitly performs a network
connection based on where it runs. If the tenzir
binary executes the pipeline,
the executor establishesa transparent network connection. If the pipeline runs
within a node, the executor passes the data directly to the next operator in the
same process.