Skip to content

Overview

The Tenzir Query Language (TQL) is a dataflow language designed for processing of unstructured byte-streams and semi-structured events.

TQL is built on a powerful streaming execution engine, but it shields you from the complexity of low-level data processing. It provides a rich set of building blocks to create intricate pipelines that collect, transform, and route data. You can also embed your TQL programs into reusable packages to create one-click deployable use cases.

Streaming ExecutionEngineTenzir Query LanguageIntegrationsPackagesHigh-throughput columnardataflow processingActor Model Runtime Open source scalable C++ runtimeto execute micro-service workloadsContextualizationEngine1-click deployable use casesthat provide instant ROINative connectorsto the industry ecosystemExpressive and type-rich pipelinesbuilt for AI to orchestrateSyslog, Kafka, S3, GCP, Azure, ..Operators, functions, contextsDAG executor on top of Apache ArrowC++ Actor ModelData sources, threat intel, OCSF mappings

Security practitioners need to collect, transform, and analyze telemetry without the complexity of general-purpose data engineering tools. We created TQL to meet this need by synthesizing the best ideas from languages that security teams already use:

  • Splunk SPL’s familiarity: Operators that security analysts recognize
  • Kusto’s power: Rich aggregation and time-series capabilities
  • Unix pipes’ composability: Small, focused operators that chain together
  • jq’s flexibility: Powerful transformations on semi-structured data

TQL combines the power of a streaming execution engine with an intuitive pipeline syntax that mirrors how practitioners think about data processing. This approach is validated by nearly every modern SIEM (Splunk SPL, Elastic ES|QL, Microsoft KQL) and offers several advantages over traditional query languages like SQL.

TQL seamlessly handles both real-time and historical analysis. Unlike traditional tools that require separate codebases for streaming and batch workflows, TQL uses the same pipeline logic for both.

Process archived data from a data lake:

from_file "s3://bucket/logs/2024-01/*.parquet"
where timestamp > 2024-01-15T00:00:00

Or monitor a live stream from a message bus:

from "kafka://topic:9092"
where timestamp > now() - 1h

TQL draws inspiration from Unix pipes, where data flows through a sequence of transformations. But unlike shell pipelines that primarily work on text, TQL operates on both unstructured data (bytes) and structured data (events).

Unlike traditional databases that require strict schemas, TQL embraces heterogeneous data as a first-class concept. Real-world data pipelines process multiple event types simultaneously—firewall logs, DNS queries, authentication events—each with different schemas.

TQL’s operators are polymorphic, adapting to different schemas at runtime:

// Single pipeline processing multiple event types
from "mixed_security_logs.json"
where timestamp > now() - 1h
where severity? == "high" or risk_score? > 0.8
select \
timestamp,
event_type=@name, // Capture the schema name
message, // Common field
src_ip?, // Present in network events
username?, // Present in auth events
dns_query? // Present in DNS events

This philosophy enables powerful patterns:

  • Type-aware aggregation: Group and aggregate across different schemas
  • Unified processing: Apply common transformations to diverse data
  • Schema evolution: Handle changing schemas without pipeline updates
  • Mixed-source correlation: Join events from different systems

The language documentation is organized into four main sections:

  • Types: TQL’s type system with domain-specific types for security and network data
  • Expressions: The computational core of TQL, from literals to complex evaluations
  • Statements: Control and structure with bindings, operators, and control flow
  • Programs: Complete data processing workflows and execution model

Last updated: