This guide provides an overview of data collection in TQL. You’ll learn about the different approaches for ingesting data from various sources.
TQL provides two types of input operators:
-
from_*operators likefrom_fileandfrom_httpread bytes and parse them using a subpipeline. -
Direct event operators like
from_kafkaandfrom_udpproduce structured events directly without an intermediate byte stream.
Collection patterns
Section titled “Collection patterns”Different data sources require different collection approaches.
Files and cloud storage
Section titled “Files and cloud storage”Read local files, watch directories for changes, or access cloud storage:
// Single file with automatic format detectionfrom_file "/var/log/app.json"
// Watch a directory for new filesfrom_file "/incoming/*.csv", watch=true
// Cloud storage with glob patternsfrom_file "s3://bucket/data/**/*.parquet"See the file reading guide for details.
HTTP and APIs
Section titled “HTTP and APIs”Fetch data from web APIs with authentication, pagination, and retry handling:
from_http "https://api.example.com/events", headers={"Authorization": "Bearer " + secret("API_TOKEN")}See the HTTP and API guide for pagination patterns and advanced configurations.
Message brokers
Section titled “Message brokers”Subscribe to topics or queues from Apache Kafka, AMQP, Amazon SQS, and Google Cloud Pub/Sub:
from_kafka "security-events", offset="end"See the message broker guide for broker-specific configurations.
Network data
Section titled “Network data”Receive data over TCP or UDP sockets, or capture packets from network interfaces:
// UDP syslog receiverfrom_udp "0.0.0.0:514"
// TCP with TLSfrom "tcp://0.0.0.0:8443", tls=trueSee the network data guide for socket configurations and packet capture.
Sending data to destinations
Section titled “Sending data to destinations”For routing data to outputs, see the Routing guides, which cover destination operators, file output, load balancing, and pipeline connections.