Retrieves metrics events from a Tenzir node.
metrics [name:string, live=bool, retro=bool, shape=string]Description
Section titled “Description”The metrics operator retrieves metrics events from a Tenzir node. Metrics
events are collected every second.
name: string (optional)
Section titled “name: string (optional)”Show only metrics with the specified name. For example, metrics "cpu" only shows
CPU metrics.
live = bool (optional)
Section titled “live = bool (optional)”Work on all metrics events as they are generated in real-time instead of on metrics events persisted at a Tenzir node.
retro = bool (optional)
Section titled “retro = bool (optional)”Work on persisted diagnostic events (first), even when live is given.
shape = string (optional)
Section titled “shape = string (optional)”Controls the output shape. The default is "raw", which preserves the native
tenzir.metrics.* schemas listed below.
Set shape="prometheus" to transform metrics into canonical records that are
compatible with Prometheus-oriented pipelines. You can send this shape directly
to to_prometheus.
{ metric: string, value: double, timestamp: time, labels: record, type: string, unit: string,}The Prometheus shape recursively flattens numeric metric fields into individual
records. Duration values are converted to seconds and emitted with a _seconds
metric suffix.
Only selected low-cardinality dimensions become labels:
- Disk metrics:
path. - Memory metrics:
name(for the per-actor allocator statistics). - CAF metrics:
name.
CPU and process metrics carry no labels.
Schemas
Section titled “Schemas”Tenzir collects metrics with the following schemas.
tenzir.metrics.actor
Section titled “tenzir.metrics.actor”Contains the current mailbox size of selected core actors (such as node,
importer, and index), emitted once per second per instrumented actor.
{ timestamp: time, // The time at which this metric was recorded. id: string, // The unique ID of the actor. name: string, // The name of the actor. inbox_size: uint64, // The number of messages currently in the actor's mailbox.}tenzir.metrics.api
Section titled “tenzir.metrics.api”Contains information about all accessed API endpoints, emitted once per second.
{ timestamp: time, // The time at which the API request was received. request_id: string, // The unique request ID assigned by the Tenzir Platform. method: string, // The HTTP method used to access the API. path: string, // The path of the accessed API endpoint. response_time: duration, // The time the API endpoint took to respond. status_code: uint64, // The HTTP status code of the API response. params: record, // The API endpoints parameters passed in.}The schema of the record params depends on the API endpoint used. Refer to the
API documentation to see the available parameters per endpoint.
tenzir.metrics.caf
Section titled “tenzir.metrics.caf”Contains metrics about the CAF (C++ Actor Framework) runtime system.
{ system: { // Metrics about the CAF actor system. running_actors: int64, // Number of currently running actors. running_actors_by_name: [{ // Number of running actors, grouped by actor name. name: string, // Actor name. count: int64, // Number of actors with this name currently running. }], all_messages: { // Information about the total message metrics. processed: int64, // Number of processed messages. rejected: int64, // Number of rejected messages. }, messages_by_actor: list[{ // List of metrics, grouped by actor. name: string, // Name of the receiving actor. This may be null for messages without an associated actor. processed: int64, // Number of processed messages. rejected: int64, // Number of rejected messages. }], }, middleman: { // Metrics about CAF's network layer. inbound_messages_size: int64, // Size of received messages in bytes since last metric. outbound_messages_size: int64, // Size of sent messages in bytes since last metric. serialization_time: duration, // Time spent serializing messages since last metric. deserialization_time: duration, // Time spent deserializing messages since last metric. }, actors: list[{ // Per-actor metrics for all running actors. name: string, // Name of the actor. processing_time: duration, // Time spent processing messages since last metric. mailbox_time: duration, // Time messages spent in mailbox since last metric. mailbox_size: int64, // Current number of messages in actor's mailbox. }],}tenzir.metrics.cpu
Section titled “tenzir.metrics.cpu”Contains a measurement of CPU utilization.
{ timestamp: time, // The time at which this metric was recorded. loadavg_1m: double, // The load average over the last minute. loadavg_5m: double, // The load average over the last 5 minutes. loadavg_15m: double, // The load average over the last 15 minutes.}tenzir.metrics.disk
Section titled “tenzir.metrics.disk”Contains a measurement of disk space usage.
{ timestamp: time, // The time at which this metric was recorded. path: string, // The byte measurements below refer to the filesystem on which this path is located. total_bytes: uint64, // The total size of the volume, in bytes. used_bytes: uint64, // The number of bytes occupied on the volume. free_bytes: uint64, // The number of bytes still free on the volume.}tenzir.metrics.enrich
Section titled “tenzir.metrics.enrich”Contains a measurement of the enrich operator, emitted once every second.
{ pipeline_id: string, // The ID of the pipeline where the associated operator is from. pipeline_name: string, // The name of the pipeline. Note that pipeline names are not unique. run: uint64, // The number of the run, starting at 1 for the first run. hidden: bool, // Indicates whether the corresponding pipeline is hidden from the list of managed pipelines. timestamp: time, // The time at which this metric was recorded. operator_id: uint64, // The ID of the `enrich` operator in the pipeline. context: string, // The name of the context the associated operator is using. events: uint64, // The amount of input events that entered the `enrich` operator since the last metric. hits: uint64, // The amount of successfully enriched events since the last metric.}tenzir.metrics.export
Section titled “tenzir.metrics.export”Contains a measurement of the export operator, emitted once every second per
schema. Note that internal events like metrics or diagnostics do not emit
metrics themselves.
{ pipeline_id: string, // The ID of the pipeline where the associated operator is from. pipeline_name: string, // The name of the pipeline. Note that pipeline names are not unique. run: uint64, // The number of the run, starting at 1 for the first run. hidden: bool, // Indicates whether the corresponding pipeline is hidden from the list of managed pipelines. timestamp: time, // The time at which this metric was recorded. operator_id: uint64, // The ID of the `export` operator in the pipeline. schema: string, // The schema name of the batch. schema_id: string, // The schema ID of the batch. events: uint64, // The amount of events that were imported. queued_events: uint64, // The total amount of events that are enqueued in the export.}tenzir.metrics.import
Section titled “tenzir.metrics.import”Contains a measurement the import operator, emitted once every second per
schema. Note that internal events like metrics or diagnostics do not emit
metrics themselves.
{ pipeline_id: string, // The ID of the pipeline where the associated operator is from. pipeline_name: string, // The name of the pipeline. Note that pipeline names are not unique. run: uint64, // The number of the run, starting at 1 for the first run. hidden: bool, // Indicates whether the corresponding pipeline is hidden from the list of managed pipelines. timestamp: time, // The time at which this metric was recorded. operator_id: uint64, // The ID of the `import` operator in the pipeline. schema: string, // The schema name of the batch. schema_id: string, // The schema ID of the batch. events: uint64, // The amount of events that were imported.}tenzir.metrics.ingest
Section titled “tenzir.metrics.ingest”Contains a measurement of all data ingested into the database, emitted once per second and schema.
{ timestamp: time, // The time at which this metric was recorded. schema: string, // The schema name of the batch. schema_id: string, // The schema ID of the batch. events: uint64, // The amount of events that were ingested.}tenzir.metrics.lookup
Section titled “tenzir.metrics.lookup”Contains a measurement of the lookup operator, emitted once every second.
{ pipeline_id: string, // The ID of the pipeline where the associated operator is from. pipeline_name: string, // The name of the pipeline. Note that pipeline names are not unique. run: uint64, // The number of the run, starting at 1 for the first run. hidden: bool, // Indicates whether the corresponding pipeline is hidden from the list of managed pipelines. timestamp: time, // The time at which this metric was recorded. operator_id: uint64, // The ID of the `lookup` operator in the pipeline. context: string, // The name of the context the associated operator is using. live: { // Information about the live lookup. events: uint64, // The amount of input events used for the live lookup since the last metric. hits: uint64, // The amount of live lookup matches since the last metric. }, retro: { // Information about the retroactive lookup. events: uint64, // The amount of input events used for the lookup since the last metric. hits: uint64, // The amount of lookup matches since the last metric. queued_events: uint64, // The total amount of events that were in the queue for the lookup. }, context_updates: uint64, // The amount of times the underlying context has been updated while the associated lookup is active.}tenzir.metrics.memory
Section titled “tenzir.metrics.memory”Contains statistics about allocated memory.
{ timestamp: time, // The time at which this metric was recorded. system: { // Information about the systems memory state. total_bytes: uint64, // Total available memory in the system. used_bytes: uint64, // Amount of memory used on the system. free_bytes: uint64, // Amount of free memory on the system. }, process: { peak_bytes: uint64, // Peak memory usage during the runtime of the process. current_bytes: uint64, // Current memory usage of the entire process. swap_bytes: uint64, // Swap space used by the process. }, procfs: { // Memory statistics read from the Linux procfs. Only available on Linux. status: { // Values parsed from `/proc/self/status`. vm_rss_bytes: uint64, // Resident set size (physical memory in use). vm_data_bytes: uint64, // Size of the data segment. vm_swap_bytes: uint64, // Amount of swapped-out memory. rss_anon_bytes: uint64, // Resident anonymous memory. file_rss_bytes: uint64, // Resident file-backed memory. rss_shmem_bytes: uint64, // Resident shared memory. }, smaps: { // Values aggregated from `/proc/self/smaps`. rss_bytes: uint64, // Resident set size. pss_bytes: uint64, // Proportional set size. private_clean_bytes: uint64, // Private clean pages. private_dirty_bytes: uint64, // Private dirty pages. anonymous_rss_bytes: uint64, // Resident anonymous memory. swap_bytes: uint64, // Swapped-out memory. hugetlb_bytes: uint64, // Memory backed by huge pages. }, heap: { break_bytes: uint64, // Size of the heap as defined by the program break. }, }, arrow: { // Information about memory allocated by Arrow buffers. bytes: { current: int, // Currently allocated bytes peak: int, // Peak allocated bytes during this run cumulative: int, // Cumulative allocations during this run }, allocations: { current: int, // Number of current allocations peak: int, // Peak number of allocations cumulative: int, // Cumulative allocations during this run }, }, cpp: { /// Information about memory allocated by `operator new` bytes: { current: int, // Currently allocated bytes peak: int, // Peak allocated bytes during this run cumulative: int, // Cumulative allocations during this run }, allocations: { current: int, // Number of current allocations peak: int, // Peak number of allocations cumulative: int, // Cumulative allocations during this run }, }, c: { /// Information about memory allocated `malloc` and other C/POSIX functions. bytes: { current: int, // Currently allocated bytes peak: int, // Peak allocated bytes during this run cumulative: int, // Cumulative allocations during this run }, allocations: { current: int, // Number of current allocations peak: int, // Peak number of allocations cumulative: int, // Cumulative allocations during this run }, }, actor: [ { name: string, // Name of the operator, actor or thread, whichever is more concrete. arrow: { // Information about memory allocated by Arrow buffers from this thread/actor. bytes: { current: int, // Currently allocated bytes peak: int, // Peak allocated bytes during this run cumulative: int, // Cumulative allocations during this run }, allocations: { current: int, // Number of current allocations peak: int, // Peak number of allocations cumulative: int, // Cumulative allocations during this run }, }, cpp: { /// Information about memory allocated by `operator new` from this thread/actor. bytes: { current: int, // Currently allocated bytes peak: int, // Peak allocated bytes during this run cumulative: int, // Cumulative allocations during this run }, allocations: { current: int, // Number of current allocations peak: int, // Peak number of allocations cumulative: int, // Cumulative allocations during this run }, }, c: { /// Information about memory allocated `malloc` and other C/POSIX functions from this thread/actor. bytes: { current: int, // Currently allocated bytes peak: int, // Peak allocated bytes during this run cumulative: int, // Cumulative allocations during this run }, allocations: { current: int, // Number of current allocations peak: int, // Peak number of allocations cumulative: int, // Cumulative allocations during this run }, }, } ], malloc: { // Statistics from glibc's `mallinfo2`. Only available with glibc. arena_bytes: uint64, // Total bytes allocated in the main arena. uordblks_bytes: uint64, // Total bytes used by in-use allocations. fordblks_bytes: uint64, // Total free bytes within the arena. keepcost_bytes: uint64, // Releasable free space at the top of the heap. hblkhd_bytes: uint64, // Total bytes allocated via `mmap`. ordblks_count: uint64, // Number of ordinary (non-fastbin) free blocks. smblks_count: uint64, // Number of fastbin free blocks. }, table_slices: { // Memory used by table slices (the in-memory event batches). serialized_bytes: uint64, // Bytes used by serialized table slices. non_serialized_bytes: uint64, // Bytes used by non-serialized table slices. batch_count: uint64, // Number of table slice batches currently alive. event_count: uint64, // Number of events across all live table slices. }, chunks: { // Memory used by reference-counted byte chunks. bytes: uint64, // Total bytes held by chunks. count: uint64, // Number of chunks currently alive. },}tenzir.metrics.operator_buffers
Section titled “tenzir.metrics.operator_buffers”Contains the aggregate size of the internal buffers between the operators of a pipeline. Emitted once per second per pipeline that currently has buffered data.
{ timestamp: time, // The time at which this metric was recorded. pipeline_id: string, // The ID of the pipeline these metrics represent. bytes: uint64, // Approximate size of all buffered data in bytes. events: uint64, // The number of events currently buffered.}tenzir.metrics.operator_profile
Section titled “tenzir.metrics.operator_profile”Contains per-operator performance measurements, emitted once every second per
operator. This replaces the operator metrics that were removed in Tenzir v6.
{ timestamp: time, // The time at which this metric was recorded. pipeline_id: string, // The ID of the pipeline where the associated operator is from. operator_id: string, // The ID of the operator within the pipeline. name: string, // The name of the operator. input_bytes: uint64, // Approximate number of bytes currently buffered at the operator's input. cpu: double, // CPU usage of the operator as a percentage of wall-clock time over the last second. task_count: uint64, // The number of scheduling tasks executed since the last metric. bytes_in: uint64, // The number of bytes that entered the operator since the last metric. bytes_out: uint64, // The number of bytes that left the operator since the last metric. batches_in: uint64, // The number of batches that entered the operator since the last metric. batches_out: uint64, // The number of batches that left the operator since the last metric. events_in: uint64, // The number of events that entered the operator since the last metric. events_out: uint64, // The number of events that left the operator since the last metric. signals_in: uint64, // The number of control signals that entered the operator since the last metric. signals_out: uint64, // The number of control signals that left the operator since the last metric.}tenzir.metrics.pipeline
Section titled “tenzir.metrics.pipeline”Contains measurements of data flowing through pipelines, emitted once every 10 seconds.
{ timestamp: time, // The time at which this metric was recorded. pipeline_id: string, // The ID of the pipeline these metrics represent. pipeline_name: string, // The name of the pipeline. Note that pipeline names are not unique. ingress: { // Measurement of data entering the pipeline. duration: duration, // The timespan over which this data was collected. events: uint64, // Number of events that passed through during this period. bytes: uint64, // Approximate number of bytes that passed through. batches: uint64, // Number of batches that passed through. internal: bool, // True if the data flow is considered internal to Tenzir. }, egress: { // Measurement of data exiting the pipeline. duration: duration, // The timespan over which this data was collected. events: uint64, // Number of events that passed through during this period. bytes: uint64, // Approximate number of bytes that passed through. batches: uint64, // Number of batches that passed through. internal: bool, // True if the data flow is considered internal to Tenzir. },}tenzir.metrics.platform
Section titled “tenzir.metrics.platform”Signals whether the connection to the Tenzir Platform is working from the node’s perspective. Emitted once per second.
{ timestamp: time, // The time at which this metric was recorded. connected: bool, // The connection status.}tenzir.metrics.process
Section titled “tenzir.metrics.process”Contains a measurement of the amount of memory used by the tenzir-node process.
{ timestamp: time, // The time at which this metric was recorded. current_memory_usage: uint64, // The memory currently used by this process. peak_memory_usage: uint64, // The peak amount of memory, in bytes. swap_space_usage: uint64, // The amount of swap space, in bytes. Only available on Linux systems. open_fds: uint64, // The amount of open file descriptors by the node. Only available on Linux systems.}tenzir.metrics.publish
Section titled “tenzir.metrics.publish”Contains a measurement of the publish operator, emitted once every second per
schema.
{ pipeline_id: string, // The ID of the pipeline where the associated operator is from. pipeline_name: string, // The name of the pipeline. Note that pipeline names are not unique. run: uint64, // The number of the run, starting at 1 for the first run. hidden: bool, // Indicates whether the corresponding pipeline is hidden from the list of managed pipelines. timestamp: time, // The time at which this metric was recorded. operator_id: uint64, // The ID of the `publish` operator in the pipeline. topic: string, // The topic name. schema: string, // The schema name of the batch. schema_id: string, // The schema ID of the batch. events: uint64, // The amount of events that were published to the `topic`.}tenzir.metrics.rebuild
Section titled “tenzir.metrics.rebuild”Contains a measurement of the partition rebuild process, emitted once every second.
{ timestamp: time, // The time at which this metric was recorded. partitions: uint64, // The number of partitions currently being rebuilt. queued_partitions: uint64, // The number of partitions currently queued for rebuilding.}tenzir.metrics.subscribe
Section titled “tenzir.metrics.subscribe”Contains a measurement of the subscribe operator, emitted once every second
per schema.
{ pipeline_id: string, // The ID of the pipeline where the associated operator is from. pipeline_name: string, // The name of the pipeline. Note that pipeline names are not unique. run: uint64, // The number of the run, starting at 1 for the first run. hidden: bool, // Indicates whether the corresponding pipeline is hidden from the list of managed pipelines. timestamp: time, // The time at which this metric was recorded. operator_id: uint64, // The ID of the `subscribe` operator in the pipeline. topic: string, // The topic name. schema: string, // The schema name of the batch. schema_id: string, // The schema ID of the batch. events: uint64, // The amount of events that were retrieved from the `topic`.}tenzir.metrics.subscribe_buffer
Section titled “tenzir.metrics.subscribe_buffer”Contains information about the subscribe operator’s internal buffer, emitted
once every second.
{ pipeline_id: string, // The ID of the pipeline where the associated operator is from. pipeline_name: string, // The name of the pipeline. Note that pipeline names are not unique. run: uint64, // The number of the run, starting at 1 for the first run. hidden: bool, // Indicates whether the corresponding pipeline is hidden from the list of managed pipelines. timestamp: time, // The time at which this metric was recorded. operator_id: uint64, // The ID of the `subscribe` operator in the pipeline. bytes: uint64, // Approximate size of buffered data in bytes. batches: uint64, // The number of batches currently in the buffer. events: uint64, // The number of events currently in the buffer.}tenzir.metrics.tcp
Section titled “tenzir.metrics.tcp”Contains measurements about the number of read calls and the received bytes per TCP connection.
{ pipeline_id: string, // The ID of the pipeline where the associated operator is from. pipeline_name: string, // The name of the pipeline. Note that pipeline names are not unique. run: uint64, // The number of the run, starting at 1 for the first run. hidden: bool, // Indicates whether the corresponding pipeline is hidden from the list of managed pipelines. timestamp: time, // The time at which this metric was recorded. operator_id: uint64, // The ID of the operator owning the TCP connection. handle: string, // An identifier for the connection (for example, the peer address). reads: uint64, // The number of attempted reads since the last metric. writes: uint64, // The number of attempted writes since the last metric. bytes_read: uint64, // The number of bytes received since the last metrics. bytes_written: uint64, // The number of bytes written since the last metrics.}tenzir.metrics.throttle
Section titled “tenzir.metrics.throttle”Contains metrics for the throttle operator. Emitted once per second only while
the operator is dropping events — that is, when it is configured with drop=true
and the rate limit is being exceeded.
{ pipeline_id: string, // The ID of the pipeline where the associated operator is from. pipeline_name: string, // The name of the pipeline. Note that pipeline names are not unique. run: uint64, // The number of the run, starting at 1 for the first run. hidden: bool, // Indicates whether the corresponding pipeline is hidden from the list of managed pipelines. timestamp: time, // The time at which this metric was recorded. operator_id: uint64, // The ID of the `throttle` operator in the pipeline. dropped_events: int64, // The number of events dropped since the last metric.}Examples
Section titled “Examples”Sort pipelines by total ingress in bytes
Section titled “Sort pipelines by total ingress in bytes”metrics "pipeline"summarize pipeline_id, ingress=sum(ingress.bytes if not ingress.internal)sort -ingress{pipeline_id: "demo-node/m57-suricata", ingress: 59327586}{pipeline_id: "demo-node/m57-zeek", ingress: 43291764}Show the CPU usage over the last hour
Section titled “Show the CPU usage over the last hour”metrics "cpu"where timestamp > now() - 1hselect timestamp, percent=loadavg_1m{timestamp: 2023-12-21T12:00:32.631102, percent: 0.40478515625}{timestamp: 2023-12-21T11:59:32.626043, percent: 0.357421875}{timestamp: 2023-12-21T11:58:32.620327, percent: 0.42578125}{timestamp: 2023-12-21T11:57:32.614810, percent: 0.50390625}{timestamp: 2023-12-21T11:56:32.609896, percent: 0.32080078125}{timestamp: 2023-12-21T11:55:32.605871, percent: 0.5458984375}Get the current memory usage
Section titled “Get the current memory usage”metrics "memory"sort -timestamptail 1select current_bytes=process.current_bytes{current_bytes: 1083031552}Show the total pipeline ingress in bytes
Section titled “Show the total pipeline ingress in bytes”Show the ingress for every day over the last week, excluding internal data flows:
metrics "pipeline"where timestamp > now() - 1weektimestamp = floor(timestamp, 1day)summarize timestamp, bytes=sum(ingress.bytes if not ingress.internal){timestamp: 2023-11-08T00:00:00.000000, bytes: 79927223}{timestamp: 2023-11-09T00:00:00.000000, bytes: 51788928}{timestamp: 2023-11-10T00:00:00.000000, bytes: 80740352}{timestamp: 2023-11-11T00:00:00.000000, bytes: 75497472}{timestamp: 2023-11-12T00:00:00.000000, bytes: 55497472}{timestamp: 2023-11-13T00:00:00.000000, bytes: 76546048}{timestamp: 2023-11-14T00:00:00.000000, bytes: 68643200}Show the operators that produced the most events
Section titled “Show the operators that produced the most events”Show the three operator instantiations that produced the most events in total and their pipeline IDs:
metrics "operator_profile"summarize pipeline_id, operator_id, events=sum(events_out)sort -eventshead 3{pipeline_id: "70a25089-b16c-448d-9492-af5566789b99", operator_id: "0/0", events: 391008694 }{pipeline_id: "7842733c-06d6-4713-9b80-e20944927207", operator_id: "0/0", events: 246914949 }{pipeline_id: "6df003be-0841-45ad-8be0-56ff4b7c19ef", operator_id: "1/0", events: 83013294 }Get the disk usage over time
Section titled “Get the disk usage over time”metrics "disk"sort timestampselect timestamp, used_bytes{timestamp: 2023-12-21T12:52:32.900086, used_bytes: 461834444800}{timestamp: 2023-12-21T12:53:32.905548, used_bytes: 461834584064}{timestamp: 2023-12-21T12:54:32.910918, used_bytes: 461840302080}{timestamp: 2023-12-21T12:55:32.916200, used_bytes: 461842751488}Send metrics to Prometheus
Section titled “Send metrics to Prometheus”metrics live=true, shape="prometheus"to_prometheus "https://prometheus.example/api/v1/write"Get the memory usage over time
Section titled “Get the memory usage over time”metrics "memory"sort timestampselect timestamp, used_bytes=system.used_bytes{timestamp: 2023-12-21T13:08:32.982083, used_bytes: 48572645376}{timestamp: 2023-12-21T13:09:32.986962, used_bytes: 48380682240}{timestamp: 2023-12-21T13:10:32.992494, used_bytes: 48438878208}{timestamp: 2023-12-21T13:11:32.997889, used_bytes: 48491839488}{timestamp: 2023-12-21T13:12:33.003323, used_bytes: 48529952768}Get inbound TCP traffic over time
Section titled “Get inbound TCP traffic over time”metrics "tcp"sort timestampselect timestamp, handle, reads, writes, bytes_read, bytes_written{ timestamp: 2024-09-04T15:43:38.011350, handle: "12", reads: 884, writes: 0, bytes_read: 10608, bytes_written: 0}{ timestamp: 2024-09-04T15:43:39.013575, handle: "12", reads: 428, writes: 0, bytes_read: 5136, bytes_written: 0}{ timestamp: 2024-09-04T15:43:40.015376, handle: "12", reads: 429, writes: 0, bytes_read: 5148, bytes_written: 0}