Skip to content

Slice and sample data

When working with data streams, you often need to control which events flow through your pipeline. This guide shows you how to slice event streams, sample data, and control event ordering using TQL operators.

The operators in this guide work on entire event streams:

  • head and tail - Get events from the beginning or end
  • slice - Extract a specific range of events
  • taste - Sample events by schema
  • reverse - Invert the order of events
  • sample - Randomly sample events

These operators maintain state between events, unlike functions that work on individual values.

Use the head operator to get the first N events from a stream.

Get the first 3 events:

from {id: 1, value: "a"},
{id: 2, value: "b"},
{id: 3, value: "c"},
{id: 4, value: "d"},
{id: 5, value: "e"}
head 3
{id: 1, value: "a"}
{id: 2, value: "b"}
{id: 3, value: "c"}

Without an argument, head returns all events:

from {id: 1, value: "first"},
{id: 2, value: "second"},
{id: 3, value: "third"}
head
{id: 1, value: "first"}
{id: 2, value: "second"}
{id: 3, value: "third"}

Use the tail operator to get the last N events.

Get the last 2 events:

from {id: 1, value: "a"},
{id: 2, value: "b"},
{id: 3, value: "c"},
{id: 4, value: "d"}
tail 2
{id: 3, value: "c"}
{id: 4, value: "d"}

The slice operator provides fine-grained control over which events to extract.

Get events 2 through 4 (0-indexed):

from {n: 1}, {n: 2}, {n: 3}, {n: 4}, {n: 5}
slice begin=1, end=4
{n: 2}
{n: 3}
{n: 4}

Get every other event starting from the second:

from {n: 1}, {n: 2}, {n: 3}, {n: 4}, {n: 5}, {n: 6}
slice begin=1, stride=2
{n: 2}
{n: 4}
{n: 6}

Skip 2, take every 3rd event, stop after 10 total:

from {n: 1}, {n: 2}, {n: 3}, {n: 4}, {n: 5},
{n: 6}, {n: 7}, {n: 8}, {n: 9}, {n: 10}
slice begin=2, end=10, stride=3
{n: 3}
{n: 6}
{n: 9}

The taste operator samples events based on their structure, giving you examples of different data shapes in your stream.

See one example of each unique schema:

from {type: "user", name: "alice"},
{type: "user", name: "bob"},
{type: "event", id: 1},
{type: "event", id: 2},
{value: 42}
taste 1
{type: "user", name: "alice"}
{type: "event", id: 1}
{value: 42}

Get up to 2 examples of each schema:

from {x: 1, y: 1},
{x: 2, y: 2},
{x: 3},
{x: 4},
{z: "a"},
{z: "b"}
taste 2
{x: 1, y: 1}
{x: 2, y: 2}
{x: 3}
{x: 4}
{z: "a"}
{z: "b"}

Use the reverse operator to invert the order of events in a stream:

from {seq: 1, msg: "first"},
{seq: 2, msg: "second"},
{seq: 3, msg: "third"}
reverse
{seq: 3, msg: "third"}
{seq: 2, msg: "second"}
{seq: 1, msg: "first"}

Use the sample operator to sample events based on time intervals:

Sample events at regular time intervals:

from {id: 1}, {id: 2}, {id: 3}, {id: 4}, {id: 5},
{id: 6}, {id: 7}, {id: 8}, {id: 9}, {id: 10}
sample 1s
{id: 1}
{id: 2}
{id: 3}
{id: 4}
{id: 5}
{id: 6}
{id: 7}
{id: 8}
{id: 9}
{id: 10}

Note: The sample operator uses duration-based sampling, not random probability sampling.

Chain operators to create more complex sampling strategies:

from {user: "alice", action: "login", time: 1},
{user: "bob", action: "view", time: 2},
{user: "alice", action: "edit", time: 3},
{user: "charlie", action: "login", time: 4},
{user: "bob", action: "logout", time: 5}
where action == "login"
head 2
{user: "alice", action: "login", time: 1}
{user: "charlie", action: "login", time: 4}
  1. Prefer head over tail: head stops processing once it has enough events, while tail must process everything.

  2. Use taste for exploration: When working with unfamiliar data, taste quickly shows you the different schemas present.

  3. Be mindful of memory: Operators like tail and reverse buffer all input, which can consume significant memory for large streams.

  4. Combine with filters: Use where before slicing operators to reduce the amount of data processed.

Last updated: