Skip to content

Types

Tenzir’s type system is a superset of JSON. That is, every valid JSON object is a valid Tenzir value, but there also additional types available, such as ip and subnet.

The diagram below illustrates the type system at a glance:

timedurationenumField 1Field N...recordField 1Field N...:TypeType...:listElementTypeTypeboolint64uint64doubleipsubnetstringArithmeticTemporalNetworkStringContainersblob

Basic types are stateless types with a static structure. The following basic types exist:

TypeDescription
noneDenotes an absent or invalid value
boolA boolean value
int64A 64-bit signed integer
uint64A 64-bit unsigned integer
doubleA 64-bit double (IEEE 754)
durationA time span (nanosecond granularity)
timeA time point (nanosecond granularity)
stringA UTF-8 encoded string
blobAn arbitrary sequence of bytes
ipAn IPv4 or IPv6 address
subnetAn IPv4 or IPv6 subnet
secretA secret value

The secret type is a special type created by the secret function. Secrets can only be used as arguments for operators that accept them and only support a limited set of operations, such as concatenation.

See the explanation page for secrets for more details.

Complex types are stateful types that carry additional runtime information.

The enum type is a list of predefined string values. It comes in handy for low-cardinality values from a fixed set of options.

Tenzir implements an enum as an Arrow Dictionary.

The list type is an ordered sequence of values with a fixed element type.

Lists have zero or more elements.

The record type consists of an ordered sequence fields, each of which have a name and type. Records must have at least one field.

The field name is an arbitrary UTF-8 string.

The field type is any Tenzir type.

All types are optional in that there exists an additional null data point in every value domain. Consequently, Tenzir does not have a special type to indicate optionality.

Every type has zero or more attributes, which are free-form key-value pairs to enrich types with custom semantics.

All Tenzir types have a lossless mapping to Arrow types, however, not all Arrow types have a Tenzir equivalent. As a result, it is currently not yet possible to import arbitrary Arrow data. In the future, we plan to extend our support for Arrow-native types and also offer conversion options for seamless data handover.

Tenzir has a few domain-specific types that map to Arrow extension types. These are currently enum, ip, and subnet. Tenzir and Arrow attach type metadata to different entities: Tenzir attaches metadata to a type instance, whereas Arrow attaches metadata to a schema or record field.

Last updated: