Skip to main content
Version: Tenzir v4.9

parquet

Reads events from a Parquet file. Writes events to a Parquet file.

Synopsis

parquet

Description

The parquet format provides both a parser and a printer for Parquet files.

Apache Parquet is a columnar storage format that a variety of data tools support.

MMAP Parsing

When using the parser with the file connector, we recommend passing the --mmap option to file to give the parser full control over the reads, which leads to better performance and memory usage.

Tenzir writes Parquet files with Zstd compression enables. Our blog has a post with an in-depth analysis about the effect of Zstd compression.

Examples

Read a Parquet file via the from operator:

from file --mmap /tmp/data.prq read parquet
Limitation

The parquet parser currently supports only Parquet files written with Tenzir. We will remove this limitation in the future.