Skip to main content
Version: Next

compress

Compresses a stream of bytes.

Synopsis

compress [--level=<level>] <codec>

Description

The compress operator compresses bytes in a pipeline incrementally with a known codec.

The compress operator is invoked automatically as a part of to if the resulting file has a file extension indicating compression. This behavior can be circumvented by using save directly.

Streaming Compression

The operator uses Apache Arrow's compression utilities under the hood, and transparently supports all options that Apache Arrow supports for streaming compression.

Besides the supported brotli, bz2, gzip, lz4, and zstd, Apache Arrow also ships with codecs for lzo, lz4_raw, lz4_hadoop and snappy, which only support oneshot compression. Support for them is not currently implemented.

--level=<level>

The compression level to use. The supported values depend on the codec used. If omitted, the default level for the codec is used.

<codec>

An identifier of the codec to use. Currently supported are brotli, bz2, gzip, lz4, and zstd.

Examples

Export all events in a Gzip-compressed NDJSON file:

export
| write json --compact-output
| compress gzip
| save file /tmp/backup.json.gz

Recompress a Zstd-compressed file at a higher compression level:

load file in.zst
| decompress zstd
| compress --level 18 zstd
| save file out.zst