Skip to content

Jul 18, 2022 · @rdettai · #2415

The VAST Cloud CLI can now authenticate to the Tenzir private registry and download the vast-pro image (including plugins such as Matcher). The deployment script can now be configured to use a specific image and can thus be set to use vast-pro.

Add percentage of total number of events to index status

Section titled “Add percentage of total number of events to index status”

Jun 21, 2022 · @dominiklohmann · #2351

The index statistics in vast status --detailed now show the event distribution per schema as a percentage of the total number of events in addition to the per-schema number, e.g., for suricata.flow events under the key index.statistics.layouts.suricata.flow.percentage.

Jun 19, 2022 · @dominiklohmann · #2360

The output vast status --detailed now shows metadata from all partitions under the key .catalog.partitions. Additionally, the catalog emits metrics under the key catalog.num-events and catalog.num-partitions containing the number of events and partitions respectively. The metrics contain the schema name in the field metadata_schema and the (internal) partition version in the field metadata_partition-version.

Jun 16, 2022 · @dominiklohmann · #2346

VAST now compresses on-disk indexes with Zstd, resulting in a 50-80% size reduction depending on the type of indexes used, and reducing the overall index size to below the raw data size. This improves retention spans significantly. For example, using the default configuration, the indexes for suricata.ftp events now use 75% less disk space, and suricata.flow 30% less.

Add index metric for created active partitions

Section titled “Add index metric for created active partitions”

Jun 13, 2022 · @lava · #2302

VAST emits the new metric partition.events-written when writing a partition to disk. The metric’s value is the number of events written, and the metadata_schema field contains the name of the partition’s schema.

Jun 10, 2022 · @dominiklohmann · #2336

The csv import gained a new --seperator='x' option that defaults to ','. Set it to '\t' to import tab-separated values, or ' ' to import space-separated values.

Jun 10, 2022 · @KaanSK · #2334

PyVAST now supports running client commands for VAST servers running in a container environment, if no local VAST binary is available. Specify the container keyword to customize this behavior. It defaults to {"runtime": "docker", "name": "vast"}.

Jun 9, 2022 · @dominiklohmann · #2321

The new rebuild command rebuilds old partitions to take advantage of improvements in newer VAST versions. Rebuilding takes place in the VAST server in the background. This process merges partitions up to the configured max-partition-size, turns VAST v1.x’s heterogeneous into VAST v2.x’s homogenous partitions, migrates all data to the currently configured store-backend, and upgrades to the most recent internal batch encoding and indexes.

Report by schema metrics from the importer

Section titled “Report by schema metrics from the importer”

Jun 3, 2022 · @tobim · #2274

VAST now produces additional metrics under the keys ingest.events, ingest.duration and ingest.rate. Each of those gets issued once for every schema that VAST ingested during the measurement period. Use the metadata_schema key to disambiguate the metrics.

May 31, 2022 · @lava · #2260

The lsvast tool can now print contents of individual .mdx files. It now has an option to print raw Bloom filter contents of string and IP address synopses.

The mdx-regenerate tool was renamed to vast-regenerate and can now also regenerate an index file from a list of partition UUIDs.

May 23, 2022 · @dominiklohmann · #2288

The status command now supports filtering by component name. E.g., vast status importer index only shows the status of the importer and index components.

May 19, 2022 · @dominiklohmann · #2268

VAST now compresses data with Zstd. When persisting data to the segment store, the default configuration achieves over 2x space savings. When transferring data between client and server processes, compression reduces the amount of transferred data by up to 5x. This allowed us to increase the default partition size from 1,048,576 to 4,194,304 events, and the default number of events in a single batch from 1,024 to 65,536. The performance increase comes at the cost of a ~20% memory footprint increase at peak load. Use the option vast.max-partition-size to tune this space-time tradeoff.

May 17, 2022 · @dispanser · #2284

A new parquet store plugin allows VAST to store its data as parquet files, increasing storage efficiency at the expense of higher deserialization costs. Storage requirements for the VAST database is reduced by another 15-20% compared to the existing segment store with Zstd compression enabled. CPU usage for suricata import is up ~ 10%, mostly related to the more expensive serialization. Deserialization (reading) of a partition is significantly more expensive, increasing CPU utilization by about 100%, and should be carefully considered and compared to the potential reduction in storage cost and I/O operations.

Always format time values with microsecond precision

Section titled “Always format time values with microsecond precision”

Jun 25, 2022 · @tobim · #2380

VAST will from now on always format time and timestamp values with six decimal places (microsecond precision). The old behavior used a precision that depended on the actual value. This may require action for downstream tooling like metrics collectors that expect nanosecond granularity.

Write homogenous partitions from the partition transformer

Section titled “Write homogenous partitions from the partition transformer”

Jun 2, 2022 · @lava · #2277

Partition transforms now always emit homogenous partitions, i.e., one schema per partition. This makes compaction and aging more efficient.

May 31, 2022 · @lava · #2260

The mdx-regenerate tool is no longer part of VAST binary releases.

May 27, 2022 · @lava · #2312

The vast.use-legacy-query-scheduler option is now ignored because the legacy query scheduler has been removed.

May 20, 2022 · @dominiklohmann · #2290

The vast.store-backend configuration option no longer supports archive, and instead always uses the superior segment-store instead. Events stored in the archive will continue to be available in queries.

May 17, 2022 · @dispanser · #2284

VAST now requires Arrow >= v8.0.0.

Jul 4, 2022 · @tobim · #2394

We improved the mechanism to recover the database state after an unclean shutdown.

Support environment variables for plugin options

Section titled “Support environment variables for plugin options”

Jun 30, 2022 · @dominiklohmann · #2390

VAST no longer ignores environment variables for plugin-specific options. E.g., the environment variable VAST_PLUGINS__FOO__BAR now correctly refers to the bar option of the foo plugin, i.e., plugins.foo.bar.

Jun 24, 2022 · @tobim · #2376

VAST will no longer terminate when it can’t write any more data to disk. Incoming data will still be accepted but discarded. We encourage all users to enable the disk-monitor or compaction features as a proper solution to this problem.

Parse time from JSON strings containing numbers

Section titled “Parse time from JSON strings containing numbers”

Jun 10, 2022 · @dominiklohmann · #2340

The JSON import now treats time and duration fields correctly for JSON strings containing a number, i.e., the JSON string "1654735756" now behaves just like the JSON number 1654735756 and for a time field results in the value 2022-06-09T00:49:16.000Z.

Jun 10, 2022 · @dominiklohmann · #2336

The csv import no longer crashes when the CSV file contains columns not present in the selected schema. Instead, it imports these columns as strings.

vast export csv now renders enum columns in their string representation instead of their internal numerical representation.

Jun 9, 2022 · @tobim · #2332

The parser for real values now understands scientific notation, e.g., 1.23e+42.

Jun 3, 2022 · @dominiklohmann · #2325

VAST now reads the default false-positive rate for sketches correctly. This broke accidentally with the v2.0 release. The option moved from vast.catalog-fp-rate to vast.index.default-fp-rate.

Jun 3, 2022 · @tobim · #2324

VAST no longer hangs when it is shut down while still importing events.

Fall back to string when parsing config options from environment

Section titled “Fall back to string when parsing config options from environment”

May 25, 2022 · @dispanser · #2305

Setting the environment variable VAST_ENDPOINT to host:port pair no longer fails on startup with a parse error.

Fix crash in query evaluation for new partitions

Section titled “Fix crash in query evaluation for new partitions”

May 23, 2022 · @dominiklohmann · #2295

VAST no longer crashes when a query arrives at a newly created active partition in the time window between the partition creation and the first event arriving at the partition.

Allow missing value indices in partition flatbuffer

Section titled “Allow missing value indices in partition flatbuffer”

May 20, 2022 · @lava · #2286

VAST no longer crashes when importing map or pattern data annotated with the #skip attribute.

Prefer CLI over config file for vast.plugins

Section titled “Prefer CLI over config file for vast.plugins”

May 19, 2022 · @dominiklohmann · #2289

The command-line options --plugins, --plugin-dirs, and --schema-dirs now correctly overwrite their corresponding configuration options.