Skip to content

VAST v2.0.0

Download the release on GitHub.

Clean up transform steps (and native plugins generally)

Section titled “Clean up transform steps (and native plugins generally)”

The replace transform step now allows for setting values of complex types, e.g., lists or records.

By @dominiklohmann in #2228.

The lsvast tool now prints the whole store contents when given a store file as an argument.

By @lava in #2247.

VAST now creates one active partition per layout, rather than having a single active partition for all layouts.

The new option vast.active-partition-timeout controls the time after which an active partition is flushed to disk. The timeout may hit before the partition size reaches vast.max-partition-size, allowing for an additional temporal control for data freshness. The active partition timeout defaults to 1 hour.

By @tobim in #2096.

Allow fine-grained meta index configuration

Section titled “Allow fine-grained meta index configuration”

The new vast.index section in the configuration supports adjusting the false-positive rate of first-stage lookups for individual fields, allowing users to optimize the time/space trade-off for expensive queries.

By @lava in #2065.

Add a grand total event counter to the status output

Section titled “Add a grand total event counter to the status output”

The output of vast status now displays the total number of events stored under the key index.statistics.events.total.

By @6yozo in #2133.

The disk monitor has new status entries blacklist and `blacklist

  • size` containing information about partitions failed to be erased.

By @dominiklohmann in #2160.

Support environment variables as alternate config mechanism

Section titled “Support environment variables as alternate config mechanism”

VAST has now complete support for passing environment variables as alternate path to configuration files. Environment variables have lower precedence than CLI arguments and higher precedence than config files. Variable names of the form VAST_FOO__BAR_BAZ map to vast.foo.bar-baz, i.e., __ is a record separator and _ translates to -. This does not apply to the prefix VAST_, which is considered the application identifier. Only variables with non-empty values are considered.

By @mavam in #2162.

Implement support for transforms that apply to every type and use compaction for aging

Section titled “Implement support for transforms that apply to every type and use compaction for aging”

VAST v1.0 deprecated the experimental aging feature. Given popular demand we’ve decided to un-deprecate it, and to actually implement it on top of the same building blocks the compaction mechanism uses. This means that it is now fully working and no longer considered experimental.

By @lava in #2186.

We removed the experimental vast get command. It relied on an internal unique event ID that was only exposed to the user in debug messages. This removal is a preparatory step towards a simplification of some of the internal workings of VAST.

By @tobim in #2121.

VAST’s internal data model now completely preserves the nesting of the stored data when using the arrow encoding, and maps the pattern, address, subnet, and enumeration types onto Arrow extension types rather than using the underlying representation directly. This change enables use of the export arrow command without needing information about VAST’s type system.

Transform steps that add or modify columns now transform the columns in-place rather than at the end, preserving the nesting structure of the original data.

The deprecated msgpack encoding no longer exists. Data imported using the msgpack encoding can still be accessed, but new data will always use the arrow encoding.

By @dominiklohmann in #2159.

Minimize the threadpool for client commands

Section titled “Minimize the threadpool for client commands”

Client commands such as vast export or vast status now create less threads at runtime, reducing the risk of hitting system resource limits.

By @tobim in #2193.

VAST ships experimental Terraform scripts to deploy on AWS Lambda and Fargate.

By @rdettai in #2108.

The command line option --verbosity has the new name --console-verbosity. This synchronizes the CLI interface with the configuration file that solely understands the option vast.console-verbosity.

By @mavam in #2178.

Remove the “catalog” and “catalog-bytes” keys from the index status

Section titled “Remove the “catalog” and “catalog-bytes” keys from the index status”

The index section in the status output no longer contains the catalog and catalog-bytes keys. The information is already present in the top-level catalog section.

By @tobim in #2233.

The meta-index is now called the catalog. This affects multiple metrics and entries in the output of vast status, and the configuration option vast.meta-index-fp-rate, which is now called vast.catalog-fp-rate.

By @dominiklohmann in #2128.

Eploit synergies when evaluating many queries at the same time

Section titled “Eploit synergies when evaluating many queries at the same time”

We revised the query scheduling logic to exploit synergies when multiple queries run at the same time. In that vein, we updated the related metrics with more accurate names to reflect the new mechanism. The new keys scheduler.partition.materializations, scheduler.partition.scheduled, and scheduler.partition.lookups provide periodic counts of partitions loaded from disk and scheduled for lookup, and the overall number of queries issued to partitions, respectively. The keys query.workers.idle, and query.workers.busy were renamed to scheduler.partition.remaining-capacity, and scheduler.partition.current-lookups. Finally, the key scheduler.partition.pending counts the number of currently pending partitions. It is still possible to opt-out of the new scheduling algorithm with the (deprecated) option --use-legacy-query-scheduler.

By @tobim in #2117.

Bump the minimum version of Apache Arrow to 7.0

Section titled “Bump the minimum version of Apache Arrow to 7.0”

VAST now requires Apache Arrow >= v7.0.0.

By @tobim in #2122.

Clean up transform steps (and native plugins generally)

Section titled “Clean up transform steps (and native plugins generally)”

Multiple transform steps now have new names: select is now called where, delete is now called drop, project is now called put, and aggregate is now called summarize. This breaking change is in preparation for an upcoming feature that improves the capability of VAST’s query language.

The layout-names option of the rename transform step was renamed schemas. The step now additonally supports renaming fields.

By @dominiklohmann in #2228.

The vast(1) man-page is no longer empty for VAST distributions with static binaries.

By @dominiklohmann in #2190.

Treat list options in env variables consistently

Section titled “Treat list options in env variables consistently”

Environment variables for options that specify lists now consistently use comma-separators and respect escaping with backslashes.

By @dominiklohmann in #2236.

Reduce the default log queue size for client commands

Section titled “Reduce the default log queue size for client commands”

We optimized the queue size of the logger for commands other than vast start. Client commands now show a significant reduction in memory usage and startup time.

By @tobim in #2176.

Lift selector field requirements for JSON import

Section titled “Lift selector field requirements for JSON import”

The JSON import no longer rejects non-string selector fields. Instead, it always uses the textual JSON representation as a selector. E.g., the JSON object {id:1,...} imported via vast import json --selector=id:mymodule now matches the schema named mymodule.1 rather than erroring because the id field is not a string.

By @dominiklohmann in #2255.

The explore command now properly terminates after the requested number of results are delivered.

By @tobim in #2120.

The count --estimate erroneously materialized store files from disk, resulting in an unneeded performance penalty. VAST now answers approximate count queries by solely consulting the relevant index files.

By @dominiklohmann in #2146.

The CSV parser no longer fails when encountering integers when floating point values were expected.

By @dominiklohmann in #2184.

The query optimizer incorrectly transformed queries with conjunctions or disjunctions with several operands testing against the same string value, leading to missing result. This was rarely an issue in practice before the introduction of homogenous partitions with the v2.0 release.

By @lava in #2264.

Don’t send null pointers when erasing whole partitions

Section titled “Don’t send null pointers when erasing whole partitions”

VAST no longer sometimes crashes when aging or compaction erase whole partitions.

By @lava in #2227.

Ignore types unrelated to the configuration in the summarize plugin

Section titled “Ignore types unrelated to the configuration in the summarize plugin”

Transform steps removing all nested fields from a record leaving only empty nested records no longer cause VAST to crash.

By @dominiklohmann in #2258.

Some queries could get stuck when an importer would time out during the meta index lookup. This race condition no longer exists.

By @lava in #2167.

Stop accepting new queries after initiating shutdown

Section titled “Stop accepting new queries after initiating shutdown”

VAST servers no longer accept queries after initiating shutdown. This fixes a potential infinite hang if new queries were coming in faster than VAST was able to process them.

By @dominiklohmann in #2215.

Use the timestamp type for inferred event timestamp fields in the Zeek reader

Section titled “Use the timestamp type for inferred event timestamp fields in the Zeek reader”

The import zeek command now correctly marks the event timestamp using the timestamp type alias for all inferred schemas.

By @tobim in #2155.

Last updated: