VAST v2.4.1 improves the performance of queries when VAST is under high load, and significantly reduces the time to first result for queries with a low selectivity.
Reading Feather Files Incrementally
VAST's Feather store naïvely used the Feather reader from the Apache Arrow C++ library in its initial implementation. However, its API is rather limited: It does not support reading record batches incrementally. We've swapped this out with a more efficient implementation that does.
This is best explained visually:
Within the scope of a single Feather store file, a single query takes the same amount of time overall, but there exist two distinct advantages of this approach:
- The first result arrives much faster at the client.
- Stores do less work for cancelled queries.
One additional benefit that is not immediately obvious comes into play when queries arrives at multiple stores in parallel: disk reads are more evenly spread out now, making them less likely to overlap between stores. For deployments with slower I/O paths this can lead to a significant query performance improvement.
To verify and test this, we've created a VAST database with 300M Zeek events (33GB on disk) from a Corelight sensor. All tests were performed on a cold start of VAST, i.e., we stopped and started VAST after every repetition of each test.
We performed three tests:
- Export a single event (20 times)
- Export all events (20 times)
- Rebuild the entire database (3 times)
The results are astonishingly good:
|(1)||Avg. store load time||55.1ms||4.2ms||13.1x|
|Time to first result/Total time||19.8ms||14.5ms||1.4x|
|(2)||Avg. store load time||386.5ms||7.3ms||52.9x|
|Time to first result||69.2ms||25.4ms||2.7x|
|(3)||Avg. store load time||480.3ms||9.1ms||52.7x|
If you're using the Feather store backend (the default as of v2.4.0), you will see an immediate improvement with VAST v2.4.1. There are no other changes between the two releases.
VAST also offers an experimental Parquet store backend, for which we plan to make a similar improvement in a coming release.