Staying ahead in the realm of cybersecurity means relentlessly navigating an endless sea of emerging threats and ever-increasing data volumes. The battle to stay one step ahead can often feel overwhelming, especially when your organization's data costs are skyrocketing.
Introducing Tenzir Security Data Pipelines
We're overjoyed to announce our highly-anticipated security data pipeline platform at the renowned BlackHat conference in Las Vegas. The launch marks a milestone in our journey to bring simplicity to data engineering for cybersecurity operations, and to bring a cost-efficient way to tackle the increasingly complex data engineering challenges that security teams confront daily.
Tenzir for Splunk Users
Our Tenzir Query Language (TQL) is a pipeline language that works by chaining operators into data flows. When we designed TQL, we specifically studied Splunk's Search Processing Language (SPL), as it generally leaves a positive impression for security analysts that are not data engineers. Our goal was to take all the good things of SPL, but provide a more powerful language without compromising simplicity. In this blog post, we explain how the two languages differ using concrete threat hunting examples.
Native Zeek Log Rotation & Shipping
Did you know that Zeek supports log rotation triggers, so that you can do anything you want with a newly rotated batch of logs?
Shell Yeah! Supercharging Zeek and Suricata with Tenzir
As an incident responder, threat hunter, or detection engineer, getting quickly to your analytics is key for productivity. For network-based visibility and detection, Zeek and Suricata are the bedrock for many security teams. But operationalizing these tools can take a good chunk of time.
So we asked ourselves: How can we make it super easy to work with Zeek and Suricata logs?
Zeek and Ye Shall Pipe
Zeek turns packets into structured logs. By default, Zeek generates one file per log type and per rotation timeframe. If you don't want to wrangle files and directly process the output, this short blog post is for you.
Mobilizing Zeek Logs
Zeek offers many ways to produce and consume logs. In this blog, we explain the various Zeek logging formats and show how you can get the most out of Zeek with Tenzir. We conclude with recommendations for when to use what Zeek format based on your use case.
Migrating from VAST to Tenzir
VAST is now Tenzir. This blog post describes what changed when we renamed the project.
Visibility Across Space and Time is now Tenzir
After 5 years of developing two identities, the VAST project and Tenzir the company, we decided to streamline our efforts and rename VAST to Tenzir.
VAST v3.1
VAST v3.1 is out. This is a small checkpointing release that brings a few new changes and fixes.
VAST v3.0
VAST v3.0 is out. This release brings some major updates to the the VAST language, making it easy to write down dataflow pipelines that filter, reshape, aggregate, and enrich security event data. Think of VAST as security data pipelines plus open storage engine.
From Slack to Discord
The New REST API
As of v2.4 VAST ships with a new web
plugin that
provides a REST API. The API documentation describes the
available endpoints also provides an
OpenAPI spec for download. This
blog post shows how we built the API and what you can do with it.
Parquet & Feather: Data Engineering Woes
Apache Arrow and Apache Parquet have become the de-facto columnar formats for in-memory and on-disk representations when it comes to structured data. Both are strong together, as they provide data interoperability and foster a diverse ecosystem of data tools. But how well do they actually work together from an engineering perspective?
VAST v2.4.1
VAST v2.4.1 improves the performance of queries when VAST is under high load, and significantly reduces the time to first result for queries with a low selectivity.
VAST v2.4
VAST v2.4 completes the switch to open storage formats, and includes an early peek at three upcoming features for VAST: A web plugin with a REST API and an integrated frontend user interface, Docker Compose configuration files for getting started with VAST faster and showing how to integrate VAST into your SOC, and new Python bindings that will make writing integrations easier and allow for using VAST with your data science libraries, like Pandas.
Parquet & Feather: Writing Security Telemetry
VAST v2.3.1
VAST v2.3.1 is now available. This small bugfix release
addresses an issue where compaction would hang if encountering
invalid partitions that were produced by older versions of VAST when a large
max-partition-size
was set in combination with badly compressible input data.
Parquet & Feather: Enabling Open Investigations
Apache Parquet is the common denominator for structured data at rest. The data science ecosystem has long appreciated this. But infosec? Why should you care about Parquet when building a threat detection and investigation platform? In this blog post series we share our opinionated view on this question. In the next three blog posts, we
- describe how VAST uses Parquet and its little brother Feather
- benchmark the two formats against each other for typical workloads
- share our experience with all the engineering gotchas we encountered along the way
A Git Retrospective
The VAST project is roughly a decade old. But what happened over the last 10 years? This blog post looks back over time through the lens of the git merge commits.
Why merge commits? Because they represent a unit of completed contribution.
Feature work takes place in dedicated branches, with the merge to the main
branch sealing the deal. Some feature branches have just one commit, whereas
others dozens. The distribution is not uniform. As of 6f9c84198
on Sep 2,
2022, there are a total of 13,066 commits, with 2,334 being merges (17.9%).
We’ll take a deeper look at the merge commits.