Skip to main content

· 2 min read
Oliver Rochford

Staying ahead in the realm of cybersecurity means relentlessly navigating an endless sea of emerging threats and ever-increasing data volumes. The battle to stay one step ahead can often feel overwhelming, especially when your organization's data costs are skyrocketing.

· 5 min read
Oliver Rochford

We're overjoyed to announce our highly-anticipated security data pipeline platform at the renowned BlackHat conference in Las Vegas. The launch marks a milestone in our journey to bring simplicity to data engineering for cybersecurity operations, and to bring a cost-efficient way to tackle the increasingly complex data engineering challenges that security teams confront daily.

· 9 min read
Matthias Vallentin

Our Tenzir Query Language (TQL) is a pipeline language that works by chaining operators into data flows. When we designed TQL, we specifically studied Splunk's Search Processing Language (SPL), as it generally leaves a positive impression for security analysts that are not data engineers. Our goal was to take all the good things of SPL, but provide a more powerful language without compromising simplicity. In this blog post, we explain how the two languages differ using concrete threat hunting examples.

· 5 min read
Matthias Vallentin

Did you know that Zeek supports log rotation triggers, so that you can do anything you want with a newly rotated batch of logs?

· 5 min read
Matthias Vallentin

As an incident responder, threat hunter, or detection engineer, getting quickly to your analytics is key for productivity. For network-based visibility and detection, Zeek and Suricata are the bedrock for many security teams. But operationalizing these tools can take a good chunk of time.

So we asked ourselves: How can we make it super easy to work with Zeek and Suricata logs?

· 3 min read
Matthias Vallentin

Zeek turns packets into structured logs. By default, Zeek generates one file per log type and per rotation timeframe. If you don't want to wrangle files and directly process the output, this short blog post is for you.

· 8 min read
Matthias Vallentin

Zeek offers many ways to produce and consume logs. In this blog, we explain the various Zeek logging formats and show how you can get the most out of Zeek with Tenzir. We conclude with recommendations for when to use what Zeek format based on your use case.

· 2 min read
Dominik Lohmann

VAST is now Tenzir. This blog post describes what changed when we renamed the project.

· 5 min read
Dominik Lohmann

VAST v2.4 completes the switch to open storage formats, and includes an early peek at three upcoming features for VAST: A web plugin with a REST API and an integrated frontend user interface, Docker Compose configuration files for getting started with VAST faster and showing how to integrate VAST into your SOC, and new Python bindings that will make writing integrations easier and allow for using VAST with your data science libraries, like Pandas.

· One min read
Benno Evers

VAST v2.3.1 is now available. This small bugfix release addresses an issue where compaction would hang if encountering invalid partitions that were produced by older versions of VAST when a large max-partition-size was set in combination with badly compressible input data.

· 6 min read
Matthias Vallentin
Thomas Peiselt

Apache Parquet is the common denominator for structured data at rest. The data science ecosystem has long appreciated this. But infosec? Why should you care about Parquet when building a threat detection and investigation platform? In this blog post series we share our opinionated view on this question. In the next three blog posts, we

  1. describe how VAST uses Parquet and its little brother Feather
  2. benchmark the two formats against each other for typical workloads
  3. share our experience with all the engineering gotchas we encountered along the way

· 5 min read
Matthias Vallentin

The VAST project is roughly a decade old. But what happened over the last 10 years? This blog post looks back over time through the lens of the git merge commits.

Why merge commits? Because they represent a unit of completed contribution. Feature work takes place in dedicated branches, with the merge to the main branch sealing the deal. Some feature branches have just one commit, whereas others dozens. The distribution is not uniform. As of 6f9c84198 on Sep 2, 2022, there are a total of 13,066 commits, with 2,334 being merges (17.9%). We’ll take a deeper look at the merge commits.