This guide provides an overview of data normalization in TQL. Normalization transforms raw, inconsistent data into a clean, standardized format that’s ready for analysis, storage, and sharing.
What is normalization?
Section titled “What is normalization?”Normalization involves several key transformations:
- Clean up values - Replace nulls, normalize sentinels, fix types
- Map to schemas - Translate fields to a standard schema like ASIM, CIM, ECS, OCSF, or UDM
- Package mappings - Create reusable, tested mapping operators
Each step builds on the previous. Start with clean data, then map to your target schema, and finally package your mappings for production use.
Why normalize?
Section titled “Why normalize?”Raw data from different sources varies in:
- Field names:
src_ipvssource_addressvsclient.ip - Value formats:
"true"vstruevs1vs"yes" - Missing values:
nullvs""vs"-"vs"N/A" - Timestamps: Unix epochs vs ISO strings vs custom formats
Normalization solves these inconsistencies, enabling:
- Unified queries across data sources
- Reliable enrichment and correlation
- Consistent analytics and dashboards
- Interoperability with external tools
The normalization pipeline
Section titled “The normalization pipeline”A typical normalization pipeline follows this structure:
// 1. Collect raw datafrom_kafka "raw-events"
// 2. Parse into structured eventsthis = message.parse_json()
// 3. Clean up valuesreplace what="N/A", with=nullreplace what="-", with=null
// 4. Map to target schemamy_source::ocsf::map
// 5. Output normalized eventspublish "normalized-events"Normalization guides
Section titled “Normalization guides”Start with cleanup, then choose the schema guide for your target platform. Schema guides are listed alphabetically by acronym.
Clean up values
Section titled “Clean up values”Clean up values — Start by fixing data quality issues:
- Replace null placeholders (
"None","N/A","-") - Normalize sentinel values
- Fix types (strings to timestamps, IPs, numbers)
- Provide default values for missing fields
Map to ASIM
Section titled “Map to ASIM”Map to ASIM — Learn how to map events to Microsoft Sentinel ASIM records:
- Choose the correct ASIM event or entity schema
- Populate schema, product, and event metadata
- Map role-prefixed source, destination, actor, target, and device fields
- Preserve unmapped fields in
AdditionalFields
Map to CIM
Section titled “Map to CIM”Map to CIM — Learn how to map events to Splunk CIM fields:
- Choose the correct CIM data model and dataset
- Apply dataset tags and constraints
- Populate normalized fields for data model acceleration
- Send mapped events to Splunk HEC with metadata
Map to ECS
Section titled “Map to ECS”Map to ECS — Learn how to map events to Elastic Common Schema fields:
- Populate
@timestampandecs.version - Choose
event.kind,event.category, andevent.type - Map source, destination, network, and observer fieldsets
- Preserve source-specific details in a custom namespace
Map to OCSF
Section titled “Map to OCSF”Map to OCSF — Learn the comprehensive approach to OCSF mapping:
- Identify the correct event class
- Map fields by attribute group
- Handle unmapped fields
- Validate with
ocsf::cast
Map to UDM
Section titled “Map to UDM”Map to UDM — Learn how to map events to Google SecOps UDM records:
- Choose the correct UDM event type
- Populate metadata and participant nouns
- Convert source values to UDM enums
- Preserve unmapped fields in
additional
When to normalize
Section titled “When to normalize”Normalize data at the ingestion point in your pipeline:
Collection → Parsing → Normalization → Storage/Forwarding ↑ You are hereNormalizing early ensures all downstream consumers work with consistent data. Avoid normalizing the same data multiple times by storing normalized events.