SeerflowEvent Model¶
Concept¶
Every log event in the pipeline is represented as a SeerflowEvent — a single struct that carries fields from four log schema standards simultaneously:
- OpenTelemetry LogRecord — trace context, nanosecond timestamps, severity 1-24
- Elastic Common Schema (ECS) — event.kind/category/type/outcome hierarchy
- OCSF — numeric taxonomy with category_uid/class_uid/type_uid
- Sigma — logsource.category/product/service for detection rule matching
Why unify four schemas? Because downstream consumers speak different languages. The detection ensemble reads anomaly scores. Sigma rules query logsource fields. The correlation engine resolves entity references. The alerting layer formats ECS event types for human-readable notifications. One struct serves them all — no translation layer, no schema mapping at query time.
Design Choices¶
| Choice | Why |
|---|---|
msgspec.Struct |
4x faster serialization than dataclasses, 17x faster than Pydantic v2 |
frozen=True |
Immutability prevents hidden side effects — safe to pass between pipeline stages |
gc=False |
Disables cyclic garbage collection — SeerflowEvent has no reference cycles, saving GC overhead at 10K+ events/sec |
tag=True |
Adds a type discriminator to serialized output — enables tagged-union decoding when multiple struct types share a wire channel |
How It Works¶
Field Reference¶
Identity¶
| Field | Type | Description | Example |
|---|---|---|---|
event_id |
UUID |
Unique event identifier | 550e8400-e29b-41d4-a716-446655440000 |
timestamp_ns |
int |
Event time (nanoseconds since epoch) | 1742007247000000000 |
observed_ns |
int |
Pipeline receive time (nanoseconds) | 1742007247001000000 |
Trace Context (OpenTelemetry)¶
| Field | Type | Description | Example |
|---|---|---|---|
trace_id |
str | None |
OTel trace ID for correlated spans | "4bf92f3577b34da6a3ce929d0e0e4736" |
span_id |
str | None |
OTel span ID | "00f067aa0ba902b7" |
Severity¶
| Field | Type | Description | Example |
|---|---|---|---|
severity_id |
SeverityLevel |
Unified severity (0=TRACE to 6=FATAL) | SeverityLevel.WARNING (3) |
otel_severity |
int |
OTel SeverityNumber (1-24) | 13 (WARN) |
The SeverityLevel enum maps to a 7-level scale: TRACE (0), INFORMATIONAL (1), NOTICE (2), WARNING (3), ERROR (4), CRITICAL (5), FATAL (6).
Classification (ECS)¶
| Field | Type | Description | Example |
|---|---|---|---|
event_kind |
str |
ECS event kind | "alert" |
event_category |
str |
ECS category | "authentication" |
event_type |
str |
ECS type | "start" |
event_outcome |
str |
ECS outcome | "failure" |
event_action |
str |
ECS action | "ssh_login" |
OCSF Taxonomy¶
| Field | Type | Description | Example |
|---|---|---|---|
category_uid |
int |
OCSF category | 3 (Identity & Access) |
class_uid |
int |
OCSF class | 3002 (Authentication) |
type_uid |
int |
OCSF type | 300201 (Logon: Failed) |
activity_id |
int |
OCSF activity | 1 (Logon) |
Invariant: type_uid = class_uid * 100 + activity_id. Callers setting any of these three must set all three consistently.
Content¶
| Field | Type | Description | Example |
|---|---|---|---|
message |
str |
Human-readable log message | "Failed password for root from 198.51.100.23" |
body |
msgspec.Raw | None |
Deferred-decoding payload for arbitrary log bodies | Raw JSON bytes |
Source Tracking¶
| Field | Type | Description | Example |
|---|---|---|---|
source_type |
str |
Receiver type that ingested this event | "syslog" |
source_id |
str |
Unique source identifier | "syslog-udp" |
log_source_category |
str |
Sigma logsource category | "process_creation" |
log_source_product |
str |
Sigma logsource product | "linux" |
log_source_service |
str |
Sigma logsource service | "sshd" |
Drain3 Metadata¶
| Field | Type | Description | Example |
|---|---|---|---|
template_id |
int |
Drain3 cluster ID (-1 = no match) | 42 |
template_str |
str |
Extracted template | "Failed password for <*> from <*>" |
template_params |
tuple[str, ...] |
Wildcard values | ("root", "<IP>") |
Entity References¶
| Field | Type | Description | Example |
|---|---|---|---|
entity_refs |
tuple[str, ...] |
UUID5 entity IDs (resolved) | ("a1b2c3...",) |
related_ips |
tuple[str, ...] |
Extracted IP addresses | ("198.51.100.23",) |
related_users |
tuple[str, ...] |
Extracted usernames | ("root",) |
related_hosts |
tuple[str, ...] |
Extracted hostnames | ("web-prod-01",) |
related_files |
tuple[str, ...] |
Extracted file paths | ("/var/log/auth.log",) |
related_domains |
tuple[str, ...] |
Extracted domains | ("api.example.com",) |
related_processes |
tuple[str, ...] |
Extracted process names | ("sshd",) |
related_hashes |
tuple[str, ...] |
File/process hashes | ("sha256:e3b0c4...",) |
MITRE ATT&CK¶
| Field | Type | Description | Example |
|---|---|---|---|
mitre_tactics |
tuple[str, ...] |
ATT&CK tactic IDs | ("TA0006",) (Credential Access) |
mitre_techniques |
tuple[str, ...] |
ATT&CK technique IDs | ("T1110",) (Brute Force) |
Scores¶
| Field | Type | Description | Example |
|---|---|---|---|
risk_score |
float |
Accumulated entity risk (0-100) | 72.0 |
confidence |
float |
Detection confidence (0-1) | 0.95 |
anomaly_score |
float |
Blended ML anomaly score (0-1) | 0.65 |
Metadata¶
| Field | Type | Description | Example |
|---|---|---|---|
attributes |
dict[str, AttrValue] |
Arbitrary key-value pairs | {"facility": "auth"} |
tags |
tuple[str, ...] |
Free-form tags | ("brute-force", "external") |
raw_event |
str |
Original unmodified log line | Full raw text |
resource_attrs |
dict[str, str] |
OTel resource attributes | {"service.name": "sshd"} |
Schema Unification¶
The security and observability industry has no single standard for log events. Four competing schemas dominate, each designed for a different purpose:
| Schema | Origin | Strength | Used by |
|---|---|---|---|
| OpenTelemetry | CNCF (2019) | Distributed tracing, nanosecond timestamps, resource attributes | Cloud-native apps, Kubernetes, OTel Collector |
| Elastic Common Schema | Elastic (2019) | Human-readable event classification (event.kind/category/type/outcome) |
Elasticsearch, Kibana, Elastic SIEM |
| OCSF | AWS + Splunk (2022) | Numeric taxonomy for machine processing (category_uid/class_uid/type_uid) |
AWS Security Lake, Splunk, CrowdStrike |
| Sigma | Open-source (2017) | Portable detection rules via logsource.category/product/service matching |
3,000+ SigmaHQ rules, every major SIEM |
Traditional tools pick one schema and translate everything into it — losing information in the process. Seerflow takes a different approach: carry all four schemas simultaneously in a single SeerflowEvent struct. No translation, no lossy mapping, no schema conversion at query time. Select a tab below to see which fields Seerflow borrows from each schema:
Distributed tracing schema from CNCF (2019). Strength: nanosecond timestamps, trace/span correlation, resource attributes.
| Field | Description |
|---|---|
trace_id |
Distributed trace correlation |
span_id |
Span within a trace |
otel_severity |
Severity 1–24 |
body |
Raw payload (deferred decode) |
resource_attrs |
service.name, host.name, … |
Elastic's human-readable event classification (2019). Strength: descriptive string taxonomy that operators can read directly.
| Field | Description |
|---|---|
event_kind |
alert · event · metric · state |
event_category |
authentication · process · network |
event_type |
start · end · info · error |
event_outcome |
success · failure · unknown |
event_action |
ssh_login · file_create · … |
Open Cybersecurity Schema Framework from AWS + Splunk (2022). Strength: numeric taxonomy for fast machine processing.
| Field | Description |
|---|---|
category_uid |
1=System · 3=Identity · 4=Network |
class_uid |
3002=Authentication |
type_uid |
300201=Logon Failed |
activity_id |
1=Logon · 3=Terminate |
Portable detection rule matching (2017). Strength: 3,000+ SigmaHQ rules dispatch against these three fields.
| Field | Description |
|---|---|
log_source_category |
process_creation · firewall |
log_source_product |
linux · windows · aws |
log_source_service |
sshd · nginx · cloudtrail |
Who Reads What¶
Each pipeline component queries only the fields it understands — no component needs to know about the other schemas:
| Component | Schema it reads | Fields it uses | Why |
|---|---|---|---|
| OTLP receivers | OpenTelemetry | trace_id, span_id, otel_severity, body, resource_attrs |
Native format — zero conversion on ingest |
| Sigma engine | Sigma | log_source_category, log_source_product, log_source_service |
Rule matching requires logsource fields to route rules to the right events |
| Detection ensemble | Seerflow-native | anomaly_score, risk_score, template_id, related_* |
ML models and scoring don't care about classification schemas |
| Correlation engine | ECS + Seerflow | event_category, event_outcome, entity_refs, related_* |
Entity graph needs both classification context and entity references |
| Dashboard / alerting | ECS + OCSF | event_kind, event_category, category_uid, class_uid |
Human-readable labels (ECS) + machine-readable taxonomy (OCSF) for filtering |
| Export / SIEM forwarding | All four | Everything | Forward events to downstream systems in their native schema without translation |
Concrete Example: One Event, Four Lenses¶
A single SSH brute-force event carries all four schemas at once:
OpenTelemetry lens:
trace_id: "4bf92f3577b34da6a3ce929d0e0e4736"
otel_severity: 13 (WARN)
resource_attrs: {"service.name": "sshd", "host.name": "web-prod-01"}
ECS lens:
event_kind: "alert"
event_category: "authentication"
event_type: "start"
event_outcome: "failure"
event_action: "ssh_login"
OCSF lens:
category_uid: 3 → Identity & Access Management
class_uid: 3002 → Authentication
type_uid: 300201 → Authentication: Logon — Failed
activity_id: 1 → Logon
Sigma lens:
log_source_product: "linux"
log_source_service: "sshd"
→ matches rule: "Sigma SSH Brute Force Detection"
No translation happened. The OTLP receiver populated the OTel fields at ingest. The normalizer set the ECS and OCSF fields based on message classification. The Sigma engine matched rules using the logsource fields. Every downstream consumer got exactly the fields it needed, in its native format.
Configuration¶
The SeerflowEvent itself has no runtime configuration — its structure is fixed. To update an event immutably (e.g., adding ATT&CK tactics after detection), use msgspec.structs.replace():
import msgspec.structs
enriched = msgspec.structs.replace(
event,
mitre_tactics=("TA0006",),
mitre_techniques=("T1110",),
risk_score=72.0,
)
# `event` is unchanged; `enriched` is a new instance
This functional update pattern ensures immutability throughout the pipeline. No stage mutates events from a previous stage.
Dual-Lens Example¶
SSH brute-force event (fully populated):
SeerflowEvent(
event_id=UUID("550e8400-e29b-41d4-a716-446655440000"),
timestamp_ns=1742007247000000000,
observed_ns=1742007247001000000,
severity_id=SeverityLevel.WARNING, # Unified
otel_severity=13, # OTel WARN
event_kind="alert", # ECS
event_category="authentication", # ECS
event_outcome="failure", # ECS
category_uid=3, # OCSF: Identity & Access
class_uid=3002, # OCSF: Authentication
type_uid=300201, # OCSF: Logon Failed
activity_id=1, # OCSF: Logon
message="Failed password for root from 198.51.100.23 port 44123",
source_type="syslog",
log_source_category="", # Sigma
log_source_product="linux", # Sigma
log_source_service="sshd", # Sigma
template_id=42,
template_str="Failed password for <*> from <*> port <*>",
related_ips=("198.51.100.23",),
related_users=("root",),
mitre_tactics=("TA0006",), # Credential Access
mitre_techniques=("T1110",), # Brute Force
risk_score=72.0,
anomaly_score=0.65,
)
OOMKill event (fully populated):
SeerflowEvent(
event_id=UUID("661f9511-f3ac-52e5-b827-557766551111"),
timestamp_ns=1742007334000000000,
observed_ns=1742007334002000000,
severity_id=SeverityLevel.ERROR, # Unified
otel_severity=17, # OTel ERROR
event_kind="event", # ECS
event_category="process", # ECS
event_outcome="failure", # ECS
category_uid=1, # OCSF: System Activity
class_uid=1001, # OCSF: Process Activity
type_uid=100103, # OCSF: Terminate
activity_id=3, # OCSF: Terminate
message="Container nginx-canary-7f8b9 exceeded memory limit 512Mi, OOMKilled",
source_type="webhook",
source_id="k8s-events",
template_id=87,
template_str="Container <*> exceeded memory limit <*>, OOMKilled",
related_processes=("nginx-canary-7f8b9",),
mitre_tactics=(), # No ATT&CK mapping for ops events
mitre_techniques=(),
risk_score=0.0, # No security risk
anomaly_score=0.85, # High ops anomaly
)
How Seerflow Implements This
- Event struct:
models/event.py—SeerflowEvent(frozen msgspec.Struct with 30+ fields) - Severity enum:
models/event.py—SeverityLevel(unified 0-6 scale) - Immutable updates:
msgspec.structs.replace()for functional event enrichment
Next: Entity Graph → — How entities extracted from events form a connected graph for correlation.