Testing & Validation¶
Seerflow enforces a 95% coverage floor across the full source tree. Every change to src/seerflow/ must pass five sequential quality gates before it can be merged — linting, formatting, type checking, security scanning, and the test suite. This page explains each gate, how tests are organized, how to validate against real log data using the LANL dataset, and how to test custom Sigma and correlation rules.
Quick Start¶
Run all five quality gates in a single command:
uv run ruff check . && uv run ruff format --check . && uv run mypy src/ && uv run bandit -r src/ -c pyproject.toml && uv run pytest --cov=src/seerflow --cov-fail-under=95
If this passes, you are good to submit a PR. Each gate is described individually below, but this one-liner is the authoritative check — it is the same sequence run by the pre-push hook and the CI pipeline.
Quality Gates¶
| Tool | Command | What it checks | Pass criteria |
|---|---|---|---|
| Ruff lint | uv run ruff check . |
Style violations, unused imports, undefined names, common bugs | Zero errors |
| Ruff format | uv run ruff format --check . |
Code formatting consistency (line length, whitespace, string quotes) | Zero diffs |
| mypy | uv run mypy src/ |
Static type correctness against all annotated functions and variables | Zero type errors |
| Bandit | uv run bandit -r src/ -c pyproject.toml |
Security anti-patterns (hardcoded secrets, unsafe deserialization, shell injection) | Zero HIGH or CRITICAL issues |
| pytest | uv run pytest --cov=src/seerflow --cov-fail-under=95 |
Unit and integration test suite with branch coverage measurement | All tests pass; coverage ≥ 95% |
You can run any gate individually while iterating on code. Run the full chain before committing.
Pre-commit hooks
Hooks enforce gates automatically so quality problems are caught before they reach CI.
Install hooks once after cloning:
uv run pre-commit install
uv run pre-commit install --hook-type pre-push
Pre-commit (runs on every git commit): Ruff lint, Ruff format, mypy. Fast checks that complete in under 5 seconds on a warm cache.
Pre-push (runs on every git push): Bandit security scan plus the full pytest suite with the 95% coverage gate. Slower — budget 30–60 seconds depending on hardware.
To run all hooks manually without committing:
uv run pre-commit run --all-files
uv run pre-commit run --all-files --hook-stage pre-push
Test Structure¶
Tests are split into two directories by scope. Unit tests verify individual functions and classes in isolation. Integration tests verify pipeline stages with real storage (SQLite in WAL mode), real Drain3 parsing, and real detector instances.
tests/
├── conftest.py # Shared fixtures (sample_event, sample_alert, sqlite_backend)
├── fixtures/ # Static test data (sample log files, pre-built rule YAML)
├── unit/ # Isolated function and class tests
│ ├── test_drain3_parser.py
│ ├── test_hst.py
│ ├── test_holt_winters.py
│ ├── test_cusum.py
│ ├── test_markov.py
│ ├── test_dspot.py
│ ├── test_sigma_engine.py
│ ├── test_entity_graph.py
│ ├── test_correlation_engine.py
│ └── test_scoring.py
└── integration/ # End-to-end pipeline tests with real storage
├── test_pipeline.py
├── test_storage_sqlite.py
├── test_correlation_engine.py
└── test_alerting.py
Shared Fixtures¶
conftest.py provides three fixtures available in all test files:
sample_event() — returns a minimal but valid SeerflowEvent with all required fields populated. Use this as the base for event-specific tests; override fields with msgspec.structs.replace() to create variants.
sample_alert() — returns a minimal Alert linked to the sample_event. Used in alerting, dedup, and sink tests.
sqlite_backend(tmp_path) — creates a temporary SQLite database in tmp_path, initialises the schema, and returns a StorageBackend instance. The database is automatically deleted when the test finishes. Use tmp_path (pytest's built-in fixture) whenever a test needs a real file — never create databases in fixed paths.
Webhook and sink tests use aiohttp.test_utils.TestServer to stand up a real HTTP server within the test process. This avoids mocking HTTP at the transport level and catches serialization bugs that mock-based approaches miss.
Coverage Requirements¶
| Scope | Required coverage |
|---|---|
Overall (src/seerflow/) |
≥ 95% |
| Critical paths: parsers, entity resolution, correlation, ML detectors | ≥ 95% |
Coverage is measured with branch coverage enabled (--cov-branch is set in pyproject.toml). A line that is executed but whose branches are not all tested counts as partially covered. To see which branches are missing, run:
uv run pytest --cov=src/seerflow --cov-report=html
open htmlcov/index.html
LANL Dataset¶
The Los Alamos National Laboratory (LANL) Unified Host and Network Dataset is a 90-day capture of anonymized host event, authentication, process, network flow, and DNS records from a real enterprise environment. It is the standard public benchmark for host-based anomaly detection research and contains labelled red-team activity — making it suitable for end-to-end validation of Seerflow's detection pipeline.
How to Obtain¶
The dataset is available free of charge from the LANL Cyber Security Research data portal. Download the unified CSV files directly from csr.lanl.gov. No registration is required. Files are gzip-compressed CSV; the full dataset is approximately 12 GB compressed.
Converting to SeerflowEvent¶
Seerflow ships a converter script that maps LANL CSV columns to SeerflowEvent fields:
python scripts/lanl_converter.py \
--input /data/lanl/auth.csv.gz \
--output /data/lanl/converted/auth.jsonl
The converter handles:
- Timestamp normalisation — LANL uses integer seconds since epoch; the converter produces nanosecond timestamps in the
timestamp_nsfield. - Entity extraction — source/destination user and host columns map to
related_usersandrelated_hosts. - Severity inference — LANL auth events use a
LogonTypefield; the converter maps each type to an OpenTelemetry severity level (e.g. interactive logon → INFO, service account → NOTICE, failed logon → WARNING). - Template assignment — because LANL records are already structured (no free-form message), the converter synthesises a synthetic Drain3 template from the event type and sets
template_idaccordingly.
To convert all files in a directory:
for f in /data/lanl/*.csv.gz; do
python scripts/lanl_converter.py --input "$f" --output "/data/lanl/converted/$(basename "$f" .csv.gz).jsonl"
done
Importing and Running a Health Check¶
Once converted, import the JSONL files and verify the pipeline is processing events correctly:
seerflow import lanl_converted/*.jsonl
seerflow query health
seerflow query health reports event count, detection rate, alert count, and storage size. On the LANL dataset, expect the HST detector to fire on the known red-team activity windows documented in the dataset's companion paper. Correlation alerts should appear for lateral movement sequences during days 2–4 of the capture.
Start small
The full 90-day dataset takes several hours to import. Start with auth.csv.gz (day 1) to verify the pipeline end-to-end before importing the full corpus.
Testing Custom Rules¶
Sigma Rules¶
1. Write the rule YAML and save it to rules/custom/:
title: My Custom Detection
status: experimental
level: medium
description: Detects a specific pattern in application logs
logsource:
category: application
product: my-service
detection:
selection:
message|contains: "suspicious pattern"
condition: selection
tags:
- attack.discovery
2. Create a test event that should trigger the rule. The cleanest approach is to add a fixture to your test file:
import msgspec.structs
def test_custom_sigma_rule(sample_event, sigma_engine):
# Construct an event that matches the logsource and detection condition
event = msgspec.structs.replace(
sample_event,
message="application encountered suspicious pattern in request",
source_type="my-service",
)
alerts = sigma_engine.evaluate(event)
assert any(a.rule_title == "My Custom Detection" for a in alerts)
3. Run only the custom rule tests to iterate quickly:
uv run pytest tests/unit/test_sigma_engine.py -k "test_custom" -v
The -k "test_custom" selector matches any test function containing test_custom in its name, avoiding a full suite run during rule development. Add -s to see stdout output, which includes the compiled pySigma condition for debugging.
Validation errors at startup
If your rule YAML is malformed, Seerflow logs a warning and skips the rule — the rest of the engine continues. Check the Seerflow logs at startup (log_level: DEBUG) to surface validation errors before relying on the rule in production.
Correlation Rules¶
Correlation rules are YAML files in rules/correlation/ that define multi-event patterns — typically a sequence of event templates or Sigma matches within a time window that together signal a higher-confidence incident.
1. Write the correlation rule in rules/correlation/:
id: custom-lateral-movement-ssh
name: Custom Lateral Movement via SSH
description: Detects a failed login followed by a successful login from the same source
window_seconds: 300
min_events: 2
sequence:
- sigma_rule: ssh-brute-force-attempt
- sigma_rule: successful-ssh-login
entity_join: related_ips
severity: high
2. Simulate the event sequence in your integration test. Inject events that match each step of the sequence in the correct order and within the time window:
import datetime
import msgspec.structs
def test_custom_correlation_rule(sample_event, sqlite_backend, correlation_engine):
base_ns = 1_712_664_000_000_000_000 # 2026-04-09 12:00:00 UTC
failed_login = msgspec.structs.replace(
sample_event,
timestamp_ns=base_ns,
message="Failed password for root from 10.0.0.99",
)
success_login = msgspec.structs.replace(
sample_event,
timestamp_ns=base_ns + 120_000_000_000, # +120 seconds
message="Accepted publickey for deploy from 10.0.0.99",
)
correlation_engine.process(failed_login)
alerts = correlation_engine.process(success_login)
assert any(a.rule_id == "custom-lateral-movement-ssh" for a in alerts)
3. Run the custom correlation tests:
uv run pytest tests/integration/test_correlation_engine.py -k "test_custom" -v
Integration tests use the sqlite_backend fixture (real SQLite, not a mock) because the correlation engine persists entity state and partial sequence matches between events. Using an in-memory store would hide bugs in the persistence layer.
See Also¶
- Configuration Reference —
sigma,correlation, andstoragesettings - Sigma Rules — full Sigma engine documentation
- Correlation Engine — sequence matching and entity joins
- Tuning — adjusting detection sensitivity and threshold parameters
Next: Tuning → — adjusting detector thresholds, score weights, and alert suppression.