Operational Intelligence Primer¶

No prior DevOps knowledge required. This chapter introduces the core concepts you need to understand how Seerflow detects operational failures. Each section builds on the last — read them in order.

Why This Chapter?¶

Most log intelligence documentation focuses on security — detecting attackers, matching threat signatures, mapping kill chains. That matters, but it's not the whole picture. For most engineering teams, operational failures cause more day-to-day pain than security incidents. A misconfigured deployment, a slow memory leak, a database connection pool running dry — these are the problems that wake people up at 3 AM.

Seerflow detects both. It treats infrastructure issues, application errors, and deployment regressions as first-class detection targets, not afterthoughts. This primer teaches the operational concepts you need to understand how that works: what failure patterns look like in logs, why deployments are high-risk windows, and how correlating signals across multiple sources turns scattered symptoms into actionable root-cause alerts.

Seerflow's Dual-Mode Architecture

Seerflow ships with two detection families that can run independently or together, controlled by a single config switch:

detection:
  mode: operational  # or: security | both

Operational mode activates 4 detectors tuned for infrastructure and application health:

HST content — spots unusual log message patterns (e.g., error messages that have never appeared before)
Holt-Winters volume — detects abnormal log volume spikes or drops (e.g., a service suddenly going silent)
Markov sequence — catches out-of-order event sequences (e.g., a startup routine that skips steps)
CUSUM change-point — catches gradual drifts and sustained mean shifts (e.g., memory slowly climbing before an OOM kill)

Security mode activates 5 threat-focused detectors. Both mode runs all 9 in parallel. Most production deployments use both.

By the End of This Chapter¶

You'll understand:

What common failure patterns look like in log data — and why some failures are obvious while others hide in plain sight
Why deployments create risk windows where baselines shift and normal detection thresholds stop working
How cross-source correlation connects individually ambiguous signals into a single root-cause alert
How Seerflow's operational detectors work together to catch failures that no single detector would flag

Reading Order¶

These sections build on each other. Start at the top and work down:

#	Section	What You'll Learn
1	Failure Patterns	Common infrastructure and application failure signatures in logs
2	Deployment Risk	How deployments change baselines and what canary signals look like
3	Ops Correlation	Cross-source correlation that turns scattered symptoms into root-cause alerts

The Running Example

A single operational scenario threads through every section: a team deploys v2.3.1 of their API service, and within 30 minutes, four log sources show escalating problems. Application error rates climb from 1% to 8%. The database reports connection pool exhaustion. The reverse proxy shows latency spiking from 200ms to 2 seconds. And finally, the OS kernel logs an OOM kill.

Each event is individually ambiguous — error rates fluctuate, connection pools hiccup, latency has bad days. But Seerflow correlates all four into a single "deployment degradation" alert, pinpointing v2.3.1 as the cause. By the end of this chapter, you'll understand exactly how.

Start reading: Failure Patterns in Logs →