Architecture
This page explains how Mermin works, its architecture, and the flow of data from network packets to Flow Traces in your observability backend.
What are Flow Traces?
Flow Traces are OpenTelemetry trace spans that represent network flows with NetFlow-like semantics. Unlike traditional NetFlow or IPFIX:
OpenTelemetry Native: Flow Traces are OTLP trace spans, not proprietary flow protocols
Bidirectional: A single span represents both directions of a flow
Rich Metadata: Includes Kubernetes context (pods, services, deployments, labels)
Standardized Format: Works with any OTLP-compatible observability platform
Mermin generates Flow Traces by capturing network packets, aggregating them into flows, decorating with Kubernetes metadata, and exporting as OpenTelemetry spans.
High-Level Architecture
Mermin is deployed as a DaemonSet in Kubernetes, with one agent instance running on each node in your cluster. Each agent independently captures and processes network traffic from its host node.
┌─────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Node 1 │ │ Node 2 │ │
│ │ │ │ │ │
│ │ ┌────────┐ │ │ ┌────────┐ │ │
│ │ │ Mermin │ │ │ │ Mermin │ │ │
│ │ │ Agent │ │ │ │ Agent │ │ │
│ │ └───┬────┘ │ │ └───┬────┘ │ │
│ │ │ eBPF │ │ │ eBPF │ │
│ │ ↓ │ │ ↓ │ │
│ │ [Network] │ │ [Network] │ │
│ │ [Packets] │ │ [Packets] │ │
│ └──────────────┘ └──────────────┘ │
│ │ │ │
│ └────────────┬───────────┘ │
│ │ OTLP │
└──────────────────────┼──────────────────────┘
↓
┌─────────────────┐
│ OpenTelemetry │
│ Collector │
└────────┬────────┘
│
┌──────────────┼──────────────┐
↓ ↓ ↓
┌────────┐ ┌─────────┐ ┌────────┐
│Elastic │ │ Grafana │ │ Jaeger │
│ Stack │ │ Tempo │ │ │
└────────┘ └─────────┘ └────────┘Components
eBPF Programs
Mermin uses eBPF (extended Berkeley Packet Filter) programs loaded into the Linux kernel to capture network packets with minimal overhead. These programs:
Attach to network interfaces specified in your configuration
Capture packets at the TC (Traffic Control) layer
Perform initial packet parsing for protocol headers
Send packet data to userspace via eBPF ring buffers
eBPF provides several advantages:
High Performance: Executes directly in the kernel, avoiding context switches
Low Overhead: Processes only necessary packet headers, not full payloads
Safety: Verified by the kernel to ensure it cannot crash or hang the system
No Kernel Modules: No need to compile or load custom kernel modules
Flow Generation Engine
The userspace Mermin agent receives packets from eBPF and aggregates them into network flows:
Bidirectional Flows: Groups packets by 5-tuple (source IP/port, dest IP/port, protocol)
State Tracking: Maintains connection state for TCP (SYN, FIN, RST flags)
Timeout Management: Expires inactive flows based on configurable timeouts
Protocol Parsing: Deep packet inspection for tunneling protocols (VXLAN, Geneve, WireGuard)
Community ID: Generates standard Community ID hashes for flow correlation
A Flow Trace includes:
Source and destination IP addresses and ports
Network protocol (TCP, UDP, ICMP, etc.)
Packet and byte counters (bidirectional)
TCP flags and connection state
Flow start and end timestamps
Community ID hash
// TODO: LINK TO FLOW TRACE SPEC
State Persistence
Mermin preserves flow state across pod restarts through eBPF map pinning, ensuring continuous visibility without data loss:
Map Pinning:
FLOW_STATS_MAPandFLOW_EVENTSare pinned to/sys/fs/bpf/when writable (requires/sys/fs/bpfhostPath mount)Schema Versioning: Maps use versioned paths (e.g.,
mermin_flow_stats_map_v1) to prevent incompatible format reuse across upgradesState Continuity: Flow statistics persist across mermin restarts, eliminating visibility gaps during rolling updates
Format Validation: Pinned maps are reused only if schema version and format match current version
Graceful Degradation: If pinning fails, mermin continues with unpinned maps (logged as warning)
Upgrade Safety: When struct layouts change, increment
EBPF_MAP_SCHEMA_VERSIONto create new versioned maps
This ensures:
*No flow data loss during pod restarts or rolling updates
*Existing flows continue to accumulate statistics across restarts
*Safe upgrades without corrupt data reuse
*Easy rollbacks (old map versions remain available)
Kubernetes Integration
Mermin deeply integrates with Kubernetes to decorate flows with contextual metadata:
Informers
Mermin uses Kubernetes informers (watch APIs) to maintain an in-memory cache of cluster resources:
Pods, Services, Deployments, ReplicaSets, StatefulSets, DaemonSets
Jobs, CronJobs, NetworkPolicies
Endpoints, EndpointSlices, Ingresses, Gateways
This cache is continuously updated as resources change, ensuring metadata is always current.
Flow Attribution
For each network flow, Mermin:
Identifies Pods: Matches source/destination IPs to pod IPs
Extracts Metadata: Retrieves pod name, namespace, labels, annotations
Walks Owner References: Follows ownerReferences from Pod → ReplicaSet → Deployment
Selector Matching: Finds Services and NetworkPolicies that select the pod
Decorates Traces: Attaches all relevant metadata to the Flow Trace
This provides full context for each network flow, enabling powerful filtering and analysis.
OTLP Exporter
Mermin exports flows as Flow Traces using the OpenTelemetry Protocol (OTLP):
Flow Traces as Spans: Each network flow becomes an OpenTelemetry trace span
Standard Protocol: OTLP is an industry-standard telemetry protocol
Flexible Transport: Supports both gRPC and HTTP protocols
Batching: Aggregates multiple Flow Traces before sending to reduce network overhead
Backpressure Handling: Queues Flow Traces if the backend is unavailable
Authentication: Supports Basic Auth, TLS client certificates
Secure Transport: TLS encryption with custom CA certificate support
Flow Traces are exported as OTLP trace spans, allowing them to be processed by any OTLP-compatible backend without requiring NetFlow collectors.
Data Flow
Let's trace a network packet through Mermin's pipeline:
1. Packet Capture (eBPF)
Network Interface (eth0)
↓
TC Hook (eBPF)
↓
Parse Headers (IP, TCP/UDP, Tunnels)
↓
Ring BuffereBPF program attached to
eth0captures incoming and outgoing packetsParses Ethernet, IP, TCP/UDP, and tunnel protocol headers
Extracts 5-tuple and other flow identifiers
Sends packet metadata to userspace via ring buffer (not full payload)
2. Flow Aggregation (Userspace)
Ring Buffer Reader
↓
Flow Table Lookup
↓
Update Flow State
↓
Check Timeout/CompletionMermin reads packet metadata from ring buffer
Looks up existing flow in flow table by 5-tuple
Updates packet/byte counters, flags, timestamps
Checks if flow should be exported (timeout, connection close, max duration)
3. Kubernetes Decoration
Flow Ready for Export
↓
IP to Pod Lookup
↓
Extract Pod Metadata
↓
Walk Owner References
↓
Match Selectors
↓
Decorate Trace RecordSource IP:
10.244.1.5→ Pod:nginx-abc123→ ReplicaSet:nginx-xyz→ Deployment:nginxDestination IP:
10.96.0.1→ Service:kubernetesAttaches labels, annotations, namespace, and other metadata
4. OTLP Export
Trace Flow
↓
Batch Accumulator
↓
OTLP Trace Span
↓
gRPC/HTTP Transport
↓
OpenTelemetry CollectorFlow is converted to an OTLP trace span (Flow Trace)
Batched with other Flow Traces to reduce network overhead
Sent to configured OTLP endpoint
Collector receives and processes the Flow Traces
Performance Characteristics
Resource Usage
Mermin is designed to be efficient in production environments:
CPU: Typically 0.1-0.5 cores per agent, varies with traffic volume
Memory: Base usage ~100-200 MB, grows with flow table size
Network: Outbound OTLP traffic depends on flow rate and batching settings
Kernel: eBPF programs have minimal impact (< 1% CPU overhead)
Scalability
Flow Rate: Can handle 10,000+ flows/second per agent on modern hardware
Packet Rate: Processes 100,000+ packets/second with minimal packet loss
Cluster Size: Scales linearly – each node runs its own independent agent
Flow Table Size: Configurable, defaults support ~100,000 concurrent flows
Tunability
Mermin provides extensive configuration for performance tuning under the pipeline block:
pipeline.ring_buffer_capacity: eBPF ring buffer size between kernel and userspacepipeline.worker_count: Number of parallel flow worker threadspipeline.k8s_decorator_threads: Dedicated threads for Kubernetes metadata decorationspan.*_timeout: Flow expiration times affect memory usageexport.otlp.max_batch_size: Larger batches reduce network overheadexport.otlp.max_queue_size: Backpressure buffer for slow backends
See Configuration Reference for details.
Failure Modes and Resilience
Agent Failure
If a Mermin agent crashes or is terminated:
Local Impact Only: Only flows from that node are affected
Kubernetes Restart: DaemonSet controller automatically restarts the pod
No Data Loss: Flow state is ephemeral; new flows are captured after restart
No Cluster Impact: Other nodes continue operating normally
Backend Unavailability
If the OTLP backend is unavailable:
Queuing: Flows are queued up to
max_queue_sizeBackpressure: If queue fills, oldest flows are dropped (not newest)
Automatic Retry: Mermin retries failed exports with exponential backoff
Graceful Degradation: Agent continues capturing flows
Network Issues
Interface Unavailable: Mermin logs a warning and continues monitoring other interfaces
eBPF Load Failure: Agent fails to start; check kernel version and eBPF support
High Packet Loss: Increase
pipeline.ring_buffer_capacityor reduce monitored interfaces
Comparison with Alternatives
vs. eBPF Observability Tools (Cilium Hubble, Pixie)
Mermin provides:
Flow-level granularity: Every individual network flow exported as a Flow Trace with full metadata
CNI Agnostic: Not tied to a specific CNI implementation (works with Cilium, Calico, Flannel, etc.)
Pure OTLP export to any OpenTelemetry-compatible backend
Lightweight, focused solely on network flow observability
No vendor lock-in or platform dependencies
Flexible backend choice (Elastic, Grafana, Jaeger, cloud providers)
Historical flow analysis and long-term storage in your observability backend
Cilium Hubble provides:
Aggregated network metrics (connection rates, error rates, latencies)
Deep integration with Cilium CNI and network policies
Service map visualization with Hubble UI
Layer 7 protocol visibility (HTTP, gRPC, Kafka, DNS)
Requires Cilium as the CNI
Limited historical data retention (ephemeral, in-memory)
Pixie provides:
Aggregated network metrics with short-term retention
Full application observability (traces, logs, metrics, profiling)
Auto-instrumentation for multiple languages
In-cluster data processing and querying
Requires Pixie platform deployment
Limited long-term storage (auto-deletes data after hours/days)
Key Insight: Mermin is the only tool that provides flow-level granularity - each individual network flow becomes a Flow Trace with complete metadata (source/dest pods, services, deployments, labels, packet/byte counts, TCP flags, etc.). Hubble and Pixie provide aggregated network metrics (requests/sec, error rates), which are useful for dashboards but don't give you the raw flow data needed for deep investigation, compliance, or security forensics.
Trade-off: Hubble and Pixie offer broader observability features (L7 protocols, application tracing) but with platform coupling and metric aggregation. Mermin prioritizes CNI/backend flexibility and flow-level detail, enabling long-term storage and granular analysis of every network connection.
vs. NetFlow/IPFIX Exporters
Mermin Flow Traces provide:
OpenTelemetry-native format (OTLP trace spans)
Kubernetes metadata: pods, services, deployments, labels, owner references
Modern observability backend integration (Tempo, Jaeger, Elastic, OpenSearch)
No specialized NetFlow collectors required
CNI Agnostic: Captures flows regardless of CNI implementation
Cloud-native architecture (DaemonSet, Helm charts)
Traditional NetFlow/IPFIX provides:
Established protocol with decades of tooling
Hardware switch/router support
Legacy network monitoring platform compatibility
SNMP integration for traditional network management
Trade-off: NetFlow/IPFIX is ideal for traditional network infrastructure. Mermin is purpose-built for cloud-native Kubernetes environments with modern observability stacks.
vs. Packet Capture Tools (tcpdump, Wireshark)
Mermin provides:
Continuous, automated flow capture without manual intervention
Bidirectional flow aggregation with packet/byte counters
Kubernetes metadata enrichment (pods, services, deployments)
Efficient OTLP export to any observability backend
Production-ready with minimal performance overhead
tcpdump/Wireshark provide:
Full packet payload capture for deep inspection
Interactive analysis and filtering (Wireshark GUI)
Protocol dissection for debugging specific issues
Manual, on-demand troubleshooting
Trade-off: Use Mermin for continuous observability; use packet capture tools for deep troubleshooting of specific issues.
vs. Service Mesh (Istio, Linkerd)
Note: These are fundamentally different tools for different jobs. Service meshes are for traffic management and security. Mermin is for network observability. They are complementary, not alternatives.
Mermin provides (Observability):
Network flow visibility across your entire cluster
Zero application changes or sidecar injection required
Captures all traffic: pod-to-pod, pod-to-external, host network, non-mesh workloads
CNI Agnostic: Works with any CNI (Cilium, Calico, Flannel, cloud-native CNIs)
Lower resource overhead (no sidecar per pod)
Network-layer (L3/L4) flow telemetry
Service Mesh provides (Traffic Management & Security):
Layer 7 (HTTP, gRPC) traffic control and policy enforcement
Traffic management (retries, timeouts, circuit breaking, canary deployments)
Mutual TLS encryption between services
Service-to-service authorization and authentication
Request routing and load balancing strategies
(Also includes L7 observability metrics as a side benefit)
Key Insight: You can run Mermin alongside a service mesh. Mermin observes network flows (L3/L4) across all workloads, while the service mesh manages application traffic (L7) for enrolled services. Many organizations use both together.
Next Steps
Now that you understand how Mermin generates Flow Traces:
Deploy to Production: Choose your deployment model
Configure Mermin: Customize for your environment
Choose Your Backend: Send Flow Traces to your observability platform
Troubleshoot Issues: Diagnose and resolve problems
Last updated