Global Options
Global options are top-level configuration settings that control Mermin's overall behavior. These are the only options that can be configured via CLI flags or environment variables in addition to the configuration file.
Configuration Methods
Configuration File (HCL)
# config.hcl
log_level = "info"
auto_reload = true
shutdown_timeout = "10s"Command-Line Flags
mermin \
--config=/etc/mermin/config.hcl \
--log-level=debug \
--auto-reloadEnvironment Variables
export MERMIN_CONFIG_PATH=/etc/mermin/config.hcl
export MERMIN_LOG_LEVEL=debug
export MERMIN_CONFIG_AUTO_RELOAD=true
merminConfiguration Options
config / MERMIN_CONFIG_PATH
config / MERMIN_CONFIG_PATHType: String (file path) Default: None (required) CLI Flag: --config Environment: MERMIN_CONFIG_PATH
Path to the HCL or YAML configuration file.
Example:
mermin --config=/etc/mermin/config.hcl
# or
export MERMIN_CONFIG_PATH=/etc/mermin/config.hclauto_reload / MERMIN_CONFIG_AUTO_RELOAD
auto_reload / MERMIN_CONFIG_AUTO_RELOADType: Boolean Default: false CLI Flag: --auto-reload Environment: MERMIN_CONFIG_AUTO_RELOAD
Automatically reload configuration when the file changes. When enabled, Mermin watches the config file and reloads it without requiring a restart.
HCL:
auto_reload = trueCLI:
mermin --auto-reload --config=config.hclEnvironment:
export MERMIN_CONFIG_AUTO_RELOAD=trueBehavior:
File is monitored for changes using filesystem watches
Configuration is reloaded atomically
Brief pause in flow capture during reload (~100ms)
Invalid configuration prevents reload (old config remains active)
Logs indicate successful/failed reload attempts
Use Cases:
Development and testing: Iterate quickly without restarts
Production: Update configuration without downtime
Debugging: Temporarily change log levels or filters
Some configuration changes may require a full restart, such as changing monitored network interfaces or modifying RBAC permissions.
log_level / MERMIN_LOG_LEVEL
log_level / MERMIN_LOG_LEVELType: String (enum) Default: info CLI Flag: --log-level Environment: MERMIN_LOG_LEVEL
Sets the logging verbosity level.
Valid Values:
trace: Most verbose, includes all debug informationdebug: Detailed debugging informationinfo: General informational messages (default)warn: Warning messages onlyerror: Error messages only
HCL:
log_level = "debug"CLI:
mermin --log-level=trace --config=config.hclEnvironment:
export MERMIN_LOG_LEVEL=warnRecommendations:
Production:
infoorwarnto reduce log volumeDebugging:
debugfor detailed troubleshootingDevelopment:
tracefor comprehensive visibility
shutdown_timeout
shutdown_timeoutType: Duration Default: 5s CLI Flag: Not available Environment: Not available
Maximum time to wait for graceful shutdown before forcing termination.
HCL:
shutdown_timeout = "10s"Behavior: During shutdown, Mermin:
Stops accepting new packets
Waits for in-flight flows to export (up to
shutdown_timeout)Closes OTLP connections gracefully
Forces shutdown if timeout is exceeded
Recommendations:
Production:
10sto ensure flows are exportedDevelopment:
5s(default) is usually sufficientHigh-throughput: Increase to
30sor more
Related Settings:
export.otlp.max_export_timeout: Should be less thanshutdown_timeout
Monitoring Shutdown Behavior
Mermin provides metrics to monitor shutdown behavior:
Shutdown Metrics
shutdown_duration_seconds: Histogram of actual shutdown durationsshutdown_timeouts_total: Count of shutdowns that exceeded timeoutflows_preserved_shutdown_total: Flows successfully exported during shutdownflows_lost_shutdown_total: Flows lost due to shutdown timeout
Pipeline Tuning Options
The pipeline block provides advanced configuration for flow processing pipeline optimization, including base channel sizing, worker threading, Kubernetes decoration, backpressure management, and buffer multipliers.
Overview
HCL:
pipeline {
# Base ring buffer capacity (default: 8192, good for typical enterprise)
ring_buffer_capacity = 8192
# Number of parallel flow workers (default: 4, suitable for most deployments)
worker_count = 4
# Worker polling interval (default: 5s)
worker_poll_interval = "5s"
# Kubernetes decorator threading (default: 4 for typical enterprise)
k8s_decorator_threads = 4
# Channel multipliers
flow_span_channel_multiplier = 2.0
decorated_span_channel_multiplier = 4.0
# Adaptive sampling under load (not yet implemented)
sampling_enabled = true
sampling_min_rate = 0.1
backpressure_warning_threshold = 0.01
}ring_buffer_capacity
ring_buffer_capacityType: Integer Default: 8192
Base capacity for the eBPF ring buffer between kernel and userspace. This is the foundation for all pipeline channel sizes.
Behavior:
Acts as a buffer for packets captured by eBPF before userspace processing
Used directly for worker channels and as the base for multipliers (
flow_span_channel_multiplier,decorated_span_channel_multiplier)Higher values provide more buffering for burst traffic
Lower values reduce memory usage
If channel fills, packets are dropped (visible in metrics)
Tuning Guidelines:
Low (< 10K flows/s)
2048-4096
Medium (10K-50K flows/s)
4096-8192
High (50K-100K flows/s)
8192 (default)
Very High (> 100K flows/s)
16384+
Signs You Need to Increase:
Metrics show packet drops (
mermin_packets_dropped_total)Gaps in Flow Trace exports
Warning logs about channel capacity or backpressure
Signs You Can Decrease:
Low CPU usage
Minimal traffic volume
Memory constraints
worker_count
worker_countType: Integer Default: 4
Number of parallel worker threads processing packets and generating flow spans. Each worker processes eBPF events independently from a dedicated worker queue.
Behavior:
Each worker processes packets independently
More workers = more parallelism = higher throughput
More workers = more CPU usage
Workers share the flow table (synchronized)
Worker queue capacity =
ring_buffer_capacity / worker_count
Tuning Guidelines:
Low (< 10K flows/s)
1-2
0.5-1 cores
Medium (10K-50K flows/s)
2-4
1-2 cores
High (50K-100K flows/s)
4 (default)
2-4 cores
Very High (> 100K flows/s)
8-16
4-8 cores
Optimal Worker Count:
Start with CPU count / 2
Monitor CPU usage with metrics
Increase if CPU is underutilized and packet drops occur
Decrease if CPU is overutilized
Relationship with CPU Resources:
# Kubernetes resources should match worker count
resources:
requests:
cpu: 2 # For worker_count = 4
limits:
cpu: 4 # For worker_count = 4worker_poll_interval
worker_poll_intervalType: Duration Default: 5s
Interval at which flow pollers check for flow records and timeouts. Pollers iterate through active flows to:
Generate periodic flow records (based on
max_record_intervalinspanconfig)Detect and remove idle flows (based on protocol-specific timeouts in
spanconfig)
Behavior:
Lower values = more responsive timeout detection and flow recording
Higher values = less CPU overhead
At typical enterprise scale (10K flows/sec with 100K active flows and 32 pollers): ~600 flow checks/sec per poller
Modern CPUs handle flow checking very efficiently (microseconds per check)
Tuning Guidelines:
Short-lived flows (ICMP)
3-5s
Fast timeout detection
Mixed traffic
5s (default)
Balance responsiveness and overhead
Long-lived flows (TCP)
10s
Lower overhead, slower timeouts
Memory constrained
3-5s
More frequent cleanup
Trade-offs:
3s interval: Most responsive, slightly higher CPU (~10K checks/sec per poller)
5s interval (default): Best balance for most workloads
10s interval: Lowest CPU, flows may linger longer before timeout
Signs You Should Decrease:
Flows lingering past their intended timeout
Memory usage growing steadily
Short-lived flow protocols (ICMP with 10s timeout)
Signs You Can Increase:
CPU constrained
Primarily long-lived TCP flows
Flow timeout accuracy not critical
k8s_decorator_threads
k8s_decorator_threadsType: Integer Default: 4
Number of dedicated threads for Kubernetes metadata decoration. Running decoration on separate threads prevents K8s API lookups from blocking flow processing. Each thread handles ~8K flows/sec (~100-150μs per flow), so 4 threads provide 32K flows/sec capacity.
Recommendations based on typical FPS (flows per second):
General/Mixed
50-200
2-4 (default: 4)
Service Mesh
100-300
4 (default)
Public Ingress
1K-5K
4-8
High-Traffic Ingress
5K-25K
8-12
Extreme Scale (Edge/CDN)
>25K
12-24
sampling_enabled
sampling_enabledType: Boolean Default: true
Enable adaptive sampling when worker channels are full. When enabled, Mermin intelligently drops events to prevent complete pipeline stalls while preserving critical flow information (TCP FIN/RST, new flows).
Behavior:
Sampling activates only under backpressure
Preserves flow control packets (FIN, RST)
Maintains minimum sampling rate (see
sampling_min_rate)
sampling_min_rate
sampling_min_rateType: Float (0.0-1.0) Default: 0.1 (10%)
Minimum fraction of events to keep during maximum backpressure. A value of 0.1 ensures at least 10% of events are processed even under extreme load.
backpressure_warning_threshold
backpressure_warning_thresholdType: Float (0.0-1.0) Default: 0.01 (1%)
Drop rate threshold for logging backpressure warnings. Warnings are logged when the fraction of dropped events exceeds this value.
Channel Capacity Tuning
These options control the buffer sizes between pipeline stages to optimize for your workload.
flow_span_channel_multiplier
flow_span_channel_multiplierType: Float Default: 2.0
Multiplier for flow span channel capacity. Provides buffering between workers and K8s decorator. Channel size = ring_buffer_capacity * flow_span_channel_multiplier. With defaults (8192 × 2.0 = 16,384 slots, ~160ms buffer at 100K/s).
Recommendations:
Steady traffic:
2.0(default)Bursty traffic:
3.0-4.0Low latency priority:
1.5
decorated_span_channel_multiplier
decorated_span_channel_multiplierType: Float Default: 4.0
Multiplier for decorated span (export) channel capacity. Provides buffering between K8s decorator and OTLP exporter. This should be the largest buffer since network export is the slowest stage. Channel size = ring_buffer_capacity * decorated_span_channel_multiplier. With defaults (8192 × 4.0 = 32,768 slots, ~320ms buffer at 100K/s).
Recommendations:
Reliable network:
4.0(default)Unreliable network:
6.0-8.0Very high throughput:
8.0-12.0
Monitoring Performance Configuration
After tuning performance settings, monitor these metrics:
# Cache effectiveness
mermin_k8s_cache_hits_total / (mermin_k8s_cache_hits_total + mermin_k8s_cache_misses_total)
# Backpressure and sampling
rate(mermin_flow_events_dropped_backpressure_total[5m])
rate(mermin_flow_events_sampled_total[5m])
mermin_flow_events_sampling_rate
# Channel utilization
mermin_channel_capacity_used_ratio
# Pipeline latency
histogram_quantile(0.95, rate(mermin_processing_latency_seconds_bucket[5m]))Healthy indicators:
Sampling rate = 0 (no backpressure)
Channel utilization < 80%
p95 processing latency < 10ms
IP index updates < 100ms
Pipeline Tuning Example
For a large cluster with high throughput:
pipeline {
# High-throughput base capacity
ring_buffer_capacity = 16384
worker_count = 8
# Increase decorator parallelism
k8s_decorator_threads = 16
# Channel multipliers
flow_span_channel_multiplier = 3.0
decorated_span_channel_multiplier = 8.0
# Adaptive sampling enabled
sampling_enabled = true
sampling_min_rate = 0.15
}Complete Example
# Global configuration options
# Required: Path to this config file (set via CLI or ENV)
# mermin --config=/etc/mermin/config.hcl
# Logging verbosity
log_level = "info"
# Auto-reload config on file changes
auto_reload = true
# Graceful shutdown timeout
shutdown_timeout = "10s"
# Pipeline tuning for flow processing
pipeline {
ring_buffer_capacity = 8192
worker_count = 4
k8s_decorator_threads = 4
}
Precedence Example
When the same option is set in multiple places:
# config.hcl
log_level = "info"# Environment variable
export MERMIN_LOG_LEVEL=warn
# CLI flag (highest precedence)
mermin --log-level=debug --config=config.hclResult: log_level will be debug (CLI flag wins).
Precedence Order (highest to lowest):
Command-line flags
Environment variables
Configuration file
Built-in defaults
Validation
Invalid values are rejected on startup:
log_level = "invalid"Error: invalid log level "invalid", must be one of: trace, debug, info, warn, errorpipeline {
ring_buffer_capacity = -1
}Error: pipeline.ring_buffer_capacity must be a positive integerMonitoring Configuration Effectiveness
After changing global options, monitor these metrics:
# Packet processing
rate(mermin_packets_total[5m])
rate(mermin_packets_dropped_total[5m])
# CPU usage
rate(container_cpu_usage_seconds_total[5m])
# Memory usage
container_memory_working_set_bytesHealthy indicators:
Zero or minimal packet drops
CPU usage 50-80% of allocated
Stable memory usage
Best Practices
Start conservative: Use default values initially
Monitor before tuning: Collect metrics for at least 24 hours
Change one at a time: Isolate the impact of each change
Document changes: Note why specific values were chosen
Test in non-production: Validate tuning before production rollout
Troubleshooting
High Packet Drop Rate
Symptoms: mermin_packets_dropped_total increasing
Solutions:
Increase
pipeline.ring_buffer_capacityIncrease
pipeline.worker_countAllocate more CPU resources
Reduce monitored interfaces
High CPU Usage
Symptoms: CPU utilization near limits
Solutions:
Decrease
pipeline.worker_countIncrease flow timeouts (see Span Options)
Add flow filters (see Filtering)
Allocate more CPU resources
High Memory Usage
Symptoms: Memory usage growing unbounded
Solutions:
Decrease
pipeline.ring_buffer_capacityDecrease flow timeouts (see Span Options)
Add flow filters to reduce active flows
Allocate more memory resources
Configuration Not Reloading
Symptoms: Changes to config file not applied
Solutions:
Verify
auto_reload = trueis setCheck logs for reload errors
Validate configuration syntax
Ensure file permissions allow reading
Some changes require full restart
Next Steps
API and Metrics: Configure health checks and monitoring
Network Interface Discovery: Select which interfaces to monitor
Flow Span Options: Configure flow generation and timeouts
Configuration Examples: See complete configurations
Last updated