Deployment Issues
This guide will help you diagnose and resolve pod startup failures, eBPF loading errors, permission issues, and network interface configuration problems.
Pod Not Starting
Mermin pods that fail to start typically show one of these states: Pending, CrashLoopBackOff, or Error.
Check Pod Status
Gather information about the pod:
kubectl get pods -l app.kubernetes.io/name=mermin -n ${MERMIN_NAMESPACE}
kubectl describe pod mermin-xxxxx -n ${MERMIN_NAMESPACE}
kubectl get events -n ${MERMIN_NAMESPACE} --field-selector involvedObject.name=mermin-xxxxxCommon Causes and Solutions
1. Insufficient Node Resources
Insufficient cpu or Insufficient memory in the events indicates nodes lack available resources.
Fix it by adjusting resource requests in your Helm values:
# In values.yaml
resources:
requests:
cpu: 200m
memory: 220Mi
limits:
cpu: 1
memory: 512MiNote: The Helm chart sets the default limits to prevent the Mermin pods from disrupting existing workloads, please see the default values for details.
2. Pod Security Policy Restrictions
Error: container has runAsNonRoot and image will run as root indicates cluster security policies block the privileged access Mermin needs for eBPF programs.
Solution: Configure your Pod Security Policy (PSP) or Pod Security Standards (PSS) to allow privileged containers in the Mermin namespace. Mermin uses these privileges exclusively for eBPF operations and network monitoring.
The default Helm chart includes the necessary security context settings:
If your cluster uses Pod Security Standards (PSS), you may need to label the namespace appropriately:
3. Image Pull Failures
ImagePullBackOff or ErrImagePull in the pod status indicates image pull failures.
Troubleshoot with these commands:
eBPF Program Loading Failures
eBPF requires specific kernel features and permissions. If Mermin can't load its eBPF programs, you'll see errors like:
Check the Logs
Search the logs for eBPF-related errors:
Test eBPF Attach/Detach Operations
You can use the diagnose bpf subcommand to validate eBPF capabilities in a deployed Mermin cluster:
In a deployed Kubernetes cluster:
Before deploying (using a debug pod):
What the test validates:
Required Linux capabilities (BPF, NET_ADMIN, etc.)
eBPF program loading and verification
Attach/detach operations on network interfaces
BPF filesystem writeability (for TCX link pinning)
Kernel version and TCX vs netlink mode detection
Interpreting results:
All tests pass: Your environment is ready for Mermin
Attach failures: Check capabilities, kernel version, or interface availability
BPF FS not writable: Mount
/sys/fs/bpfor configure volume mounts (see eBPF File System Not Mounted)Capability errors: Verify security context configuration (see Missing Linux Capabilities)
The subcommand provides structured logging with clear success/failure indicators, making it easy to identify specific issues.
Finding Available Interfaces
List interfaces in the pod:
Debug Logging
Enable debug logging for detailed output:
What's Going Wrong?
1. Missing Linux Capabilities
Operation not permitted indicates missing Linux capabilities — the most common issue.
The Helm chart sets privileged: true by default, which grants all necessary capabilities. This is the simplest and most reliable approach:
If you can't use privileged mode (due to security policies), you can grant specific capabilities instead. Refer to the security considerations documentation for more information.
Note: Using specific capabilities requires kernel 5.8+ for the BPF and PERFMON capabilities. On older kernels, privileged: true is required.
Also required: hostPID: true to access the host network namespace:
Without hostPID: true, Mermin can't attach eBPF programs to host network interfaces.
2. Kernel Version Too Old
Invalid argument or Function not implemented indicates a kernel which is too old for eBPF support.
Check your kernel version:
Requirements: Mermin requires Linux kernel 5.14 or newer (6.6+ recommended). Upgrade nodes running older kernels.
3. BTF (BPF Type Format) Not Available
BTF provides type information for eBPF programs. BTF is not supported indicates the kernel was compiled without BTF enabled.
Check if BTF is available:
If the file does not exist, enable BTF in your kernel configuration or switch to a distribution with BTF support (most modern kernels include it).
4. eBPF File System Not Mounted
Mermin pins eBPF maps to /sys/fs/bpf for state persistence. No such file or directory: /sys/fs/bpf indicates the BPF filesystem is not mounted.
Quick fix on the host node:
To make this permanent across reboots, add it to /etc/fstab:
Better yet, configure it in Kubernetes:
Without writable /sys/fs/bpf, Mermin runs in best-effort mode (unpinned maps). Flow state will not persist across pod restarts.
Test BPF filesystem writeability:
Use the diagnose bpf subcommand to verify the BPF filesystem is writable in a deployed cluster:
On bare metal or in a debug pod:
The subcommand will report whether /sys/fs/bpf is writable. On kernels >= 6.6.0 (TCX mode), this is required for link pinning. If the test fails, ensure the BPF filesystem is properly mounted and the container has write permissions.
5. eBPF Verifier Rejection (Program Too Large)
The eBPF verifier enforces program complexity limits. Verifier instruction limit exceeded indicates the program exceeds these limits.
For more detailed guidance on verifier errors, see Common eBPF Errors.
Permission Errors
RBAC permission errors appear when Mermin lacks access to Kubernetes resources:
The service account lacks necessary permissions.
Check Your RBAC Configuration
Make sure your ClusterRole has the required permissions, which can be found in the Helm Chart template:
CNI and Interface Configuration
Missing expected traffic often indicates Mermin is not monitoring the correct network interfaces for your CNI plugin.
Configure Interfaces for Your CNI
Each CNI plugin creates different interface types. Here's what to use:
Calico:
interfaces = ["veth*", "cali*", "tunl*"]Cilium:
interfaces = ["veth*", "cilium_*", "lxc*"]Flannel:
interfaces = ["veth*", "flannel*"]GKE Dataplane V2:
interfaces = ["gke*", "cilium_*", "lxc*"]
Different interface types show different traffic - veth interfaces capture pod-to-pod traffic, while tunnel interfaces capture encapsulated traffic.
Want to learn more? Check out these guides:
Interface Visibility and Traffic Decapsulation - Understand what traffic each interface type captures
Advanced Scenarios: Custom CNI Configurations - Complex CNI setups
Understanding TC Priority
TC (Traffic Control) priority determines the order in which eBPF programs execute in the networking stack. On older kernels (< 6.6), this is managed through netlink-based TC with numeric priorities. On newer kernels (>= 6.6), TCX mode uses explicit ordering.
Check What Priority Mermin is Using
You should see output like this:
How Priority Works
Think of priority as a queue - lower numbers cut to the front of the line:
Lower number = Higher priority = Runs earlier in the TC chain
Higher number = Lower priority = Runs later in the TC chain
Mermin's default: Priority 1 - Mermin runs first to capture an unfiltered, unprocessed view of network packets.
The Priority Conflict:
Most CNI programs (Cilium, Calico) also default to priority 1 for early packet processing. This creates a conflict - only one program can use each priority value.
Resolving the Conflict:
Since Mermin uses TC_ACT_UNSPEC (pass-through), it observes packets without modifying or blocking them. Running Mermin at priority 1 provides the most accurate observability data.
If your CNI also uses priority 1, you need to choose:
Recommended: Keep Mermin at priority 1, adjust your CNI to priority 2+ (e.g., Cilium priority 2)
Alternative: Move Mermin to a higher priority if you prefer CNI to run first (loses unfiltered view)
Test any priority changes thoroughly! Adjusting either Mermin's or your CNI's priority can affect network behavior differently depending on your CNI plugin. Validate in a non-production environment that flows are captured correctly and network connectivity works as expected.
Why priority 1 matters for Mermin:
Prevents flow gaps from orphaned programs after restarts
Provides the most complete and accurate network observability
Troubleshooting Priority Conflicts
Priority conflicts are rare, but they can happen. You'll typically notice network connectivity issues if Mermin interferes with your CNI.
Common causes:
Mermin running before critical CNI programs that need to see traffic first
Multiple programs using the same priority value
Non-standard CNI priority configurations
Debug it step by step:
First, check what priorities are in use:
Then adjust based on your kernel version:
For older kernels (< 6.6) - netlink mode:
For newer kernels (>= 6.6) - TCX mode:
Important: Changing from the default priority/order settings can cause issues with some CNI plugins, including missing flows or network connectivity problems. Test thoroughly in a non-production environment first and verify that flows are being captured correctly for your specific CNI.
Not sure which kernel you're running?
If it's >= 6.6.0, you're using TCX mode (you'll also see this in the logs). In TCX mode, tc_priority is ignored in favor of tcx_order.
Quick reference:
TCX mode (kernel >= 6.6): Programs are ordered explicitly using
tcx_order(first/last)Netlink mode (kernel < 6.6): Programs are ordered by numeric priority (lower = earlier)
Priority only affects execution order, not performance
Running first helps prevent flow gaps after restarts
Configuration Syntax Errors
HCL syntax errors can be tricky to debug. If Mermin won't start and you see something like:
Your configuration file has a syntax error.
Validate Your Configuration
Use Terraform's formatter to check for syntax errors:
Common Mistakes to Watch For
Missing closing braces - Every
{needs a matching}Mismatched quotes - Use
"quotes"consistentlyInvalid key names - Use underscores (
tcp_priority), not hyphens (tcp-priority)
Next Steps
Configure Network Interfaces: Optimize for your CNI
Set Up OTLP Export: Send flows to your backend
Diagnose eBPF Verifier Errors: Detailed solutions for verifier failures
Understand Interface Visibility: Why traffic might not appear
Search Existing Issues: Check if someone else had the same problem
GitHub Discussions: Ask for community help
Related Documentation
Configuration Reference: Complete configuration options
Security Considerations: Understand required privileges
Last updated