FAQ
The Edge Regulator app filters noisy events from log forwarders, reducing storage and log analytics costs by enforcing budget caps.
Overview
What is Edge Regulator and how does it control costs
Edge Regulator applies cost-based sampling to your log pipeline, using automatic message enrichment to identify event types by their symbol identity and calculate actual ingestion cost per event (bytes x $/GB).
When an event type exceeds its share of the budget, Edge Regulator samples it down -- in real-time at the edge, before expensive data reaches your SIEM. A severity boost ensures ERROR and WARN events are more likely to be retained than DEBUG noise.
Works in local (per-node) or global (cluster-wide via GitOps lookup) mode.
How do budget policies work
Configuration via YAML file or Console UI. The rate regulator tracks per-event-type spend using automatic symbol identity enrichment. You configure:
- Budget per hour -- target ingestion cost rate (e.g., $1.50/hour)
- Max share per event type -- prevents any single event type from dominating the budget (e.g., 20%)
- Severity boost -- ERROR events get higher retention probability than DEBUG noise
Plain English: "Cap total ingestion at $500/hour, prevent any single event type from consuming more than 20% of the budget, and give ERROR events higher priority over INFO events."
When an event type exceeds its max share, the regulator samples it down proportionally. For multi-app environments (Kubernetes), regulate per-app budgets using the container name field -- scaling replicas doesn't bypass limits.
What happens to logs when budget limits are reached
The rate regulator applies cost-based sampling. Each event is either retained or dropped based on how much its event type has spent relative to the budget:
- Under budget -- all events flow through normally
- Event type over its max share -- that type gets sampled down proportionally (e.g., an event type consuming 60% of the budget with a 20% cap gets sampled to ~33%)
- Severity boost -- ERROR events receive a higher retention multiplier, making them far more likely to survive sampling than DEBUG noise
The result: noisy event types are automatically throttled while critical events are preserved. Edge Regulator exports metrics tracking exactly what was regulated, so you can tune policies over time.
Configuration
Can I set different budgets for different event types
Yes. The rate regulator tracks spend per event type using configurable field sets. You can regulate by:
- Message pattern -- each distinct event type gets its own budget share
- K8s container name -- cap total spend per application, regardless of pod replica count
- Both combined -- cap per event type per application for fine-grained control
The severity boost ensures higher-severity events (ERROR, WARN) are more likely to be retained when sampling kicks in, without requiring separate budget tiers.
How does Edge Regulator protect critical events
The rate regulator uses two mechanisms that naturally protect critical events:
- Severity boost -- ERROR and WARN events receive a higher retention multiplier during sampling. Even when their event type is over budget, critical-severity events are far more likely to be retained than DEBUG noise
- Max share targeting -- regulation only kicks in when a specific event type exceeds its configured share of the budget (e.g., 20%). Low-volume event types -- which security and authentication logs typically are -- stay well under their share and pass through unaffected
The regulator targets noisy, high-volume event types that dominate your budget -- not broad categories. Security events that don't spike beyond their budget share flow through without regulation.
How do I monitor budget consumption
Edge Regulator publishes cost metrics per event type -- volume, spend rate, and regulated counts. Three ways to consume them:
- ROI Analytics dashboards -- Grafana dashboards showing budget consumption, regulated volumes, and cost trends per event type. Available as managed SaaS or self-hosted with open-source JSON definitions
- Prometheus Metrics API -- standard REST endpoints for querying all cost and regulation metrics programmatically via
/query,/query_range, and/series - Native metric outputs -- export to Prometheus, Datadog, CloudWatch, SignalFx, or any compatible platform for custom dashboards and alerts
How do I configure priority tiers to always forward critical events
Use the levelBoost configuration to force minimum retention floors per severity level. This creates a priority tier system where critical events are guaranteed to pass through even when their event type is over budget.
How it works: The regulator calculates a retention threshold based on budget spend. With levelBoost, if any event's severity level multiplier forces a higher minimum threshold, that event is always retained. For example, ERROR: 100 means ERROR events require random() < (minThreshold * 100) to be dropped — effectively impossible if minThreshold is 0.001 or higher.
Helm configuration example:
# values.yaml for Edge Regulator Helm chart
regulator:
config:
rateRegulatorBudgetPerHour: "$1.50"
rateRegulatorMaxSharePerFieldSet: "0.20"
# Severity boost: guarantee higher priority for critical events
rateRegulatorLevelBoost:
TRACE: 0.25
DEBUG: 0.5
INFO: 1.0
WARN: 1.5
ERROR: 10.0 # High boost ensures ERRORs pass through
CRITICAL: 50.0 # Highest priority for security/outage alerts
YAML config example (non-Helm):
# config.yaml for Edge Regulator sidecar
tenx-regulator:
mode: local
threshold:
budgetPerHour: "$1.50"
maxSharePerFieldSet: 0.20
levelBoost:
TRACE: 0.25 # Debug noise: aggressive sampling
DEBUG: 0.5 # Development logs: light sampling
INFO: 1.0 # Normal events: baseline threshold
WARN: 1.5 # Warnings: higher priority
ERROR: 10.0 # Production errors: always forward
CRITICAL: 50.0 # System critical: guaranteed pass
Real-world scenarios:
-
API Gateway (all errors must be captured): - Budget: $2/hour across 50 microservices - ERROR boost: 100 (guarantees all API errors pass) - DEBUG boost: 0.1 (aggressive sampling on verbose request logs) - Result: ~95% reduction on debug, zero loss on errors
-
Background Job Queue (low signal, high volume): - Budget: $0.50/hour - INFO boost: 0.25 (job completion messages sampled 75%) - ERROR boost: 20.0 (failed jobs always captured) - CRITICAL boost: 100.0 (job queue down = always alerting) - Result: Cost control on routine ops, full visibility on failures
-
Kubernetes cluster (health checks vs. pod crashes): - Budget: $5/hour per node - DEBUG boost: 0.1 (health check events: 90% dropped) - WARN boost: 2.0 (pod warnings: light sampling) - ERROR boost: 100.0 (pod failures and crashes: always forward) - Result: 95% reduction on health checks, 100% capture on pod issues
Configuration interaction with budget:
The regulator applies: retentionThreshold = max(calculatedThreshold, minThreshold * boost)
This means:
- Events with high boost values are always more likely to pass than events with low boost values, regardless of budget pressure
- Budget still applies as the upper limit — you won't exceed your per-hour cap
- Lower boost values get sampled first when over budget; higher boost values get sampled last
- A boost: 100 on ERROR doesn't mean "ignore budget" — it means "ERROR events compete at 100x priority vs. DEBUG events at 0.25x priority"
Integration & Deployment
Which log forwarders does Edge Regulator support
Edge Regulator integrates with all major log forwarders:
Deployment: Runs as a sidecar process alongside your forwarder. Kubernetes deployment via Helm chart (DaemonSet). Setup time: ~30 minutes.
Resource requirements: 512 MB heap + 2 threads handles 100+ GB/day per node. See Performance FAQ for sizing details and Kubernetes resource specs, and the deployment guide for per-forwarder configuration.
Can Edge Regulator reduce Kubernetes health check volume
Yes. Health checks, liveness probes, and pod lifecycle events are highly repetitive and can represent 20-40% of total log volume in K8s environments. 95% reduction is common for these event types.
How it works:
- Set a budget cap for health check event types using rate regulation
- Pair with Edge Optimizer to losslessly compact remaining events via templates
- Result: 95% reduction on health checks while preserving all pod metadata (namespace, pod name, container name, labels)
Failed health checks are always captured. Works with Splunk Connect for Kubernetes, Fluentd, Fluent Bit, OpenTelemetry Collector.