Skip to content

Rate

Prevent log analytics over-billing while ensuring critical events always reach your analysis tools.

The rate receiver uses automatic message enrichment combined with byte-based cost calculation to apply sampling based on actual ingestion costs per logical event type.

By tracking costs (event byte size × vendor ingestion rate), the receiver provides business-aligned budget enforcement that accounts for variable event sizes (i.e., 10KB error log is correctly weighted against a 100-byte debug message).

This approach enables more precise control than regex-based rules which require manual configuration and lack logical event type, severity, and cost awareness.

Control Strategies

The rate receiver supports two independent strategies for filtering events. They answer different questions and can be run in the same pipeline as layers — mute file as the surgical human-intent layer, per-node budget as the always-on safety net.

Forwarders maintain independent cost counters without cross-node communication, tracking spend per event type (symbolMessage) based on byte volume and configured ingestion costs. Value lies in simple per-service setup, quick processing without network delays, and fault isolation that contains issues to single nodes.

Trade-offs include decisions limited to local data (risking cluster-wide budget overruns) and no visibility into cross-service patterns.

Example: A single application forwarder tracks its own $0.025/min budget, throttling high-cost debug logs probabilistically based solely on its own traffic.

Activated when rateReceiverLookupFile is not set.

A declarative file keyed by the joined rateReceiverFieldNames values (the same key the local receiver uses for its per-node counters) caps specific patterns with an explicit sample rate and expiry. Typically committed to a git repo alongside the pipeline config and pulled via gitops, so each mute has a diff, a reviewer, and an audit trail.

File format:

<fieldSet>=<sampleRate>:<untilEpochSec>[:<reason>]

With rateReceiverFieldNames: [symbolMessage] the key is just the symbolMessage value (e.g. Error_syncing_pod); with [symbolMessage, container] it becomes <symbolMessage>_<container> (e.g. heartbeat_debug_frontend).

Trade-offs: does nothing about unknown patterns or runaway nodes — this is human-declared intent, not adaptive control. Pair with per-node budget mode (in a separate receiver instance) if you need a fallback safety net.

Example: An operator notices the Reporter attributing $12K/month to Error_syncing_pod. They append Error_syncing_pod=0.10:1744848000:pod error spam OPS-4821 to the mute file, open a PR, merge it. All forwarders pulling the file apply the mute on their next reload. The mute self-expires at the epoch, so nobody has to remember to clean it up.

Activated when rateReceiverLookupFile points at a mute file.

Multi-App Receiving

For central forwarders handling logs from multiple applications (common in Kubernetes), the rate receiver prevents individual apps from bypassing budget caps by scaling pods. Use the k8s container name field to aggregate spend per app across all replicas. Two approaches available:

Option A: Cap Total App Spend (All Event Types)

Prevents any single app from dominating the budget regardless of how many event types it emits.

rateReceiverFieldNames: [container]  # App only
rateReceiverMaxSharePerFieldSet: 0.2
rateReceiverBudgetPerHour: 1.50

Result:

  • Frontend app (all events, 5 pods): Cannot exceed 20% of total budget ($0.30/hour)
  • Backend app (all events, 2 pods): Cannot exceed 20% ($0.30/hour)
  • Payment app (all events, 1 pod): Cannot exceed 20% ($0.30/hour)

Trade-off: Loses event-type intelligence—can't prioritize ERROR over DEBUG within an app.

Option B: Cap Per Event Type Per App

Enforces fairness within each app—prevents a single noisy event type from dominating that app's spend.

rateReceiverFieldNames: [symbolMessage, container]  # Event type + app
rateReceiverMaxSharePerFieldSet: 0.2
rateReceiverBudgetPerHour: 1.50

Result:

  • "heartbeat_debug|frontend" (5 pods): Cannot exceed 20% of total budget
  • "error_login|frontend" (5 pods): Separate 20% cap
  • "timeout|payment-service" (1 pod): Separate 20% cap

Trade-off: Each (event type × app) combo gets its own 20% cap—apps with many event types could theoretically exceed 20% total (though unlikely in practice).

Key Insight: Use container (not pod) for aggregation—container name is stable across replicas, while pod names are unique per instance. Scaling from 1→10 pods doesn't bypass limits.

Workflow

The rate receiver executes the following steps:

graph LR
    A["<div style='font-size: 14px;'>📥 Input</div><div style='font-size: 10px; text-align: center;'>TenXObject</div>"] --> B["<div style='font-size: 14px;'>💰 Cost Calc</div><div style='font-size: 10px; text-align: center;'>Bytes × $/GB</div>"]
    B --> C{"<div style='font-size: 14px;'>🗂️ Mute File</div><div style='font-size: 10px; text-align: center;'>Set?</div>"}
    C -->|Yes| D["<div style='font-size: 14px;'>🔇 Lookup</div><div style='font-size: 10px; text-align: center;'>Field Set</div>"]
    C -->|No| E["<div style='font-size: 14px;'>📈 Local</div><div style='font-size: 10px; text-align: center;'>Node Spend</div>"]
    D --> F["<div style='font-size: 14px;'>⚖️ Budget Check</div><div style='font-size: 10px; text-align: center;'>vs Target Rate</div>"]
    E --> F
    F --> G["<div style='font-size: 14px;'>📊 Event Share</div><div style='font-size: 10px; text-align: center;'>vs Max %</div>"]
    G --> H["<div style='font-size: 14px;'>🎯 Boost</div><div style='font-size: 10px; text-align: center;'>by Severity</div>"]
    H --> I{"<div style='font-size: 14px;'>🎲 Sample</div><div style='font-size: 10px; text-align: center;'>Decision</div>"}
    I -->|Drop| J["<div style='font-size: 14px;'>🗑️ Drop</div><div style='font-size: 10px; text-align: center;'>TenXObject</div>"]
    I -->|Keep| K["<div style='font-size: 14px;'>✅ Retain</div><div style='font-size: 10px; text-align: center;'>TenXObject</div>"]

    classDef input fill:#3b82f688,stroke:#2563eb,color:#ffffff,stroke-width:2px,rx:8,ry:8
    classDef decision fill:#eab30888,stroke:#d97706,color:#ffffff,stroke-width:2px,rx:8,ry:8
    classDef process fill:#059669,stroke:#047857,color:#ffffff,stroke-width:2px,rx:8,ry:8
    classDef rate fill:#7c3aed88,stroke:#6d28d9,color:#ffffff,stroke-width:2px,rx:8,ry:8
    classDef retain fill:#16a34a,stroke:#15803d,color:#ffffff,stroke-width:2px,rx:8,ry:8
    classDef drop fill:#dc2626,stroke:#b91c1c,color:#ffffff,stroke-width:2px,rx:8,ry:8

    class A input
    class B,G decision
    class C,D,F process
    class E rate
    class I retain
    class H drop

Local Mode (Without Lookup): Per-Node Filtering

Scenario: A Kubernetes node running 3 pods with config:

  • Budget: $1.50/hour ($0.025/min)
  • Max share per event type: 20%
  • Ingestion cost: $1.50/GB (Splunk)
  • 5-minute tracking window

Step-by-step for a Kubernetes pod error event (ERROR level, 1.8KB):

  1. 📥 Event Arrives: Pod emits a CrashLoopBackOff error with full Kubernetes metadata (see raw JSON, 1835 bytes)

  2. 💰 Cost Calculated: 1835 bytes / 1GB × $1.50 = $0.0000028 per event

  3. 📊 Field Set Identified: symbolMessage = "Error_syncing_pod" (extracted by message enrichment) → counter key: Error_syncing_pod

  4. 📈 Track Spend (Local):

  • Current 5-min window spend: Error_syncing_pod = $0.06, total = $0.10
  • After increment: Error_syncing_pod = $0.0600028, total = $0.1000028
  • Normalize to per-minute: Error_syncing_pod = $0.012/min, total = $0.020/min
  1. ⚖️ Budget Check: Is total over budget?
  • Total spend rate: $0.020/min vs. target $0.025/min → Under budgetglobalScale = 1.0 (no throttling)
  1. 📊 Event Share Check: Is "Error_syncing_pod" dominating?
  • Share: $0.06 / $0.10 = 60% vs. max 20% → Over share limit
  • Scale down: fieldSetRate = 0.2 / 0.6 = 0.33 (retain 33% of these events)
  1. 🎯 Severity Boost: ERROR level boost = 2.0
  • baseRate = 1.0 × 0.33 = 0.33
  • finalRate = 0.33 × 2.0 = 0.6666% retention (boost helps but doesn't fully override)
  1. 🎲 Sample Decision: random(0-1) = 0.3 < 0.66✅ Event Kept

Result: "Error_syncing_pod" is heavily over the 20% share (at 60%), so it gets throttled to 33% base rate. The ERROR severity boost (2.0×) increases retention to 66%, meaning 2/3 of these ERROR events are kept. If it were DEBUG (boost=0.5), final rate would be 0.165 → 83% chance of being dropped.


Mute File Mode: Declarative Field-Set Caps

Scenario: A platform engineer sees the Reporter attributing $12K/month to the Error_syncing_pod event type. They want to cap it at 10% sample rate for 24 hours while the application team ships a fix. The pipeline is configured with rateReceiverFieldNames: [symbolMessage], so mute keys are symbolMessage values.

Mute file contents (mutes.csv, pulled via gitops from a config repo):

Error_syncing_pod=0.10:1744848000:pod error spam OPS-4821
heartbeat_debug=0.00:1744416000:k8s liveness 200s
jwt_validated=0.25:1744502400:auth flood after deploy

Step-by-step for an incoming pod error event (INFO level, 1.2KB, symbolMessage = "Error_syncing_pod"):

  1. 📥 Event Arrives: Forwarder receives the event and builds the field-set key by joining rateReceiverFieldNames values — here just Error_syncing_pod.

  2. 🗂️ Mute File Check: Look up Error_syncing_pod in the mute file → entry found: 0.10:1744848000:...

  3. ⏰ Expiry Check: Compare untilEpochSec = 1744848000 against current time.

  • If past expiry: mute self-heals, event is retained with no further checks.
  • If still active: proceed to the sample-rate decision.
  1. 🎯 Severity Floor: This is an INFO event.
  • minRetentionThreshold = 0.1, levelBoost[INFO] = 1.0 → floor = 0.1
  • retentionThreshold = max(0.10, 0.10) = 0.10
  • Had this been an ERROR event, the floor would be 0.1 × 2.0 = 0.20, raising the retention to max(0.10, 0.20) = 0.20 — even under a 10% mute, ERRORs stay at 20%.
  1. 🎲 Sample Decision: random(0-1) = 0.73 > 0.10🗑️ Event Dropped

Result: On average, 10% of Error_syncing_pod events are retained for the next 24 hours. Events with any other field-set are unaffected — the mute is surgical. When the engineer ships the fix, the mute expires automatically; no cleanup required.

Key Difference from per-node budget mode:

  • Scope: per-node mode throttles any pattern that pushes spend over a budget; mute-file mode only touches explicitly declared field-sets and leaves everything else alone.
  • Authority: per-node mode makes autonomous probabilistic decisions based on counters; mute-file mode applies human-declared intent reviewed via PR.
  • Workflow fit: the mute file is edited by operators (often via an AI assistant like Claude Code using the Log10x MCP + GitHub MCP), committed to git, and pulled into each forwarder's config via gitops. The file is the interface.

Config Files

To configure the rate receiver module, Edit these files.

Below is the default configuration from: rate/config.yaml.

Edit Online

Edit config.yaml Locally

# 🔟❎ 'run' rate receiver configuration

# rate receivers utilize cost-based sampling to filter noisy telemetry from event outputs (e.g., Splunk)
# Enforces spending limits by tracking byte volume and ingestion costs per event type
# To learn more see https://doc.log10x.com/run/receive/rate/

# Set the 10x pipeline to 'run'
tenx: run

# =============================== Dependencies ================================

include: run/modules/receive/rate

# ============================== Rate Options =================================

rateReceiver:

  # 'fieldNames' specifies the list of TenXObject fields to identify rate counter buckets
  #  The list usually contains the symbolMessage field from message enrichment
  #  Can include additional fields like container (k8s app), country (GeoIP), httpCode, etc.
  fieldNames:
    - $=yield TenXEnv.get("symbolMessageField")  # Matches lookup keys

  # 'resetIntervalMs' specifies reset interval for rate counters in milliseconds
  #  Default of 5 minutes provides stable cost averages while remaining responsive
  #  Trade-offs: 1min (reactive/noisy) vs 5min (balanced) vs 10min+ (stable/slow)
  resetIntervalMs: $=parseDuration("5m")

  # 'minRetentionThreshold' specifies minimum retantion threshold for high-spend events (0.0 to 1.0)
  #  Ensures some events are always retained even when budget is exceeded
  #  Example: 0.1 = minimum 10% retention even for very high-spend patterns
  minRetentionThreshold: 0.1

  # 'levelBoost' specifies severity level boost mapping for sampling rates
  #  Higher severity events can be given higher retention rates through boost values
  #  Example: ERROR=2.0 means ERROR events are twice as likely to be retained
  levelBoost:
    - TRACE=0.25
    - DEBUG=0.5
    - INFO=1
    - WARN=1.5
    - ERROR=2
    - FATAL=3

  # ----------------------------- Budget Options ------------------------------

  # 'budgetPerHour' specifies target spending budget per hour in USD
  #  Soft target for total cost across all event types on this node
  #  Actual spend may exceed by 10-20% due to soft enforcement
  #  Examples: $1.50/hour (~$36/day), $0.10/hour (dev/test), $0.02/hour (minimal)
  budgetPerHour: 1.50

  # 'maxSharePerFieldSet' specifies maximum % of total spend any single field set can consume
  #  Enforced independently of budget—prevents noisy field sets from dominating even when under budget
  #  Example: 0.2 means no single event type can exceed 20% of total spend
  #  In global mode, enforced cluster-wide based on aggregate spend data from lookup
  maxSharePerFieldSet: 0.2

  # 'ingestionCostPerGB' specifies vendor ingestion cost per GB in USD
  #  Used to calculate per-event costs based on byte size (per-node budget mode only)
  #  Common values: Splunk $1.50/GB, Datadog $0.10-$0.25/GB, Elastic $0.109/GB
  ingestionCostPerGB: 1.5

  # ---------------------------- Mute File Options ----------------------------

  # Declarative mute file keyed by the joined field-set defined in 'fieldNames'.
  # When 'file' is set, the receiver switches from per-node budget sampling to
  # mute-file mode: per-event decisions are driven entirely by entries in the file.
  #
  # File format (one entry per line):
  #   <fieldSet>=<sampleRate>:<untilEpochSec>[:<reason>]
  #
  # Example (with fieldNames: [symbolMessage]):
  #   Error_syncing_pod=0.10:1744848000:pod error spam OPS-4821
  #   heartbeat_debug=0.00:1744416000:k8s liveness 200s
  #
  # Entries self-heal past 'untilEpochSec'. Severity boost still applies as a floor
  # so ERROR/FATAL are never fully suppressed even under a 0.0 mute.
  #
  # Periodically pulling the mute file to keep it fresh is done via the gitops
  # configuration — see https://doc.log10x.com/config/github/#config
  #
  lookup:

    # 'file' specifies the mute file path. Will reload on change.
    #  Comment out to use per-node budget sampling (default).
    # file: $=path("data/sample/mutes") + "/mutes.csv"

    # 'retain' specifies the period before the file is marked as stale.
    retain: $=parseDuration("10m")

Options

Specify the options below to configure the rate receiver:

Name Description
rateReceiverFieldNames List of TenXObject fields to identify rate counter buckets
rateReceiverResetIntervalMs Reset interval for rate counters in milliseconds
rateReceiverMinRetentionThreshold Minimum retention threshold for events when budget is exceeded
rateReceiverLevelBoost Severity level boost mapping for retention rates
rateReceiverLookupFile Declarative mute file keyed by field-set
rateReceiverLookupRetain Retention period for the lookup file containing global event type rates
rateReceiverBudgetPerHour Target spending budget per hour in USD
rateReceiverMaxSharePerFieldSet Maximum % of total spend any single field set can consume
rateReceiverIngestionCostPerGB Vendor ingestion cost per GB in USD

rateReceiverFieldNames

List of TenXObject fields to identify rate counter buckets.

Type Default
List [symbolMessage]

Defines the list of TenXObject field names extracted to identify which rate counter bucket an event belongs to. The list usually contains the symbolMessage field from the message enrichment module but can include additional fields like GeoIP, HTTP code, k8s container name, or custom enrichments for multi-dimensional rate tracking.

Common Use Cases:

Single-app receiving (per event type):

rateReceiverFieldNames:
  - symbolMessage

Multi-dimensional tracking (event type + geography + HTTP status):

rateReceiverFieldNames:
  - symbolMessage
  - country
  - httpCode

Multi-app receiving in Kubernetes:

Option A: Cap total spend per app (all event types combined):

rateReceiverFieldNames:
  - container  # Aggregates all event types for each app

Each app's total spend (across all event types and pods) gets one cap. Simple but loses event-type intelligence.

Option B: Cap spend per event type per app:

rateReceiverFieldNames:
  - symbolMessage  # Event type
  - container      # App identifier (same across all pods)

Each (event type × app) combo gets its own cap. Provides fairness within apps but allows apps with many event types to potentially exceed one total cap.

Use container (not pod) for aggregation—container name is stable across replicas while pod names are unique per instance.

rateReceiverResetIntervalMs

Reset interval for rate counters in milliseconds.

Type Default
Number 300000

Defines the interval in milliseconds after which to reset rate counters. Controls how frequently the receiver resets its tracking counters.

Default of 5 minutes (300000ms) provides stable cost averages and smooths out bursts while remaining responsive. Shorter windows (e.g., 1 minute) are more reactive but noisier; longer windows (e.g., 10 minutes) are smoother but slower to adapt.

Trade-offs:

  • 1 minute: Very responsive for bursts, but unstable averages and over-reactive throttling
  • 5 minutes: Balanced—stable averages, catches sustained patterns, aligns with hourly budgets (1/12 hour)
  • 10+ minutes: Very stable for long-term trends, but slower to adapt and less suitable for short-lived nodes

Validation: Must be at least 60000 milliseconds (1 minute).

rateReceiverMinRetentionThreshold

Minimum retention threshold for events when budget is exceeded.

Type Default
Number 0.1

Defines the minimum retention rate (0.0 to 1.0) applied to events even when budget is exceeded. Ensures some events are always retained even for very high-spend patterns, preventing complete data loss.

This value ensures a floor on retention, preventing complete data loss even when budget is exceeded. The severity level boost (from rateReceiverLevelBoost) is multiplied with this value to determine the actual minimum retention threshold for each severity level.

How boost works:

  • Boost only affects the minimum retention threshold, not the calculated threshold based on budget
  • When under budget: retention threshold is based on budget utilization (boost has no effect)
  • When over budget: retention threshold is clamped to minRetentionThreshold * boost

Examples:

  • minRetentionThreshold: 0.1 with boost: 1.0 (INFO) → minimum 10% retention when over budget
  • minRetentionThreshold: 0.1 with boost: 2.0 (ERROR) → minimum 20% retention when over budget
  • minRetentionThreshold: 0.1 with boost: 0.25 (DEBUG) → minimum 2.5% retention when over budget

Important: Boost values \< 1.0 reduce minimum retention for low-priority events. This prevents them from consuming budget when over limit, while still ensuring some events are retained.

Trade-offs:

  • 0.01 (1%): Very aggressive throttling, minimal retention when over budget. Use only if budget is critical.
  • 0.1 (10%): Balanced default. Ensures observability even during budget overruns while still enforcing cost control.
  • 0.25 (25%): Conservative. Prioritizes data retention over strict budget enforcement.

Validation: Must be greater than 0.01.

rateReceiverLevelBoost

Severity level boost mapping for retention rates.

Type Default
List []

defines a map of severity levels to boost multipliers for minimum retention thresholds. Higher severity events can be given higher minimum retention rates through boost values.

The boost multiplier is applied only to rateReceiverMinRetentionThreshold, not to the entire retention threshold. This ensures critical events (ERROR, FATAL) have higher minimum retention floors when budget is exceeded, while preventing boost values \< 1.0 from reducing retention when under budget.

How it works:

  • The receiver calculates a retention threshold based on budget utilization
  • The threshold is clamped to at least rateReceiverMinRetentionThreshold * boost
  • Boost only affects the minimum floor, not the calculated threshold
  • Higher boost values result in higher minimum retention for events of that severity when over budget
  • Lower boost values (\< 1.0) reduce minimum retention for low-priority events (e.g., DEBUG, TRACE)

For example:

levelBoost:
  - TRACE=0.25
  - DEBUG=0.5
  - INFO=1
  - WARN=1.5
  - ERROR=2
  - FATAL=3

rateReceiverLookupFile

Declarative mute file keyed by field-set.

Type Default
String

Defines the path to a declarative mute file that caps specific log patterns by the joined field-set defined in rateReceiverFieldNames (e.g. symbolMessage, container, httpCode). The key format is the same one the local receiver uses for per-node counters.

When this option is set, the receiver switches from local per-node budget sampling to mute-file mode: it does nothing until an event's field-set matches an entry in the file, at which point that entry's sampleRate and untilEpochSec decide retention.

File format — one entry per line, keyed by the joined field-set value:

<fieldSet>=<sampleRate>:<untilEpochSec>[:<reason>]
  • fieldSet — the joined values of the fields named in rateReceiverFieldNames, separated by _. With rateReceiverFieldNames: [symbolMessage] the key is just symbolMessage (e.g. Error_syncing_pod). With [symbolMessage, container] the key is symbolMessage_container (e.g. heartbeat_debug_frontend).
  • sampleRate — probability in [0.0, 1.0] that a matching event is retained. 0.0 = full mute; 0.1 = keep 10%; 1.0 = no-op.
  • untilEpochSec — mute expires at this Unix epoch (seconds). Past that, the entry becomes a no-op until someone edits or removes it. Self-healing by design.
  • reason — optional free-text string for audit. Not used at runtime.

Example (with rateReceiverFieldNames: [symbolMessage]):

Error_syncing_pod=0.10:1744848000:pod error spam OPS-4821
heartbeat_debug=0.00:1744416000:k8s liveness 200s
jwt_validated=0.25:1744502400:auth flood after deploy

Why this shape:

  • Field-set keyed, not regex keyed → mute keys are the exact identifiers the Reporter attributes cost to (same rateReceiverFieldNames on both sides), so a "top spender" in the Reporter maps 1:1 to a mute entry.
  • Git-friendly: the file is a human-readable declaration that can live in a config repo, reviewed via PR, git blame-d for who/why.
  • Self-healing: every mute has an explicit expiry, so forgotten entries eventually stop filtering instead of silently dropping production data forever.
  • AI-editable: an operator can ask an assistant (e.g., Claude Code via the Log10x MCP) to "mute Error_syncing_pod for 24 hours at 10%" — the assistant reads the Reporter's cost attribution, appends the entry, and opens a PR. The mute file is the interface.

Severity floor still applies. Even a 0.0 mute will retain high-severity events at rateReceiverMinRetentionThreshold * boost (see rateReceiverLevelBoost). This prevents a poorly-scoped mute from silencing ERROR/FATAL traffic.

When rateReceiverLookupFile is unset, the receiver falls back to the local per-node budget strategy driven by rateReceiverBudgetPerHour, rateReceiverMaxSharePerFieldSet, and rateReceiverIngestionCostPerGB. Those options are ignored in mute-file mode.

rateReceiverLookupRetain

Retention period for the lookup file containing global event type rates.

Type Default
Number 300000

Defines the retention period for the lookup file containing global event type frequency data.

If the file's last modified time is older than this period, the lookup is considered stale, and local counter rates are used. Used to make sampling decisions based on cluster-wide event patterns.

Validation: Must be greater than 60000 milliseconds.

rateReceiverBudgetPerHour

Target spending budget per hour in USD.

Type Default
Number 1.0

Defines the soft target budget per hour for total cost across all event types on this node. Actual spend may exceed this by 10-20% due to soft enforcement.

Hourly budgets align naturally with cloud infrastructure costs and work for both short-lived and long-lived nodes.

Examples:

  • $1.50/hour → reasonable for a production log forwarder (~$36/day, ~$1080/month if running 24/7)
  • $0.10/hour → conservative for dev/test environments
  • $0.02/hour → minimal for low-volume services.

rateReceiverMaxSharePerFieldSet

Maximum % of total spend any single field set can consume.

Type Default
Number 0.2

Defines the maximum share (0.0 to 1.0) of the total budget that any single unique field set can use. A "field set" is a unique combination of the field values specified in rateReceiverFieldNames.

Enforced independently of whether the total budget is exceeded—prevents noisy field sets from dominating even when under budget.

In global mode, this is enforced cluster-wide based on aggregate spend data from the lookup file.

Example with rateReceiverFieldNames: [symbolMessage]:

  • 0.2 means no single event type (e.g., "heartbeat_debug") can use more than 20% of total spend
  • If "heartbeat_debug" is costing 35% of total, it gets throttled to 20% regardless of whether you're over budget

Example with rateReceiverFieldNames: [container] (Kubernetes: per-app total):

  • Each app (e.g., "frontend", "backend", "payment-service") is tracked separately
  • 0.2 means no single app can exceed 20% of total spend across ALL its event types
  • Aggregates across all pods and all event types for each app
  • If "frontend" app (all events, 5 pods) costs 30%, it gets throttled to 20%

Example with rateReceiverFieldNames: [symbolMessage, container] (Kubernetes: per event type per app):

  • Each unique combination (e.g., "error_login|frontend", "heartbeat_debug|backend") is tracked separately
  • 0.2 means no single (event type × app) can exceed 20% of total spend
  • If "frontend" app's "heartbeat_debug" events cost 30% across its 5 pods, they get throttled to 20%
  • Note: "frontend" could have multiple event types, each with their own 20% cap.

rateReceiverIngestionCostPerGB

Vendor ingestion cost per GB in USD.

Type Default
Number 1.5

Defines the cost per GB charged by your observability vendor for log ingestion. Used to calculate per-event costs (event byte size × cost per GB) for budget enforcement.

Note: ingestionCostPerGB only affects per-node budget mode. In mute-file mode, sampling is declarative and does not use cost calculations.

Common vendor pricing (2025):

  • Splunk Cloud: ~$1.50/GB (varies by contract, SKU)
  • Datadog Logs: ~$0.10-$0.25/GB (depends on tier: standard, flex, online archives)
  • Elastic Cloud: ~$0.109/GB (standard logging tier)
  • New Relic: ~$0.30/GB (Data Plus)
  • Sumo Logic: ~$1.50/GB (depends on plan)
  • AWS CloudWatch Logs: ~$0.50/GB ingestion + $0.03/GB storage

Example: A 10KB error log at $1.50/GB costs ~$0.000015. Over 1 million such events per hour, that's $15/hour ($360/day). The receiver tracks this spend per field set and enforces your rateReceiverBudgetPerHour by probabilistically sampling events when over budget.


This module is defined in rate/module.yaml.