KV Store
Validating the KV Store, diagnosing "Consume KV" silent failures, monitoring capacity, recovering from template/event ordering issues, and distributed-cluster setup.
How do I validate that KV Store is working correctly
Quick Health Check (run all three):
-
Verify KV collection exists:
Expected: Returns 1 result. If 0 results, collection wasn't created. -
Check KV store population:
Expected: Shows N (number of templates). If 0, no templates loaded yet. -
Verify "Consume KV" scheduled search is running:
Expected:| index=_internal savedsearch_name="Consume KV" | stats latest(status) as status, latest(_time) as last_run by savedsearch_namestatus=success,last_runwithin last 2 minutes.
If any check fails, see troubleshooting below.
\"Consume KV\" scheduled search is failing silently
The "Consume KV" search populates templates from tenx_dml index into the KV Store. If it fails, templates won't be available for expansion.
Diagnostic procedure:
Step 1: Check scheduler logs
| index=_internal sourcetype=scheduler savedsearch_name="Consume KV"
| table _time, status, result_count, alert_action
| stats latest(*) as * by status
Common failure modes:
| Status | Cause | Fix |
|---|---|---|
error |
Search syntax error in saved search | Edit saved search "Consume KV" and verify query syntax |
success / count=0 |
No templates in tenx_dml index |
Run: \| index=tenx_dml \| stats count — if 0, send templates via HEC |
failure |
Alert action (tenx_dml_to_kv.py) failed | Check: \| index=_internal sourcetype=action_handler savedsearch_name="Consume KV" |
| No results | Search never ran | Verify: Scheduler is enabled (Settings > Scheduled Searches) |
Recovery steps:
1. Verify templates exist:
| index=tenx_dml sourcetype=tenx_dml_raw_json | stats count
2. Force immediate execution:
Click saved search "Consume KV" > Run
(Or use: | savedsearch "Consume KV")
3. Wait 2 minutes and verify population:
| inputlookup tenx-dml-lookup | stats count
(Should show > 0)
4. If still 0, check KV collection exists:
| rest /servicesNS/nobody/tenx-for-splunk/storage/collections/config
How do I monitor KV Store size and capacity
KV Store size affects search performance. Monitor it proactively:
Monthly capacity check:
| inputlookup tenx-dml-lookup
| stats count as num_templates, max(timestamp_format) as latest_update
Recommended capacity limits:
| Template Count | Action | Performance |
|---|---|---|
| < 100K | No action needed | Excellent (< 5ms lookup) |
| 100K-500K | Monitor monthly | Good (5-20ms lookup) |
| 500K-1M | Plan optimization | Fair (20-50ms lookup) |
| > 1M | Contact engineering | Needs partitioning |
If approaching 1M templates:
Option 1: Archive old templates (move to secondary collection)
Option 2: Partition templates across multiple collections
Monitor expansion latency:
index=<your-index> sourcetype=tenx_encoded
| `tenx-inflate`
| stats avg(eval(round(relative_time(now(), "now") - _time, 3))) as inflate_latency_sec
If latency > 1 second, KV Store may be oversized.
What if I accidentally send encoded events before templates are loaded
If encoded events arrive before templates, expansion will fail silently until templates load.
Prevention:
Always verify template population BEFORE sending encoded events:
Recovery (if already happened):
-
Load the missing templates - Re-send template data via HEC (same format as before) - Wait 2-3 minutes for "Consume KV" to process
-
Re-index the encoded events (optional)
-
Verify recovery:
Distributed KV Store setup for multi-node Splunk clusters
For production Splunk clusters, KV Store can be: - Replicated (HA across nodes) - Partitioned (scaled across multiple collections)
For 3-node Splunk cluster:
KV Store automatically replicates to all nodes (no special config). To verify:
# On each node:
| rest /servicesNS/nobody/tenx-for-splunk/storage/collections/config
| search title="tenx_dml"
| table label, acl{}.perms
All three nodes should return the same collection.
Performance optimization for distributed setup:
# In app's local/collections.conf (or via REST):
[tenx_dml]
field.pattern_hash = string
field.pattern = string
accelerated_fields = pattern_hash # Index pattern_hash for faster lookups
This creates an index on pattern_hash (faster expansion macro joins).
For very large clusters (10+ nodes):
Consider dedicated KV Store nodes:
Edit: $SPLUNK_HOME/etc/system/local/server.conf
[sslConfig]
serverRepositories = <list-of-kv-store-only-nodes>
Monitoring cluster KV Store health:
| rest /servicesNS/nobody/tenx-for-splunk/storage/collections/data/tenx_dml
| stats count as templates_primary
| append
[| rest /servicesNS/nobody/tenx-for-splunk/storage/collections/data/tenx_dml
| stats count as templates_replica]
Both should be equal (healthy replication).