Optimization

How the 10x Engine affects hot/warm/cold tiering, data-node count reductions, handling high-cardinality fields, and search-time overhead from the L1ES plugin.

Does the 10x Engine help with hot/warm/cold tier optimization

Yes, the 10x Engine reduces hot tier requirements at every level. On self-hosted Elasticsearch or OpenSearch, Compact mode shrinks the indexed footprint losslessly in place. On managed Elasticsearch (Elastic Cloud, OpenSearch Service), compact is a no-op at the destination, so the levers there are tier_down, offload, and sample/drop instead.

Without the 10x Engine (100TB/day):

  • Hot tier: 100TB x 7 days = 700TB SSD storage
  • Warm tier: 700TB x 4 weeks = 2.8PB storage

With Compact mode on self-hosted ES/OpenSearch (modeled ~50% reduction of indexed volume):

  • Hot tier: 50TB x 7 days = 350TB SSD storage
  • Warm tier: 350TB x 4 weeks = 1.4PB storage

Adding Retriever (offload) pushes the combined reduction higher still, keeping only the hot working set indexed. Offloaded data lands in your own S3 and stays queryable on demand from there; the SIEM does not re-ingest it. Compact and offload are non-destructive, so the data stays reachable; sampling and drop are lossy choices you opt into.

Bonus: Offloaded events land in your own S3 at $0.023/GB-month and the Retriever returns the exact offloaded events on demand, with no rehydration and no SIEM re-ingest.

How does volume reduction translate to fewer data nodes

For most self-managed clusters, data node count is driven by storage capacity. The relationship is roughly linear: 50% less indexed volume ≈ 50% fewer data nodes, because each decommissioned node removes both its EBS volume and its EC2 instance. Compact only shrinks in place on self-hosted ES/OpenSearch; the percentages below are modeled estimates, not guarantees.

Example, 30 data nodes ingesting 10TB/day, 7-day hot retention, 1 replica:

Before Receiver (Compact mode) (modeled 50%) + Retriever (modeled 80%)
Daily indexed volume 10TB 5TB 2TB
Hot storage (7d × 1 replica) 140TB 70TB 28TB
Data nodes 30 15 6
S3 offload (your bucket) , , 300TB

Each decommissioned data node retires both an EC2 instance and its EBS volume, so the node-count reduction tracks the indexed-volume reduction directly. Offloaded events land in your own S3 at $0.023/GB-month, and the Retriever returns the exact offloaded events on demand rather than re-ingesting them into the cluster. Master nodes (typically 3), coordinating nodes, and Kibana instances stay the same regardless of data volume.

The numbers above assume r6g.2xlarge data nodes with 5TB gp3 EBS each. Your instance types and storage sizes will differ, but the scaling relationship holds: half the indexed volume → half the hot-tier storage → half the data nodes.

Use the ROI Calculator with your actual cluster size for a personalized estimate.

How does the 10x Engine handle high-cardinality fields that cause mapping explosions

The 10x Engine separates the stable structure of each log event from the variable data inside it (timestamps, IDs, UUIDs) that drives mapping explosions. Fewer distinct field values reach Elasticsearch, so the mappings stay small.

Each event is matched to its stable pattern, the part of the message that stays constant across instances, such as Error_syncing_pod or Receive_ListRecommendations_for_product_ids. That pattern is the unit of tracking, not the raw field values.

Once cost is tracked per pattern, budget-based sampling can target the noisiest patterns before they reach Elasticsearch. Sampling is a lossy choice you opt into; it cuts ingestion volume and the count of distinct field values your mappings handle, but it does not keep every line.

What is the search-time overhead in Elasticsearch

The L1ES Lucene plugin expands compact events transparently during the fetch phase, so search returns the full original events. The expansion overhead is modest and offset by the smaller index: fewer data nodes, less SSD, lower compute. For managed Elasticsearch (Elastic Cloud, OpenSearch Service), Retriever returns the exact offloaded events on demand.

Expansion runs in the fetch phase, so its cost scales with the number of documents a query returns rather than with index size, and it is offset by the smaller index the compact format produces. For resource requirements, scaling tables, and architecture details, see Performance FAQ.