Data Protection
Where log processing happens, what data leaves your network, symbol libraries, AI, and how to validate that security logs are not filtered.
Where does log processing happen
All processing happens in your infrastructure:
- Reporter: a DaemonSet alongside your forwarder. Not in the critical log path.
- Receiver: a sidecar to your forwarder (Filter or Compact mode).
- Retriever: deploys in your AWS/cloud account, not ours.
- MCP server: where your agent (Claude or your own LLM) runs to inspect patterns and propose config. The agent proposes; the Receiver enforces what you approve.
You control where processed events go via output configuration (files, forwarders, metric destinations). Log10x never receives log content.
What data does Log10x actually see
Zero log content, always. By default, per-pattern metrics go to your own time-series database and log10x sees nothing. If you opt into the hosted metrics backend, only aggregated metrics leave your network: event counts and byte volumes grouped by enrichment fields (a recurring log-pattern name derived from symbol tokens in your code, severity, K8s container/namespace, HTTP status). Never log messages, PII, or raw events.
What log data leaves my environment
None. Log data never leaves your infrastructure; all log content stays in your environment. The only thing that optionally leaves is aggregated metrics (event counts, byte volumes), and only if you opt into the hosted backend. AI analysis is optional too; the next answer covers exactly what it sends.
What specific metrics leave my network
When you opt into the hosted metrics backend, 10x sends aggregated metrics to prometheus.log10x.com over TLS 1.3. The exact fields:
| Label | Example Value | Contains PII? |
|---|---|---|
tenx_env |
production |
No |
tenx_app |
order-service |
No |
tenx_host_name |
edge-node-1 |
No |
tenx_pipeline_uuid |
a1b2c3d4-... |
No |
severity_level |
ERROR |
No |
message_pattern |
Failed to connect to {} |
No |
k8s_namespace |
payments |
No |
k8s_container |
api-gateway |
No |
http_code |
503 |
No |
index_app |
main |
No |
message_pattern is a template name derived from log statement structure (placeholders replace all variable data), it contains no log content, no request data, no PII.
Metric names (the values):
| Metric | Type | What It Measures |
|---|---|---|
tenx_pipeline_up |
Gauge | Pipeline running (1 = up) |
tenx_pipeline_bootstrap_time_seconds |
Gauge | Startup time |
tenx_pipeline_runtime_seconds |
Gauge | Total runtime |
all_events_summaryVolume_total |
Counter | Input events (before reduction) |
all_events_summaryBytes_total |
Counter | Input bytes (before reduction) |
emitted_events_summaryBytes_total |
Counter | Output bytes (after processing) |
emitted_events_optimized_size_total |
Counter | Compact output bytes |
All values are numeric counters and gauges: event counts, byte volumes, durations. No log content appears in any metric value.
Billing telemetry: Engines also send lightweight heartbeats (tenx_pipeline_up, tenx_pipeline_info) containing node ID and pipeline name for license tracking. No log content, no PII. Air-gapped deployments use a local License Receiver instead.
Default (your own TSDB): Nothing is sent to Log10x. All metrics go to your own time-series database; the hosted backend above is opt-in.
Sensitivity note: Metric labels include infrastructure metadata such as application names, Kubernetes namespace names, and log pattern templates. Organizations that classify infrastructure topology as sensitive should deploy self-managed, no data reaches Log10x systems.
What are symbol libraries and do they contain my code
Symbol libraries contain 64-bit hashes of string constants extracted from your log statements, plus class and method names to identify the source of each log statement. They contain no source code, no log data, and no telemetry. Compilation happens in your CI/CD pipeline, we never see your repositories, code, or symbol libraries. See the Compiler FAQ for full details.
Is AI optional? What data does it send
Fully optional, and you choose how. AI analysis runs two ways:
- Bring Your Own Key, point the agent at your own model (OpenAI, Anthropic, xAI, Azure OpenAI, or self-hosted Ollama)
- Disabled, no data sent to any AI provider
Either way you supply the model and the key; log10x does not provide an AI key. Only aggregated metrics (event-type names, volume, cost) are ever sent to the model, never raw log content. Self-managed deployments preconfigure no AI; you enable it if and how you want. Disabling AI does not affect optimization.
How do I validate that critical security logs aren't being filtered
The agent proposes which patterns to filter; you review the proposal (a pull request in your own repo) before it takes effect, and the Receiver enforces only what you approve. Several layers let you confirm your security logs are preserved:
- Shadow mode first: Deploy the Reporter as a read-only DaemonSet. It tails the live event stream pre-SIEM without modifying, filtering, or redirecting any data. Compare what would be reduced vs actual security events before the Receiver enforces anything in-path.
- Allowlist approach: Explicitly preserve all logs from security indexes. Allowlist sourcetypes like
firewall,ids,authentication. - Metrics tracking: Dropped event counts are recorded in aggregated metrics: query
all_events{routeState="drop"}to see exactly what was dropped and confirm nothing unexpected was filtered. - Recoverable on demand: Patterns the engine offloads land in your own S3, where the Retriever returns the exact offloaded events on demand if a security log is ever needed.
- Compliance reporting: A daily summary reports how many security-source events were dropped (target: none), so you can confirm filtering on those sources. Start with no filtering on security sources, then expand gradually after 30-day validation.