POC submit

Run a full cost POC against your log analyzer once credentials are in the environment, for numbers tied to your actual billed GB.

Returns a snapshot_id you poll via POC status for the 9-section report:

cost drivers,
Receiver recommendations,
native log analyzer exclusion configs,
volume-reduction potential,
risk / dependency checks,
deployment paths.

Events stay on the machine by default.

Example

You

cost POC on our production Datadog, 7d window

Log10x

Pulling ~250K-event sample from Datadog. Returned snapshot_id=snap_abc123. ETA 2–3 min.

Auto-detected daily volume: 1.2 TB/day · auto-detected $/GB: $2.50 · pull window: 7d.

Phases: pulling (1–3 min) → analyzing (3–8 min) → rendering (<5s) → complete. Poll POC status every ~30s; partial_patterns_found stops growing when render is close.

Report will land at /tmp/log10x-reports/poc_from_siem-snap_abc123.md.

More to ask

"POC against /aws/ecs/checkout-svc CloudWatch log group, level=ERROR"
"cost POC on Splunk index main, 30d"
"Elasticsearch POC, no auto-volume probe, override 500GB/day"

Prerequisites

No Log10x API key required. The local 10x engine runs the analysis, so events never leave the machine:

the tenx CLI (install for macOS/Linux/Windows), or
a local Docker container (LOG10X_TENX_MODE=docker, auto-detected and preferred when the mode is unset).

Schema and samples

Input example

Real call against the demo env (captured by scripts/capture-tool-envelopes.mjs).

{
  "siem": "cloudwatch",
  "window": "7d",
  "target_event_count": 250000,
  "max_pull_minutes": 5,
  "ai_prettify": true
}

Input schema

Agent-facing JSON Schema (the canonical shape the MCP server publishes via tools/list):

{
  "type": "object",
  "properties": {
    "siem": {
      "type": "string",
      "enum": [
        "cloudwatch",
        "datadog",
        "sumo",
        "gcp-logging",
        "elasticsearch",
        "azure-monitor",
        "splunk",
        "clickhouse"
      ],
      "description": "Which SIEM to pull from. Omit to auto-detect from ambient credentials. Valid values: cloudwatch, datadog, sumo, gcp-logging, elasticsearch, azure-monitor, splunk, clickhouse."
    },
    "window": {
      "type": "string",
      "default": "14d",
      "description": "Window to pull over. Accepts \"1h\", \"24h\", \"7d\", \"14d\", \"30d\". Default \"14d\". Wide windows unlock the differentiated longitudinal signals (first-seen, growth, stable-vs-new) that the agent cannot compute from a small sample. Pull pacing is automatic; long windows take minutes but the snapshot continues in the background."
    },
    "scope": {
      "type": "string",
      "description": "SIEM-specific resource scope. CloudWatch: log group name or wildcard (`/aws/ecs/*`). Datadog: index name. Sumo: `_sourceCategory`. GCP: project id. Elasticsearch: index pattern. Azure Monitor: workspace id. Splunk: index name. ClickHouse: database name."
    },
    "query": {
      "type": "string",
      "description": "SIEM-native filter expression layered on top of `scope`. Syntax per SIEM (CloudWatch filter pattern; Datadog query; KQL for ES/Azure; SPL for Splunk; SQL WHERE for ClickHouse; Sumo query)."
    },
    "target_event_count": {
      "type": "number",
      "minimum": 1000,
      "maximum": 5000000,
      "default": 1000000,
      "description": "Target event count for the pull. Default 1,000,000 (~500 MB at 500B avg, tokenizes in 5-10 min). The pull self-terminates earlier on saturation: when new patterns per 100k events drops below 2%, the long tail has been covered and the report is generated. This default is intentionally two orders of magnitude beyond what an unaided agent can fit in context."
    },
    "max_pull_minutes": {
      "type": "number",
      "minimum": 1,
      "maximum": 60,
      "default": 30,
      "description": "Hard cap on pull wall-time. Default 30. The pull stops at whichever of target_event_count, max_pull_minutes, or saturation-detected hits first. Long pulls run in the background; poll status while the user does other things."
    },
    "analyzer_cost_per_gb": {
      "type": "number",
      "exclusiveMinimum": 0,
      "description": "Override the $/GB rate for cost calculations. Default is read from vendors.json per detected SIEM."
    },
    "total_daily_gb": {
      "type": "number",
      "exclusiveMinimum": 0,
      "description": "Customer's total daily log volume in GB/day. Pick any one of total_daily_gb / total_monthly_gb / total_annual_gb, whichever unit the user naturally thinks in. The tool normalizes to daily internally. When any is provided (or auto_detect_volume succeeds), per-pattern costs are extrapolated from the pulled sample to the full volume, producing meaningful annual-savings figures instead of sub-cent numbers. Priority: daily > monthly > annual. If the pull was narrowed via `query` to one service, this overstates cost; only a fraction of daily volume matches the filter."
    },
    "total_monthly_gb": {
      "type": "number",
      "exclusiveMinimum": 0,
      "description": "Customer's total monthly log volume in GB/month. See total_daily_gb for semantics."
    },
    "total_annual_gb": {
      "type": "number",
      "exclusiveMinimum": 0,
      "description": "Customer's total annual log volume in GB/year. See total_daily_gb for semantics."
    },
    "auto_detect_volume": {
      "type": "boolean",
      "default": true,
      "description": "Default true: when no total_*_gb arg is provided, probe the SIEM's usage/metrics API to auto-detect daily ingest volume. Per-SIEM best-effort: CloudWatch (describeLogGroups ÷ retention), Datadog (Usage API), Elasticsearch (_stats), Azure (Usage KQL table), GCP (Cloud Monitoring byte_count), ClickHouse (system.parts), Splunk (license API), Sumo (Account Usage API). Fails silently and falls back to scenario brackets if the current creds lack the required scope. Set false to skip the probe and go straight to manual args or scenarios."
    },
    "ai_prettify": {
      "type": "boolean",
      "default": true,
      "description": "Default true: use MCP sampling to ask the host LLM (the same model the user is already chatting with, e.g. Claude Desktop, Claude Code, Cursor) to batch-generate 3-5-word human-readable names for the top patterns. No Log10x-side endpoint, no extra API key; the host uses whatever model + credentials the user already has. Sends only templated pattern identities (no variable values, no raw log content). Skipped automatically when the host does not advertise the `sampling` capability; the report falls back to raw snake_case identities plus a note. Set false to skip unconditionally."
    },
    "enrich_with_host_agent": {
      "type": "boolean",
      "default": true,
      "description": "Default true: after the engine produces measured findings (per-pattern $/mo, growth, incident clusters), ask the MCP host LLM via sampling to contribute operational context the engine cannot see: kubectl events / deploys correlating with GROWING patterns, alert / dashboard dependencies before recommending mute, code-level root-cause refinement on code_fix patterns, and prioritization based on customer context. Single round-trip, capped at 8000 output tokens. Skipped automatically when the host does not advertise sampling; the v2 envelope still ships without enrichment. Contributions land in output.agent_enrichment.contributions with an audit trail (tools_inspected) so the customer sees what the agent says it looked at."
    },
    "enrich_max_tokens": {
      "type": "integer",
      "minimum": 1000,
      "maximum": 32000,
      "default": 8000,
      "description": "Output token cap for the host-agent enrichment call. Default 8000."
    },
    "environment": {
      "type": "string",
      "description": "Optional environment nickname, cosmetic only, for the report header."
    },
    "target_percent_reduction": {
      "type": "number",
      "minimum": 0,
      "maximum": 100,
      "description": "Customer-specified target reduction percent. If absent, POC produces a recommendation-only output. If present, POC produces a feasibility verdict (`output.feasibility`) plus a pre-deploy commitment artifact stub (`output.commitment_artifact`) the agent can surface alongside the per-pattern actions. The cap CSV ready to commit ships in Item 4 of the cost-cutting close list."
    },
    "exception_services": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "Services flagged by the customer to stay in the SIEM with full retention (action=pass). Typically 1-3 services for audit, compliance, or executive dashboards. Patterns whose service is in this list are pinned to pass on the envelope outputs and their bytes are subtracted from the achievable reduction pool used for the feasibility verdict."
    },
    "pin_services": {
      "type": "object",
      "additionalProperties": {
        "type": "string",
        "enum": [
          "pass",
          "sample",
          "compact",
          "tier_down",
          "offload",
          "drop"
        ]
      },
      "description": "Primary per-service override surface. Map of service name to action. \"Pin payment-svc to pass\" → {\"payment-svc\":\"pass\"}. Pins are applied AFTER the destination default and AFTER exception_services. Feasibility math reruns with the pins; max_achievable_percent may shift and reason cites the pins."
    },
    "pin_patterns": {
      "type": "object",
      "additionalProperties": {
        "type": "string",
        "enum": [
          "pass",
          "sample",
          "compact",
          "tier_down",
          "offload",
          "drop"
        ]
      },
      "description": "Advanced: most customers will not need this. Map of pattern_hash to action for rare per-pattern overrides within a service. Applied AFTER pin_services. Use only when a single pattern inside an otherwise-reducible service must be excepted (e.g., audit-trail log line inside a chatty service)."
    },
    "clickhouse_table": {
      "type": "string",
      "description": "[ClickHouse] Required: table name holding log events."
    },
    "clickhouse_timestamp_column": {
      "type": "string",
      "description": "[ClickHouse] Column holding the timestamp. Default auto-detected."
    },
    "clickhouse_message_column": {
      "type": "string",
      "description": "[ClickHouse] Column holding the message body. Default auto-detected."
    },
    "clickhouse_service_column": {
      "type": "string",
      "description": "[ClickHouse] Optional column for service name."
    },
    "clickhouse_severity_column": {
      "type": "string",
      "description": "[ClickHouse] Optional column for severity."
    }
  },
  "additionalProperties": false
}

Source: src/tools/poc-from-siem.ts.

Output example

Real envelope from the demo env. view: "summary" returns the full StructuredOutput with typed data. Long arrays + base64 PNG bodies trimmed for readability; the real call returns them in full.

Headline (the 1-line agent-facing answer):

POC submit accepted for cloudwatch (snapshot_id b27c9a4f); estimated 4 min. Poll log10x_poc_from_siem_status.

{
  "schema_version": "1.0",
  "schema_epoch": "2026-05-25",
  "tool": "log10x_poc_from_siem_submit",
  "view": "summary",
  "summary": {
    "headline": "POC submit accepted for cloudwatch (snapshot_id b27c9a4f); estimated 4 min. Poll log10x_poc_from_siem_status."
  },
  "data": {
    "ok": true,
    "snapshot_id": "b27c9a4f-...",
    "siem_detected": "cloudwatch",
    "estimated_duration_minutes": 4,
    "window": "7d",
    "target_event_count": 250000,
    "max_pull_minutes": 5
  },
  "actions": [
    {
      "tool": "log10x_poc_from_siem_status",
      "args": {
        "snapshot_id": "b27c9a4f-..."
      },
      "reason": "poll POC progress; phases: pulling -> analyzing -> rendering -> complete"
    }
  ],
  "generated_at": "2026-05-26T00:00:00.000Z"
}

Output schema

The data block inside the StructuredOutput envelope:

interface ToolData {
  ok: boolean;
  snapshot_id: string;
  siem_detected: string;
  estimated_duration_minutes: number;
  window: string;
  target_event_count: number;
  max_pull_minutes: number;
}

Envelope-level fields the agent should also read: summary.headline (1-line answer), actions[] (next-call chain hints as {tool, args, reason}), truncated: boolean, images[] (PNG attachments where applicable), schema_epoch (engine-ID stability boundary).

Next: POC status