Skip to content

Health check

Confirm Log10x is wired up correctly across the boxes you expect: API auth, Reporter freshness, Retriever reachability, log analyzer creds. Returns one pass / warn / fail line per check, with the exact env var to set or kubectl to run when something's missing.

Example

You

health check

Log10x

✓ 2 environments: prod ★, staging. Default: prod.

prometheus.log10x.com reachable, auth OK

✓ Reporter detected: Cloud, edge reporter metrics fresh within 8s

✓ Retriever endpoint reachable

⚠ Datadog destination credentials missing. Set DD_API_KEY if you need Datadog integration.

✓ Paste endpoint reachable

✓ Enrichment labels present for metric correlation

✓ Scale: 101 TB / 7d · 42 services · 3,212 patterns · 33.9M events per pattern. Retriever deployed: historical events past analyzer retention, dropped-event recovery, metric backfill from archive, and sample-reversal verification are in reach.

More to ask

  • "health check on staging"
  • "check all envs"
  • "is the Reporter wired up correctly?"

Prerequisites

No prerequisites.

Schema and samples

Input example

Real call against the demo env (captured by scripts/capture-tool-envelopes.mjs).

{
  "view": "summary"
}
Input schema

Agent-facing JSON Schema (the canonical shape the MCP server publishes via tools/list):

{
  "type": "object",
  "properties": {
    "environment": {
      "type": "string",
      "description": "Optional environment nickname to probe. In multi-env setups, omit to run the checks against ALL configured environments; pass a specific nickname to check only that one."
    }
  },
  "additionalProperties": false
}

Source: src/tools/doctor.ts.

Output example

Real envelope from the demo env. view: "summary" returns the full StructuredOutput with typed data. Long arrays + base64 PNG bodies trimmed for readability; the real call returns them in full.

Headline (the 1-line agent-facing answer):

Doctor: overall WARN (12 pass, 4 warn, 0 fail).

{
  "schema_version": "1.0",
  "schema_epoch": "2026-05-25",
  "tool": "log10x_doctor",
  "generated_at": "2026-05-26T15:38:41.938Z",
  "view": "summary",
  "summary": {
    "headline": "Doctor: overall WARN (12 pass, 4 warn, 0 fail)."
  },
  "data": {
    "overall": "warn",
    "counts": {
      "pass": 12,
      "warn": 4,
      "fail": 0
    },
    "checks_by_env": {
      "global": [
        {
          "name": "environment_config",
          "status": "pass",
          "message": "1 environment: 10x Demo (read) ★. Default: 10x Demo. (★ = default env; nicknames with (read) are read-only.)"
        },
        {
          "name": "retriever_endpoint",
          "status": "warn",
          "message": "Retriever not reachable from this MCP install. log10x_retriever_query and log10x_backfill_metric cannot retrieve raw events from the S3 archive in this session. For events inside SIEM hot retention (typically <7d), the fastest workaround is direct SIEM query — do not block on retriever setup.\n  - explicit_env: skipped — __SAVE_LOG10X_RETRIEVER_URL__ / __SAVE_LOG10X_RETRIEVER_BUCKET__ not set\n  - terraform_state: skipped — /Users/talweiss/.log10x/retriever.tfstate not present\n  - aws_s3_bucket_pattern: skipped — no AWS_REGION / AWS_PROFILE in env\n  - kubectl_service: skipped — kubectl found 0 log10x-retriever services — none clearly a query-handler",
          "fix": "Options: (a) set __SAVE_LOG10X_RETRIEVER_URL__ + __SAVE_LOG10X_RETRIEVER_BUCKET__ explicitly; (b) expose AWS creds (AWS_REGION + IAM with s3:ListAllMyBuckets) so auto-detect can find a log10x-retriever-* bucket; (c) deploy the Retriever — https://doc.log10x.com/apps/cloud/retriever/"
        },
        {
          "name": "datadog_destination",
          "status": "warn",
          "message": "No DATADOG_API_KEY (or DD_API_KEY) set. backfill_metric to Datadog will error if attempted.",
          "fix": "Set DATADOG_API_KEY in the MCP server environment if you plan to backfill Datadog metrics. DD_SITE / DATADOG_SITE controls the region (defaults to datadoghq.com)."
        },
        "... 4 more elided"
      ],
      "10x Demo": [
        {
          "name": "metrics_backend_reachable",
          "status": "pass",
          "message": "Backend `log10x` at https://prometheus.log10x.com reachable, auth OK for env 10x Demo."
        },
        {
          "name": "reporter_tier",
          "status": "pass",
          "message": "Edge Reporter detected — full-fidelity metrics with dropped-event coverage."
        },
        {
          "name": "metric_freshness",
          "status": "pass",
          "message": "edge reporter emitted within the last 16s — metrics are fresh."
        },
        "... 6 more elided"
      ]
    },
    "failing_checks": [],
    "warning_checks": [
      {
        "env": "global",
        "name": "retriever_endpoint",
        "message": "Retriever not reachable from this MCP install. log10x_retriever_query and log10x_backfill_metric cannot retrieve raw events from the S3 archive in this session. For events inside SIEM hot retention (typically <7d), the fastest workaround is direct SIEM query — do not block on retriever setup.\n  - explicit_env: skipped — __SAVE_LOG10X_RETRIEVER_URL__ / __SAVE_LOG10X_RETRIEVER_BUCKET__ not set\n  - terraform_state: skipped — /Users/talweiss/.log10x/retriever.tfstate not present\n  - aws_s3_bucket_pattern: skipped — no AWS_REGION / AWS_PROFILE in env\n  - kubectl_service: skipped — kubectl found 0 log10x-retriever services — none clearly a query-handler",
        "fix": "Options: (a) set __SAVE_LOG10X_RETRIEVER_URL__ + __SAVE_LOG10X_RETRIEVER_BUCKET__ explicitly; (b) expose AWS creds (AWS_REGION + IAM with s3:ListAllMyBuckets) so auto-detect can find a log10x-retriever-* bucket; (c) deploy the Retriever — https://doc.log10x.com/apps/cloud/retriever/"
      },
      {
        "env": "global",
        "name": "datadog_destination",
        "message": "No DATADOG_API_KEY (or DD_API_KEY) set. backfill_metric to Datadog will error if attempted.",
        "fix": "Set DATADOG_API_KEY in the MCP server environment if you plan to backfill Datadog metrics. DD_SITE / DATADOG_SITE controls the region (defaults to datadoghq.com)."
      },
      {
        "env": "global",
        "name": "templater_paste_endpoint_fallback",
        "message": "Paste endpoint reachable (demo fallback for privacy_mode: false). Do NOT use with production log content — events leave the caller's machine and hit a shared public Lambda."
      },
      "... 1 more elided"
    ]
  },
  "actions": [],
  "truncated": false,
  "warnings": []
}
Output schema

The data block inside the StructuredOutput envelope:

interface ToolData {
  overall: string;
  counts: { pass: number; warn: number; fail: number };
  checks_by_env: { global: Array<{
    name: string;
    status: string;
    message: string;
  }>; 10x Demo: Array<{
    name: string;
    status: string;
    message: string;
  }> };
  failing_checks: unknown[];
  warning_checks: Array<{
    env: string;
    name: string;
    message: string;
    fix: string;
  }>;
}

Envelope-level fields the agent should also read: summary.headline (1-line answer), actions[] (next-call chain hints as {tool, args, reason}), truncated: boolean, images[] (PNG attachments where applicable), schema_epoch (engine-ID stability boundary).