Skip to content

Investigate

Root-cause analysis on a service or pattern, and cross-pillar correlation between log patterns and APM / infra / business metrics.

"spike on payments-svc — what's driving it?"

Root cause: Payment_Gateway_Timeout jumped 200/min → 45,000/min at 14:30. CPU spike on db-replica-2 matches.

"co-movers?"

Datadog: db.replica.cpu (r=0.94) · apm.payments.latency (r=0.91) · kafka.consumer.lag (r=0.87).

"verify"

kubectl describe pod db-replica-2 — node pressure, OOM-killed twice in last hour.

You ask Example answer
spike on payments-svc — what's driving it? Root cause: Payment_Gateway_Timeout jumped 200/min → 45,000/min at 14:30. CPU spike on db-replica-2 matches. Verify: kubectl describe pod db-replica-2.
re-show investigation inv_a1b2 Full report by ID (session-local, 50-item cache)
what moves with Payment_Gateway_Timeout? Datadog: db.replica.cpu (r=0.94) · apm.payments.latency (r=0.91) · kafka.consumer.lag (r=0.87). 3 noise hits filtered.
log patterns behind apm.payments.latency? Payment_Gateway_Timeout (r=0.91) · DB_Connection_Refused (r=0.83) · Retry_Exhausted (r=0.71)
apm_request_duration_p99{service="payments-svc"} Direct passthrough to your Datadog / Grafana / Prometheus endpoint
join key for logs ↔ metrics? Found service (matches in 87% of overlap). Used by Correlate / Translate automatically.

Prerequisites

Investigate and Resume need the Reporter deployed. The metric-correlation tools (Correlate, Translate, Query, Join keys) additionally need an APM / infra metric endpoint linked — point LOG10X_CUSTOMER_METRICS_URL at your Grafana, Datadog, or Prometheus instance.