Skip to content

Splunk

Configures a Cloud/On-premises Splunk input from which to read events to transform into typed TenXObjects.

Instances of this module define a connection to a hosted/on-premises Splunk cluster from which events to retrieve, as well as the querying logic used such as chronological direction, start values, time ranges, and page size of each API request sent.

Splunk inputs commonly run within scheduled jobs (e.g., k8s CronJob) to retrieve a recent sample amount of events (e.g., 200MB in the last 10min) to transform into TenXObjects as part of the Cloud Reporter app.

Architecture

The Splunk input module uses Apache Camel to poll the Splunk REST API:

graph LR
    A["<div style='font-size: 16px;'>🔍 Splunk API</div><div style='font-size: 14px;'>REST Endpoint</div>"] --> B["<div style='font-size: 16px;'>🛤️ Camel Route</div><div style='font-size: 14px;'>netty-http</div>"]
    B --> C["<div style='font-size: 16px;'>⚙️ 10x Pipeline</div><div style='font-size: 14px;'>Transform</div>"]
    C --> D["<div style='font-size: 16px;'>📈 TenXSummary</div><div style='font-size: 14px;'>Time-Series</div>"]

    classDef splunk fill:#9333ea88,stroke:#7c3aed,color:#ffffff,stroke-width:2px,rx:8,ry:8
    classDef camel fill:#2563eb88,stroke:#1d4ed8,color:#ffffff,stroke-width:2px,rx:8,ry:8
    classDef pipeline fill:#059669,stroke:#047857,color:#ffffff,stroke-width:2px,rx:8,ry:8
    classDef objects fill:#ea580c88,stroke:#c2410c,color:#ffffff,stroke-width:2px,rx:8,ry:8

    class A splunk
    class B camel
    class C pipeline
    class D objects

🔍 Splunk API : Polls Splunk's REST API at regular intervals to fetch search results

🛤️ Camel Route : Submits search jobs and retrieves results in configurable page sizes

⚙️ 10x Pipeline : Transforms raw events into structured TenXObjects with symbol enrichment

📈 TenXSummary : Outputs aggregated metrics to time-series outputs

Splunk User Permissions

The Splunk user account needs the following capabilities:

  • search - Execute searches
  • rest_apps_management - Access REST API
  • list_inputs_edit (optional) - For advanced input management
Network Requirements
  • Outbound HTTPS access to Splunk management port (default: 8089)
  • For Splunk Cloud: Ensure your IP is allowlisted
SSL Certificate Errors

Error: PKIX path building failed or unable to find valid certification path

Solution: For self-signed certificates in dev/test environments:

splunk:
  - name: DevSplunk
    verifySSL: false

For production, import the Splunk CA certificate into your Java truststore.

Authentication Failures

Error: 401 Unauthorized or Authentication failed

Checklist:

  1. Verify username/password are correct
  2. Check user has required Splunk capabilities
  3. Ensure credentials are properly passed via environment variables
  4. Test credentials with curl:
    curl -k -u username:password https://splunk-host:8089/services/server/info
    
No Results Returned

Symptoms: Pipeline starts but no events are processed

Checklist:

  1. Verify the search query returns results in Splunk UI
  2. Check totalEventsLimit isn't set too low
  3. Ensure enabled: true is set (or not explicitly set to false)
  4. Review query time range matches available data
Connection Timeouts

Error: Connection timed out or Read timed out

Solutions:

  • Check network connectivity to Splunk host
  • Verify firewall rules allow port 8089
  • For Splunk Cloud, ensure IP allowlisting
  • Increase totalDuration for slow networks
Rate Limiting

Symptoms: Intermittent failures, 429 Too Many Requests

Solution: Increase polling interval:

splunk:
  - name: RateLimited
    queryInterval: $=parseDuration("10s")
Credential Management

Never hardcode credentials in configuration files:

# Good - uses environment variables
username: $=TenXEnv.get("SPLUNK_USERNAME")
password: $=TenXEnv.get("SPLUNK_PASSWORD")

# Bad - hardcoded credentials
username: admin
password: secret123
SSL/TLS
  • Always use protocol: https (default)
  • Only disable verifySSL in development environments
  • For production with custom CAs, import certificates to Java truststore
Network Security
  • Use VPN or private networking when possible
  • Restrict Splunk API access to known IP ranges
  • Consider using Splunk tokens instead of username/password where supported

Configuration

To configure the Splunk input module, Edit these settings.

Below is the default configuration from: splunk/config.yaml (* Required Fields).

Edit Online

Edit config.yaml Locally

# 🔟❎ 'run' Splunk input configuration

# Configure a Splunk event input
# To learn more see https://doc.log10x.com/run/input/analyzer/splunk/

# Set the 10x pipeline to 'run'
tenx: run

# =============================== Dependencies ================================

include: run/modules/input/analyzer/splunk

# =============================== Splunk Options ==============================

# Multiple Splunk inputs can be defined below

splunk:

    # ---------------------------- General Options ----------------------------

    # 'name' sets a unique logical name across all pipeline inputs
  - name: Splunk
    # Disabled by default - configure host/port to enable
    enabled: false

    # --------------------------- Connection Options --------------------------

    # 'host' and 'port' set the Splunk host address to connect to (e.g., '<deployment-name>.splunkcloud.com')
    host: null    # (❗ REQUIRED)
    port: null    # (Not mandatory if the host already encapsulates it)
    protocol: "https"

    # 'username' and 'password' used to authenticate against the Splunk deployment
    #  To learn more see https://docs.splunk.com/Documentation/Splunk/latest/RESTUM/RESTusing#Authentication_and_authorization
    username: $=TenXEnv.get("SPLUNK_USERNAME") # (❗ EnvVar REQUIRED)
    password: $=TenXEnv.get("SPLUNK_PASSWORD") # (❗ EnvVar REQUIRED)

    # ----------------------------- Query Options -----------------------------

    # 'pageSize' sets the number of events to retrieve with each result page
    #  Performance: Increase to 1000-2000 for high-volume environments
    pageSize: 500

    # 'query' sets the Splunk search query to execute for this job
    query: search *

    # --------------------------- Backpressure Options -----------------------

    # 'queryInterval' sets the interval between queries to the remote API
    #  Performance: Increase for rate-limited APIs; decrease for real-time needs
    queryInterval: $=parseDuration("2s")

    # 'totalDuration' sets the max duration to try reading from the the remote input
    #  Performance: Match to your job scheduling interval
    totalDuration: $=parseDuration("5min")

    # 'totalBytesLimit' sets the max total bytes to read from the remote input
    #  Performance: Increase for longer analysis windows (e.g., 200MB for 10min)
    totalBytesLimit: $=parseBytes("50MB")

    # 'totalEventsLimit' sets the max number of events to read the remote input
    #  Performance: Adjust based on memory capacity; each event consumes memory
    totalEventsLimit: 10000

    # --------------------------- Ancillary Options ---------------------------

    # 'printProgress' controls whether to print progress gage to the console
    #  This option helps debug and test the input
    printProgress: $=!TenXEnv.get("quiet")

Options

Specify the options below to configure multiple Splunk input:

Name Description Category
splunkName Logical name of this Splunk input General
splunkEnabled Sets whether this input is enabled (default true) General
splunkPrintProgress Sets whether this input prints throughput stats to the console General
splunkHost Splunk host address Authentication
splunkPort Splunk server port Authentication
splunkProtocol Defines the protocol to connect to Splunk Authentication
splunkUsername Splunk user name Authentication
splunkPassword Splunk user password Authentication
splunkVerifySSL Whether to verify SSL certificates (default true) Authentication
splunkQuery Search query to execute Query
splunkPageSize Number of events to retrieve with each result page Query
splunkTotalBytesLimit Maximum total bytes to read from input before closing Backpressure
splunkTotalEventsLimit Maximum total events to read from input before closing Backpressure
splunkTotalDuration Maximum duration to keep input open before closing Backpressure
splunkQueryInterval Query interval (in milliseconds) for checking new data from remote source Backpressure

General

splunkName

Logical name of this Splunk input.

Type Default Category
String "" General

Sets a logical name (e.g., 'mySplunk') for this input. The inputName field returns this value at run time to allow for identifying and operating on instances originating from this input.

splunkEnabled

Sets whether this input is enabled (default true).

Type Default Category
Boolean true General

Sets whether to open the input stream. To enable this input only when a splunkHost startup argument value is truthy, use:

splunkEnabled: $=TenXEnv.get("splunkHost")

To learn more see TenXEnv.get.

splunkPrintProgress

Sets whether this input prints throughput stats to the console.

Type Default Category
Boolean false General

Sets whether this input prints throughput stats to the console for testing an integration to a remote endpoint.

Authentication

splunkHost

Splunk host address.

Type Default Category
String "" Authentication

Sets the Splunk host address to connect to (e.g., <deployment-name>.splunkcloud.com).

splunkPort

Splunk server port.

Type Default Category
Number 0 Authentication

Sets the Splunk server port to connect to (e.g., 8089) The port is not needed if the provided splunkHost already encapsulates the port.

splunkProtocol

Defines the protocol to connect to Splunk.

Type Default Category
String https Authentication

Sets the protocol to connect to Splunk with (e.g., https).

splunkUsername

Splunk user name.

Type Default Category
String "" Authentication

Sets the Splunk user name to authenticate with This value is set into the 'username' header of the /services/search/v2/jobs/ endpoint.

splunkPassword

Splunk user password.

Type Default Category
String "" Authentication

Sets the Splunk user password to authenticate with This value is set into the password header of the /services/search/v2/jobs/ endpoint.

splunkVerifySSL

Whether to verify SSL certificates (default true).

Type Default Category
Boolean true Authentication

sets whether to verify SSL certificates when connecting to Splunk. Set to false to allow connections to Splunk instances with self-signed certificates.

Warning: Disabling SSL verification is not recommended for production environments.

For example:

splunkVerifySSL: false

Query

splunkQuery

Search query to execute.

Type Required Category
String Query

Sets the Splunk search query to execute for this job.

splunkPageSize

Number of events to retrieve with each result page.

Type Default Category
Number 500 Query

Sets the number of events to retrieve with each result page.

Performance: Increase to 1000-2000 for high-volume environments to reduce API round-trips.

Backpressure

splunkTotalBytesLimit

Maximum total bytes to read from input before closing.

Type Default Category
Number 50000000 Backpressure

sets the maximum number of bytes a target pipeline input will read into the pipeline. This value limits the volume of events to read from a local/remote source (e.g., log analyzer).

Performance: Increase for longer analysis windows (e.g., 200MB for 10min windows).

For example:

splunkTotalBytesLimit: $=parseBytes("1GB")

splunkTotalEventsLimit

Maximum total events to read from input before closing.

Type Default Category
Number 10000 Backpressure

Sets the maximum number of events a target pipeline input will read into the pipeline. This value limits the volume of events to read from a local/remote source (e.g., log analyzer).

Performance: Adjust based on memory and processing capacity. Each event consumes memory during processing.

splunkTotalDuration

Maximum duration to keep input open before closing.

Type Default Category
String 5min Backpressure

sets the maximum duration a target pipeline input will remain open. When reached, the input will close and no more data will be read.

Performance: Match to your job scheduling interval (e.g., if running every 10min, set to 10min).

For example:

splunkTotalDuration: $=parseDuration("10min")

splunkQueryInterval

Query interval (in milliseconds) for checking new data from remote source.

Type Default Category
Number 2000 Backpressure

sets the interval between queries to the remote Splunk API. This controls how frequently the input polls for new log data.

Performance: Increase for rate-limited APIs; decrease for real-time needs.

For example:

splunkQueryInterval: $=parseDuration("5s")


This module is defined in splunk/module.yaml.