Group Initializer

Group TenXObjects to filter, aggregate and output as a single logical unit.

Log events read from an input stream can serve as a part of a larger logical group. A typical example of events spanning multiple sub-events are stack traces where each line within the stack trace may be logged as a separate line.

TenXObjects can be composed into logical group to:

Identify groups consuming the most storage and analytics resources using aggregators. This is especially valuable when storing stack traces than span 100s of lines and consume a significant amount of resources.
Filter unnecessary groups such as 'noisy' stack traces via group filters and output regulators.
Optimize storage of multi-line events by losslessly compacting them as a composite instances to reduce storage footprint by > 75% when compared to storing individual events.

Group Heads

tenXObjects which evaluate as truthy against groupExpressions are marked as starting of a new group (i.e. a group head).

All subsequent TenXObjects read from the same input will join the current group until either:

Another group is started marked by a subsequent instance which evaluates as truthy against groupExpressions.
The number of TenXObjects in the current group exceeds groupMaxSize.
The groupFlushTimeout elapses.

At that point the group is sealed as new composite TenXObject and flushed forward for aggregation and output. Each composite TenXObject returns the number of instances grouped within it via the groupSize member.

Grouping Logic

The group module calculates two static fields for each TenXTemplate:

isGroup

TenXObjects qualify as group heads (isGroup = true) when they meet one of these conditions:

Timestamp - Presence of a timestamp marks them as heads.
Severity level - Assignment of a severity level designates them as heads.
Group indicator - Starting with any configured groupIndicators pattern confirms them as heads.

Subsequent TenXObjects attach to the active group until encountering another head, exceeding the maximum group size, or hitting the flush timeout.

isStandalone

TenXObjects qualify as standalone (isStandalone = true) when they meet one of these conditions:

Group head - If isGroup is true, the event is standalone.
Not a continuation - If the event does NOT match any groupNegators pattern, it is standalone.

Standalone events have message and origin patterns calculated for them. This optimization skips pattern calculation for known continuation lines (e.g., stack trace frames) while ensuring events without timestamps or severity levels still get patterns calculated unless they're definitely continuations.

isGroup	matches negator	isStandalone
true	any	true
false	no	true
false	yes	false

Default Configuration

Default groupIndicators encompass patterns like private IP addresses (e.g., '192.', '10.'), HTTP methods (e.g., 'GET ', 'POST '), system logs (e.g., 'kernel:', 'sshd['), and stack trace starters (e.g., 'Traceback ', 'Exception in thread ').

Default groupNegators include stack trace continuation patterns for Java (e.g., '\tat ', 'Caused by:'), .NET (e.g., ' at '), Python (e.g., ' File "'), Node.js (e.g., ' at '), Go, Ruby, Rust, and PHP.

Users can customize these patterns in their configuration to suit specific log formats.

Configuration

To configure the Group initializer unit, Edit these settings.

Below is the default configuration from: group/config.yaml.

Edit Online

# 🔟❎ 'run' event grouping configuration

# Group  sequences of TenXObjects to filter, aggregate and output as a single logical unit.
# To learn more see https://doc.log10x.com/run/transform/group/

# Set the 10x pipeline to 'run'
tenx: run

# =============================== Dependencies ================================

include: run/modules/initialize/group

# =============================== Group Options ===============================


# 'indicators'  specifies a list of `value:state` pairs that determine if a log line's start marks a group head (`true`) or child (`false`).
group:

  # 'indicators' specifies a list of strings that, when matched at the start of a log line's text, designate it as a group head.
  # Unmatched lines default to false, indicating they are group children.
  indicators:
    - '192.'              # Indicates a private IP address, common in web server logs (e.g., "192.168.1.1 - - [...]")
    - '10.'               # Indicates a private IP range, often in Kubernetes or internal network logs (e.g., "10.244.0.125 - - [...]")
    - '172.'              # Indicates a private IP range, typical in enterprise network logs (e.g., "172.16.0.1 - - [...]")
    - '127.'              # Indicates localhost, a common web server log initiator (e.g., "127.0.0.1 - - [...]")
    - 'GET '              # Indicates an HTTP GET request, marking the start of a web transaction log (e.g., "GET /index.html HTTP/1.1")
    - 'POST '             # Indicates an HTTP POST request, marking the start of a web transaction log (e.g., "POST /api HTTP/1.1")
    - 'PUT '              # Indicates an HTTP PUT request, marking the start of a web transaction log (e.g., "PUT /resource HTTP/1.1")
    - 'DELETE '           # Indicates an HTTP DELETE request, marking the start of a web transaction log (e.g., "DELETE /resource HTTP/1.1")
    - 'HEAD '             # Indicates an HTTP HEAD request, marking the start of a web transaction log (e.g., "HEAD /index.html HTTP/1.1")
    - 'OPTIONS '          # Indicates an HTTP OPTIONS request, marking the start of a web transaction log (e.g., "OPTIONS /api HTTP/1.1")
    - 'HTTP/'             # Indicates an HTTP protocol version, marking the start of a web transaction log (e.g., "GET /index.html HTTP/1.1")
    - 'kernel:'           # Indicates a Linux kernel log entry, marking the start of a system event (e.g., "kernel: [0.123456] Device initialized")
    - 'sshd['             # Indicates an SSH daemon log entry, marking the start of a security event (e.g., "sshd[1234]: Accepted password ...")
    - 'systemd['          # Indicates a systemd service log entry, marking the start of a system service event (e.g., "systemd[1]: Started service ...")
    - 'cron['             # Indicates a cron daemon log entry, marking the start of a scheduled task event (e.g., "cron[1234]: Running job ...")
    - 'syslog:'           # Indicates a syslog message, marking the start of a system log event (e.g., "syslog: Message ...")
    - 'rsyslogd:'         # Indicates an rsyslog daemon log entry, marking the start of a logging system event (e.g., "rsyslogd: Log started ...")
    - 'auditd['           # Indicates an audit daemon log entry, marking the start of a security audit event (e.g., "auditd[1234]: Audit event ...")
    - 'daemon:'           # Indicates a syslog daemon facility log, marking the start of a system service event (e.g., "daemon: Service started ...")
    - 'user:'             # Indicates a syslog user facility log, marking the start of a user-related event (e.g., "user: User logged in ...")
    - 'local0:'           # Indicates a syslog local facility log, marking the start of a custom system event (e.g., "local0: Custom message ...")
    - 'local1:'           # Indicates a syslog local facility log, marking the start of a custom system event (e.g., "local1: Custom message ...")
    - 'level='            # Indicates a structured log key, marking the start of a key-value log entry (e.g., "level=info msg=Started")
    - 'msg='              # Indicates a structured log key, marking the start of a message log entry (e.g., "msg=Application started")
    - 'message='          # Indicates a structured log key, marking the start of a message log entry (e.g., "message=Application started")
    - 'event='            # Indicates a structured log key, marking the start of an event log entry (e.g., "event=Service startup")
    - 'thread='           # Indicates a structured log key, marking the start of a thread-specific log entry (e.g., "thread=main Processing ...")
    - 'Starting '         # Indicates the start of a process initiation log (e.g., "Starting server on port 8080")
    - 'Stopping '         # Indicates the start of a process termination log (e.g., "Stopping service ...")
    - 'Running '          # Indicates the start of a process status log (e.g., "Running task ...")
    - 'Listening '        # Indicates the start of a network service log (e.g., "Listening on port 8080 ...")
    - 'Connecting '       # Indicates the start of a connection attempt log (e.g., "Connecting to database ...")
    - 'Connected '        # Indicates the start of a successful connection log (e.g., "Connected to database ...")
    - 'Disconnected '     # Indicates the start of a disconnection log (e.g., "Disconnected from server ...")
    - 'Processing '       # Indicates the start of a task processing log (e.g., "Processing request ...")
    - 'Received '         # Indicates the start of a data reception log (e.g., "Received message ...")
    - 'Sent '             # Indicates the start of a data transmission log (e.g., "Sent response ...")
    - 'User '             # Indicates the start of a user action log (e.g., "User logged in ...")
    - 'Authentication '   # Indicates the start of an authentication log (e.g., "Authentication successful ...")
    - 'Authorized '       # Indicates the start of an authorization log (e.g., "Authorized user access ...")
    - 'Failed '           # Indicates the start of a failure log (e.g., "Failed login attempt ...")
    - 'kubelet:=true'     # Indicates a Kubernetes kubelet log entry, marking the start of a node event (e.g., "kubelet: Starting kubelet")
    - 'pod:=true'         # Indicates a Kubernetes pod log entry, marking the start of a pod event (e.g., "pod: Starting container")
    - 'container:=true'   # Indicates a Kubernetes container log entry, marking the start of a container event (e.g., "container: Started")
    - 'namespace:=true'   # Indicates a Kubernetes namespace log entry, marking the start of a namespace event (e.g., "namespace: Created")
    - 'Traceback '        # Indicates the start of a Python stack trace (e.g., "Traceback (most recent call last):")
    - 'File "'            # Indicates a Python stack trace line (e.g., "File "/script.py", line 10")
    - 'Trace:'            # Indicates a Node.js console trace (e.g., "Trace: Show me")
    - 'Error:'            # Indicates a Node.js error log (e.g., "Error: Something went wrong")
    - 'Warning:'          # Indicates a Node.js warning log (e.g., "Warning: Deprecated method")
    - 'Exception in thread ' # Indicates a Java exception header (e.g., "Exception in thread 'main'")
    - 'goroutine '        # Indicates a Go goroutine stack trace (e.g., "goroutine 1 [running]:")
    - 'panic:'            # Indicates a Go panic (e.g., "panic: runtime error")
    - 'thread '           # Indicates a Rust thread panic (e.g., "thread 'main' panicked at")
    - 'stack backtrace:'  # Indicates a Rust backtrace header (e.g., "stack backtrace:")
    - 'PHP Warning:'      # Indicates a PHP warning (e.g., "PHP Warning: Undefined variable")
    - 'PHP Fatal error:'  # Indicates a PHP fatal error (e.g., "PHP Fatal error: Out of memory")
    - 'in /'              # Indicates a PHP/Ruby file path in stack trace (e.g., "in /path/to/file.php:10")
    - 'Exception:'        # Indicates a C# exception (e.g., "Exception: Invalid operation")
    - 'terminate called'  # Indicates a C++ termination (e.g., "terminate called after throwing an instance of 'std::exception'")
    - 'syslog:'           # Indicates a Linux syslog message (e.g., "syslog: Message ...")
    - 'IN='               # Indicates an iptables firewall input log (e.g., "IN=eth0 OUT=")
    - 'OUT='              # Indicates an iptables firewall output log (e.g., "OUT=eth0 SRC=")
    - 'SRC='              # Indicates an iptables firewall source IP log (e.g., "SRC=192.168.1.1 DST=")
    - 'DST='              # Indicates an iptables firewall destination IP log (e.g., "DST=10.0.0.1 LEN=")
    - 'KafkaServer:'      # Indicates a Kafka server log (e.g., "KafkaServer: Starting Kafka server")
    - 'Redis:'            # Indicates a Redis server log (e.g., "Redis: Server initialized")

  # 'negators' specifies patterns that mark a line as a continuation of a previous event (e.g., stack trace lines).
  # Lines matching these patterns are NOT standalone events and typically don't need their own message pattern.
  # Events that are NOT negators (isStandalone=true) will get a message even without timestamp/severity.
  negators:
    # Java/Kotlin/Scala stack traces
    - "\tat "             # Java stack frame (e.g., "\tat com.example.MyClass.method(MyClass.java:42)")
    - "    at "           # Java stack frame with spaces (e.g., "    at com.example.MyClass.method")
    - "\t... "            # Java truncated stack (e.g., "\t... 15 more")
    - "    ... "          # Java truncated stack with spaces
    - "Caused by: "       # Java chained exception (e.g., "Caused by: java.io.IOException")
    - "Suppressed: "      # Java suppressed exception (e.g., "Suppressed: java.lang.Exception")

    # .NET/C# stack traces
    - "   at "            # .NET stack frame (e.g., "   at MyNamespace.MyClass.Method()")
    - "--- End of "       # .NET inner exception marker (e.g., "--- End of inner exception stack trace ---")
    - " ---> "            # .NET inner exception chain

    # Python stack traces
    - "  File \""         # Python stack frame (e.g., '  File "/path/to/script.py", line 10')
    - "    raise "        # Python raise statement in traceback
    - "    return "       # Python return statement in traceback

    # Node.js/JavaScript stack traces
    - "    at "           # Node.js stack frame (e.g., "    at Object.<anonymous> (/path/file.js:10:15)")
    - "    at new "       # Node.js constructor stack frame
    - "    at Module."    # Node.js module stack frame
    - "    at Object."    # Node.js object stack frame

    # Go stack traces
    - "\t/"               # Go stack frame path (e.g., "\t/go/src/main.go:42 +0x123")
    - "created by "       # Go goroutine creator (e.g., "created by main.main")

    # Ruby stack traces
    - "\tfrom "           # Ruby stack frame (e.g., "\tfrom /path/to/file.rb:10:in `method'")
    - "        from "     # Ruby stack frame with spaces

    # Rust stack traces
    - "             at "  # Rust source location (13 spaces + at)
    - "  --> "            # Rust error pointer

    # PHP stack traces
    - "#0 "               # PHP stack frame (e.g., "#0 /path/to/file.php(10): function()")
    - "#1 "               # PHP stack frame
    - "#2 "               # PHP stack frame
    - "#3 "               # PHP stack frame
    - "#4 "               # PHP stack frame
    - "#5 "               # PHP stack frame
    - "  thrown in "      # PHP exception location

Options

Specify the options below to configure the Group initializer:

Name	Description
groupIndicators	List of strings that, when matched at the start of a log line's text, designate it as a group head.
groupNegators	List of patterns that mark a line as a known continuation (e.g., stack trace lines), negating standalone status.

`groupIndicators`

List of strings that, when matched at the start of a log line's text, designate it as a group head.

Type	Default
List	[]

A list of strings that, when matched at the start of a log line's text, designate it as a group head.

Unmatched lines default to false, indicating they are group children.

192. - Indicates a private IP address, common in web server logs (e.g., "192.168.1.1 - - [...]").
10. - Indicates a private IP range, often in Kubernetes or internal network logs (e.g., "10.244.0.125 - - [...]").
GET - Indicates an HTTP GET request, marking the start of a web transaction log (e.g., "GET /index.html HTTP/1.1").
at - Indicates a Java/C# stack trace continuation.

`groupNegators`

List of patterns that mark a line as a known continuation (e.g., stack trace lines), negating standalone status.

Type	Default
List	[]

A list of strings that, when matched at the start of a log line's text, mark it as a known continuation line.

Lines matching these patterns are identified as continuations (e.g., stack trace frames) and will not have message patterns calculated for them, optimizing processing performance.

at - Java/Kotlin stack trace frame (tab-prefixed).
at - .NET stack trace frame (space-prefixed).
File " - Python stack trace frame.
at - Node.js stack trace frame.
Caused by: - Java chained exception.
... - Java truncated stack frames.

This unit is defined in group/module.yaml.