Skip to content

Regex Pattern Scanner

Extracts symbol values from text files using regular expressions.

Use this scanner as a fallback for file formats not supported by language-specific scanners.

Configuration

To configure the Regex pattern scanner module, Edit these settings.

Below is the default configuration from: pattern/config.yaml.

Edit Online

Edit config.yaml Locally

# 🔟❎ 'compile' pattern quoted strings configuration

# This pattern symbol scanner extracts single and double quoted string values
# from target files. This pattern is used a default for capturing symbol values
# from source files for which no ANTLR grammar or built-in scanner is available.

# Set the 10x pipeline to 'compile'
tenx: compile

# ============================== Pattern Options ==============================

pattern: 

    # 'fileFilter' is set as a sink for prominent source code files. Available built-in and ANTLR
    #  scanners will take precedence over this pattern scanner
  - fileFilter: \b\w+\.(java|cs|php|cpp|cxx|c|h|H|hxx|cc|hh|py|go|js|rs|bash|scala|sc|ts|rb|kt|lua|groovy|swift)\b

    extractQuotes: true

    # 'lineFilter' defines a regex pattern applied to each line within a target file.
    #  Each matching region will be extracted as symbol value
    lineFilter: null

    # 'matchContext' sets the source context to assign any symbols collected using this this pattern selector. 
    #  For possible values, see https://doc.log10x.com/run/transform/symbol/#contexts
    matchContext: method_invoke

    # 'unescapeScheme' specifies an unescape algorithm to apply to matching sequences of patternLineFilter. 
    #  Supported values: [json,java,xml,html,javaScript].
    unescapeScheme: java

Options

Specify the options below to configure multiple Regex pattern scanner:

Name Description
patternFileFilter File name filter regx pattern
patternLineFilter File line filter regx pattern
patternExtractQuotes Extract single and double quoted values
patternMatchContext Output symbol context
patternUnescapeScheme Unescape scheme to apply to matching sequences

patternFileFilter

File name filter regx pattern.

Type Required
String

Defines a regex pattern a file name must match for this scanner to be applied to its contents.

patternLineFilter

File line filter regx pattern.

Type Default
String ""

Defines a regex pattern applied to each line within a target file where each matching region will be extracted as symbol value. NOTE: Take into consideration performance implications of the input expression to avoid long processing time and/or errors.

patternExtractQuotes

Extract single and double quoted values.

Type Required
Boolean

Defines whether to extract single and double quoted values from input lines, taking into consideration escaping and improperly closed quotations. This options provides am efficient method for extracting string constants from input files without relying on regex backtracking. To learn more see [pseudo code] (https://gist.github.com/talwgx/d31133f6278bb8b640baf050f78047d5).

patternMatchContext

Output symbol context.

Type Default
String ""

sets the source context to assign any symbols collected using this this pattern selector. For possible values, see symbol contexts.

If a target input line contains matches for patternLineFilter AND this value is set to method AND the matching line contains values specified as logmethods or logstreams outside of match bounds then the context will be set as log.

For example, for the following input line, a pattern scanner set to method that extracts quoted strings will assigned the extracted value's context as log given the presence of the Console value which is defined as logging stream identifier.

Console.WriteLine("Accounting \"service started");

patternUnescapeScheme

Unescape scheme to apply to matching sequences.

Type Default
String ""

Specifies an unescape algorithm to apply to matching sequences of patternLineFilter. Supported values: [json,java,xml,html,javaScript].

If not specified, characters are treated as not escaped.

To learn more see Unescape algorithms.


This module is defined in pattern/module.yaml.