Regex Pattern Scanner
Extracts symbol values from text files using regular expressions.
Use this scanner as a fallback for file formats not supported by language-specific scanners.
Configuration
To configure the Regex pattern scanner module, Edit these settings.
Below is the default configuration from: pattern/config.yaml.
ewogICJ0eXBlIiA6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIiA6IHsKICAgICJ0ZW54IiA6IHsKICAgICAgInR5cGUiIDogInN0cmluZyIKICAgIH0sCiAgICAicGF0dGVybiIgOiB7CiAgICAgICJ0eXBlIiA6ICJhcnJheSIsCiAgICAgICJpdGVtcyIgOiB7CiAgICAgICAgInR5cGUiIDogIm9iamVjdCIsCiAgICAgICAgImFkZGl0aW9uYWxQcm9wZXJ0aWVzIiA6IGZhbHNlLAogICAgICAgICJwcm9wZXJ0aWVzIiA6IHsKICAgICAgICAgICJmaWxlRmlsdGVyIiA6IHsKICAgICAgICAgICAgInR5cGUiIDogWwogICAgICAgICAgICAgICJzdHJpbmciLAogICAgICAgICAgICAgICJudWxsIgogICAgICAgICAgICBdLAogICAgICAgICAgICAibWFya2Rvd25EZXNjcmlwdGlvbiIgOiAiRmlsZSBuYW1lIGZpbHRlciByZWd4IHBhdHRlcm5cblxuRGVmaW5lcyBhIHJlZ2V4IHBhdHRlcm4gYSBmaWxlIG5hbWUgbXVzdCBtYXRjaCBmb3IgdGhpcyBzY2FubmVyIHRvIGJlIGFwcGxpZWQgdG8gaXRzIGNvbnRlbnRzIgogICAgICAgICAgfSwKICAgICAgICAgICJsaW5lRmlsdGVyIiA6IHsKICAgICAgICAgICAgInR5cGUiIDogWwogICAgICAgICAgICAgICJzdHJpbmciLAogICAgICAgICAgICAgICJudWxsIgogICAgICAgICAgICBdLAogICAgICAgICAgICAibWFya2Rvd25EZXNjcmlwdGlvbiIgOiAiRmlsZSBsaW5lIGZpbHRlciByZWd4IHBhdHRlcm5cblxuRGVmaW5lcyBhIHJlZ2V4IHBhdHRlcm4gYXBwbGllZCB0byBlYWNoIGxpbmUgd2l0aGluIGEgdGFyZ2V0IGZpbGUgd2hlcmUgZWFjaCBtYXRjaGluZyByZWdpb24gd2lsbCBiZSBleHRyYWN0ZWQgYXMgc3ltYm9sIHZhbHVlLiBOT1RFOiBUYWtlIGludG8gY29uc2lkZXJhdGlvbiBwZXJmb3JtYW5jZSBpbXBsaWNhdGlvbnMgb2YgdGhlIGlucHV0IGV4cHJlc3Npb24gdG8gYXZvaWQgbG9uZyBwcm9jZXNzaW5nIHRpbWUgYW5kL29yIGVycm9ycy4iCiAgICAgICAgICB9LAogICAgICAgICAgImV4dHJhY3RRdW90ZXMiIDogewogICAgICAgICAgICAidHlwZSIgOiBbCiAgICAgICAgICAgICAgImJvb2xlYW4iLAogICAgICAgICAgICAgICJzdHJpbmciCiAgICAgICAgICAgIF0sCiAgICAgICAgICAgICJtYXJrZG93bkRlc2NyaXB0aW9uIiA6ICJFeHRyYWN0IHNpbmdsZSBhbmQgZG91YmxlIHF1b3RlZCB2YWx1ZXNcblxuRGVmaW5lcyB3aGV0aGVyIHRvIGV4dHJhY3Qgc2luZ2xlIGFuZCBkb3VibGUgcXVvdGVkIHZhbHVlcyBmcm9tIGlucHV0IGxpbmVzLCB0YWtpbmcgaW50byBjb25zaWRlcmF0aW9uIGVzY2FwaW5nIGFuZCBpbXByb3Blcmx5IGNsb3NlZCBxdW90YXRpb25zLiBUaGlzIG9wdGlvbnMgcHJvdmlkZXMgYW0gZWZmaWNpZW50IG1ldGhvZCBmb3IgZXh0cmFjdGluZyBzdHJpbmcgY29uc3RhbnRzIGZyb20gaW5wdXQgZmlsZXMgd2l0aG91dCByZWx5aW5nIG9uIHJlZ2V4IGJhY2t0cmFja2luZy4gVG8gbGVhcm4gbW9yZSBzZWUgW3BzZXVkbyBjb2RlXSAoaHR0cHM6Ly9naXN0LmdpdGh1Yi5jb20vdGFsd2d4L2QzMTEzM2Y2Mjc4YmI4YjY0MGJhZjA1MGY3ODA0N2Q1KS4gKEFjY2VwdHMgYm9vbGVhbiBvciBzdHJpbmcgd2l0aCAkPSBwcmVmaXggZm9yIHJ1bnRpbWUgZXZhbHVhdGlvbikiCiAgICAgICAgICB9LAogICAgICAgICAgIm1hdGNoQ29udGV4dCIgOiB7CiAgICAgICAgICAgICJ0eXBlIiA6IFsKICAgICAgICAgICAgICAic3RyaW5nIiwKICAgICAgICAgICAgICAibnVsbCIKICAgICAgICAgICAgXSwKICAgICAgICAgICAgIm1hcmtkb3duRGVzY3JpcHRpb24iIDogIk91dHB1dCBzeW1ib2wgY29udGV4dFxuXG5TZXRzIHRoZSBzb3VyY2UgY29udGV4dCB0byBhc3NpZ24gYW55IHN5bWJvbHMgY29sbGVjdGVkIHVzaW5nIHRoaXMgdGhpcyBwYXR0ZXJuIHNlbGVjdG9yLiBGb3IgcG9zc2libGUgdmFsdWVzLCBzZWUgc3ltYm9sIFtjb250ZXh0c10oaHR0cHM6Ly9kb2MubG9nMTB4LmNvbS9ydW4vdHJhbnNmb3JtL3N5bWJvbC8jY29udGV4dHMpLiAgIElmIGEgdGFyZ2V0IGlucHV0IGxpbmUgY29udGFpbnMgbWF0Y2hlcyBmb3IgW3BhdHRlcm5MaW5lRmlsdGVyXShodHRwczovL2RvYy5sb2cxMHguY29tL2NvbXBpbGUvc2Nhbm5lci9wYXR0ZXJuLyNwYXR0ZXJubGluZWZpbHRlcikgQU5EIHRoaXMgdmFsdWUgaXMgc2V0IHRvIFttZXRob2RdKGh0dHBzOi8vZG9jLmxvZzEweC5jb20vcnVuL3RyYW5zZm9ybS9zeW1ib2wvI21ldGhvZCkgQU5EIHRoZSBtYXRjaGluZyBsaW5lIGNvbnRhaW5zIHZhbHVlcyBzcGVjaWZpZWQgYXMgW2xvZ21ldGhvZHNdKGh0dHBzOi8vZG9jLmxvZzEweC5jb20vY29tcGlsZS9zY2FubmVyL2xvZ01ldGhvZHMvI2xvZ21ldGhvZHMpIG9yIFtsb2dzdHJlYW1zXShodHRwczovL2RvYy5sb2cxMHguY29tL2NvbXBpbGUvc2Nhbm5lci9sb2dNZXRob2RzLyNsb2dzdHJlYW1zKSBvdXRzaWRlIG9mIG1hdGNoIGJvdW5kcyB0aGVuIHRoZSBjb250ZXh0IHdpbGwgYmUgc2V0IGFzIFtsb2ddKGh0dHBzOi8vZG9jLmxvZzEweC5jb20vcnVuL3RyYW5zZm9ybS9zeW1ib2wvI2xvZykuICBGb3IgZXhhbXBsZSwgZm9yIHRoZSBmb2xsb3dpbmcgaW5wdXQgbGluZSwgYSBwYXR0ZXJuIHNjYW5uZXIgc2V0IHRvIGBtZXRob2RgIHRoYXQgZXh0cmFjdHMgcXVvdGVkIHN0cmluZ3Mgd2lsbCBhc3NpZ25lZCB0aGUgZXh0cmFjdGVkIHZhbHVlJ3MgY29udGV4dCBhcyBgbG9nYCBnaXZlbiB0aGUgcHJlc2VuY2Ugb2YgdGhlIGBDb25zb2xlYCB2YWx1ZSB3aGljaCBpcyBkZWZpbmVkIGFzIGxvZ2dpbmcgc3RyZWFtIGlkZW50aWZpZXIuICBgYGAgY3MgQ29uc29sZS5Xcml0ZUxpbmUoXCJBY2NvdW50aW5nIFxcXCJzZXJ2aWNlIHN0YXJ0ZWRcIik7IGBgYCIKICAgICAgICAgIH0sCiAgICAgICAgICAidW5lc2NhcGVTY2hlbWUiIDogewogICAgICAgICAgICAidHlwZSIgOiBbCiAgICAgICAgICAgICAgInN0cmluZyIsCiAgICAgICAgICAgICAgIm51bGwiCiAgICAgICAgICAgIF0sCiAgICAgICAgICAgICJtYXJrZG93bkRlc2NyaXB0aW9uIiA6ICJVbmVzY2FwZSBzY2hlbWUgdG8gYXBwbHkgdG8gbWF0Y2hpbmcgc2VxdWVuY2VzXG5cblNwZWNpZmllcyBhbiB1bmVzY2FwZSBhbGdvcml0aG0gdG8gYXBwbHkgdG8gbWF0Y2hpbmcgc2VxdWVuY2VzIG9mIFtwYXR0ZXJuTGluZUZpbHRlcl0oaHR0cHM6Ly9kb2MubG9nMTB4LmNvbS9jb21waWxlL3NjYW5uZXIvcGF0dGVybi8jcGF0dGVybmxpbmVmaWx0ZXIpLiBTdXBwb3J0ZWQgdmFsdWVzOiBbanNvbixqYXZhLHhtbCxodG1sLGphdmFTY3JpcHRdLiAgIElmIG5vdCBzcGVjaWZpZWQsIGNoYXJhY3RlcnMgYXJlIHRyZWF0ZWQgYXMgbm90IGVzY2FwZWQuICBUbyBsZWFybiBtb3JlIHNlZSBbVW5lc2NhcGUgYWxnb3JpdGhtc10oaHR0cHM6Ly93d3cudW5iZXNjYXBlLm9yZy91c2luZ3VuYmVzY2FwZS5odG1sKS4iCiAgICAgICAgICB9CiAgICAgICAgfSwKICAgICAgICAicmVxdWlyZWQiIDogWwogICAgICAgICAgImZpbGVGaWx0ZXIiLAogICAgICAgICAgImV4dHJhY3RRdW90ZXMiCiAgICAgICAgXQogICAgICB9CiAgICB9CiAgfSwKICAiYWRkaXRpb25hbFByb3BlcnRpZXMiIDogZmFsc2UKfQ==
# 🔟❎ 'compile' pattern quoted strings configuration
# This pattern symbol scanner extracts single and double quoted string values
# from target files. This pattern is used a default for capturing symbol values
# from source files for which no ANTLR grammar or built-in scanner is available.
# Set the 10x pipeline to 'compile'
tenx: compile
# ============================== Pattern Options ==============================
pattern:
# 'fileFilter' is set as a sink for prominent source code files. Available built-in and ANTLR
# scanners will take precedence over this pattern scanner
- fileFilter: \b\w+\.(java|cs|php|cpp|cxx|c|h|H|hxx|cc|hh|py|go|js|rs|bash|scala|sc|ts|rb|kt|lua|groovy|swift)\b
extractQuotes: true
# 'lineFilter' defines a regex pattern applied to each line within a target file.
# Each matching region will be extracted as symbol value
lineFilter: null
# 'matchContext' sets the source context to assign any symbols collected using this this pattern selector.
# For possible values, see https://doc.log10x.com/run/transform/symbol/#contexts
matchContext: method_invoke
# 'unescapeScheme' specifies an unescape algorithm to apply to matching sequences of patternLineFilter.
# Supported values: [json,java,xml,html,javaScript].
unescapeScheme: java
Options
Specify the options below to configure multiple Regex pattern scanner:
| Name | Description |
|---|---|
| patternFileFilter | File name filter regx pattern |
| patternLineFilter | File line filter regx pattern |
| patternExtractQuotes | Extract single and double quoted values |
| patternMatchContext | Output symbol context |
| patternUnescapeScheme | Unescape scheme to apply to matching sequences |
patternFileFilter
File name filter regx pattern.
| Type | Required |
|---|---|
| String | ✔ |
Defines a regex pattern a file name must match for this scanner to be applied to its contents.
patternLineFilter
File line filter regx pattern.
| Type | Default |
|---|---|
| String | "" |
Defines a regex pattern applied to each line within a target file where each matching region will be extracted as symbol value. NOTE: Take into consideration performance implications of the input expression to avoid long processing time and/or errors.
patternExtractQuotes
Extract single and double quoted values.
| Type | Required |
|---|---|
| Boolean | ✔ |
Defines whether to extract single and double quoted values from input lines, taking into consideration escaping and improperly closed quotations. This options provides am efficient method for extracting string constants from input files without relying on regex backtracking. To learn more see [pseudo code] (https://gist.github.com/talwgx/d31133f6278bb8b640baf050f78047d5).
patternMatchContext
Output symbol context.
| Type | Default |
|---|---|
| String | "" |
sets the source context to assign any symbols collected using this this pattern selector. For possible values, see symbol contexts.
If a target input line contains matches for patternLineFilter AND this value is set to method AND the matching line contains values specified as logmethods or logstreams outside of match bounds then the context will be set as log.
For example, for the following input line, a pattern scanner set to method that extracts quoted strings
will assigned the extracted value's context as log given the presence of the Console value which
is defined as logging stream identifier.
patternUnescapeScheme
Unescape scheme to apply to matching sequences.
| Type | Default |
|---|---|
| String | "" |
Specifies an unescape algorithm to apply to matching sequences of patternLineFilter. Supported values: [json,java,xml,html,javaScript].
If not specified, characters are treated as not escaped.
To learn more see Unescape algorithms.
This module is defined in pattern/module.yaml.