Skip to content

Executable Scanner

Launches a subprocess to extract symbol values from source code or binary files.

Supports OS utilities like strings for pre-compiled binaries, or custom code for reading from arbitrary files and remote sources (databases, web services).

The subprocess stdout is parsed in one of two modes:

Unstructured

Splits stdout lines using token delimiters and assigns each value an exec context. See the strings configuration example below.

Structured

Parses JSON objects from stdout containing hierarchical symbol trees with context metadata (class, function, enum names). See SymbolUnit.java for an example.

Configuration

To configure the Executable scanner module, Edit these settings.

Below is the default configuration from: executable/strings-nix.yaml.

Edit Online

Edit strings-nix.yaml Locally

# # 🔟❎ 'compile' executable 'strings' configuration

# This executable symbol scanner configuration uses OS-specific utilities to extract 
# constant string values from pre-compiled binaries (e.g. programs, shared libraries).
# Any matching text values are added to the output symbol files.

# The following configuration is added by default to the 10x 'compile' pipeline.
# If another 'executable' scanner is defined whose 'osFilter' matches that of the 
# current target input file, it will take precedence over the scanners defined below.
# To learn more about the 'executable' scanner, 
# see: https://doc.log10x.com/compile/scanner/executable

# Set the 10x pipeline to 'compile'
tenx: compile

# =============================== Exec Options ================================

exec: 

    # This scanner uses the Linux built-in 'strings' (https://linux.die.net/man/1/file)
    # command to extract symbol values from a target binary executable file
  - name: stringsNIX

    # This configuration targets NIX OSs
    osFilter: (nix|nux|lux|mac)  

    # try for known extensions before invoking a 'selectorProcess'
    extensions:   
      - .so
      - .dynlib   

    # 'selector' processes provide a method for determining whether a file is an executable binary.
    #  If a selector process is not provided the file's ELF/PE/OSX binary header is tested. 
    #  Since on NIX, executables do not have a known extension,
    #  the 'selector' process uses the built-in 'file' command whose 
    #  output is used to verify the target file is a NIX executable file.   
    selector:
      process: file
      args:
        - file # this macro value is replaced at run-time by the actual input file name
      timeout: 5s      
      outputFilter: ".*?\\bexecutable\\b.*?" # select 'executable' files

    process: strings    
    args:
      - -a   # scan the whole file, not just initialized and loaded sections 
      - file # This macro value is replaced at run-time by the actual text/binary input file   
    timeout: 100s

    symbol: 
      # 'minLength' specifies the min number of chars a value extracted via strings must be to 
      #  in order to qualify as a symbol. 
      minLength: 4  

      # 'maxLength' specifies the max number of chars a value extracted via strings must be to 
      #  in order to qualify as a symbol. 
      maxLength: 1024

      # 'filters' defines regex patterns applied to each line read from the scanner process' stdout 
      #  that must NOT match for the line value to be considered a symbol.
      filters: 
        - AWAV.  # https://stackoverflow.com/questions/39322552/meaning-of-a-common-string-in-executables
        - "^(?=^.{0,7}$)(?=.*[^a-zA-Z]).*$" # Matches any string that is shorter than 8 characters (0 to 7 chars) and contains at least one non-alphabetic character.

      # 'setSelector' specifies regex patterns that determines whether to treat an input value as set of tokens    
      setSelectors:
        - '^[0-9].*$'                  # Starts with digit
        - '^\*.*$'                     # Starts with *
        - '^(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}(?:/[^\s]*)?$' # Domain + optional path (e.g., github.com/user/repo)
        - '^type:\\..*$'               # Go land metadata starts with "type:." (any suffix)

Below is the default configuration from: executable/strings-win.yaml.

Edit Online

Edit strings-win.yaml Locally

# # 🔟❎ 'compile' executable 'strings' configuration

# This executable symbol scanner configuration uses OS-specific utilities to extract 
# constant string values from pre-compiled binaries (e.g. programs, shared libraries).
# Any matching text values are added to the output symbol files.

# The following configuration is added by default to the 10x 'compile' pipeline.
# If another 'executable' scanner is defined whose 'osFilter' matches that of the 
# current target input file, it will take precedence over the scanners defined below.
# To learn more about the 'executable' scanner, 
# see: https://doc.log10x.com/compile/scanner/executable/

# Set the 10x pipeline to 'compile'
tenx: compile

# =============================== Exec Options ================================

exec: 

    # This configuration provides a similar capability for scanning Windows executable files.

    # *IMPORTANT*: On Windows, the 'strings' command is not installed by default and
    # should be downloaded from: https://docs.microsoft.com/en-us/sysinternals/downloads/strings   
    # and added to 'PATH'    
  - name: stringsWin

      # target Win
    osFilter: win   

    extensions:
      - .so
      - .dll         

    process: strings
    args:
      - file # This macro value is replaced at run-time by the actual input file name
    timeout: 10s

Options

Specify the options below to configure multiple Executable scanner:

Name Description Category
execName A logical name for this exec scanner Scanner
execProcess Process to launch for extracting symbol values Scanner
execWorkDir The path of the working directory when running 'execProcess' Scanner
execArgs List of arguments passed to 'execProcess' ('file' is replaced by scan target) Scanner
execTimeout Timeout period after which to terminate 'execProcess' process Scanner
execOsFilter A pattern that must match the name of the host OS to apply this scanner. Filter
execExtensions A list of file extensions a target file must match to apply this scanner Filter
execFileNameFilter A pattern that the target input file must match to apply this scanner Filter
execSelectorProcess Name of a process to launch whose output whether to apply this input Selector
execSelectorWorkDir The path of the working directory when running 'execSelectorProcess' Selector
execSelectorArgs List of arguments passed to 'execSelectorProcess' Selector
execSelectorTimeout Timeout period after which to terminate 'execSelectorProcess process Selector
execSelectorOutputFilter A filter pattern that must match the output of 'execSelectorProcess' to apply this scanner Selector
execSymbolMinLength Minimal length of the string read from 'execProcess' to read as a symbol Parse
execSymbolMaxLength Max length the string read from 'execProcess' to read as a symbol Parse
execSymbolFilters Patterns a line read from 'execProcess' must NOT match Parse
execSymbolSetSelectors Regex patterns to determines whether to treat an input value as set of tokens Parse
execIsStructured Sets whether output lines read from 'execProcess' are read as plain text or structured JSON Parse
execSymbolsPrefix A prefix lines read from 'execProcess' must match to parse as structured JSON symbols Parse

Scanner

execName

A logical name for this exec scanner.

Type Required Category
String Scanner

A logical name for this exec scanner unique amongst all specified scanner option groups (e.g., 'stringsNIX').

execProcess

Process to launch for extracting symbol values.

Type Default Category
String "" Scanner

Defines a process to launch and whose stdout pipe to parse for symbol values.

execWorkDir

The path of the working directory when running 'execProcess'.

Type Default Category
String "" Scanner

Defines the working directory when running execProcess. If omitted, uses the working directory.

execArgs

List of arguments passed to 'execProcess' ('file' is replaced by scan target).

Type Default Category
List [] Scanner

Defines launch arguments execProcess. The file macro expands into the input file path.

execTimeout

Timeout period after which to terminate 'execProcess' process.

Type Default Category
String "" Scanner

Defines how long execProcess is allowed to run before termination.

Filter

execOsFilter

A pattern that must match the name of the host OS to apply this scanner.

Type Required Category
String Filter

Defines a pattern to match against the name of the host OS to determine whether to apply this scanner.

execExtensions

A list of file extensions a target file must match to apply this scanner.

Type Default Category
List [] Filter

An array of file extensions to test a file candidate against. If the candidate matches any extensions, apply this scanner.

execFileNameFilter

A pattern that the target input file must match to apply this scanner.

Type Default Category
String "" Filter

Sets a regex pattern to test a file candidate name against. If the candidate matches any of the extensions, apply this scanner.

Selector

execSelectorProcess

Name of a process to launch whose output whether to apply this input.

Type Default Category
String "" Selector

Define a process name to launch to obtain information about the input file. In some OSs (e.g., Linux/Unix), executable files do not necessarily have a known extension. For this, an OS utility such as NIX file can obtain information on the actual file contents.

If a selector process is not provided the file's ELF/PE/OSX binary header is tested for a possible match.

execSelectorWorkDir

The path of the working directory when running 'execSelectorProcess'.

Type Default Category
String "" Selector

Defines the working directory when running execSelectorProcess. If omitted, will use the the working directory.

execSelectorArgs

List of arguments passed to 'execSelectorProcess'.

Type Default Category
List [] Selector

Defines arguments for the selector process, if provided. The file macro takes the value of the target exec file to launch.

execSelectorTimeout

Timeout period after which to terminate 'execSelectorProcess process.

Type Default Category
String "" Selector

Sets how long to allow the selector process to run before termination (e.g. '10s').

execSelectorOutputFilter

A filter pattern that must match the output of 'execSelectorProcess' to apply this scanner.

Type Default Category
String "" Selector

Defines a regex pattern to match against the contents of the stdout stream read from the execSelectorProcess process. If matched, apply this scanner to the input file.

Parse

execSymbolMinLength

Minimal length of the string read from 'execProcess' to read as a symbol.

Type Default Category
Number 0 Parse

Sets the minimal length a string read from the scanner process stdout must be to be considered a symbol.

execSymbolMaxLength

Max length the string read from 'execProcess' to read as a symbol.

Type Default Category
Number 0 Parse

Sets the maximum length a string read from the scanner process stdout must be to be considered a symbol.

execSymbolFilters

Patterns a line read from 'execProcess' must NOT match.

Type Default Category
List [] Parse

defines regex patterns applied to each line read from the scanner process' stdout that must NOT match for the line value to be captured as a symbol.

For example, to avoid AWAV binary when using the strings OS utility, specify:

execSymbolFilters:
  -  ^AWAV     
`

execSymbolSetSelectors

Regex patterns to determines whether to treat an input value as set of tokens.

Type Default Category
List [] Parse

defines regex patterns must ALL matc to treat an input value as a line (e.g., 'error connecting to {}'), or as a series of independent tokens (e.g., 'error', 'connecting', 'to') to reduce the cardinality and size of the output symbol file.

For example, when processing files that contain values GO compile symbols such as:

github.com/open-telemetry/opentelemetry-collector-contrib/exporter/fileexporter.group[go.shape.struct

Specify the pattern below to break the string into separate symbol tokens (e.g., shape, struct,..):

setSelectors: 
- ^(?:github\.com|golang\.org|type:\.eq\.(?:github\.com|golang\.org))(/[a-zA-Z0-9_-]+)*$
- ^[a-zA-Z0-9_/:\.\[\]\*,-]*(?:\.[a-zA-Z0-9]+|\[[a-zA-Z0-9,\.\*/:_-]+\]|-fm)?$

execIsStructured

Sets whether output lines read from 'execProcess' are read as plain text or structured JSON.

Type Default Category
String "" Parse

Controls whether to read the execProcess stdout as plain text or as a JSON object containing source context information. If false (default), symbol values read from the process' output are marked as having an exec context. To learn more about producing structured symbol values, see SymbolUnit.java.

execSymbolsPrefix

A prefix lines read from 'execProcess' must match to parse as structured JSON symbols.

Type Default Category
Boolean false Parse

Sets a prefix that must precede lines read from the scan process' stdout to parse as structured JSON symbol trees. Only applies when execIsStructured is true.


This module is defined in executable/module.yaml.