ANTLR Scanner
Extracts symbols from source files using ANTLR parsers.
The rules for each language control which symbols to extract from the AST and their context within the source file (e.g. class name, constant, enum,..). This enables generating symbol units from source code in virtually any programming language for which a grammar is defined.
Extensibility
Parsing source code files for a given language can be done by:
-
Compiling a g4 grammar file using the ANTLR tool into a parser class file and specifying its fully qualified class name via the antlrParserClass argument.
-
Loading a .g4 grammar file directly via the antlrGrammarFile argument, assuming the target grammar does not require any language-specific code extensions.
The ANTLR scanner uses either pre-compiled or interpreted (.g4) parsers to read the contents of input files and iterate their AST.
To learn more see the ANTLR tutorial.
Modules
Configuration
To configure the ANTLR scanner module, Edit these settings.
Below is the default configuration from: antlr/java.yaml.
ewogICJ0eXBlIiA6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIiA6IHsKICAgICJ0ZW54IiA6IHsKICAgICAgInR5cGUiIDogInN0cmluZyIKICAgIH0sCiAgICAiYW50bHJQYXJzZVVuaXRUaW1lb3V0IiA6IHsKICAgICAgInR5cGUiIDogWyAic3RyaW5nIiwgIm51bGwiIF0sCiAgICAgICJtYXJrZG93bkRlc2NyaXB0aW9uIiA6ICJTb3VyY2UgZmlsZSBwYXJzZSB0aW1lb3V0XG5cbkRlZmluZXMgdGhlIHRpbWVvdXQgaW50ZXJ2YWwgZm9yIHBhcnNpbmcgYW5kIHNjYW5uaW5nIGEgc291cmNlL2JpbmFyeSBpbnB1dCBmaWxlIGJlZm9yZSBhYm9ydGluZy4gZGVmYXVsdCB2YWx1ZTogNjBzZWMiCiAgICB9CiAgfSwKICAiYWRkaXRpb25hbFByb3BlcnRpZXMiIDogZmFsc2UKfQ==
# 🔟❎ 'compile' Java ANTLR symbol scanner configuration
# The 'java' configuration instructs the ANTLR scanner which symbol values (e.g. class/func names, code constants)
# to extract from Java source files.
# IMPORTANT: while set to enabled by default, the dedicated java scanner utilizes
# takes precedence over this scanner unless disabled.
# To learn more see https://doc.log10x.com/compile/scanner/javaParser
# Set the 10x pipeline to 'compile'
tenx: compile
# ============================== ANTLR Options ================================
antlr:
# The 'antlrLang' module is defined in: https://doc.log10x.com/compile/scanner/antlr/langs
# These values define the ANTLR-generated parser class used to
# to tokenize and construct an AST for the target Java file.
lang: java
fileExt:
- .java
parserClass: com.log10x.antlr.generated.java.Java9Parser
lexerClass: com.log10x.antlr.generated.java.Java9Lexer
rootRule: compilationUnit
reserved:
- package
- interface
- class
- enum
# 'lineFilters' specifies a list of regex patterns for skipping input lines
lineFilters:
- ^\s*(//.*|/\*.*\*/)$
# The 'rule' list defines the nodes within a Java AST tree that are captured
# and added to the output symbol unit.
# To learn more see: https://doc.log10x.com/compile/scanner/antlr/rules/
rule:
- name: packageDeclaration
lang: java
context: package
recursive: false
capture: allSymbols
- name: methodInvocation
lang: java
context: method_invoke
recursive: true
capture: literalsOnly
- name: enumDeclaration
lang: java
context: class
recursive: false
capture: allSymbols
- name: annotation
lang: java
context: annotation_invoke
recursive: true
capture: literalsOnly
- name: normalClassDeclaration
lang: java
context: class
recursive: false
capture: allSymbols
- name: variableInitializer
lang: java
context: var_assign
recursive: true
capture: literalsOnly
- name: enumConstant
lang: java
context: enum
recursive: false
capture: allSymbols
- name: returnStatement
lang: java
context: var_assign
recursive: true
capture: literalsOnly
- name: methodDeclaration
lang: java
context: method_decl
recursive: false
capture: literalsOnly
- name: argumentList
lang: java
context: method_invoke
recursive: true
capture: literalsOnly
- name: methodInvocation_lfno_primary
lang: java
context: method_invoke
recursive: true
capture: literalsOnly
- name: methodDeclarator
lang: java
context: method_decl
recursive: false
capture: allSymbols
- name: methodInvocation_lf_primary
lang: java
context: method_invoke
recursive: true
capture: literalsOnly
- name: normalInterfaceDeclaration
lang: java
context: class
recursive: false
capture: allSymbols
Below is the default configuration from: antlr/cpp.yaml.
ewogICJ0eXBlIiA6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIiA6IHsKICAgICJ0ZW54IiA6IHsKICAgICAgInR5cGUiIDogInN0cmluZyIKICAgIH0sCiAgICAiYW50bHJQYXJzZVVuaXRUaW1lb3V0IiA6IHsKICAgICAgInR5cGUiIDogWyAic3RyaW5nIiwgIm51bGwiIF0sCiAgICAgICJtYXJrZG93bkRlc2NyaXB0aW9uIiA6ICJTb3VyY2UgZmlsZSBwYXJzZSB0aW1lb3V0XG5cbkRlZmluZXMgdGhlIHRpbWVvdXQgaW50ZXJ2YWwgZm9yIHBhcnNpbmcgYW5kIHNjYW5uaW5nIGEgc291cmNlL2JpbmFyeSBpbnB1dCBmaWxlIGJlZm9yZSBhYm9ydGluZy4gZGVmYXVsdCB2YWx1ZTogNjBzZWMiCiAgICB9CiAgfSwKICAiYWRkaXRpb25hbFByb3BlcnRpZXMiIDogZmFsc2UKfQ==
# 🔟❎ 'compile' C++ ANTLR symbol scanner configuration
# The 'cpp' configuration instructs the ANTLR scanner which symbol values (e.g. class/func names, code constants)
# to extract from c++ source files.
# Set the 10x pipeline to 'compile'
tenx: compile
# ============================== ANTLR Options ================================
antlr:
# The 'antlrLang' module is defined in: https://doc.log10x.com/compile/scanner/antlr/langs
# These values define the ANTLR grammar .g4 file used to
# to tokenize and construct an AST for the target cpp file
lang: cpp
fileExt:
- .cpp
- .cxx
- .c++
- .c
- .h
- .H
- .hpp
- .hxx
- .h++
- .cc
- .hh
grammarFile: grammar/CPP14Parser.g4
lexerFile: grammar/CPP14Lexer.g4
rootRule: translationUnit
reserved:
- '::'
- '<<'
- inline
- namespace
- <
- class
- '>'
# 'lineFilters' specifies a list of regex patterns for skipping input lines
lineFilters:
- ^\s*(//.*|/\*.*\*/)$
- ^.*\bnew\(\);\b.*$
# The 'rule' list defines the nodes within a cpp AST tree that are captured
# and added to the output symbol unit.
# To learn more see: https://doc.log10x.com/compile/scanner/antlr/rules/
rule:
- name: enumerator
lang: cpp
context: enum
recursive: false
capture: allSymbols
- name: blockDeclaration
lang: cpp
context: method_invoke
recursive: true
capture: literalsOnly
- name: enumbase
lang: cpp
context: enum
recursive: true
capture: literalsOnly
- name: classHeadName
lang: cpp
context: class
recursive: true
capture: allSymbols
- name: enumSpecifier
lang: cpp
context: class
recursive: false
capture: literalsOnly
condition: ^enumHead$
tag: EnumHead
- name: enumHead
lang: cpp
context: class
recursive: false
capture: allSymbols
ifTag: EnumHead
- name: functionDefinition
lang: cpp
context: method_decl
recursive: false
capture: allSymbols
condition: ^functionDefinition$
tag: FunctionDefinition
- name: declaratorid
lang: cpp
context: method_decl
recursive: true
capture: allSymbols
ifTag: FunctionDefinition
- name: jumpStatement
lang: cpp
context: var_assign
recursive: true
capture: literalsOnly
- name: functionBody
lang: cpp
context: method_invoke
recursive: false
capture: literalsOnly
# condition: ^functionBody$
# tag: FunctionBody
- name: statement
lang: cpp
context: method_invoke
recursive: true
capture: literalsOnly
- name: classSpecifier
lang: cpp
context: class
recursive: false
capture: literalsOnly
subRule: memberSpecification
- name: staticAssertDeclaration
lang: cpp
context: var_assign
recursive: true
capture: literalsOnly
- name: parameterDeclarationList
lang: cpp
context: method_invoke
recursive: false
capture: literalsOnly
# condition: ^parameterDeclarationList$
# tag: ParameterDeclarationList
- name: enumeratorDefinition
lang: cpp
context: enum
recursive: false
capture: allSymbols
- name: namespaceDefinition
lang: cpp
context: package
recursive: false
capture: allSymbols
- name: constantExpression
lang: cpp
context: var_assign
recursive: true
capture: literalsOnly
- name: templateArgumentList
lang: cpp
context: method_invoke
recursive: false
capture: literalsOnly
# condition: ^templateArgumentList$
# tag: TemplateArgumentList
Below is the default configuration from: antlr/scala.yaml.
ewogICJ0eXBlIiA6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIiA6IHsKICAgICJ0ZW54IiA6IHsKICAgICAgInR5cGUiIDogInN0cmluZyIKICAgIH0sCiAgICAiYW50bHJQYXJzZVVuaXRUaW1lb3V0IiA6IHsKICAgICAgInR5cGUiIDogWyAic3RyaW5nIiwgIm51bGwiIF0sCiAgICAgICJtYXJrZG93bkRlc2NyaXB0aW9uIiA6ICJTb3VyY2UgZmlsZSBwYXJzZSB0aW1lb3V0XG5cbkRlZmluZXMgdGhlIHRpbWVvdXQgaW50ZXJ2YWwgZm9yIHBhcnNpbmcgYW5kIHNjYW5uaW5nIGEgc291cmNlL2JpbmFyeSBpbnB1dCBmaWxlIGJlZm9yZSBhYm9ydGluZy4gZGVmYXVsdCB2YWx1ZTogNjBzZWMiCiAgICB9CiAgfSwKICAiYWRkaXRpb25hbFByb3BlcnRpZXMiIDogZmFsc2UKfQ==
# 🔟❎ 'compile' Scala symbol scanner configuration
# The 'scala' configuration instructs the ANTLR scanner which symbol values (e.g. class/func names, code constants)
# to extract from Scala source files.
# IMPORTANT: while set to enabled by default, the dedicated scala scanner utilizes
# takes precedence over this scanner unless disabled.
# To learn more see https://doc.log10x.com/compile/scanner/scalameta
# Set the 10x pipeline to 'compile'
tenx: compile
# ============================== ANTLR Options ================================
antlr:
# The 'antlrLang' module is defined in: https://doc.log10x.com/compile/scanner/antlr/langs
# These values define the ANTLR grammar .g4 file used to
# to tokenize and construct an AST for the target .Scala file.
lang: scala
fileExt:
- .scala
- .sc
grammarFile: grammar/scala.g4
rootRule: compilationUnit
reserved:
- interface
- class
- enum
- this
# 'lineFilters' specifies a list of regex patterns for skipping input lines
lineFilters:
- ^\s*(//.*|/\*.*\*/)$
# The 'rule' list defines the nodes within a scala AST tree that are captured
# and added to the output symbol unit.
# To learn more see: https://doc.log10x.com/compile/scanner/antlr/rules/
rule:
- name: classParamClauses
lang: scala
context: class
recursive: false
capture: literalsOnly
- name: patVarDef
lang: scala
context: var_assign
recursive: true
capture: literalsOnly
- name: argumentExprs
lang: scala
context: method_invoke
recursive: true
capture: literalsOnly
- name: annotation
lang: scala
context: annotation_invoke
recursive: true
capture: literalsOnly
- name: objectDef
lang: scala
context: class
recursive: true
capture: allSymbols
- name: paramClauses
lang: scala
context: class
recursive: false
capture: literalsOnly
- name: classTemplateOpt
lang: scala
context: class
recursive: false
capture: literalsOnly
- name: funDcl
lang: scala
context: method_decl
recursive: false
capture: allSymbols
- name: classDef
lang: scala
context: class
recursive: true
capture: allSymbols
- name: qualId
lang: scala
context: package
recursive: true
capture: allSymbols
- name: traitDef
lang: scala
context: class
recursive: true
capture: allSymbols
- name: funDef
lang: scala
context: method_decl
recursive: true
capture: allSymbols
subRule: funSig
- name: type_
lang: scala
context: method_decl
recursive: false
capture: literalsOnly
- name: blockStat
lang: scala
context: var_assign
recursive: true
capture: literalsOnly
- name: templateStat
lang: scala
context: method_invoke
recursive: true
capture: literalsOnly
Below is the default configuration from: antlr/go.yaml.
ewogICJ0eXBlIiA6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIiA6IHsKICAgICJ0ZW54IiA6IHsKICAgICAgInR5cGUiIDogInN0cmluZyIKICAgIH0sCiAgICAiYW50bHJQYXJzZVVuaXRUaW1lb3V0IiA6IHsKICAgICAgInR5cGUiIDogWyAic3RyaW5nIiwgIm51bGwiIF0sCiAgICAgICJtYXJrZG93bkRlc2NyaXB0aW9uIiA6ICJTb3VyY2UgZmlsZSBwYXJzZSB0aW1lb3V0XG5cbkRlZmluZXMgdGhlIHRpbWVvdXQgaW50ZXJ2YWwgZm9yIHBhcnNpbmcgYW5kIHNjYW5uaW5nIGEgc291cmNlL2JpbmFyeSBpbnB1dCBmaWxlIGJlZm9yZSBhYm9ydGluZy4gZGVmYXVsdCB2YWx1ZTogNjBzZWMiCiAgICB9CiAgfSwKICAiYWRkaXRpb25hbFByb3BlcnRpZXMiIDogZmFsc2UKfQ==
# 🔟❎ 'compile' Go ANTLR symbol scanner configuration
# The 'go' configuration instructs the ANTLR scanner which symbol values (e.g. class/func names, code constants)
# to extract from GO source files.
# Set the 10x pipeline to 'compile'
tenx: compile
# ============================== ANTLR Options ================================
antlr:
# The 'antlrLang' module is defined in: https://doc.log10x.com/compile/scanner/antlr/langs
# These values define the ANTLR-generated parser class used to
# to tokenize and construct an AST for the target go file.
lang: go
fileExt:
- .go
parserClass: com.log10x.antlr.generated.golang.GoParser
lexerClass: com.log10x.antlr.generated.golang.GoLexer
rootRule: sourceFile
reserved:
- package
- func
# 'lineFilters' specifies a list of regex patterns for skipping input lines
lineFilters:
- ^\s*(//.*|/\*.*\*/)$
# The 'rule' list defines the nodes within a go AST tree that are captured
# and added to the output symbol unit.
# To learn more see: https://doc.log10x.com/compile/scanner/antlr/rules/
rule:
- name: constSpec
lang: go
context: enum
recursive: false
capture: allSymbols
subRule: identifierList
- name: returnStmt
lang: go
context: var_assign
recursive: true
capture: literalsOnly
- name: assignment
lang: go
context: var_assign
recursive: true
capture: literalsOnly
- name: expressionStmt
lang: go
context: method_invoke
recursive: true
capture: literalsOnly
- name: packageClause
lang: go
context: package
recursive: false
capture: allSymbols
- name: arguments
lang: go
context: method_invoke
recursive: true
capture: literalsOnly
- name: typeSpec
lang: go
context: class
recursive: false
capture: allSymbols
- name: functionDecl
lang: go
context: method_decl
recursive: false
capture: allSymbols
- name: methodDecl
lang: go
context: method_decl
recursive: false
capture: allSymbols
Below is the default configuration from: antlr/javascript.yaml.
ewogICJ0eXBlIiA6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIiA6IHsKICAgICJ0ZW54IiA6IHsKICAgICAgInR5cGUiIDogInN0cmluZyIKICAgIH0sCiAgICAiYW50bHJQYXJzZVVuaXRUaW1lb3V0IiA6IHsKICAgICAgInR5cGUiIDogWyAic3RyaW5nIiwgIm51bGwiIF0sCiAgICAgICJtYXJrZG93bkRlc2NyaXB0aW9uIiA6ICJTb3VyY2UgZmlsZSBwYXJzZSB0aW1lb3V0XG5cbkRlZmluZXMgdGhlIHRpbWVvdXQgaW50ZXJ2YWwgZm9yIHBhcnNpbmcgYW5kIHNjYW5uaW5nIGEgc291cmNlL2JpbmFyeSBpbnB1dCBmaWxlIGJlZm9yZSBhYm9ydGluZy4gZGVmYXVsdCB2YWx1ZTogNjBzZWMiCiAgICB9CiAgfSwKICAiYWRkaXRpb25hbFByb3BlcnRpZXMiIDogZmFsc2UKfQ==
# 🔟❎ 'compile' JavaScript ANTLR symbol scanner configuration
# The 'javascript' configuration instructs the ANTLR scanner which symbol values (e.g. class/func names, code constants)
# to extract from JavaScript source files.
# Set the 10x pipeline to 'compile'
tenx: compile
# ============================== ANTLR Options ================================
antlr:
# The 'antlrLang' module is defined in: https://doc.log10x.com/compile/scanner/antlr/langs
# These values define the ANTLR-generated parser class used to
# to tokenize and construct an AST for the target JavaScript file.
lang: javascript
fileExt:
- .js
parserClass: com.log10x.antlr.parsers.JavaScriptDecoratedParser
lexerClass: com.log10x.antlr.generated.javascript.JavaScriptLexer
rootRule: program2
reserved:
- function
- constructor
- class
maxLineLength: 4096
# 'lineFilters' specifies a list of regex patterns for skipping input lines
lineFilters:
- ^\s*(//.*|/\*.*\*/)$
# The 'rule' list defines the nodes within a js AST tree that are captured
# and added to the output symbol unit.
# To learn more see: https://doc.log10x.com/compile/scanner/antlr/rules/
rule:
- name: functionDeclaration
lang: javascript
context: method_decl
recursive: false
capture: allSymbols
subRule: identifier
- name: returnStatement
lang: javascript
context: var_assign
recursive: true
capture: literalsOnly
- name: classDeclaration
lang: javascript
context: class
recursive: false
capture: allSymbols
subRule: identifier
- name: methodDefinition
lang: javascript
context: method_decl
recursive: true
capture: allSymbols
subRule: identifier
- name: formalParameterList
lang: javascript
context: method_decl
recursive: false
capture: literalsOnly
- name: functionBody
lang: javascript
context: method_invoke
recursive: false
capture: literalsOnly
- name: functionDecl
lang: javascript
context: method_decl
recursive: false
capture: allSymbols
- name: anoymousFunctionDecl
lang: javascript
context: method_decl
recursive: false
capture: allSymbols
- name: argumentsExpression
lang: javascript
context: method_invoke
recursive: true
capture: literalsOnly
- name: variableDeclaration
lang: javascript
context: var_assign
recursive: true
capture: literalsOnly
- name: assignmentExpression
lang: javascript
context: var_assign
recursive: true
capture: literalsOnly
Below is the default configuration from: antlr/python.yaml.
ewogICJ0eXBlIiA6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIiA6IHsKICAgICJ0ZW54IiA6IHsKICAgICAgInR5cGUiIDogInN0cmluZyIKICAgIH0sCiAgICAiYW50bHJQYXJzZVVuaXRUaW1lb3V0IiA6IHsKICAgICAgInR5cGUiIDogWyAic3RyaW5nIiwgIm51bGwiIF0sCiAgICAgICJtYXJrZG93bkRlc2NyaXB0aW9uIiA6ICJTb3VyY2UgZmlsZSBwYXJzZSB0aW1lb3V0XG5cbkRlZmluZXMgdGhlIHRpbWVvdXQgaW50ZXJ2YWwgZm9yIHBhcnNpbmcgYW5kIHNjYW5uaW5nIGEgc291cmNlL2JpbmFyeSBpbnB1dCBmaWxlIGJlZm9yZSBhYm9ydGluZy4gZGVmYXVsdCB2YWx1ZTogNjBzZWMiCiAgICB9CiAgfSwKICAiYWRkaXRpb25hbFByb3BlcnRpZXMiIDogZmFsc2UKfQ==
# 🔟❎ 'compile' Python symbol scanner configuration
# The 'python' configuration instructs the ANTLR scanner which symbol values (e.g. class/func names, code constants)
# to extract from Python source files.
# NOTE: while set to enabled by default, the 'pythonAST' scanner which utilizes
# the python run-time's built-in AST parser takes precedence over this scanner
# unless a python run-time is not installed or the 'pythonAST' scanner is disabled.
# To learn see https://doc.log10x.com/compile/scanner/pythonAST
# Set the 10x pipeline to 'compile'
tenx: compile
# ============================== ANTLR Options ================================
antlr:
# The 'antlrLang' module is defined in: https://doc.log10x.com/compile/scanner/antlr/langs
# These values define the ANTLR-generated parser class used to
# to tokenize and construct an AST for the target Python file.
lang: python
fileExt:
- .py
parserClass: com.log10x.antlr.generated.python.PythonParser
lexerClass: com.log10x.antlr.generated.python.PythonLexer
rootRule: root
reserved:
- def
- __str__
- __init__
- class
# 'lineFilters' specifies a list of regex patterns for skipping input lines
lineFilters:
- ^\s*(//.*|/\*.*\*/)$
# The 'rule' list defines the nodes within a Python AST tree that are captured
# and added to the output symbol unit.
# To learn more see: https://doc.log10x.com/compile/scanner/antlr/rules/
rule:
- name: classdef
lang: python
context: class
recursive: false
capture: allSymbols
subRule: name
condition: (^Enum$|^enum$)
tag: enumLiteral
- name: testlist_star_expr
lang: python
context: enum
recursive: true
capture: allSymbolsIfMatchCond
ifTag: enumLiteral
- name: trailer
lang: python
context: method_invoke
recursive: true
capture: literalsOnly
subRule: name
- name: return_stmt
lang: python
context: var_assign
recursive: true
capture: literalsOnly
- name: decorator
lang: python
context: annotation_invoke
recursive: true
capture: literalsOnly
subRule: arglist
- name: arglist
lang: python
context: method_invoke
recursive: true
capture: literalsOnly
- name: funcdef
lang: python
context: method_decl
recursive: false
capture: allSymbols
subRule: name
- name: assign_part
lang: python
context: var_assign
recursive: true
capture: literalsOnly
- name: dictorsetmaker
lang: python
context: var_assign
recursive: true
capture: literalsOnly
Below is the default configuration from: antlr/csharp.yaml.
ewogICJ0eXBlIiA6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIiA6IHsKICAgICJ0ZW54IiA6IHsKICAgICAgInR5cGUiIDogInN0cmluZyIKICAgIH0sCiAgICAiYW50bHJQYXJzZVVuaXRUaW1lb3V0IiA6IHsKICAgICAgInR5cGUiIDogWyAic3RyaW5nIiwgIm51bGwiIF0sCiAgICAgICJtYXJrZG93bkRlc2NyaXB0aW9uIiA6ICJTb3VyY2UgZmlsZSBwYXJzZSB0aW1lb3V0XG5cbkRlZmluZXMgdGhlIHRpbWVvdXQgaW50ZXJ2YWwgZm9yIHBhcnNpbmcgYW5kIHNjYW5uaW5nIGEgc291cmNlL2JpbmFyeSBpbnB1dCBmaWxlIGJlZm9yZSBhYm9ydGluZy4gZGVmYXVsdCB2YWx1ZTogNjBzZWMiCiAgICB9CiAgfSwKICAiYWRkaXRpb25hbFByb3BlcnRpZXMiIDogZmFsc2UKfQ==
# 🔟❎ 'compile' C# ANTLR symbol scanner configuration
# The 'csharp' configuration instructs the ANTLR scanner which symbol values (e.g. class/func names, code constants)
# to extract from c# source files.
# Set the 10x pipeline to 'compile'
tenx: compile
# ============================== ANTLR Options ================================
antlr:
# The 'antlrLang' module is defined in: https://doc.log10x.com/compile/scanner/antlr/langs
# These values define the ANTLR-generated parser class used to
# to tokenize and construct an AST for the target c# file.
lang: csharp
fileExt:
- .csx
- .cs
parserClass: com.log10x.antlr.generated.csharp.CSharpParser
lexerClass: com.log10x.antlr.generated.csharp.CSharpLexer
# 'charStreamClass' is set to remove language elements introduced later than V6.0 which is the latest release supported by this grammar.
# To learn more see https://github.com/antlr/grammars-v4/tree/master/csharp#readme
charStreamClass: com.log10x.antlr.parsers.CSharpDowngrader
rootRule: compilation_unit
reserved:
- namespace
- interface
- class
- enum
# 'lineFilters' specifies a list of regex patterns for skipping input lines
lineFilters:
- ^\s*(//.*|/\*.*\*/)$
# Remove unsupported C# v10 File scoped namespaces
- ^namespace.*;$
# Remove unsupported C# v9 Target-typed new operator
# - "(?<!\\w)new\\s*\\(\\s*\\)\\s*;"
- ".*new\\(.*"
# The 'rule' list defines the nodes within a csharp AST tree that are captured
# and added to the output symbol unit.
# To learn more see: https://doc.log10x.com/compile/scanner/antlr/rules/
rule:
- name: interface_definition
lang: csharp
context: class
recursive: false
capture: allSymbols
subRule: identifier
- name: enum_member_declaration
lang: csharp
context: enum
recursive: true
capture: allSymbols
- name: enum_definition
lang: csharp
context: class
recursive: false
capture: allSymbols
subRule: identifier
- name: namespace_body
lang: csharp
context: package
recursive: true
capture: literalsOnly
- name: method_invocation
lang: csharp
context: method_invoke
recursive: true
capture: literalsOnly
- name: object_creation_expression
lang: csharp
context: method_invoke
recursive: true
capture: literalsOnly
- name: interpolated_regular_string_part
lang: csharp
context: method_invoke
recursive: false
capture: allSymbols
subRule: interpolated_regular_string_part
- name: namespace_declaration
lang: csharp
context: package
recursive: true
capture: allSymbols
- name: statement_list
lang: csharp
context: method_invoke
recursive: true
capture: literalsOnly
- name: constant_declaration
lang: csharp
context: var_assign
recursive: true
capture: literalsOnly
- name: method_declaration
lang: csharp
context: method_decl
recursive: true
capture: allSymbols
subRule: identifier
- name: method_body
lang: csharp
context: method_decl
recursive: false
capture: literalsOnly
#- name: method_member_name
# lang: csharp
# context: method_decl
# recursive: true
# capture: allSymbols
- name: local_variable_initializer
lang: csharp
context: var_assign
recursive: true
capture: literalsOnly
- name: returnStatement
lang: csharp
context: var_assign
recursive: true
capture: literalsOnly
- name: class_definition
lang: csharp
context: class
recursive: false
capture: allSymbols
subRule: identifier
- name: attribute_argument
lang: csharp
context: annotation_invoke
recursive: true
capture: literalsOnly
Options
Specify the options below to configure the ANTLR scanner:
| Name | Description |
|---|---|
| antlrParseUnitTimeout | Source file parse timeout |
antlrParseUnitTimeout
Source file parse timeout.
| Type | Default |
|---|---|
| String | "" |
Defines the timeout interval for parsing and scanning a source/binary input file before aborting. default value: 60sec.
This module is defined in antlr/module.yaml.