ANTLR Language
Defines language configurations for scanning symbols using ANTLR grammars.
Specify a grammar or parser to produce an AST for symbol extraction via configurable rules.
Options
Specify the options below to configure multiple ANTLR language:
| Name | Description | Category |
|---|---|---|
| antlrLang | Programming language name | General |
| antlrFileExt | Language file extensions | General |
| antlrRootRule | Syntax root rule | General |
| antlrGrammarFile | ANTLR .g4 grammar file | G4 grammar |
| antlrLexerFile | ANTLR .g4 lexer file | G4 grammar |
| antlrParserClass | Java parser class name | Compiled grammar |
| antlrLexerClass | Java lexer class name | Compiled grammar |
| antlrTokenStreamClass | Java token stream class name | Compiled grammar |
| antlrCharStreamClass | ANTLR CharStream factory class name | Compiled grammar |
| antlrReserved | Language-specific keywords | Filter |
| antlrLineFilters | Regex patterns for skipping input lines | Filter |
| antlrMaxLineLength | Max line length to parse | Filter |
General
antlrLang
Programming language name.
| Type | Required | Category |
|---|---|---|
| String | ✔ | General |
Defines a logical name of the programming language processed by this scanner (e.g., 'cpp', 'go').
antlrFileExt
Language file extensions.
| Type | Required | Category |
|---|---|---|
| List | ✔ | General |
Defines the file extensions to be scanned by this ANTLR syntax (e.g., '.cpp', '.js').
antlrRootRule
Syntax root rule.
| Type | Required | Category |
|---|---|---|
| String | ✔ | General |
Provides the root grammar rule of the ANTLR-generated parser class for iterating over the root of the AST structure To learn more see ANTLR start rules.
G4 grammar
antlrGrammarFile
ANTLR .g4 grammar file.
| Type | Default | Category |
|---|---|---|
| String | "" | G4 grammar |
Provides an ANTLR grammar file that is used directly to parse source files for the lang. This value takes precedence over a 'parserClass' value. This option enables the parsing of source files without requiring a pre-compile step of the grammar into a .java for grammars that do not have language-specific extensions If this value is not specified, 'antlrParserClass' must be set.
antlrLexerFile
ANTLR .g4 lexer file.
| Type | Default | Category |
|---|---|---|
| String | "" | G4 grammar |
Provides an optional ANTLR lexer file to use with 'antlrGrammarFile'.
Compiled grammar
antlrParserClass
Java parser class name.
| Type | Default | Category |
|---|---|---|
| String | "" | Compiled grammar |
Provides the fully qualified name Java class name of the ANTLR-generated parser class that will be used to parse the token stream from the input source. This option applies in cases where the ANTLR syntax requires additional Java code to parse target sources such as custom parser, lexer, or token stream classes
To learn more https://ocw.mit.edu/ans7870/6/6.005/s16/classes/18-parser-generators/ and https://www.baeldung.com/java-antlr#1-prepare-a-grammar-file For an example implementation, see com.log10x.antlr.generated.cpp.CPP14Parser
If this value is not specified, 'antlrGrammarFile' must be set.
antlrLexerClass
Java lexer class name.
| Type | Default | Category |
|---|---|---|
| String | "" | Compiled grammar |
Provides the fully qualified name Java class name of the ANTLR-generated lexer class that will be used to draw input symbols from a character stream. For an example implementation, see com.log10x.antlr.generated.cpp.CPP14Lexer.
antlrTokenStreamClass
Java token stream class name.
| Type | Default | Category |
|---|---|---|
| String | "" | Compiled grammar |
Provides an optional Java class name for the ANTLR-generated token stream to bridge between the lexer and parser. This option supports ANTLR syntaxes that require a custom token stream.
The class must be a sub-class of ANTLR CommonTokenStream
The token stream class must define a constructor that receives an ANTLR TokenSource reference and a millisecond timeout:
If 'timeout' is exceeded from the moment of instantiation to a subsequent call to the stream's 'seek' or 'consume' methods, the stream should throw an exception to indicate a timeout to halt the scanning for the current source file.
antlrCharStreamClass
ANTLR CharStream factory class name.
| Type | Default | Category |
|---|---|---|
| String | CharStreams | Compiled grammar |
Provides a fully qualified Java class name that acts as a factory for CharStream instances by declaring the following method:
If not specified, defaults to CharStreams.fromStream.
This option provides a mechanism for preprocessing input code prior to parsing it using the target ANTLR grammar lexer/parsers.
Filter
antlrReserved
Language-specific keywords.
| Type | Default | Category |
|---|---|---|
| List | [] | Filter |
Provides a list of symbols specifying reserved words in the target language syntax (e.g., 'class', 'int'..) to be skip over when scanning for symbol values.
antlrLineFilters
Regex patterns for skipping input lines.
| Type | Default | Category |
|---|---|---|
| List | [] | Filter |
Specifies a list of patterns for filtering matching input code lines. This options can be used to filter out comments as well as language constructs unsupported by the target grammar. If a pattern in the list matches a line it is skipped.
antlrMaxLineLength
Max line length to parse.
| Type | Default | Category |
|---|---|---|
| Number | 0 | Filter |
Specifies a max number of chars an input line may contain for it to be parsed. This option provides a method for skipping minified code (e.g., .js) which may slow down the ANTLR parser.
This module is defined in langs/module.yaml.