欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

An Example of ANTLR

程序员文章站 2022-07-14 13:40:20
...

ANTLR Introduction

Create a new maven project

add dependency in pom.xml

    <dependencies>
        <dependency>
            <groupId>org.antlr</groupId>
            <artifactId>antlr4-maven-plugin</artifactId>
            <version>4.7.1</version>
        </dependency>

    </dependencies>

Create ANTLTR file LabeledExpr.g4

/** Grammars always start with a grammar header. This grammar is called LabeledExpr 
 * and must match the file name:LabeledExpr.
 */
grammar LabeledExpr;

/** A rule called prog  */
prog:   stat+ ;

// A rule called stat
stat:  expr NEWLINE                # printExpr  // '#' indicate a label
      |  ID '=' expr NEWLINE       # assign
      | NEWLINE                    # blank
      ;
// expr rule      
expr: expr op=('*' | '/') expr          # MulDiv
      |  expr op=('+' | '-')  expr      # AddSub
      |  INT                            # int      
      |  ID                             # id
      | '('  expr ')'                   # parens
      ;
/*
 * parser rules start with lower case letters, lexer rules with uppercase  * 
 */      
MUL :   '*' ; // ASSIGNS TOKEN NAME TO '*' USED ABOVE IN GRAMMAR
DIV   : '/' ; 
ADD :       '+' ;
SUB :       '-' ;

ID   :    [a-zA-Z]+ ;
INT :    [0-9]+ ; // Define token INT as one or more digits.
NEWLINE: '\r' ? '\n' ;
WS :    [ \t]+ -> skip; // Define white space rule, toss it out

Parsing Terms

This chapter introduced a number of important language recognition terms.

  • Language A language is a set of valid sentences; sentences are composed of phrases,
    which are composed of subphrases, and so on.
  • Grammar A grammar formally defines the syntax rules of a language. Each rule in
    a grammar expresses the structure of a subphrase.
  • Syntax tree or parse tree This represents the structure of the sentence where each
    subtree root gives an abstract name to the elements beneath it. The subtree roots
    correspond to grammar rule names. The leaves of the tree are symbols or tokens
    of the sentence.
  • Token A token is a vocabulary symbol in a language; these can represent a category
    of symbols such as “identifier” or can represent a single operator or keyword.
  • Lexer or tokenizer This breaks up an input character stream into tokens. A lexer
    performs lexical analysis.
  • Parser A parser checks sentences for membership in a specific language by checking
    the sentence’s structure against the rules of a grammar. The best analogy for
    parsing is traversing a maze, comparing words of a sentence to words written
    along the floor to go from entrance to exit. ANTLR generates top-down parsers
    called ALL(*) that can use all remaining input symbols to make decisions. Top-
    down parsers are goal-oriented and start matching at the rule associated with
    the coarsest construct, such as program or inputFile .
  • Recursive-descent parser This is a specific kind of top-down parser implemented
    with a function for each rule in the grammar.
    Lookahead Parsers use lookahead to make decisions by comparing the symbols that
    begin each alternative.

Generate files

antlr4 LabeledExpr.g4 -visitor
Generate the following files

-rw-r--r--. 1 houzhizhen root  3884 6月   6 17:53 LabeledExprBaseListener.java
-rw-r--r--. 1 houzhizhen root  1545 6月   6 17:53 LabeledExpr.interp
-rw-r--r--. 1 houzhizhen root  2199 6月   6 17:53 LabeledExprLexer.interp
-rw-r--r--. 1 houzhizhen root  4181 6月   6 17:53 LabeledExprLexer.java
-rw-r--r--. 1 houzhizhen root   115 6月   6 17:53 LabeledExprLexer.tokens
-rw-r--r--. 1 houzhizhen root  3894 6月   6 17:53 LabeledExprListener.java
-rw-r--r--. 1 houzhizhen root 15231 6月   6 17:53 LabeledExprParser.java
-rw-r--r--. 1 houzhizhen root   115 6月   6 17:53 LabeledExpr.tokens

Parsing Process

An Example of ANTLR

Use grun LabeledExpr prog -tokens to Generate tokens

After input grun LabeledExpr prog -tokens command, you can input the expression you let ANTLR parse. After input 3 * (1 + 2), input CTRL + D to terminate the input.

$ grun LabeledExpr  prog -tokens
30 * ( 123 + 4567)
[@0,0:1='30',<INT>,1:0]
[@1,3:3='*',<'*'>,1:3]
[@2,5:5='(',<'('>,1:5]
[@3,7:9='123',<INT>,1:7]
[@4,11:11='+',<'+'>,1:11]
[@5,13:16='4567',<INT>,1:13]
[@6,17:17=')',<')'>,1:17]
[@7,18:18='\n',<NEWLINE>,1:18]
[@8,19:18='<EOF>',<EOF>,2:0]

The meaning of tokens

[@0,0:1='30',<INT>,1:0] indicates that the token is first token( indexed from 0), goes from character position 0 to 1, has text ‘30’, has token type ‘INT’, is on line 1 from 1, and is at character position 0.

Use grun LabeledExpr prog -gui to display the parse tree visually in a dialog box

Input 30 * (123 + 4567) and CTRL + D will launch the following diaglog box.
An Example of ANTLR

Input

a = 2
b = 3
c = 4
a * (b+c)

and CTRL + D will launch the following diaglog box.
An Example of ANTLR

ANTLR Core Notation

Syntax Description
x Match token, rule reference, or subrule x .
x y ... z Match a sequence of rule elements.
(... | ... | ...) Subrule with multiple alternatives.
x ? Match x or skip it.
x * Match x zero or more times.
x + Match x one or more times.
r : ... ; Define rule r .
r : ... | ... | ... ; Define rule r with multiple alternatives.

Core Language Patterns

Pattern Name Description

  • Sequence
    This is a finite or arbitrarily long sequence of tokens or subphrases. Examples include variable declarations (type followed by identifier) and lists of integers. Here are some sample implementations:
    x y ... z //
    '[' INT+ ']' // Matlab vector of integers
  • Sequence with terminator.
    This is an arbitrarily long, potentially empty sequence of tokens or subphrases separated by a token, usually a semicolon or newline. Examples include statement lists from C-like languages and rows of data terminated with newlines. Here are some sample implementations:
    (statement ';')* // Java statement list
    (row '\n')* // Lines of data
  • Sequence with separator
    This is a nonempty arbitrarily long sequence of tokens or subphrases separated by a token, usually a comma, semicolon, or period. Examples include function argument definition lists, function call argument lists, languages where statements are separated but not terminated, and directory names. Here are some sample implementations:
    expr (',' expr)* // function call arguments
    ( expr (',' expr)* )? // optional function call arguments
    '/'? name ('/' name)* //simplified directory name
    stat ('.' stat)*// SmallTalk statement list
  • Choice
    This is a set of alternative phrases. Examples include the different kinds of types, statements, expressions, or XML tags. Here are some sample implementations:
    type : 'int' | 'float' ;
    stat : ifstat | whilestat | 'return' expr ';' ;
    expr : '(' expr ')' | INT | ID ;
    tag : '<' Name attribute* '>' | '<' '/' Name '>' ;

  • Token dependency
    The presence of one token requires the presence of one or more subsequent tokens. Examples include matching parentheses,curly braces, square bracket, and angle brackets. Here are some sample implementations:
    '(' expr ')' // nested expression
    ID '[' expr ']' //array index
    '{' stat* '}' //statements grouped in curlies
    '<' ID (',' ID)* '>' //generic type specifier

  • Nested phrase
    This is a self-similar language structure. Examples include expressions, nested Java classes, nested code blocks, and nested Python function definitions. Here are some sample implementations:
    expr : '(' expr ')' | ID ;
    classDef : 'class' ID '{' (classDef|method|field) '}' ;

Calculate values with a Visitor

import java.io.FileInputStream;
import java.io.InputStream;

import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.tree.ParseTree;

public class Calc {
    public static void main(String[] args) throws Exception {
        String inputFile = null;
        if ( args.length>0 ) inputFile = args[0];
        InputStream is = System.in;
        if ( inputFile!=null ) is = new FileInputStream(inputFile);
        ANTLRInputStream input = new ANTLRInputStream(is);
        LabeledExprLexer lexer = new LabeledExprLexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        LabeledExprParser parser = new LabeledExprParser(tokens);
        ParseTree tree = parser.prog(); // parse
        //TreeUitls.printlnTree(tree,1);
        EvalVisitor eval = new EvalVisitor();
        eval.visit(tree);
    }
}

Visitor interface

$ antrl4 LabeledExpr.g4 -visitor generates a visitor interface with a method for each labeled alternative name.

// Generated from LabeledExpr.g4 by ANTLR 4.7.1
import org.antlr.v4.runtime.tree.ParseTreeListener;

/**
 * This interface defines a complete listener for a parse tree produced by
 * {@link LabeledExprParser}.
 */
public interface LabeledExprListener extends ParseTreeListener {
    /**
     * Enter a parse tree produced by {@link LabeledExprParser#prog}.
     * @param ctx the parse tree
     */
    void enterProg(LabeledExprParser.ProgContext ctx);
    /**
     * Exit a parse tree produced by {@link LabeledExprParser#prog}.
     * @param ctx the parse tree
     */
    void exitProg(LabeledExprParser.ProgContext ctx);
    /**
     * Enter a parse tree produced by the {@code printExpr}
     * labeled alternative in {@link LabeledExprParser#stat}.
     * @param ctx the parse tree
     */
    void enterPrintExpr(LabeledExprParser.PrintExprContext ctx);
    /**
     * Exit a parse tree produced by the {@code printExpr}
     * labeled alternative in {@link LabeledExprParser#stat}.
     * @param ctx the parse tree
     */
    void exitPrintExpr(LabeledExprParser.PrintExprContext ctx);
    /**
     * Enter a parse tree produced by the {@code assign}
     * labeled alternative in {@link LabeledExprParser#stat}.
     * @param ctx the parse tree
     */
    void enterAssign(LabeledExprParser.AssignContext ctx);
    /**
     * Exit a parse tree produced by the {@code assign}
     * labeled alternative in {@link LabeledExprParser#stat}.
     * @param ctx the parse tree
     */
    void exitAssign(LabeledExprParser.AssignContext ctx);
    /**
     * Enter a parse tree produced by the {@code blank}
     * labeled alternative in {@link LabeledExprParser#stat}.
     * @param ctx the parse tree
     */
    void enterBlank(LabeledExprParser.BlankContext ctx);
    /**
     * Exit a parse tree produced by the {@code blank}
     * labeled alternative in {@link LabeledExprParser#stat}.
     * @param ctx the parse tree
     */
    void exitBlank(LabeledExprParser.BlankContext ctx);
    /**
     * Enter a parse tree produced by the {@code parens}
     * labeled alternative in {@link LabeledExprParser#expr}.
     * @param ctx the parse tree
     */
    void enterParens(LabeledExprParser.ParensContext ctx);
    /**
     * Exit a parse tree produced by the {@code parens}
     * labeled alternative in {@link LabeledExprParser#expr}.
     * @param ctx the parse tree
     */
    void exitParens(LabeledExprParser.ParensContext ctx);
    /**
     * Enter a parse tree produced by the {@code MulDiv}
     * labeled alternative in {@link LabeledExprParser#expr}.
     * @param ctx the parse tree
     */
    void enterMulDiv(LabeledExprParser.MulDivContext ctx);
    /**
     * Exit a parse tree produced by the {@code MulDiv}
     * labeled alternative in {@link LabeledExprParser#expr}.
     * @param ctx the parse tree
     */
    void exitMulDiv(LabeledExprParser.MulDivContext ctx);
    /**
     * Enter a parse tree produced by the {@code AddSub}
     * labeled alternative in {@link LabeledExprParser#expr}.
     * @param ctx the parse tree
     */
    void enterAddSub(LabeledExprParser.AddSubContext ctx);
    /**
     * Exit a parse tree produced by the {@code AddSub}
     * labeled alternative in {@link LabeledExprParser#expr}.
     * @param ctx the parse tree
     */
    void exitAddSub(LabeledExprParser.AddSubContext ctx);
    /**
     * Enter a parse tree produced by the {@code id}
     * labeled alternative in {@link LabeledExprParser#expr}.
     * @param ctx the parse tree
     */
    void enterId(LabeledExprParser.IdContext ctx);
    /**
     * Exit a parse tree produced by the {@code id}
     * labeled alternative in {@link LabeledExprParser#expr}.
     * @param ctx the parse tree
     */
    void exitId(LabeledExprParser.IdContext ctx);
    /**
     * Enter a parse tree produced by the {@code int}
     * labeled alternative in {@link LabeledExprParser#expr}.
     * @param ctx the parse tree
     */
    void enterInt(LabeledExprParser.IntContext ctx);
    /**
     * Exit a parse tree produced by the {@code int}
     * labeled alternative in {@link LabeledExprParser#expr}.
     * @param ctx the parse tree
     */
    void exitInt(LabeledExprParser.IntContext ctx);
}

Visitor Base

$ antrl4 LabeledExpr.g4 -visitor generates a default visitor implementation called LabeledExprBaseVisitor

// Generated from LabeledExpr.g4 by ANTLR 4.7.1
import org.antlr.v4.runtime.tree.AbstractParseTreeVisitor;

/**
 * This class provides an empty implementation of {@link LabeledExprVisitor},
 * which can be extended to create a visitor which only needs to handle a subset
 * of the available methods.
 *
 * @param <T> The return type of the visit operation. Use {@link Void} for
 * operations with no return type.
 */
public class LabeledExprBaseVisitor<T> extends AbstractParseTreeVisitor<T> implements LabeledExprVisitor<T> {
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation returns the result of calling
     * {@link #visitChildren} on {@code ctx}.</p>
     */
    @Override public T visitProg(LabeledExprParser.ProgContext ctx) { return visitChildren(ctx); }
    // ...
}

EvalVisitor

public class EvalVisitor extends LabeledExprBaseVisitor<Integer> {
    Map<String, Integer> memory = new HashMap<String, Integer>();

    @Override
    public Integer visitAssign(LabeledExprParser.AssignContext ctx) {
        String id = ctx.ID().getText();
        int value = visit(ctx.expr());
        memory.put(id, value);
        return value;
    }

    @Override
    public Integer visitPrintExpr(LabeledExprParser.PrintExprContext ctx) {
        Integer value = visit(ctx.expr());
        System.out.println(value);
        return value;
    }

    @Override
    public Integer visitInt(LabeledExprParser.IntContext ctx) {
        return Integer.valueOf(ctx.INT().getText());
    }

    @Override
    public Integer visitId(LabeledExprParser.IdContext ctx) {
        String id = ctx.ID().getText();
        if (memory.containsKey(id)) {
            return memory.get(id);
        }
        return 0;
    }

    @Override
    public Integer visitMulDiv(LabeledExprParser.MulDivContext ctx) {
        int left = visit(ctx.expr(0));
        int right = visit(ctx.expr(1));
        if (ctx.op.getType() == LabeledExprParser.MUL) {
            return left * right;
        } else {
            return left / right;
        }
    }

    @Override
    public Integer visitAddSub(LabeledExprParser.AddSubContext ctx) {
        int left = visit(ctx.expr(0));
        int right = visit(ctx.expr(1));
        if (ctx.op.getType() == LabeledExprParser.ADD) {
            return left + right;
        } else {
            return left - right;
        }
    }

    @Override
    public Integer visitParens(LabeledExprParser.ParensContext ctx) {
        return visit(ctx.expr());
    }

}

Tree Listener

The biggest difference between the listener and visitor mechanisms is that listener methods are called by the ANTLR-provided walker object, whereas visitor methods must walk their children with explicit visit calls.

Antlr generates listener interface and base listener as visitor.

Generated LabeledExprBaseListener

import java.util.HashMap;
import java.util.Map;
import java.util.Stack;

import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.tree.ErrorNode;
import org.antlr.v4.runtime.tree.TerminalNode;

/**
 * Created by houzhizhen on 18-6-8.
 */
public class LabeledExprListenerImpl extends LabeledExprBaseListener {
    Map<String, Integer> memory = new HashMap<String, Integer>();
    Stack<Integer> stack = new Stack<Integer>();
    @Override public void enterProg(LabeledExprParser.ProgContext ctx) {

    }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void exitProg(LabeledExprParser.ProgContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void enterPrintExpr(LabeledExprParser.PrintExprContext ctx) {
         ctx.expr();
    }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void exitPrintExpr(LabeledExprParser.PrintExprContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void enterAssign(LabeledExprParser.AssignContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void exitAssign(LabeledExprParser.AssignContext ctx) {
        String id = ctx.ID().getText();

        int value = 0;
        memory.put(id, value);

    }

    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void enterBlank(LabeledExprParser.BlankContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void exitBlank(LabeledExprParser.BlankContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void enterParens(LabeledExprParser.ParensContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void exitParens(LabeledExprParser.ParensContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void enterMulDiv(LabeledExprParser.MulDivContext ctx) {


    }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void exitMulDiv(LabeledExprParser.MulDivContext ctx) {
        int right = stack.pop();
        int left = stack.pop();
        if (ctx.op.getType() == LabeledExprParser.MUL) {
            stack.push(left * right);
        } else {
            stack.push(left / right) ;
        }
    }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void enterAddSub(LabeledExprParser.AddSubContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void exitAddSub(LabeledExprParser.AddSubContext ctx) {
        int right = stack.pop();
        int left = stack.pop();
        if (ctx.op.getType() == LabeledExprParser.ADD) {
            stack.push(left + right);
        } else {
            stack.push(left + right) ;
        }
    }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void enterId(LabeledExprParser.IdContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void exitId(LabeledExprParser.IdContext ctx) {
        Integer value = memory.get(ctx.ID().getText());
        if (value == null) {
            throw new RuntimeException(ctx.ID().getText() + " is not specified a value yet.");
        }
        stack.push(value);
    }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void enterInt(LabeledExprParser.IntContext ctx) {
        stack.push( Integer.valueOf(ctx.INT().getText()));
    }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void exitInt(LabeledExprParser.IntContext ctx) { }

    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void enterEveryRule(ParserRuleContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void exitEveryRule(ParserRuleContext ctx) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void visitTerminal(TerminalNode node) { }
    /**
     * {@inheritDoc}
     *
     * <p>The default implementation does nothing.</p>
     */
    @Override public void visitErrorNode(ErrorNode node) { }

    public int getValue() {
        return 0;
    }
}

CalculatorWithListenerTest

public static void main(String[] args) throws Exception {
        String inputFile = null;
        if ( args.length>0 ) inputFile = args[0];
        InputStream is = System.in;
        if ( inputFile!=null ) is = new FileInputStream(inputFile);
        ANTLRInputStream input = new ANTLRInputStream(is);
        LabeledExprLexer lexer = new LabeledExprLexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        LabeledExprParser parser = new LabeledExprParser(tokens);
        ParseTree tree = parser.prog(); // parse
        ParseTreeWalker walker = new ParseTreeWalker();
        LabeledExprListenerImpl listener = new LabeledExprListenerImpl();
        walker.walk(listener , tree);
        System.out.println(listener.stack.pop());
    }

The input and output of CalculatorWithListenerTest.

2 * (2 + 14 /2)
^D
18

ANTLR History