1. 来由



2. 抽取注释

闲话少说,怎么提取代码里的注释呢,在 12.1 Broadcasting Tokens on Different Channels这一节专门有讲。

2.1 语法定义-导流



WS  : [\t\n\r]+ ->  skip

    : ‘//‘ .*? ‘\n‘ -> skip
效果如下图所示,默认的是channel 0,其它用户自定义的都是hidden channel: 

2.2 按规则(位置)提取




 * Excerpted from "The Definitive ANTLR 4 Reference",
 * published by The Pragmatic Bookshelf.
 * Copyrights apply to this code. It may not be used to create training material, 
 * courses, books, articles, and the like. Contact us if you are in doubt.
 * We make no guarantees that this code is fit for any purpose. 
 * Visit http://www.pragmaticprogrammer.com/titles/tpantlr2 for more book information.
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.ParseTreeWalker;

import java.io.FileInputStream;
import java.io.InputStream;
import java.util.List;

public class ShiftVarComments {
    public static class CommentShifter extends CymbolBaseListener {
        BufferedTokenStream tokens;
        TokenStreamRewriter rewriter;
        /** Create TokenStreamRewriter attached to token stream
         *  sitting between the Cymbol lexer and parser.
        public CommentShifter(BufferedTokenStream tokens) {
            this.tokens = tokens;
            rewriter = new TokenStreamRewriter(tokens);

        public void exitVarDecl(CymbolParser.VarDeclContext ctx) {
            Token semi = ctx.getStop(); 
            int i = semi.getTokenIndex();
            List<Token> cmtChannel =
                tokens.getHiddenTokensToRight(i, CymbolLexer.COMMENTS); 
            if ( cmtChannel!=null ) {
                Token cmt = cmtChannel.get(0); 
                if ( cmt!=null ) {
                    String txt = cmt.getText().substring(2);
                    String newCmt = "/* " + txt.trim() + " */\n";
                    rewriter.insertBefore(ctx.start, newCmt); 
                    rewriter.replace(cmt, "\n");              

    public static void main(String[] args) throws Exception {
        String inputFile = null;
        if ( args.length>0 ) inputFile = args[0];
        InputStream is = System.in;
        if ( inputFile!=null ) {
            is = new FileInputStream(inputFile);
        ANTLRInputStream input = new ANTLRInputStream(is);
        CymbolLexer lexer = new CymbolLexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        CymbolParser parser = new CymbolParser(tokens);
        RuleContext tree = parser.file();

        ParseTreeWalker walker = new ParseTreeWalker();
        CommentShifter shifter = new CommentShifter(tokens);
        walker.walk(shifter, tree);
从上述代码可以看到,CommentShifter继承listener模式,重载了exitVarDecl方法。在遍历parse tree的时候,会自动调用exitVarDecl,完成了注释顺序改写功能。exitVarDecl对应了语法文件里面的变量定义规则,每当有变量定义的时候,就会调用该方法。

2.3 按channel提取所有注释



    private static void printComments(String code){
        CPP14Lexer lexer = new CPP14Lexer(new ANTLRInputStream(code));
        CommonTokenStream tokens = new CommonTokenStream(lexer);

        List<Token> lt = tokens.getTokens();
        for(Token t:lt){
            // if t is on channel 2 which is comments channel(configured in grammar file)
            // simply pass t, otherwise for two adjacent comments line the first comment line will
            // appear twice
            if(t.getChannel() == 2) continue;

            // getHiddenTokensToLeft will suffice to get all comments
            // no need to call getHiddenTokensToRight
            int tokenIndex = t.getTokenIndex();
            List<Token> comments = tokens.getHiddenTokensToLeft(tokenIndex);
            if(comments != null && comments.size() > 0){
                for(Token c:comments){
                    System.out.println("    " + c.getText());
