深入JVM之源码编译机制

时间：2015-05-14 20:36:38 阅读：145 评论：0 收藏：0 [点我收藏+]

标签：jvm 编译源码

对于jvm源码编译机制，参考资料是《分布式Java应用基础与实践》。学习后，大概的总结如下。
最近没有更博客，心情比较乱，╮(╯▽╰)╭，

转载注明出处：http://blog.csdn.net/supera_li/article/details/45725213

javac编译.java文件为.class文件。
第一步，需要分析和输入到符号表中
技术分享
第二步，注释处理

sun jdk 6才支持该处理

第三步语义分析和生成class文件
技术分享

对于Gen类
技术分享

最终思维图谱：
技术分享

源码分析：
parse步骤中词法分析。
技术分享
和token相关，其实就是生成token的
我来瞅瞅怎么实现的。先看变量确定需要存储的什么。

根据名字我们可以推断。
前2个和token生成后的临时存储有关。
pervToken见名知意，是遍历到某个List节点的引用，为了下次操作提供先前节点。
savedTokens，存储生成的tokenList
JavaTokenizer，对这个就是我们需要找的，生成token的类。点进去瞅瞅。
技术分享
一眼看到就是readToken是核心方法。
其他的scan开头的明显是扫描输入，
除了还有skip,is,add等属于操作。
分析readToken方法。

public Token readToken() {
        this.reader.sp = 0;
        this.name = null;
        this.radix = 0;
        boolean var1 = false;
        boolean var2 = false;
        List var3 = null;

        try {
            int var9;
            label474:
            while(true) {
                var9 = this.reader.bp;
                int var4;
                boolean var11;
                switch(this.reader.ch) {
                case ‘\t‘:
                case ‘\f‘:
                case ‘ ‘:
                    do {
                        do {
                            this.reader.scanChar();
                        } while(this.reader.ch == 32);
                    } while(this.reader.ch == 9 || this.reader.ch == 12);

                    this.processWhiteSpace(var9, this.reader.bp);
                    break;
                case ‘\n‘:
                    this.reader.scanChar();
                    this.processLineTerminator(var9, this.reader.bp);
                    break;
                case ‘\u000b‘:
                case ‘\u000e‘:
                case ‘\u000f‘:
                case ‘\u0010‘:
                case ‘\u0011‘:
                case ‘\u0012‘:
                case ‘\u0013‘:
                case ‘\u0014‘:
                case ‘\u0015‘:
                case ‘\u0016‘:
                case ‘\u0017‘:
                case ‘\u0018‘:
                case ‘\u0019‘:
                case ‘\u001a‘:
                case ‘\u001b‘:
                case ‘\u001c‘:
                case ‘\u001d‘:
                case ‘\u001e‘:
                case ‘\u001f‘:
                case ‘!‘:
                case ‘#‘:
                case ‘%‘:
                case ‘&‘:
                case ‘*‘:
                case ‘+‘:
                case ‘-‘:
                case ‘:‘:
                case ‘<‘:
                case ‘=‘:
                case ‘>‘:
                case ‘?‘:
                case ‘@‘:
                case ‘\\‘:
                case ‘^‘:
                case ‘`‘:
                case ‘|‘:
                default:
                    if(this.isSpecial(this.reader.ch)) {
                        this.scanOperator();
                    } else {
                        if(this.reader.ch < 128) {
                            var11 = false;
                        } else {
                            char var13 = this.reader.scanSurrogates();
                            if(var13 != 0) {
                                this.reader.putChar(var13);
                                var11 = Character.isJavaIdentifierStart(Character.toCodePoint(var13, this.reader.ch));
                            } else {
                                var11 = Character.isJavaIdentifierStart(this.reader.ch);
                            }
                        }

                        if(var11) {
                            this.scanIdent();
                        } else if(this.reader.bp != this.reader.buflen && (this.reader.ch != 26 || this.reader.bp + 1 != this.reader.buflen)) {
                            String var14 = 32 < this.reader.ch && this.reader.ch < 127?String.format("%s", new Object[]{Character.valueOf(this.reader.ch)}):String.format("\\u%04x", new Object[]{Integer.valueOf(this.reader.ch)});
                            this.lexError(var9, "illegal.char", new Object[]{var14});
                            this.reader.scanChar();
                        } else {
                            this.tk = TokenKind.EOF;
                            var9 = this.reader.buflen;
                        }
                    }
                    break label474;
                case ‘\r‘:
                    this.reader.scanChar();
                    if(this.reader.ch == 10) {
                        this.reader.scanChar();
                    }

                    this.processLineTerminator(var9, this.reader.bp);
                    break;
                case ‘\"‘:
                    this.reader.scanChar();

                    while(this.reader.ch != 34 && this.reader.ch != 13 && this.reader.ch != 10 && this.reader.bp < this.reader.buflen) {
                        this.scanLitChar(var9);
                    }

                    if(this.reader.ch == 34) {
                        this.tk = TokenKind.STRINGLITERAL;
                        this.reader.scanChar();
                    } else {
                        this.lexError(var9, "unclosed.str.lit", new Object[0]);
                    }
                    break label474;
                case ‘$‘:
                case ‘A‘:
                case ‘B‘:
                case ‘C‘:
                case ‘D‘:
                case ‘E‘:
                case ‘F‘:
                case ‘G‘:
                case ‘H‘:
                case ‘I‘:
                case ‘J‘:
                case ‘K‘:
                case ‘L‘:
                case ‘M‘:
                case ‘N‘:
                case ‘O‘:
                case ‘P‘:
                case ‘Q‘:
                case ‘R‘:
                case ‘S‘:
                case ‘T‘:
                case ‘U‘:
                case ‘V‘:
                case ‘W‘:
                case ‘X‘:
                case ‘Y‘:
                case ‘Z‘:
                case ‘_‘:
                case ‘a‘:
                case ‘b‘:
                case ‘c‘:
                case ‘d‘:
                case ‘e‘:
                case ‘f‘:
                case ‘g‘:
                case ‘h‘:
                case ‘i‘:
                case ‘j‘:
                case ‘k‘:
                case ‘l‘:
                case ‘m‘:
                case ‘n‘:
                case ‘o‘:
                case ‘p‘:
                case ‘q‘:
                case ‘r‘:
                case ‘s‘:
                case ‘t‘:
                case ‘u‘:
                case ‘v‘:
                case ‘w‘:
                case ‘x‘:
                case ‘y‘:
                case ‘z‘:
                    this.scanIdent();
                    break label474;
                case ‘\‘‘:
                    this.reader.scanChar();
                    if(this.reader.ch == 39) {
                        this.lexError(var9, "empty.char.lit", new Object[0]);
                    } else {
                        if(this.reader.ch == 13 || this.reader.ch == 10) {
                            this.lexError(var9, "illegal.line.end.in.char.lit", new Object[0]);
                        }

                        this.scanLitChar(var9);
                        char var12 = this.reader.ch;
                        if(this.reader.ch == 39) {
                            this.reader.scanChar();
                            this.tk = TokenKind.CHARLITERAL;
                        } else {
                            this.lexError(var9, "unclosed.char.lit", new Object[0]);
                        }
                    }
                    break label474;
                case ‘(‘:
                    this.reader.scanChar();
                    this.tk = TokenKind.LPAREN;
                    break label474;
                case ‘)‘:
                    this.reader.scanChar();
                    this.tk = TokenKind.RPAREN;
                    break label474;
                case ‘,‘:
                    this.reader.scanChar();
                    this.tk = TokenKind.COMMA;
                    break label474;
                case ‘.‘:
                    this.reader.scanChar();
                    if(48 <= this.reader.ch && this.reader.ch <= 57) {
                        this.reader.putChar(‘.‘);
                        this.scanFractionAndSuffix(var9);
                    } else if(this.reader.ch == 46) {
                        var4 = this.reader.bp;
                        this.reader.putChar(‘.‘);
                        this.reader.putChar(‘.‘, true);
                        if(this.reader.ch == 46) {
                            this.reader.scanChar();
                            this.reader.putChar(‘.‘);
                            this.tk = TokenKind.ELLIPSIS;
                        } else {
                            this.lexError(var4, "illegal.dot", new Object[0]);
                        }
                    } else {
                        this.tk = TokenKind.DOT;
                    }
                    break label474;
                case ‘/‘:
                    this.reader.scanChar();
                    if(this.reader.ch == 47) {
                        do {
                            this.reader.scanCommentChar();
                        } while(this.reader.ch != 13 && this.reader.ch != 10 && this.reader.bp < this.reader.buflen);

                        if(this.reader.bp < this.reader.buflen) {
                            var3 = this.addComment(var3, this.processComment(var9, this.reader.bp, CommentStyle.LINE));
                        }
                        break;
                    } else {
                        if(this.reader.ch != 42) {
                            if(this.reader.ch == 61) {
                                this.tk = TokenKind.SLASHEQ;
                                this.reader.scanChar();
                            } else {
                                this.tk = TokenKind.SLASH;
                            }
                            break label474;
                        }

                        var11 = false;
                        this.reader.scanChar();
                        CommentStyle var5;
                        if(this.reader.ch == 42) {
                            var5 = CommentStyle.JAVADOC;
                            this.reader.scanCommentChar();
                            if(this.reader.ch == 47) {
                                var11 = true;
                            }
                        } else {
                            var5 = CommentStyle.BLOCK;
                        }

                        while(!var11 && this.reader.bp < this.reader.buflen) {
                            if(this.reader.ch == 42) {
                                this.reader.scanChar();
                                if(this.reader.ch == 47) {
                                    break;
                                }
                            } else {
                                this.reader.scanCommentChar();
                            }
                        }

                        if(this.reader.ch == 47) {
                            this.reader.scanChar();
                            var3 = this.addComment(var3, this.processComment(var9, this.reader.bp, var5));
                            break;
                        }

                        this.lexError(var9, "unclosed.comment", new Object[0]);
                        break label474;
                    }
                case ‘0‘:
                    this.reader.scanChar();
                    if(this.reader.ch != 120 && this.reader.ch != 88) {
                        if(this.reader.ch != 98 && this.reader.ch != 66) {
                            this.reader.putChar(‘0‘);
                            if(this.reader.ch == 95) {
                                var4 = this.reader.bp;

                                do {
                                    this.reader.scanChar();
                                } while(this.reader.ch == 95);

                                if(this.reader.digit(var9, 10) < 0) {
                                    this.lexError(var4, "illegal.underscore", new Object[0]);
                                }
                            }

                            this.scanNumber(var9, 8);
                        } else {
                            if(!this.allowBinaryLiterals) {
                                this.lexError(var9, "unsupported.binary.lit", new Object[]{this.source.name});
                                this.allowBinaryLiterals = true;
                            }

                            this.reader.scanChar();
                            this.skipIllegalUnderscores();
                            if(this.reader.digit(var9, 2) < 0) {
                                this.lexError(var9, "invalid.binary.number", new Object[0]);
                            } else {
                                this.scanNumber(var9, 2);
                            }
                        }
                    } else {
                        this.reader.scanChar();
                        this.skipIllegalUnderscores();
                        if(this.reader.ch == 46) {
                            this.scanHexFractionAndSuffix(var9, false);
                        } else if(this.reader.digit(var9, 16) < 0) {
                            this.lexError(var9, "invalid.hex.number", new Object[0]);
                        } else {
                            this.scanNumber(var9, 16);
                        }
                    }
                    break label474;
                case ‘1‘:
                case ‘2‘:
                case ‘3‘:
                case ‘4‘:
                case ‘5‘:
                case ‘6‘:
                case ‘7‘:
                case ‘8‘:
                case ‘9‘:
                    this.scanNumber(var9, 10);
                    break label474;
                case ‘;‘:
                    this.reader.scanChar();
                    this.tk = TokenKind.SEMI;
                    break label474;
                case ‘[‘:
                    this.reader.scanChar();
                    this.tk = TokenKind.LBRACKET;
                    break label474;
                case ‘]‘:
                    this.reader.scanChar();
                    this.tk = TokenKind.RBRACKET;
                    break label474;
                case ‘{‘:
                    this.reader.scanChar();
                    this.tk = TokenKind.LBRACE;
                    break label474;
                case ‘}‘:
                    this.reader.scanChar();
                    this.tk = TokenKind.RBRACE;
                    break label474;
                }
            }

            int var10 = this.reader.bp;
            switch(null.$SwitchMap$com$sun$tools$javac$parser$Tokens$Token$Tag[this.tk.tag.ordinal()]) {
            case 1:
                Token var18 = new Token(this.tk, var9, var10, var3);
                return var18;
            case 2:
                NamedToken var17 = new NamedToken(this.tk, var9, var10, this.name, var3);
                return var17;
            case 3:
                StringToken var16 = new StringToken(this.tk, var9, var10, this.reader.chars(), var3);
                return var16;
            case 4:
                NumericToken var15 = new NumericToken(this.tk, var9, var10, this.reader.chars(), this.radix, var3);
                return var15;
            default:
                throw new AssertionError();
            }
        } finally {
            ;
        }
    }

处理代码首字符
技术分享

处理空白字符
32 space 空格
9 HT (horizontal tab) 水平制表符
12 FF (NP form feed, new page) 换页键
当遇到空格字符时，处理空白字符。
技术分享

处理换行。

处理代码字符。
首先处理操作字符
剩余的字符判断是否是java的字符起始（通过补充代码点组成的指定的代理对判断）。
下面还有判断去除注释，和样式的判断。
字母，数字处理
[]{}等代码优先级块处理
通过以上信息，
技术分享

将代码转化为token序列就容易将逻辑生成为语法树，有了语法树就可以将类转化为符号表去存储。
具体预发树的生成查阅Parser类。
转化为符号表Enter类。

深入JVM之源码编译机制

标签：jvm 编译源码

原文地址：http://blog.csdn.net/supera_li/article/details/45725213

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行