码迷,mamicode.com
首页 > 其他好文 > 详细

深入研究Clang(七) Clang Lexer代码阅读笔记之Lexer

时间:2016-08-10 17:41:59      阅读:209      评论:0      收藏:0      [点我收藏+]

标签:

作者:史宁宁(snsn1984)

源码位置:clang/lib/Lexer.cpp

源码网络地址:http://clang.llvm.org/doxygen/Lexer_8cpp_source.html


Lexer.cpp这个文件,是Clang这个前端的词法分析器的主要文件,它的内容是对Lexer这个类的具体实现,原文件的注释中:“This file implements the Lexer and Token interfaces.” 这么解释这个文件的,但是Token只有两个简单函数的实现,剩下的都是Lexer的实现。所以要想搞清楚Clang的词法分析器是怎么实现的,那么必须对这个文件有着深入的理解。

从Lexer的初始化函数开始入手:

void Lexer::InitLexer(const char *BufStart, const char *BufPtr,
   56                       const char *BufEnd) {
   57   BufferStart = BufStart;
   58   BufferPtr = BufPtr;
   59   BufferEnd = BufEnd;
   60 
   61   assert(BufEnd[0] == 0 &&
   62          "We assume that the input buffer has a null character at the end"
   63          " to simplify lexing!");
   64 
   65   // Check whether we have a BOM in the beginning of the buffer. If yes - act
   66   // accordingly. Right now we support only UTF-8 with and without BOM, so, just
   67   // skip the UTF-8 BOM if it‘s present.
   68   if (BufferStart == BufferPtr) {
   69     // Determine the size of the BOM.
   70     StringRef Buf(BufferStart, BufferEnd - BufferStart);
   71     size_t BOMLength = llvm::StringSwitch<size_t>(Buf)
   72       .StartsWith("\xEF\xBB\xBF", 3) // UTF-8 BOM
   73       .Default(0);
   74 
   75     // Skip the BOM.
   76     BufferPtr += BOMLength;
   77   }
   78 
   79   Is_PragmaLexer = false;
   80   CurrentConflictMarkerState = CMK_None;
   81 
   82   // Start of the file is a start of line.
   83   IsAtStartOfLine = true;
   84   IsAtPhysicalStartOfLine = true;
   85 
   86   HasLeadingSpace = false;
   87   HasLeadingEmptyMacro = false;
   88 
   89   // We are not after parsing a #.
   90   ParsingPreprocessorDirective = false;
   91 
   92   // We are not after parsing #include.
   93   ParsingFilename = false;
   94 
   95   // We are not in raw mode.  Raw mode disables diagnostics and interpretation
   96   // of tokens (e.g. identifiers, thus disabling macro expansion).  It is used
   97   // to quickly lex the tokens of the buffer, e.g. when handling a "#if 0" block
   98   // or otherwise skipping over tokens.
   99   LexingRawMode = false;
  100 
  101   // Default to not keeping comments.
  102   ExtendedTokenMode = 0;
  103 }


深入研究Clang(七) Clang Lexer代码阅读笔记之Lexer

标签:

原文地址:http://blog.csdn.net/snsn1984/article/details/52173275

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!