标签:
---恢复内容开始---
The format of an operating system‘s executable file is in many ways a mirror of the operating system.
Winnt.h是一个非常重要的头文件,其中定义了大部分windows下的内部结构。
The PE format is documented (in the loosest sense of the word) in the WINNT.H header file.About midway through WINNT.H is a section titled "Image Format." This section starts out with small tidbits from the old familiar MS-DOS MZ format and NE format headers before moving into the newer PE information. WINNT.H provides definitions of the raw data structures used by PE files, but contains only a few useful comments to make sense of what the structures and flags mean. Whoever wrote the header file for the PE format (the name Michael J. O‘Leary keeps popping up) is certainly a believer in long, descriptive names, along with deeply nested structures and macros. When coding with WINNT.H, it‘s not uncommon to have expressions like this:
文章作者Matt Pietrek写了一个分析PE格式的程序,开源的,代码地址是:
http://github.com/zed-0xff/pedump
而且,你也可以上传PE文件到网站http://pedump.me/,进行在线的PE格式分析。
在线分析结果如下:
imports:
Let‘s go over a few fundamental ideas that permeate the design of a PE file (see Figure 1). I‘ll use the term "module" to mean the code, data, and resources of an executable file or DLL that have been loaded into memory.
The first important thing to know about PE files is that the executable file on disk is very similar to what the module will look like after Windows has loaded it. The Windows loader doesn‘t need to work extremely hard to create a process from the disk file. The loader uses the memory-mapped file mechanism to map the appropriate pieces of the file into the virtual address space.(exe文件在磁盘上的存储格式和被加载到内存之后的存储格式很相似,所以使用一种memory-mapped机制将exe映射到虚拟地址空间即可)
For Win32, all the memory used by the module for code, data, resources, import tables, export tables, and other required module data structures is in one contiguous block of memory. All you need to know in this situation is where the loader mapped the file into memory. You can easily find all the various pieces of the module by following pointers that are stored as part of the image.
Another idea you should be acquainted with is the Relative Virtual Address (RVA). Many fields in PE files are specified in terms of RVAs. An RVA is simply the offset of some item, relative to where the file is memory-mapped. For example, let‘s say the loader maps a PE file into memory starting at address 0x10000 in the virtual address space. If a certain table in the image starts at address 0x10464, then the table‘s RVA is 0x464.
To convert an RVA into a usable pointer, simply add the RVA to the base address of the module. The base address is the starting address of a memory-mapped EXE or DLL and is an important concept in Win32. For the sake of convenience, Windows NT and Windows 95 uses the base address of a module as the module‘s instance handle (HINSTANCE).What‘s important for Win32 is that you can call GetModuleHandle for any DLL that your process uses to get a pointer for accessing the module‘s components.(基址加上RVA就得到可用地址,而基址就是DLL或EXE文件的module handle,可以通过GetModuleHandle获取)
The final concept that you need to know about PE files is sections.Unlike segments, sections are blocks of contiguous memory with no size constraints. Some sections contain code or data that your program declared and uses directly, while other data sections are created for you by the linker and librarian, and contain information vital to the operating system. In some descriptions of the PE format, sections are also referred to as objects. The term object has so many overloaded meanings that I‘ll stick to calling the code and data areas sections.
Like all other executable file formats, the PE file has a collection of fields at a known (or easy to find) location that define what the rest of the file looks like. This header contains information such as the locations and sizes of the code and data areas, what operating system the file is intended for, the initial stack size, and other vital pieces of information.As with other executable formats from Microsoft, this main header isn‘t at the very beginning of the file. The first few hundred bytes of the typical PE file are taken up by the MS-DOS stub. This stub is a tiny program that prints out something to the effect of "This program cannot be run in MS-DOS mode."
As in other Microsoft executable formats, you find the real header by looking up its starting offset, which is stored in the MS-DOS stub header. The WINNT.H file includes a structure definition for the MS-DOS stub header that makes it very easy to look up where the PE header starts. The e_lfanew field is a relative offset (or RVA, if you prefer) to the actual PE header. To get a pointer to the PE header in memory, just add that field‘s value to the image base:
Once you have a pointer to the main PE header, the fun can begin. The main PE header is a structure of type IMAGE_NT_HEADERS, which is defined in WINNT.H. This structure is composed of a DWORD and two substructures and is laid out as follows(PE头是一个IMAGE_NT_HEADERS结构类型):
直观点看就是这样的:
再详细点就是这样:
The Signature field viewed as ASCII text is "PE\0\0".
Following the PE signature DWORD in the PE header is a structure of type IMAGE_FILE_HEADER. The fields of this structure contain only the most basic information about the file.
IMAGE_FILE_HEADER Fields
WORD Machine
WORD NumberOfSections
DWORD TimeDateStamp
DWORD PointerToSymbolTable
DWORD NumberOfSymbols
WORD SizeOfOptionalHeader
WORD Characteristics
Other fields are defined in WINNT.H
The third component of the PE header is a structure of type IMAGE_OPTIONAL_HEADER. For PE files, this portion certainly isn‘t optional.
The COFF format allows individual implementations to define a structure of additional information beyond the standard IMAGE_FILE_HEADER. The fields in the IMAGE_OPTIONAL_HEADER are what the PE designers felt was critical information beyond the basic information in the IMAGE_FILE_HEADER.
All of the fields of the IMAGE_OPTIONAL_HEADER aren‘t necessarily important to know about . The more important ones to be aware of are the ImageBase and the Subsystem fields. You can skim or skip the description of the fields.
IMAGE_OPTIONAL_HEADER Fields
WORD Magic
BYTE MajorLinkerVersion
BYTE MinorLinkerVersion
DWORD SizeOfCode
DWORD SizeOfInitializedData
DWORD SizeOfUninitializedData
DWORD AddressOfEntryPoint
DWORD BaseOfCode
DWORD BaseOfData
DWORD ImageBase
DWORD SectionAlignment
DWORD FileAlignment
(
SectionAlignment
是在内存中的对齐单位,
FileAlignment
是PE格式中的对齐单位,所以从上图中也可以看出来,磁盘中的PE和内存中的PE的大小是不同的)
WORD MajorOperatingSystemVersion
WORD MinorOperatingSystemVersion
WORD MajorImageVersion
WORD MinorImageVersion
WORD MajorSubsystemVersion
WORD MinorSubsystemVersion
DWORD Reserved1
DWORD SizeOfImage
DWORD SizeOfHeaders
DWORD CheckSum
WORD Subsystem
WORD DllCharacteristics
参考:
https://msdn.microsoft.com/en-us/library/ms809762.aspx
RVA、VA、Imagebase之间的关系:
http://blog.csdn.net/fantcy/article/details/4474604
As in other Microsoft executable formats, you find the real header by looking up its starting offset, which is stored in the MS-DOS stub header. The WINNT.H file includes a structure definition for the MS-DOS stub header that makes it very easy to look up where the PE header starts. The e_lfanew field is a relative offset (or RVA, if you prefer) to the actual PE header. To get a pointer to the PE header in memory, just add that field‘s value to the image base:
《Peering Inside the PE: A Tour of the Win32 Portable Executable File Format》阅读笔记(未完)
标签:
原文地址:http://www.cnblogs.com/predator-wang/p/4882448.html