标签:
Coredump 是分析Android native exception 和 kernel exception的利器,coredump中文名是核心转储,大概可以理解为当系统或者某个进程发生异常无法挽救时,系统机制把这块出问题的内存取出来打包成核心转储供给系统异常工程师离线分析用。有了coredump不但可以定位具体出异常的代码所在文件行数,还可以离线调试,一步步还原问题现场,抓出导致异常真凶. 但是很多时候由于系统挂得太突然或者某些原因来不及打包coredump,导致无法获取到核心转储,只留下一堆backtrace的残余信息供分析,这种缺少调试信息的问题通常调试难度比较大,而这个时候GNU tools工具家族的addr2line工具就可以发挥作用了,addr2line工具可以根据内存地址加上对于库的符号库文件解析即可“翻译”出具体的代码位置,帮助从log转换到源代码层面分析崩溃的原因。
如下某崩溃进程的backtrace:
1 Revision: ‘0‘ 2 ABI: ‘arm64‘ 3 pid: 24377, tid: 24377, name: gx_fpd >>> /system/bin/gx_fpd <<< 4 signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr -------- 5 x0 0000000000000000 x1 0000000000005f39 x2 0000000000000006 x3 0000000000000000 6 x4 0000000000000000 x5 0000000000000001 x6 0000000000000000 x7 0000000000000000 7 x8 0000000000000083 x9 0000007fb4eec110 x10 0000000000000002 x11 0000000000000003 8 x12 0000000000000000 x13 0000000000000043 x14 0000007fcc97a768 x15 0000000000000000 9 x16 0000007fb4b866a8 x17 0000007fb4b48b6c x18 0000000000000002 x19 0000007fb4f670a8 10 x20 0000007fb4f66fe8 x21 000000000000000b x22 0000000000000006 x23 0000005582219f90 11 x24 0000007fcc97ac90 x25 0000007fb4e04d18 x26 0000000000000000 x27 0000000000000000 12 x28 0000000000000000 x29 0000007fcc97ab60 x30 0000007fb4b46308 13 sp 0000007fcc97ab60 pc 0000007fb4b48b74 pstate 0000000020000000 14 v0 2e2e2e2e2e2e2e2e2e2e2e2e2e2e2e2e v1 006370692e67756265642e6f6e6e6974 15 v2 636f69203a4457540000000000000031 v3 80000000000000000000000000000000 16 v4 00000000000000008020080280200800 v5 00000000400000000000040000000000 17 v6 00000000000000000000000000000000 v7 80200802802008028020080280200802 18 v8 00000000000000000000000000000000 v9 00000000000000000000000000000000 19 v10 00000000000000000000000000000000 v11 00000000000000000000000000000000 20 v12 00000000000000000000000000000000 v13 00000000000000000000000000000000 21 v14 00000000000000000000000000000000 v15 00000000000000000000000000000000 22 v16 40100401401004014010040140100401 v17 00000000a00aa0080000aaa880400400 23 v18 00000000000000008020080280200800 v19 0833083a082f08240828083c082e0832 24 v20 0c950c920c9a0c950c960c950c970c9a v21 000000000000000000000055822a6c18 25 v22 083a083e083408380834083b084f084b v23 0c950c960c960c930c970c8d0c930c9a 26 v24 000000000000000000000055822a6c08 v25 085908470837083f083e083f08410843 27 v26 0c950c930c920c940c950c960c920c97 v27 000000000000000000000055822a6bf8 28 v28 0862084c084e083b084608350826082e v29 0c920c960c930c950c920c970c900c98 29 v30 000000000000000000000055822a6be8 v31 0838083c0850085a08410851082f0846 30 fpsr 00000000 fpcr 00000000 31 32 backtrace: 33 #00 pc 000000000006ab74 /system/lib64/libc.so (tgkill+8) 34 #01 pc 0000000000068304 /system/lib64/libc.so (pthread_kill+68) 35 #02 pc 00000000000212f8 /system/lib64/libc.so (raise+28) 36 #03 pc 000000000001ba98 /system/lib64/libc.so (abort+60) 37 #04 pc 000000000002e104 /system/lib64/libbinder.so (android::IPCThreadState::joinThreadPool(bool)+216) 38 #05 pc 0000000000004c5c /system/bin/gx_fpd (main+236) 39 #06 pc 0000000000019794 /system/lib64/libc.so (__libc_init+100) 40 #07 pc 0000000000004d78 /system/bin/gx_fpd
从发现异常的信号 signal 6 (SIGABRT) 看第一猜测就是发生了NULL内存范围,被MMU拦截了,ARM异常处理报出 data abort异常所致。 这里很重要一点是要知道具体backtrace代表的源代码是什么,
也就是从backtrace 信息翻译成具体的源代码级分析,而addr2line工具则提供了此功能。
用法:
(一定要用带sysmbol目录下的库) addr2line -e <带符号库> <内存地址>
解析如下:
1 ./aarch64-linux-android-addr2line -e symbols/system/lib64/libc.so 000000000006ab74 2 bionic/libc/arch-arm64/syscalls/tgkill.S:9 3 ./aarch64-linux-android-addr2line -e symbols/system/lib64/libc.so 0000000000068304 4 bionic/libc/bionic/pthread_kill.cpp:45 (discriminator 1) 5 ./aarch64-linux-android-addr2line -e symbols/system/lib64/libc.so 00000000000212f8 6 bionic/libc/bionic/raise.cpp:34 (discriminator 1) 7 ./aarch64-linux-android-addr2line -e symbols/system/lib64/libc.so 000000000001ba98 8 bionic/libc/bionic/abort.cpp:47 9 ./aarch64-linux-android-addr2line -e symbols/system/lib64/libbinder.so 000000000002e104 10 frameworks/native/libs/binder/IPCThreadState.cpp:608
==》
1 backtrace: 2 #00 pc 000000000006ab74 /system/lib64/libc.so (tgkill+8) tgkill.S:9 3 #01 pc 0000000000068304 /system/lib64/libc.so (pthread_kill+68) pthread_kill.cpp:45 4 #02 pc 00000000000212f8 /system/lib64/libc.so (raise+28) raise.cpp:34 5 #03 pc 000000000001ba98 /system/lib64/libc.so (abort+60) abort.cpp:47 6 #04 pc 000000000002e104 /system/lib64/libbinder.so (android::IPCThreadState::joinThreadPool(bool)+216) IPCThreadState.cpp:608 7 #05 pc 0000000000004c5c /system/bin/gx_fpd (main+236) 8 #06 pc 0000000000019794 /system/lib64/libc.so (__libc_init+100) 9 #07 pc 0000000000004d78 /system/bin/gx_fpd
这里注意下,因为gx_fpd 是第三方库,不带symbol,所以无法解析出具体代码位置。
然后我们可以看下发生异常的代码,IPCThreadState.cpp:608
1 void IPCThreadState::joinThreadPool(bool isMain) 2 { 3 LOG_THREADPOOL("**** THREAD %p (PID %d) IS JOINING THE THREAD POOL\n", (void*)pthread_self(), getpid()); 4 5 mOut.writeInt32(isMain ? BC_ENTER_LOOPER : BC_REGISTER_LOOPER); 6 7 // This thread may have been spawned by a thread that was in the background 8 // scheduling group, so first we will make sure it is in the foreground 9 // one to avoid performing an initial transaction in the background. 10 set_sched_policy(mMyThreadId, SP_FOREGROUND); 11 12 status_t result; 13 do { 14 processPendingDerefs(); 15 // now get the next command to be processed, waiting if necessary 16 result = getAndExecuteCommand(); 17 18 if (result < NO_ERROR && result != TIMED_OUT && result != -ECONNREFUSED && result != -EBADF) { 19 ALOGE("getAndExecuteCommand(fd=%d) returned unexpected error %d, aborting", 20 mProcess->mDriverFD, result); 21 abort(); 22 }
上面代码可以出,这个abort不是发生NULL指针所致(上面开始猜错了o(╯□╰)o),而是result异常人为的加了abort()动作陷阱所致, 这里就需要分析这个result为什么会异常导致跑到这个陷阱中了,
而这块属于binder通信的核心代码,所以需要对binder的原理深入理解以及其代码非常的熟悉才能从容的调试分析找出答案.
标签:
原文地址:http://www.cnblogs.com/gmy296778322/p/5764660.html