January 09, 2007

GCC 函式追蹤功能

昨天有一位同事問及 ARM call frame 相關的問題,我給的建議是透過 GCC Function instrumentation 的機制。該機制出現於 GCC 2.x,由 Cygnus (現為 RedHat) 所提出,在 GCC 中對應的選項為:"finstrument-functions",查看 man page 可得以下資訊:
    -finstrument-functions

    Generate instrumentation calls for entry and exit to functions. Just after function entry and just before function exit, the following profiling functions will be called with the address of the current function and its call site. (On some platforms, "__builtin_return_address" does not work beyond the current function, so the call site information may not be available to the profiling functions otherwise.)
        void __cyg_profile_func_enter (void *this_fn,
                                       void *call_site);
        void __cyg_profile_func_exit  (void *this_fn,
                                       void *call_site);
    
    The first argument is the address of the start of the current function, which may be looked up exactly in the symbol table.

    This instrumentation is also done for functions expanded inline in other functions. The profiling calls will indicate where, conceptually, the inline function is entered and exited. This means that addressable versions of such functions must be available. If all your uses of a function are expanded inline, this may mean an additional expansion of code size. If you use extern inline in your C code, an addressable version of such functions must be provided. (This is normally the case anyways, but if you get lucky and the optimizer always expands the functions inline, you might have gotten away without providing static copies.)

    A function may be given the attribute "no_instrument_function", in which case this instrumentation will not be done. This can be used, for example, for the profiling functions listed above, high-priority interrupt routines, and any functions from which the profiling functions cannot safely be called (perhaps signal handlers, if the profiling routines generate output or allocate memory).
不過儘管有如此詳細的資訊,我們還是依循杜威博士的「作中學」方式來學習吧。首先,來寫個 "Hello World" 小程式: (hello.c)
#include <stdio.h>
#define DUMP(func, call) \
        printf("%s: func = %p, called by = %p\n", __FUNCTION__, func, call)

void __attribute__((__no_instrument_function__))
__cyg_profile_func_enter(void *this_func, void *call_site)
{
        DUMP(this_func, call_site);
}
void __attribute__((__no_instrument_function__))
__cyg_profile_func_exit(void *this_func, void *call_site)
{
        DUMP(this_func, call_site);
}

main()
{
        puts("Hello World!");
        return 0;
}
編譯與執行:
$ gcc -finstrument-functions hello.c -o hello
$ ./hello
__cyg_profile_func_enter: func = 0x8048468, called by = 0xb7e36ebc
Hello World!
__cyg_profile_func_exit: func = 0x8048468, called by = 0xb7e36ebc
看到 "__attribute__" 就知道一定是 GNU extension,而前述的 man page 也提到 -finstrument-functions 會在每次進入與退出函式前呼叫 "__cyg_profile_func_enter" 與 "__cyg_profile_func_exit" 這兩個 hook function。等等,「進入」與「退出」是又何解?C Programming Language 最經典之處在於,雖然沒有定義語言實做的方式,但實際上 function call 皆以 stack frame 的形式存在,去年在「深入淺出 Hello World」Part II 有提過。所以上述那一大段英文就是說,如果我們不透過 GCC 內建函式 "__builtin_return_address" 取得 caller 與 callee 相關的動態位址,那麼仍可透過 -finstrument-functions,讓 GCC 合成相關的處理指令,讓我們得以追蹤。而看到 __cyg 開頭的函式,就知道是來自 Cygnus 的貢獻,在 gcc 2.x 內部設計可瞥見不少。

當我們試著移除 "__attribute__((__no_instrument_function__))" 那兩行來看看: (wrong.c)
編譯與執行:
$ gcc -g -finstrument-functions wrong.c -o wrong
$ ./wrong
程式記憶體區段錯誤
發生什麼事情呢?請出 gdb 協助:
$ gdb ./wrong
GNU gdb 6.6-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".
(gdb) run
Starting program: /home/jserv/HelloWorld/helloworld/instrument/wrong 

Program received signal SIGSEGV, Segmentation fault.
0x0804841d in __cyg_profile_func_enter (this_func=0x8048414, call_site=0x804842d) at wrong.c:7
7	{
(gdb) bt
#0  0x0804841d in __cyg_profile_func_enter (this_func=0x8048414, call_site=0x804842d) at wrong.c:7
#1  0x0804842d in __cyg_profile_func_enter (this_func=0x8048414, call_site=0x804842d) at wrong.c:7
#2  0x0804842d in __cyg_profile_func_enter (this_func=0x8048414, call_site=0x804842d) at wrong.c:7
...
#30 0x0804842d in __cyg_profile_func_enter (this_func=0x8048414, call_site=0x804842d) at wrong.c:7
#31 0x0804842d in __cyg_profile_func_enter (this_func=0x8048414, call_site=0x804842d) at wrong.c:7
#32 0x0804842d in __cyg_profile_func_enter (this_func=0x8048414, call_site=0x804842d) at wrong.c:7
---Type  to continue, or q  to quit---
既然 "__cyg_profile_func_enter" 是 function hook,則本身不得也被施以 "function instrument",否則就無止盡遞迴了,不過我們也可以發現一件有趣的事情:
$ nm wrong | grep 8048414
08048414 T __cyg_profile_func_enter
如我們所見,"__cyg_profile_func_enter" 的位址被不斷代入 __cyg_profile_func_enter function arg 中。GNU binutils 裡面有個小工具 addr2line,我們可以該工具取得虛擬位址對應的行號或符號:
$ addr2line -e wrong 0x8048414
/home/jserv/HelloWorld/helloworld/instrument/wrong.c:7
就 Linux 應用程式開發來說,我們可透過這個機制作以下應用:
  • 撰寫特製的 profiler
  • 取得執行時期的 Call Graph
  • 置入自製的 signal handler,實做 backtrace 功能
  • 模擬 Reflection 機制
之前的 blog [透過 GCC 作 Call Graph 視覺化輸出] 提過一個透過 compiler code analysis 的方式,輸出 call graph 圖形的途徑,事實上,我們可透過 GCC "-finstrument-functions" 來作動態的分析處理,詳情可見 M. Tim Jones 的文章 [用 Graphviz 可視化函數調用],展示了如何整合這一系列的工具,達到構建一個動態的圖形函式呼叫調用生成。

撰寫特製的 profiler 的應用可見 [CygProfiler suite] 與 [FunctionCheck],那為何不用 binutils 中的 gprof 呢?因為很多時候系統會有缺陷,而 gprof 又依賴 kernel 與 glibc 的機制,當我們在移植新的系統時,就得用 "PoorMan's solution" (按:當然沒有 "PoorMan" 這家公司,這裡是強調「親手打造」之意) 來打穩底子。同樣的,「取得執行時期的 Call Graph」與「置入自製的 signal handler,實做 backtrace 功能」可參考知名的 [DirectFB] 計畫,當我們在編譯時期加入 "--enable-trace" (enable call tracing) 與 "--enable-debug" (enable debugging) 兩個選項時,編譯過程就會加入 GCC "-finstrument-functions" 選項,並且也提供特製的 signal handler 模擬 gdb backtrace 輸出功能。

又因為 "-finstrument-functions" 對系統偵錯是如此重要,我們來看看內部實做。我們好奇的是,main 函式中 puts("Hello World!") 呼叫的前後,到底被 GCC 施以什麼魔法呢?還是透過實驗的途徑:
$ objdump -d hello | grep -B2 puts
... (省略) ...
--
 8048488:	e8 87 ff ff ff       	call   8048414 <__cyg_profile_func_enter>
 804848d:	c7 04 24 df 85 04 08 	movl   $0x80485df,(%esp)
 8048494:	e8 c3 fe ff ff       	call   804835c 
看來 GCC 自動生成 __cyg_profile_func_enter 的 function call 動作,並置於 puts 呼叫之前,我們繼續觀察:
$ objdump -d hello | grep -A5 puts
... (省略) ...
--
 8048494:	e8 c3 fe ff ff       	call   804835c 
 8048499:	bb 00 00 00 00       	mov    $0x0,%ebx
 804849e:	8b 45 04             	mov    0x4(%ebp),%eax
 80484a1:	89 44 24 04          	mov    %eax,0x4(%esp)
 80484a5:	c7 04 24 68 84 04 08 	movl   $0x8048468,(%esp)
 80484ac:	e8 8d ff ff ff       	call   804843e <__cyg_profile_func_exit>
puts 呼叫之後,GCC 也自動生成 __cyg_profile_func_exit 的 function call 動作,這樣的技巧也被用於 KFT (Kernel Function Trace,前身為 KFI [Kernel function Instrumentation]),同樣以 GCC "-finstrument-functions" 來達到對每個 Linux Kernel function enter/exit 的追蹤功能,但得考慮大量的 static inline function。
由 jserv 發表於 January 9, 2007 04:59 PM
迴響

每次都覺得jserv大的文都像是變魔術...
很神

kornelius 發表於 January 9, 2007 09:53 PM

谢谢分享。

lgfang 發表於 September 27, 2008 11:05 AM
發表迴響









記住我的資訊?