When your program receives SIGSEGV(Segmentation fault) kernel automatically terminates it(if the application doesn't handle SIGSEGV).
After a couple of long nights of debugging, tracing and drinking coffee you finally find the line in the sources where your application causes system to send this crafty signal.
The most general problem is that you usually unable to run application withing debugger. The fault may be caused by special circumstances. It's really painful to sit in front of the debugger and trying to reproduce the fault. More complexity add multi-threading/processing, network interaction.
Core dumps would be a good solution here.
The linux kernel is able to write a core dump if the application crashes. This core dump records the state of the process at the time of the crash.
Later you can use gdb to analyze the core dump.
Core dumps are disabled by default in linux.
To enable you should run
ulimit -c unlimitedBy default kernel writes core dump in the current working directory of the application. You may customize the pattern of file path for core dumps by writing it to /proc/sys/kernel/core_pattern.
According to current documentation pattern consists of following templates
%% A single % character %p PID of dumped process %u real UID of dumped process %g real GID of dumped process %s number of signal causing dump %t time of dump (secs since 0:00h, 1 Jan 1970) %h hostname (same as the 'nodename' returned by uname(2)) %e executable filenameSo with
echo /tmp/%e-%p.core > /proc/sys/kernel/core_patternlinux should put core dumps into /tmp with
Let's try all this.
Say we have this code
void crash() { char a[0]; free(a); } int main(int argc, char **argv) { crash(); return 0; }As you can see application should cause segmentation violation on free call. Let's compile it
gdb test.c -g -o testand execute
./test Segmentation fault (core dumped)System tells us that core was dumped. Let's see what we have
ll /tmp/*core -rw------- 1 niam niam 151552 2008-10-15 15:19 /tmp/test-25301.coreGot it. Now I'm going to run gdb
gdb --core /tmp/test-25301.core ./testgdb clearly tells that application was terminated with SIGSEGV
Core was generated by `./test'. Program terminated with signal 11, Segmentation fault. #0 0xb7e4ff97 in free () from /lib/libc.so.6Now we can use power of gdb to catch the problem code
(gdb) bt #0 0xb7e4ff97 in free () from /lib/libc.so.6 #1 0x08048392 in crash () at 1.c:9#2 0x080483aa in main () at 1.c:15 (gdb) up #1 0x08048392 in crash () at 1.c:99 free(a); (gdb) p a $1 = 0xbfc5daf8 "\b�ſ�\203\004\b�D�� �ſx�ſ��߷�����\203\004\bx�ſ��߷\001" (gdb) whatis a type = char [0] (gdb) info frame Stack level 1, frame at 0xbfc5db00: eip = 0x8048392 in crash (1.c:9); saved eip 0x80483aa called by frame at 0xbfc5db10, caller of frame at 0xbfc5daf0 source language c. Arglist at 0xbfc5daf8, args: Locals at 0xbfc5daf8, Previous frame's sp is 0xbfc5db00 Saved registers: ebp at 0xbfc5daf8, eip at 0xbfc5dafcWe can see here that free attempted to free memory of the stack. It shows 'whatis a' and we see that address of a is in the stack(esp holds 0xbfc5db00 and a is stored at 0xbfc5daf8 - just in the beginning of the stack).
gdb gave all needed information for further investigation. The only thing left is to understand who tought you to free array on the stack o_O.
1 comment:
Post a Comment