How do I translate the rsp: 2b6d2ea40450 kernel trap splitting error back to its original location?
A customer reported a division by zero error in one of our programs. We only have this VLM line:
kernel: myprog[16122] trap divide error rip:79dd99 rsp:2b6d2ea40450 error:0
I don't believe there is a core file for this.
I have searched the web to see how I can point to the line of program that caused this division by zero, but so far I am failing.
I understand that 16122 is the pid of the program, so it won't help me.
I suspect rsp: 2b6d2ea40450 has something to do with the address of the line that caused the error (0x2b6d2ea40450), but is it true?
If so, then how can I translate it to a physical approximate location in the source, assuming I can load the debug version of myprog into gdb and then ask to show the context around that address ...
Any, any help would be greatly appreciated!
source to share
rip is the instruction pointer, rsp is the stack pointer. The stack pointer is not very useful if you don't have a kernel image or a running process.
You can use the command addr2line
or disassemble
in gdb
to see the line getting the error based on ip.
$ cat divtest.c main () { int a, b; a = 1; b = a / 0; } $ ./divtest Floating point exception (core dumped) $ dmesg | tail -1 [6827.463256] traps: divtest [3255] trap divide error ip: 400504 sp: 7fff54e81330 error: 0 in divtest [400000 + 1000] $ addr2line -e divtest 400504 ./divtest.c:5 $ gdb divtest (gdb) disass / m 0x400504 Dump of assembler code for function main: 2 { 0x00000000004004f0: push% rbp 0x00000000004004f1: mov% rsp,% rbp 3 int a, b; 4 5 a = 1; b = a / 0; 0x00000000004004f4: movl $ 0x1, -0x4 (% rbp) 0x00000000004004fb: mov -0x4 (% rbp),% eax 0x00000000004004fe: mov $ 0x0,% ecx 0x0000000000400503: cltd 0x0000000000400504: idiv% ecx 0x0000000000400506: mov% eax, -0x8 (% rbp) 6} 0x0000000000400509: pop% rbp 0x000000000040050a: retq End of assembler dump.
source to share