Duplicate memory management symbols in libc.so and ld-linux.so
Some preamble
It seems that malloc, calloc, realloc and free are replicated to ld-linux.so
and libc.so
. As I understand it, this is done by a dynamic loader to manage memory in ld-linux.so
pre-load libc.so
and makes the memory management capabilities available. However, I have some questions about these duplicated symbols:
Here's a very simple C program calling malloc and exiting:
#include <stdlib.h>
int main()
{
void *p = malloc(8);
return 0;
}
I'll compile it with gcc on linux box x86_64 and do some debugging with gdb:
$ gcc -g -o main main.c
$ gdb ./main
(gdb) start
Temporary breakpoint 1 at 0x4004f8
Starting program: main
Temporary breakpoint 1, 0x00000000004004f8 in main ()
(gdb) info symbol malloc
malloc in section .text of /lib64/ld-linux-x86-64.so.2
(gdb) b malloc
Breakpoint 2 at 0x7ffff7df0930: malloc. (2 locations)
(gdb) info breakpoints
Num Type Disp Enb Address What
2 breakpoint keep y <MULTIPLE>
2.1 y 0x00007ffff7df0930 in malloc at dl-minimal.c:95
2.2 y 0x00007ffff7a9f9d0 in __GI___libc_malloc at malloc.c:2910
nm in libc.so and ld.so shows the following:
$ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep malloc
00000000000829d0 T __libc_malloc
00000000003b6700 V __malloc_hook
00000000003b8b00 V __malloc_initialize_hook
00000000000829d0 T malloc
0000000000082db0 W malloc_get_state
00000000000847c0 T malloc_info
0000000000082480 W malloc_set_state
00000000000844f0 W malloc_stats
0000000000084160 W malloc_trim
00000000000844b0 W malloc_usable_size
$ nm -D /lib64/ld-linux-x86-64.so.2 | grep malloc
0000000000016930 W malloc
Questions
-
malloc
is replicated inlibc.so
andld-linux.so
, but in caseld-linux.so
it is a weak character, so both must access the same address. Also, as I understand it, the dynamic loader's character resolution table is global and only allows one address per character (correct me if I'm wrong).However, gdb clearly shows the opposite (two different addresses). Why what?
-
gdb effectively splits into two different addresses as you type
break malloc
, but only displays character information in ld.so as you typeinfo symbol malloc
. Why is this? -
Although I split on malloc, but
libc.so
defines its own symbolmalloc
(as shown by nm), gdb is split on symbol__GI___libc_malloc
. Why is this?
source to share
- I suspect that GDB is simply putting a breakpoint on whatever characters
malloc
it can find, "just in case," so to speak. GDB uses its own internal symbol table, not a dynamic loader. So it can break into non-exported symbols if you have debug symbols. Only one address is listed in the command feedback to reduce noise in case of too many matches. It still mentions "2 places" so you can check it yourself withinfo breakpoints
. - I'm guessing the
info symbol
developer just didn't foresee this situation, so it only prints the first match -
__GI___libc_malloc
is the name of the internal, actual implementationmalloc
inside libc.so. Since you are also getting information about the original line"at malloc.c:2910"
, I am assuming that it comes from the debug symbols and not the ELF symtab. Again, one location can have many names (see__libc_malloc
symbol list), so GDB just picks one.
BTW, the pointer malloc
in the ld.so GOT is replaced malloc
with the libc address when libc.so is loaded (it initially points to the internal implementation). This way you get the same address for both when the process's entry point is reached and ld.so is malloc
no longer used.
source to share