Duplicate memory management symbols in libc.so and ld-linux.so

Some preamble

It seems that malloc, calloc, realloc and free are replicated to ld-linux.so

and libc.so

. As I understand it, this is done by a dynamic loader to manage memory in ld-linux.so

pre-load libc.so

and makes the memory management capabilities available. However, I have some questions about these duplicated symbols:

Here's a very simple C program calling malloc and exiting:

#include <stdlib.h>

int main()
{
  void *p = malloc(8);
  return 0;
}

      

I'll compile it with gcc on linux box x86_64 and do some debugging with gdb:

$ gcc -g -o main main.c
$ gdb ./main
(gdb) start
Temporary breakpoint 1 at 0x4004f8
Starting program: main 

Temporary breakpoint 1, 0x00000000004004f8 in main ()
(gdb) info symbol malloc
malloc in section .text of /lib64/ld-linux-x86-64.so.2
(gdb) b malloc
Breakpoint 2 at 0x7ffff7df0930: malloc. (2 locations)
(gdb) info breakpoints
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   <MULTIPLE>         
2.1                         y     0x00007ffff7df0930 in malloc at dl-minimal.c:95
2.2                         y     0x00007ffff7a9f9d0 in __GI___libc_malloc at malloc.c:2910

      

nm in libc.so and ld.so shows the following:

$ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep malloc
00000000000829d0 T __libc_malloc
00000000003b6700 V __malloc_hook
00000000003b8b00 V __malloc_initialize_hook
00000000000829d0 T malloc
0000000000082db0 W malloc_get_state
00000000000847c0 T malloc_info
0000000000082480 W malloc_set_state
00000000000844f0 W malloc_stats
0000000000084160 W malloc_trim
00000000000844b0 W malloc_usable_size

$ nm -D /lib64/ld-linux-x86-64.so.2 | grep malloc
0000000000016930 W malloc

      

Questions

  • malloc

    is replicated in libc.so

    and ld-linux.so

    , but in case ld-linux.so

    it is a weak character, so both must access the same address. Also, as I understand it, the dynamic loader's character resolution table is global and only allows one address per character (correct me if I'm wrong).

    However, gdb clearly shows the opposite (two different addresses). Why what?

  • gdb effectively splits into two different addresses as you type break malloc

    , but only displays character information in ld.so as you type info symbol malloc

    . Why is this?

  • Although I split on malloc, but libc.so

    defines its own symbol malloc

    (as shown by nm), gdb is split on symbol __GI___libc_malloc

    . Why is this?

+3


source to share


2 answers


  • I suspect that GDB is simply putting a breakpoint on whatever characters malloc

    it can find, "just in case," so to speak. GDB uses its own internal symbol table, not a dynamic loader. So it can break into non-exported symbols if you have debug symbols. Only one address is listed in the command feedback to reduce noise in case of too many matches. It still mentions "2 places" so you can check it yourself with info breakpoints

    .
  • I'm guessing the info symbol

    developer just didn't foresee this situation, so it only prints the first match
  • __GI___libc_malloc

    is the name of the internal, actual implementation malloc

    inside libc.so. Since you are also getting information about the original line "at malloc.c:2910"

    , I am assuming that it comes from the debug symbols and not the ELF symtab. Again, one location can have many names (see __libc_malloc

    symbol list), so GDB just picks one.


BTW, the pointer malloc

in the ld.so GOT is replaced malloc

with the libc address when libc.so is loaded (it initially points to the internal implementation). This way you get the same address for both when the process's entry point is reached and ld.so is malloc

no longer used.

+2


source


ld.so

is not a library, it is a dynamic linker (which is implicitly built to create an executable image in memory by linking the executable with shared libraries when the program starts).



-1


source







All Articles