Lib OSMesa context unconfiguration is not done in C ++ but only when static binding

I made a C ++ tool for off-screen rendering of 3D models. Rendering is performed using the OSMesa library.

The software has been working flawlessly for over a year and I stopped to update it about 6 months ago. In the meantime, my development environment has been updated several times.

Now I compiled it again and encountered an unexpected error.

The plain version of the software still works as expected, but there is a statically related issue.

I guess this is a bug in OSmesa's config / compile / link routine, not in the library code, but all tips for better debugging segmentation fault are appreciated.

Having tried numerous variations of the compilation process without success, I am now stuck. Can anyone see something silly that I am doing in some of the steps below?


I recompiled the static version of the OSmesa library with the same version of the shared library that works on my system (12.0.6), disabling all unnecessary features (using an Ubuntu based system, no static version of OSmesa lib available from the repositories):

./configure \
    --disable-xvmc \
    --disable-glx \
    --disable-dri \
    --with-dri-drivers = "" \
    --with-gallium-drivers = "" \
    --disable-shared-glapi \
    --disable-egl \
    --with-egl-platforms = "" \
    --enable-osmesa \
    --enable-gallium-llvm = no \
    --disable-gles1 \
    --disable-gles2 \
    --enable-static \
    --disable-shared

This is the command to compile my offscreen display tool:

g ++ -std = c ++ 11 -Wall -O3 -g -static -static-libgcc -static-libstdc ++ ./src/measure_model.cpp model.o thumbnail.o -o measure_model_debug -pthread -lOSMesa -ldl -lm -lpng -lz -lcrypto

This is the warning I was getting statically compiling with OSMesa and was present even a year ago with a working static binary:

/home/XXX/XXX/backend/lambda/mesa/mesa-12.0.6/src/mesa/main/dlopen.h:52: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

This is what I get from running the tool:

Segmentation fault (core dumped)

But no segmentation fault occurs if I just skip the step of creating the OSmesa context (and obviously the whole 3D rendering)

This is a backtrace:

# 0 0x0000000000000000 in ?? ()
# 1 0x00000000004af20a in mtx_init (type = 4, mtx = 0xe10f70) at ../../include/c11/threads_posix.h:215
# 2 _mesa_NewHashTable () at main / hash.c: 135
# 3 0x000000000052f295 in _mesa_alloc_shared_state ( ctx = ctx @ entry = 0xdcc9b0) at main / shared.c: 67
# 4 0x000000000046e717 in _mesa_initialize_context ( ctx = ctx @ entry = 0xdcc9b0, api = api @ entry = API_OPENGL_COMPAT, visual =, share_list = share_list @ entry = 0x0, driverFunctions = driverFunctions = driverFunctions = driverFunctions @ mainfc.x740)
# 5 0x000000000046c870 in OSMesaCreateContextAttribs ( attribList = attribList @ entry = 0x7fffffffd290, sharelist =) at osmesa.c: 834
# 6 0x000000000046ccdc in OSMesaCreateContextExt (format =, depthBits =, stencilBits =, accumBits =, sharelist =) at osmesa.c: 660
# 7 0x0000000000468742 in generate_thumbnail (Model *, Json :: Value) ()
# 8 0x0000000000401c7d in main (argc =, argv =) at ./src/measure_model.cpp:107

A statically linked binary is a strict requirement.

The segmentation fault occurs on the same machine that I use to compile the tool (OSmesa static library compiled on the same machine), but does not have a segmentation fault in the non-statically linked version of the same tool.

+3


source to share


1 answer


This is what I get from running the tool:     Segmentation fault (core dumped)

But no segmentation fault occurs if I just skip the step of creating the OSmesa context (and obviously the whole 3D rendering)

So there is some problem from OSmesa creation. With your backtrace, we can see that the top function was executed from EIP zero (transition to NULL / call to NULL), so there is a call to some function in mtx_init

that is part of creating the OS Mesa context.

#0  0x0000000000000000 in ?? ()
#1  0x00000000004af20a in mtx_init (type=4, mtx=0xe10f70) at ../../include/c11/threads_posix.h:215
#2  _mesa_NewHashTable () at main/hash.c:135
#3  0x000000000052f295 in _mesa_alloc_shared_state (ctx=ctx@entry=0xdcc9b0) at main/shared.c:67
#4  0x000000000046e717 in _mesa_initialize_context (ctx=ctx@entry=0xdcc9b0, api=api@entry=API_OPENGL_COMPAT, visual=, share_list=share_list@entry=0x0, driverFunctions=driverFunctions@entry=0x7fffffffcd40) at main/context.c:1192
#5  0x000000000046c870 in OSMesaCreateContextAttribs (attribList=attribList@entry=0x7fffffffd290, sharelist=) at osmesa.c:834
#6  0x000000000046ccdc in OSMesaCreateContextExt (format=, depthBits=, stencilBits=, accumBits=, sharelist=) at osmesa.c:660
#7  0x0000000000468742 in generate_thumbnail(Model*, Json::Value) ()
#8  0x0000000000401c7d in main (argc=, argv=) at ./src/measure_model.cpp:107

      

What was the function? According to online sources include / c11 / threads_posix.h: mtx_init()

in github
, there are only calls pthread_mutex_init

, pthread_mutexattr_init

and some other mutex-related libpthread ( -lpthread

) functions .

Why was the call to NULL been called instead of the real function? Probably due to the static link being used by glibc and / or libpthread. The exact problem is not yet fixed at the moment (I was able to find a report of statically linked libpthread.a in some shared lib which is wrong and will never work).

In your case, there is only a (strong) alias pthread_mutex_init

in glibc / nptl / pthread_mutex_init.c (line 150) strong_alias (__pthread_mutex_init, pthread_mutex_init)

, and there may be some weak character alias in glibc itself, possibly uninitialized, Some were wrong in your link settings and / or in the mind ld

, and it didn't find / link nptl/pthread_mutex_init.o

(it is part of the libpthread.a archive) with a real symbol into the final executable (ld often skips unused / unneeded objects from the .a archives and doesn't link them to the final executable) while keeping the relocation pointing to NULL ... Some glibc experts may know, Busy Russian is one of the SO experts.

I suggest linking statically only to your internal libraries, or perhaps also to regular non-system libraries like mesa (you can use-Wl,-Bstatic -lyour_lib -Wl,-Bdynamic

options to temporarily bind changes to static for the libs specified in between, or use the cheat option -l:

as -l:libYour_lib.a

found by Radek in the same q.). But don't link statically to most of the base glibc libs like libc, libpthread, librt (there are some problems with static linking glibc when using nss: the target system must have the exact same version of dynamic glibc to allow nss to work).



If you want to package your application on older machines and need some glibc functionality, you can also try to package your own version of the glibc shared libraries with your application; put them in some subdirectory, add a rpath

linker option to change the library search paths, also change the INTERP section from the default loader ABI ld-linux.so.2 your own copy of ld-linux.so.2 from your glibc version, ... And you will still have problems with too old kernels, as newer glibcs ​​require some modern features (syscalls, structs) of a fairly new kernel.

Or you can package your application in some container like Docker or some other isolate solution (or chroot?) To always have your libs ...

UPDATE: just found a report of a similar bt with NULL instead of the mutex implementation from nptl: https://bugzilla.redhat.com/show_bug.cgi?id=163083 "A statically linked C ++ program using pthreads will segfault" (2005- 2007)pthread_mutex_init(&lock, NULL);

g++ -g -static foo.cpp -o foo -lpthread

where #0 0x00000000 in ?? () #1 0x08048232 in main () at foo.cpp:7

This seems to be due to the fact that some pthreads functions are not included in the output executable. This bug may duplicate # 115157 and I apologize if so, but hopefully a helpful test case will be helpful.

Additional Information:

The suggestion in # 115157 to force the link in all libpthread.a is a valid workaround.

https://bugzilla.redhat.com/show_bug.cgi?id=115157 "executables are statically linked to /usr/lib/nptl/libpthread.a fail" - 2004-2009 CLOSED WONTFIX

Jakub Jelinek 2004-10-29 05:26:10 EDT

First of all, avoid -static

it if possible, it creates problems
, both portability and others.

If you really need to create a statically linked binary with -lpthread

linked then just use -Wl,--whole-archive -lpthread -Wl,--no-whole-archive

instead-pthread

. Everything else has many problems.

+2


source







All Articles