Weird segfault in XCloseDisplay (called from glfwTerminate)

Hi,

I am posting this here rather than the Github Issue tracker because I doubt it is actually a bug in glfw, but I would appreciate any help in fixing it.

On this specific PC glfwTerminate is reliably causing a segfault somewhere in XCloseDisplay. Output of the glfwinfo test in gdb:

GLFW header version: 3.2.0
GLFW library version: 3.2.0
GLFW library version string: "3.2.0 X11 GLX EGL clock_gettime /dev/js Xf86vm shared"
OpenGL context version string: "4.5.0 NVIDIA 352.79"
OpenGL context version parsed by GLFW: 4.5.0
OpenGL context flags (0x00000000):
OpenGL context flags parsed by GLFW:
OpenGL profile mask (0x00000000): unknown
OpenGL profile mask parsed by GLFW: compat
OpenGL robustness strategy (0x00008261): none
OpenGL robustness strategy parsed by GLFW: none
OpenGL context renderer string: "GeForce GTX 770/PCIe/SSE2"
OpenGL context vendor string: "NVIDIA Corporation"
OpenGL context shading language version: "4.50 NVIDIA"
OpenGL framebuffer:
 red: 8 green: 8 blue: 8 alpha: 8 depth: 24 stencil: 8
 samples: 0 sample buffers: 0
 accum red: 16 accum green: 16 accum blue: 16 accum alpha: 16 aux buffers: 4
Vulkan loader: missing

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff2054560 in ?? ()
(gdb) bt
#0  0x00007ffff2054560 in ?? ()
#1  0x00007ffff6ddfe22 in XCloseDisplay ()
   from /usr/lib/x86_64-linux-gnu/libX11.so.6
#2  0x00007ffff7bc93ce in _glfwPlatformTerminate ()
   from /home/me/Development/glfw-3.2-build/src/libglfw.so.3
#3  0x00007ffff7bc15ec in glfwTerminate ()
   from /home/me/Development/glfw-3.2-build/src/libglfw.so.3
#4  0x000000000040c2a5 in main ()

I am mystified as to the cause of this error, as everything was working fine until recently. It’s also strange that all of my code runs fine (creating window, loading OpenGL function pointers, rendering, etc) up until this segfault at the end. Any ideas?

If you comment out the call to _glfwTerminateEGL in x11_init.c, does it still segfault?

Removing the call to _glfwTerminateEGL allows my program (and glfwinfo) to exit normally with no segfault. Interesting… I can bisect if you think it might be helpful?

I suspect a bisect will point here but I certainly won’t turn down verification if you have the time.

https://github.com/glfw/glfw/commit/ef80beab812ee3282493e87f2d7ad8cf10cb8a95

Thought of a quicker way than a full bisect: instead of commenting out the call to _glfwTerminateEGL, comment out the call to _glfw_dlclose in that function in egl_context.c. That shouldn’t segfault either.

Thanks for the shortcut; I can confirm that commenting out just _glfw_dlclose also resolves the problem.

Edit: Just tested and confirmed the segfault also happens with latest master (my previous tests were with the 3.2 release).

Edit 2: It didn’t take long to do a full bisect; I can confirm that the segfault was introduced in https://github.com/glfw/glfw/commit/ef80beab812ee3282493e87f2d7ad8cf10cb8a95

Since this looks like a glfw bug after all (or at least something external that can be worked around in glfw), should I add an issue on Github?

Please do. I think it is related to this and this, but this is sufficiently different from both.

This might be (mostly, implicitly) fixed in master.

I’ll test it again over the weekend. Delayed submitting an issue because I’m still convinced the root cause of the problem was me messing up some package selections… several problems started manifesting at the same time on that PC which do not occur on my other Debian machine.

Confirmed the segfault is gone as of d5e00e6, although glfwinfo now gives the following at the end of its output:

Error: Vulkan: Loader not found
Vulkan loader: missing
Error: Vulkan: Loader not found

The double error messages is a little inelegant. Will fix that.

Edit: Neither error message should be there.

Do you have a Vulkan loader on that system?

I haven’t deliberately installed anything Vulkan-related on this machine.

When building glfw, CMake reports this:

-- Could NOT find Vulkan (missing: VULKAN_LIBRARY VULKAN_INCLUDE_DIR)

and the generated CMakeCache.txt has these lines:

//Path to a file.
VULKAN_INCLUDE_DIR:PATH=VULKAN_INCLUDE_DIR-NOTFOUND

//Path to a library.
VULKAN_LIBRARY:FILEPATH=VULKAN_LIBRARY-NOTFOUND

Right, thank you! Will fix this after I’m done with the current bug.

Question: should the GLFW library version string contain “EGL” if I haven’t enabled EGL in my CMakeCache.txt?

You’re welcome! Sorry I can’t contribute a fix myself… I don’t have experience writing this kind of code.

Since version 3.2, EGL is a run-time option and not a compile-time one.

There is still a cmake option GLFW_USE_EGL:BOOL; is that no longer used?

That looks like a line from CMakeCache.txt. It’s probably still there because it’s been around since before the option was removed from CMakeLists.txt.