glfwSwapBuffers vs. glxSwapBuffers

I am trying to implement multiple full screen windows each on its own X-Screen.
As far as I understand this is not currently possible using GLFW because it can only open 1 Display.
My application is extremely sensitive to latency therefore I have to work in full screen mode to avoid composition.

The code I have using glfw on a single monitor (full screen window) works great. So I am trying to use the glx/X command sequence similar to what glfw implements, hoping that I can overcome the single X-screen limitation whilst keeping the good performance.

I have the glx code running and displaying an image however the latency is much higher then what I get using glfw.

The main difference I noticed in the performance is that glfwSwapBuffers doesnt block while my call to glxSwapBuffers does block.
Any idea what might be causing my glxSwapBuffers to block?

The rendering code is identical (between the glx and glfw implementations) apart from the MakeCurrent context and SwapBuffers.
The window setup code for glfw is basically glfwInit → glfwCreateWindow (without a monitor) → glfwSetWindowMonitor (with monitor).