Windows glfwSwapBuffers locking from 60 to 30 FPS randomly, but Linux is fine


#1

I have written a basic render loop following principles from the “Fix Your Timestep!” article. In a basic form: it polls for input, updates all object positions as many times as required, issues GL commands then swaps the buffers. Pretty standard, nothing too special.

When rendering a simple square with 6 vertices (with glfwSwapInterval enabled), it maintains 60 frames per second, then quickly dips down locking at 30 for a few seconds, before finally going back to 60. This happens randomly without warning or cause.

TL;DR: I initially thought that I had a bug in my rendering code, but after ensuring all commands were synced and flushed, I found that glfwSwapBuffers was taking 16.51 ms (~60 FPS) then suddenly decided to take 33.02 ms (~30 FPS) causing the sudden drop in frame rate.

This can’t be a coincidence as those sleeps correspond to perfectly match the 60 FPS and 30 FPS mark?

It works fine on Linux and maintains a constant 60 FPS where glfwSwapInterval always takes a similar amount of sleep time. On Windows, it is very hard to reproduce as it happens in such a sporadic and unpredictable way randomly locking to 30 FPS for a few seconds.

I have tried on friends computers, updated my drivers and completely rewritten my rendering code - but swapping the buffers always seems to bottleneck it at random.

Is this the driver limiting the FPS from 60 to 30 randomly for a few seconds (and why)? Is this an issue with Windows sleep functions/internals of GLFW? It seems vaguely similar to https://github.com/glfw/glfw/issues/603.

Here is what the profiling log, looks like (notice that the time it takes for swap doubles randomly, then returns to normal):

[2017-12-13 01:42:12.645] [engine] [info] FPS: 60
[2017-12-13 01:42:12.646] [engine] [info] Render: 1.0003e+006 ns
[2017-12-13 01:42:12.646] [engine] [info] Swap: 1.50109e+007 ns
[2017-12-13 01:42:14.647] [engine] [info] FPS: 60
[2017-12-13 01:42:14.648] [engine] [info] Render: 1.0006e+006 ns
[2017-12-13 01:42:14.648] [engine] [info] Swap: 1.50102e+007 ns
[2017-12-13 01:42:15.665] [engine] [info] FPS: 42
[2017-12-13 01:42:15.665] [engine] [info] Render: 1.0011e+006 ns
[2017-12-13 01:42:15.666] [engine] [info] Swap: 3.30234e+007 ns
[2017-12-13 01:42:16.665] [engine] [info] FPS: 53
[2017-12-13 01:42:16.666] [engine] [info] Render: 2.0011e+006 ns
[2017-12-13 01:42:16.666] [engine] [info] Swap: 1.50108e+007 ns
[2017-12-13 01:42:17.666] [engine] [info] FPS: 60
[2017-12-13 01:42:17.667] [engine] [info] Render: 1.0002e+006 ns
[2017-12-13 01:42:17.667] [engine] [info] Swap: 1.50106e+007 ns
[2017-12-13 01:42:18.667] [engine] [info] FPS: 60
[2017-12-13 01:42:18.668] [engine] [info] Render: 2.0009e+006 ns
[2017-12-13 01:42:18.668] [engine] [info] Swap: 1.40099e+007 ns

I’m extremely confused on how to fix this, so any help would be appreciated!


#2

It might help to know what type of GPU you are using. AMD/Nvidia/Intel? Embedded?

Are you calling any kind of sleep function anywhere in your code? The default thread scheduling interval on Windows is 17ms IIRC, which kills any attempt to lock to 60Hz. You might want to play with timeBeginPeriod() (https://msdn.microsoft.com/en-us/library/windows/desktop/dd757633(v=vs.85).aspx) to see if that makes any difference.

Edit: Ha! I see that I gave almost exactly the same advice on the issue you linked, which was never resolved.


#3

Does this happen only when the window is not fullscreen with a swap interval set to non zero?

If so there is a chance it could be either DWM composition (if using an older version of GLFW) or the swapBuffersWGL implementation in wgl_context.c which uses DwmFlush to get around DWM composition issues.

It is also possible you’re just seeing other applications using the GPU such as Browsers and the Desktop manager which are big spikey users of GPU, or the GPU may be throttling down when not in heavy use and not throttling up fast enough due to bubbles in the handling of your rendering etc.

Sadly this is a complex issue. Knowning the GLFW version, GPU and driver, OS version etc. may help identify the cause, and whether this issue appears in Fullscreen or Windowed. If you are able to put a breakpoint in the swapBufferWGL when windowed check if isCompositionEnabled returns false.


#4

No sleep functions, but it is non-fullscreen and the swap interval was set to 1. This occurred on any Nvidia GPU with the Windows driver. After testing I found that the AMD drivers were not affected by it and, even stranger, if I forced Adaptive VSync, the problem went away with a solid 60 FPS.

The issue is now fixed after following some of the suggestions, but I’m not sure which one caused the fix.

For anyone else stumbling upon a similar problem, here’s what I did:

  • Clean re-install Nvidia driver from scratch (resetting all settings)
  • Pull and re-build GLFW from the very latest commit removing any old traces
  • Re-do any buffer hints (i.e. changed GL_STATIC_DRAW to GL_STREAM_DRAW if applicable)
  • Fresh restart with no running programs (disabled Synergy, Skype, etc) and try again

I have a feeling it was either my GLFW version being outdated (as I am using a very old Docker image) or one of my other applications was randomly hogging GPU time for various reasons, causing the rendering time to suddenly take longer, then return to normal.

That’s as detailed of a diagnosis as I can go, but I hope it helps anyone else with a similar issue.


#5

I have/had exactly the same issue. I had quite a few attempts trying to fix the frame drops but nothing helped. With barely anything on the screen my game engine had this issue, having said that there is a slight overheat from the engine itself. To be honest Windows glfw version is just a testing ground for other platforms (consoles/iOS) where I use native mechanics, but it is one of the reasons why I never got around to release anything on Windows using this particular engine.

Fixes like “reinstall the drivers” are usually off the table, since no one expects games to behave like that on “stock” drivers anyway.

Machines used are actually pretty old (describing from memory):
laptop: Dell XPS 17 (l702x) - cpu: older i7, gpu: nvidia 550m & ~intel hd 3000
laptop: macbook pro 2013 - cpu: older i7, gpu: ~intel hd 3000
desktop: cpu: i7 8700k, gpu: intel UHD 630

Now some slight off topic: I’m using OpenGL, not Vulcan. Just being honest with myself here: I don’t think I’d be able to utilize Vulcan’s power to squeeze more speed, even using OpenGL there are tons of things to speed up the engine like proper multi-threading and DOD - job for months if not years. With every version glfw becomes more of SDL equivalent than lightweight window creation lib and having 2 gfx APIs probably introduces some noise or even overhead. With that in mind I treat glsl more like a good (actually great) base for the multi-platform engine which could be replaced by native implementation in production environment anyway (unless everything you had in mind just works).


#6

GLFW remains a lightweight simple API for creating windows, contexts and surfaces, receiving input and events. The support for Vulkan adds only 5 functions to the API. Most of the complexity of GLFW is in multitude of platforms handled.

Rendering performance can be complicated, and the usual culprit is the application code rather than GLFW, since GLFW handles very little of the per frame code. Whilst glfwSwapBuffers might appear to ‘hang’ this is down to the driver/OS implementation rather than GLFW itself.

Although @Crunkle mentions that they re-installed their driver, they also mention that they don’t know if this was the fix.