Most frames are not drawn on Intel graphics and Windows 10

Hi everyone,
I am running into some extremely bizarre behavior in my app, which I later reproduced with a single triangle Hello World. The triangle is supposed to smoothly move in a circle at a steady speed, as per this tutorial: https://paroj.github.io/gltut/Positioning/Tutorial%2003.html

What happens instead is that only one frame in about 30 is actually shown on the screen. I tested that glfwSwapBuffers is actually called 60 frames per second (or a lot more often with vsync off), it’s just that the majority of these frames are not shown. The problem mysteriously disappears whenever doing any of these things:

  • Running glFinish in the main loop,
  • Running glGetError in the main loop,
  • Running the app through RenderDoc (no OpenGL errors reported),
  • Using OBS to try capturing the problem.

It seems like I am running into the same problem someone else had, but I would like to find a more proper solution that doesn’t cause CPU/GPU sync which would affect performance on non-Intel systems too:

Not willing to simply “bump” an old issue, I continued trying out various setups to gather more information… This is where things get really weird.

I enabled fullscreen. Here a similar thing happens, except with a small difference… Between the “jumps” of the triangle, there is pixel crawl visible on the edges. As capture software makes the problem go away, I had to record a phone video. You should be able to see it happening though when I zoom into the screen in the second half.

The triangle is supposed to be moving much faster than this (a full circle around the screen in 5 seconds), so with fullscreen the app actually generates frames it wouldn’t otherwise… Which implies that glfwGetTime reports wrong values. Indeed, a quick per-frame debug output confirms exactly this.

First batch of output is in windowed mode, evenly spaced if a little stuttery. The second batch is fullscreen, where you can see sub-millisecond differences followed by huge gaps of over half a second. Both are with swap interval 1.

11:27:35 [DEBUG] Time: 3.798262
11:27:35 [DEBUG] Time: 3.814984
11:27:35 [DEBUG] Time: 3.835489
11:27:35 [DEBUG] Time: 3.848199
11:27:35 [DEBUG] Time: 3.865583
11:27:35 [DEBUG] Time: 3.881927
11:27:35 [DEBUG] Time: 3.898950
11:27:35 [DEBUG] Time: 3.915315
11:27:35 [DEBUG] Time: 3.931963
11:27:35 [DEBUG] Time: 3.948715
11:27:35 [DEBUG] Time: 3.965419
11:27:35 [DEBUG] Time: 3.981549
11:27:35 [DEBUG] Time: 3.998368
11:27:35 [DEBUG] Time: 4.015092
11:27:35 [DEBUG] Time: 4.031708
11:27:35 [DEBUG] Time: 4.048560
11:27:35 [DEBUG] Time: 4.064962
11:27:35 [DEBUG] Time: 4.082015
11:27:35 [DEBUG] Time: 4.098696
11:27:35 [DEBUG] Time: 4.115313
11:27:35 [DEBUG] Time: 4.131848
11:27:35 [DEBUG] Time: 4.148595
11:27:35 [DEBUG] Time: 4.165345
11:27:35 [DEBUG] Time: 4.181943
11:27:35 [DEBUG] Time: 4.198230
11:27:35 [DEBUG] Time: 4.214815
11:27:35 [DEBUG] Time: 4.231602
11:27:35 [DEBUG] Time: 4.248223
11:27:35 [DEBUG] Time: 4.264636
11:27:35 [DEBUG] Time: 4.281548
11:27:35 [DEBUG] Time: 4.299057
11:27:35 [DEBUG] Time: 4.314845
11:27:35 [DEBUG] Time: 4.331880
11:27:35 [DEBUG] Time: 4.348516
11:27:35 [DEBUG] Time: 4.467110
--------------------------------------------------
11:25:19 [DEBUG] Time: 2.129351
11:25:19 [DEBUG] Time: 2.130135
11:25:19 [DEBUG] Time: 2.130606
11:25:19 [DEBUG] Time: 2.131112
11:25:19 [DEBUG] Time: 2.131585
11:25:19 [DEBUG] Time: 2.132102
11:25:19 [DEBUG] Time: 2.132578
11:25:19 [DEBUG] Time: 2.133448
11:25:19 [DEBUG] Time: 2.133899
11:25:20 [DEBUG] Time: 2.709859
11:25:20 [DEBUG] Time: 2.710703
11:25:20 [DEBUG] Time: 2.711527
11:25:20 [DEBUG] Time: 2.712742
11:25:20 [DEBUG] Time: 2.713270
11:25:20 [DEBUG] Time: 2.714037
11:25:20 [DEBUG] Time: 2.714670
11:25:20 [DEBUG] Time: 2.715125
11:25:20 [DEBUG] Time: 2.715613
11:25:20 [DEBUG] Time: 2.716782
11:25:20 [DEBUG] Time: 2.717389
11:25:20 [DEBUG] Time: 2.717882
11:25:20 [DEBUG] Time: 2.718700
11:25:20 [DEBUG] Time: 2.719321
11:25:20 [DEBUG] Time: 2.719824
11:25:20 [DEBUG] Time: 2.720364
11:25:20 [DEBUG] Time: 2.720868
11:25:20 [DEBUG] Time: 2.721474
11:25:20 [DEBUG] Time: 2.721945
11:25:20 [DEBUG] Time: 2.722607
11:25:20 [DEBUG] Time: 2.723179
11:25:20 [DEBUG] Time: 2.723714
11:25:20 [DEBUG] Time: 2.724147
11:25:20 [DEBUG] Time: 2.725027
11:25:20 [DEBUG] Time: 2.725443
11:25:20 [DEBUG] Time: 2.725906
11:25:20 [DEBUG] Time: 2.726509
11:25:20 [DEBUG] Time: 2.726998
11:25:20 [DEBUG] Time: 2.727455
11:25:20 [DEBUG] Time: 2.727917
11:25:20 [DEBUG] Time: 2.728341
11:25:20 [DEBUG] Time: 2.728854
11:25:20 [DEBUG] Time: 2.729356
11:25:20 [DEBUG] Time: 2.729860
11:25:20 [DEBUG] Time: 2.730370
11:25:20 [DEBUG] Time: 3.309454
11:25:20 [DEBUG] Time: 3.310663

If at first this was mysterious, now it’s completely nonsensical. Even the clock is wrong. Please send help.

My environment, for reference:
C99, CMake, MinGW-w64 from MSYS2, GLFW 3.3 from MSYS2
Windows 10 build 1809, Intel i5-6300U, Intel HD Graphics 520

It does sound like you might have a similar issue to that in the linked post. Both of you are using an Intel GPU, you a Intel HD Graphics 520 and @KornelH a HD Graphics 620 so I suspect this could be an OpenGL driver issue.

One thing you could try, which is a good idea anyway, is to use glFenceSync and glClientWaitSync to reduce the number of frames latency - for example every frame you insert a fence, store the id in a circular buffer, then N frames later use the sync id to wait. Setting N = 3 would be a decent default.

I don’t think this has anything to do with your issue but do check you are using glfwPollEvents() rather than glfwWaitEvents(), as the later will block execution until it receives input.

Thank you for the response. I was not familiar with glFenceSync but the idea behind it seems rather straightforward. I tested what I believe should be a trivial same-frame synchronization equivalent to glFinish:

GLsync fence = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
glClientWaitSync(fence, GL_SYNC_FLUSH_COMMANDS_BIT, MSEC * 100); // how does the timeout affect it?
glfwSwapBuffers(window);

And, predictably, this does solve the issue. I will implement the buffer and see how waiting 1 frame or more affects the behavior.