FPS variance too much. Ask for help


photo

Recommended Posts

Hi, I found a terrible problem with FPS.  The avg fps is about 220 for virtual studio inner scene, but the min fps could be as low as 110 fps. 

Then I use the simple "player" sample project ( in the API/nodes/player directory, in the scene there are only four boxes ) , the variance is still very much. 

The unigine version is 2.6 release-double x64, compiled with vs2017. 

I use renderdoc to profile the player project, the GPU time seems normal and consistent between frames. 

Then I use the latest nvidia insight to profile,  and it seems that there are some random time gap between composite_deferred.frag and srgb_correction.frag. I checked the dx11 commands between and found nothing wrong. Another problem is that the deferred_light.frag take long time. I set render_lights_tile_grid_size to 2, and there are 8 draw() calls in the function RenderLights::renderDeferredLights. 

I appreciated any help and advice. Thanks a lot. 

 

fps.png

deferred_light_frag_take_long_time.png

gap_between_composite_deferred and srgb_correction.png

Link to post

Hi, morbid,

Thanks for your reply. 

I modify the timing code section in the unigine, cause 1) the present time in 2.6 actually is not accurate, 2) it outputs the average statistics instead of the continued two frames ( one is good, next is bad ). 

In the attachment is the microprofile log.  "bad" means that fps drop below 80% of average FPS, "good" means that fps is close to average FPS.

I begin to account the frame time in function 

void AppWindow::doUpdate()
{
    // update fps counter
    updateFps();  // set current time to app->frame_time
    
    beginPreUpdate();

...// handle windows msgs. this section is the period "Preupdate" in log

   endPreUpdate();

    update();

}

 

Engine::do_update()

{

    // update statistics
    frame_begin = Timer::getTime();
    present_time = (float)(frame_begin - frame_end) / Timer::CLOCKS_PER_SECOND;
    // fps counter
    engine.fps = engine.app->getFps();
    engine.ifps = engine.app->getIFps();
    engine.ftime = engine.app->getFTime();
 

stopFPS();

...  // profile and statistics section

startFPS();

...

// update time
    long long update_begin = Timer::getTime();

    // world manager
    {
        MICROPROFILE_SCOPEI("Engine", "world manager update", 0xffffffff);
        engine.worlds->update();
    }

....

// render plugins
    {
        MICROPROFILE_SCOPEI("Engine", "plugins render", 0xffffffff);
        engine.plugins->render();
    }

    long long update_end = Timer::getTime();
    update_time = (float)(update_end - update_begin) / Timer::CLOCKS_PER_SECOND;    // this section is the period "update" in log

....

}

 

void Engine::do_render()

{

     long long render_begin = Timer::getTime();

     ...

     engine.sound->renderWorld(0);
     engine.render->renderWorld();

    if (first_frame)
    {
        engine.render->setFirstFrame(0);
        engine.app->startFps();
    }

    long long render_end = Timer::getTime();
    render_time = (float)(render_end - render_begin) / Timer::CLOCKS_PER_SECOND;    // this section is the period "render" in log

    // post render begin

    .....

    // post render end       // this section is the period "post render" in log

} // end of do_render()

void Engine::do_swap()

{   // swap begin

     ....

    // swap_end   // this section is the period "swap" in log

}

 

The time between last end of swap and next update begin is the period "waitGPU" in log.

 

4736-bad

5187-good

5188-bad

4735-good

Link to post

I found sometimes the waitGPU time is big. In fact, it is the following line in the function "int D3D11RenderContext::swapWindow()" that takes long time. 

auto result = swap_chain->Present(parent_app->getVSync(), DXGI_SWAP_EFFECT_DISCARD);

So I guess it is some GPU lag. After comparing carefully the frames in nvidia nsight,  I found some random gap between the frag shaders, although the dx11 commands between seem OK. And the frag shaders take almost same time except deferred_light.frag( this shader sometimes take more time).  Then I used Renderdoc to trace continuous 10 frames,  and it turns out the time of different frame seems almost equal. It seems to me that GPU codes is good. 

Then I used the intel vtune to profile. I guess maybe waitforsingleobject in cpuext.cpp consumes some time during thread switch. So I tried to walk around the windows msg section and waitforsingleobject( I choosed not to use thread ).  But the FPS variance exists still. 

 

I don't know what is wrong with FPS variance.

Any suggestion is really appreciated. Thanks a lot. 

 

Link to post

P.S. 

I tried the release version with commenting out all the profile related codes. FPS still varies.

I also tried clearing render targets and depth instead of release it. It does not work either. 

I disabled vsync in nvidia control panel and unigine, and also turned off most effects. ( in the attachment is the configure for sample project "Players"

unigine.cfg

Link to post

zhengyu

Some applications (for example, web browsers like Google Chrome) can affect the GPU behavior even if minimized and therefore cause spikes.

I've tried to run Players demo and with average frame rate of 300 I'm getting lowest fps about 250 (I'm using show_fps 2 + show_profiler 2, FullScreen, DX11 API, Release) - is that you observing on your PC? If possible, can you try to run on a clean environment (where only GPU drivers and Engine are installed and running) - will be there any difference?

Thanks!

 

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to post

The environment is

win10, vs2017,  nvidia 1080ti ( 417.01 driver version), high performance mode in power settings for win 10. cpu is intel xeon e5-2696 v3. 

Link to post

Can you show the screenshot with profiler enabled (show_profiler 2) to see the spikes? I've also recommend to turn of all the background processes and use fullscreen mode (video_fullscreen 1).

On this screen I've manually limited fps to 200 (render_max_fps 200). As you can see the profiler line is pretty much straight, no spikes or whatsoever:

image.png

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to post

Thank you very very much. 

I turn off all other apps. 

Yes. I can get the same result on my machine after render_max_fps 200.  But if not limiting max fps, the low fps could be 2300 (average is 3000, turn of profiler. ) if turn on profiler, the low fps could 1600 ( average is 1800 ). 

Then I try another scene made by my colleagues, a more complex virtual studio scene. The fps varies between 90-240 fps (avg is 224 ). After set render_max_fps to 90, then the profiler line seems stable, no spike. 

So it is something about screen fresh rate?  I use two monitors support 2k both at 60fps. 

Can I assume that if I set render_max_fps close to the low fps value when without fps limitation, it will be stable ? I mean how I can choose the right render_max_fps value. Thanks a lot for your help. 

virtualstudio.png

Link to post

Yep, it's all about the minimal framerate. If you don't need to have a very high refresh rate the solution can be just to turn on the Vsync. If your app minimum framerate is always more than 60 you should get stable 60 in that case without tearing and whatsoever.

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to post
Quote

Got it. Thanks a lot. Right now we are test a complex scene with 4k resolution. We wish to achieve 60 fps. But avg fps is below 60. So I guess that I will try to make the low fps higher. 

One more question. Do you prefer one 4k resolution monitor to test or using two 2k monitors ?  I wonder if two 2k output from 1080ti may cause fps a bit lower.  Thanks a lot.

 

Link to post

I believe that should not affect FPS much (i'm normally using 2x1080p monitors). For 4K you need to tune render settings. If you have a lot of post-processing you need to lower post effects resolution if possible (that would give you a nice perf boost).

You can see check as an example our render settings in Superposition (we have 4K and 8K presets here) to get an idea how to move further.

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to post

Thanks a lot for your advice.     

Yes. I use renderdoc and find out 40% time consumed by post process. I will try to reduce these costs first.  Superposition you mentioned is a demo in samples in SDK browser ?  Sorry, I did not find one example called Superposition.  My version is unigine 2 sim src 2.6.   I find demos such like tank or oil platform.

Link to post

Superposition available in 2.7.x SDKs, please check. Also you can use our built-in Microprofile tool to measure the performance instead of RenderDoc.

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to post