zhengyu Posted December 6, 2018 Share Posted December 6, 2018 Hi, I found a terrible problem with FPS. The avg fps is about 220 for virtual studio inner scene, but the min fps could be as low as 110 fps. Then I use the simple "player" sample project ( in the API/nodes/player directory, in the scene there are only four boxes ) , the variance is still very much. The unigine version is 2.6 release-double x64, compiled with vs2017. I use renderdoc to profile the player project, the GPU time seems normal and consistent between frames. Then I use the latest nvidia insight to profile, and it seems that there are some random time gap between composite_deferred.frag and srgb_correction.frag. I checked the dx11 commands between and found nothing wrong. Another problem is that the deferred_light.frag take long time. I set render_lights_tile_grid_size to 2, and there are 8 draw() calls in the function RenderLights::renderDeferredLights. I appreciated any help and advice. Thanks a lot. Link to comment
morbid Posted December 6, 2018 Share Posted December 6, 2018 Hello @zhengyu , may I ask you to attach dump from our microprofile? It was available in 2.6, you can refer to this article or video. Please note, that video tutorial is based on the latest SDK version. Thank you. How to submit a good bug report --- FTP server for test scenes and user uploads: ftp://files.unigine.com user: upload password: 6xYkd6vLYWjpW6SN Link to comment
zhengyu Posted December 7, 2018 Author Share Posted December 7, 2018 Hi, morbid, Thanks for your reply. I modify the timing code section in the unigine, cause 1) the present time in 2.6 actually is not accurate, 2) it outputs the average statistics instead of the continued two frames ( one is good, next is bad ). In the attachment is the microprofile log. "bad" means that fps drop below 80% of average FPS, "good" means that fps is close to average FPS. I begin to account the frame time in function void AppWindow::doUpdate() { // update fps counter updateFps(); // set current time to app->frame_time beginPreUpdate(); ...// handle windows msgs. this section is the period "Preupdate" in log endPreUpdate(); update(); } Engine::do_update() { // update statistics frame_begin = Timer::getTime(); present_time = (float)(frame_begin - frame_end) / Timer::CLOCKS_PER_SECOND; // fps counter engine.fps = engine.app->getFps(); engine.ifps = engine.app->getIFps(); engine.ftime = engine.app->getFTime(); stopFPS(); ... // profile and statistics section startFPS(); ... // update time long long update_begin = Timer::getTime(); // world manager { MICROPROFILE_SCOPEI("Engine", "world manager update", 0xffffffff); engine.worlds->update(); } .... // render plugins { MICROPROFILE_SCOPEI("Engine", "plugins render", 0xffffffff); engine.plugins->render(); } long long update_end = Timer::getTime(); update_time = (float)(update_end - update_begin) / Timer::CLOCKS_PER_SECOND; // this section is the period "update" in log .... } void Engine::do_render() { long long render_begin = Timer::getTime(); ... engine.sound->renderWorld(0); engine.render->renderWorld(); if (first_frame) { engine.render->setFirstFrame(0); engine.app->startFps(); } long long render_end = Timer::getTime(); render_time = (float)(render_end - render_begin) / Timer::CLOCKS_PER_SECOND; // this section is the period "render" in log // post render begin ..... // post render end // this section is the period "post render" in log } // end of do_render() void Engine::do_swap() { // swap begin .... // swap_end // this section is the period "swap" in log } The time between last end of swap and next update begin is the period "waitGPU" in log. 4736-bad 5187-good 5188-bad 4735-good Link to comment
zhengyu Posted December 7, 2018 Author Share Posted December 7, 2018 I found sometimes the waitGPU time is big. In fact, it is the following line in the function "int D3D11RenderContext::swapWindow()" that takes long time. auto result = swap_chain->Present(parent_app->getVSync(), DXGI_SWAP_EFFECT_DISCARD); So I guess it is some GPU lag. After comparing carefully the frames in nvidia nsight, I found some random gap between the frag shaders, although the dx11 commands between seem OK. And the frag shaders take almost same time except deferred_light.frag( this shader sometimes take more time). Then I used Renderdoc to trace continuous 10 frames, and it turns out the time of different frame seems almost equal. It seems to me that GPU codes is good. Then I used the intel vtune to profile. I guess maybe waitforsingleobject in cpuext.cpp consumes some time during thread switch. So I tried to walk around the windows msg section and waitforsingleobject( I choosed not to use thread ). But the FPS variance exists still. I don't know what is wrong with FPS variance. Any suggestion is really appreciated. Thanks a lot. Link to comment
zhengyu Posted December 7, 2018 Author Share Posted December 7, 2018 P.S. I tried the release version with commenting out all the profile related codes. FPS still varies. I also tried clearing render targets and depth instead of release it. It does not work either. I disabled vsync in nvidia control panel and unigine, and also turned off most effects. ( in the attachment is the configure for sample project "Players" unigine.cfg Link to comment
silent Posted December 7, 2018 Share Posted December 7, 2018 zhengyu Some applications (for example, web browsers like Google Chrome) can affect the GPU behavior even if minimized and therefore cause spikes. I've tried to run Players demo and with average frame rate of 300 I'm getting lowest fps about 250 (I'm using show_fps 2 + show_profiler 2, FullScreen, DX11 API, Release) - is that you observing on your PC? If possible, can you try to run on a clean environment (where only GPU drivers and Engine are installed and running) - will be there any difference? Thanks! How to submit a good bug report --- FTP server for test scenes and user uploads: ftp://files.unigine.com user: upload password: 6xYkd6vLYWjpW6SN Link to comment
zhengyu Posted December 7, 2018 Author Share Posted December 7, 2018 The environment is : win10, vs2017, nvidia 1080ti ( 417.01 driver version), high performance mode in power settings for win 10. cpu is intel xeon e5-2696 v3. Link to comment
silent Posted December 7, 2018 Share Posted December 7, 2018 Can you show the screenshot with profiler enabled (show_profiler 2) to see the spikes? I've also recommend to turn of all the background processes and use fullscreen mode (video_fullscreen 1). On this screen I've manually limited fps to 200 (render_max_fps 200). As you can see the profiler line is pretty much straight, no spikes or whatsoever: Thanks! How to submit a good bug report --- FTP server for test scenes and user uploads: ftp://files.unigine.com user: upload password: 6xYkd6vLYWjpW6SN Link to comment
zhengyu Posted December 7, 2018 Author Share Posted December 7, 2018 Thank you very very much. I turn off all other apps. Yes. I can get the same result on my machine after render_max_fps 200. But if not limiting max fps, the low fps could be 2300 (average is 3000, turn of profiler. ) if turn on profiler, the low fps could 1600 ( average is 1800 ). Then I try another scene made by my colleagues, a more complex virtual studio scene. The fps varies between 90-240 fps (avg is 224 ). After set render_max_fps to 90, then the profiler line seems stable, no spike. So it is something about screen fresh rate? I use two monitors support 2k both at 60fps. Can I assume that if I set render_max_fps close to the low fps value when without fps limitation, it will be stable ? I mean how I can choose the right render_max_fps value. Thanks a lot for your help. Link to comment
silent Posted December 7, 2018 Share Posted December 7, 2018 Yep, it's all about the minimal framerate. If you don't need to have a very high refresh rate the solution can be just to turn on the Vsync. If your app minimum framerate is always more than 60 you should get stable 60 in that case without tearing and whatsoever. Thanks! How to submit a good bug report --- FTP server for test scenes and user uploads: ftp://files.unigine.com user: upload password: 6xYkd6vLYWjpW6SN Link to comment
zhengyu Posted December 7, 2018 Author Share Posted December 7, 2018 Quote Got it. Thanks a lot. Right now we are test a complex scene with 4k resolution. We wish to achieve 60 fps. But avg fps is below 60. So I guess that I will try to make the low fps higher. One more question. Do you prefer one 4k resolution monitor to test or using two 2k monitors ? I wonder if two 2k output from 1080ti may cause fps a bit lower. Thanks a lot. Link to comment
silent Posted December 7, 2018 Share Posted December 7, 2018 I believe that should not affect FPS much (i'm normally using 2x1080p monitors). For 4K you need to tune render settings. If you have a lot of post-processing you need to lower post effects resolution if possible (that would give you a nice perf boost). You can see check as an example our render settings in Superposition (we have 4K and 8K presets here) to get an idea how to move further. Thanks! How to submit a good bug report --- FTP server for test scenes and user uploads: ftp://files.unigine.com user: upload password: 6xYkd6vLYWjpW6SN Link to comment
zhengyu Posted December 7, 2018 Author Share Posted December 7, 2018 Thanks a lot for your advice. : D Yes. I use renderdoc and find out 40% time consumed by post process. I will try to reduce these costs first. Superposition you mentioned is a demo in samples in SDK browser ? Sorry, I did not find one example called Superposition. My version is unigine 2 sim src 2.6. I find demos such like tank or oil platform. Link to comment
silent Posted December 7, 2018 Share Posted December 7, 2018 Superposition available in 2.7.x SDKs, please check. Also you can use our built-in Microprofile tool to measure the performance instead of RenderDoc. Thanks! How to submit a good bug report --- FTP server for test scenes and user uploads: ftp://files.unigine.com user: upload password: 6xYkd6vLYWjpW6SN Link to comment
Recommended Posts