Jump to content

Lower performance in 2.9


photo

Recommended Posts

Are in 2.9 any new settings, which can impact performance? In some scenes, we have cca 2x lower framerate, than in 2.8. We checked settings like penumbra for shadows, but it is turned off. 

Link to comment

demostenes

The easiest way to find out what happens is to use Microprofile tool (available in development builds).

You can load scene, set camera position and angle (via camera_set command) and do a microprofile_dump_html command with 2.8 SDK and 2.9 as well. After that you can compare dumps and see what happens in your case.

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment

It seems, that performance problem is overall (almost 50 percent down), but we dont see any changes in render settings. We will perepare these dumps.

edit: Same place, same camera postion, 2.8.0.1 (our project rev 1079)  cca 50 fps  2.9.0.2 (rev 1093, but it is same just after migration on 2.9.0.0 rev 1083) cca 30 fps. Render settings same.  Place is not much important performance is lower in whole world.

microprofile.zip

Edited by demostenes
Link to comment

demostenes

Could you please also reduce a number of frames that saved on disk (in case of 2.9.x release)? The result html file is too big to render in any web browser. You can do it via console command microprofile_dump_frames 200.

Also, what kind of CPU is used in this test? How many cores / threads?

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment
7 hours ago, silent said:

demostenes

Could you please also reduce a number of frames that saved on disk (in case of 2.9.x release)? The result html file is too big to render in any web browser. You can do it via console command microprofile_dump_frames 200.

Also, what kind of CPU is used in this test? How many cores / threads?

Thanks!

OK, we will re-grab. If i remember well, it was some i7, but it happens on all of our computers (at least 4 cores (intel) with at least 8GB ram and 1070+, or similar radeon)

Edit: Here it is.

microprofile2.zip

Edit2: it merged with upper post. Now on reply it automatically merges with last post. Dont know why.

Edited by demostenes
Link to comment

Hi

This is the configuration of the computer on which the grab was run


 

12:50:32 Version: 2.9.0.0 master-da34947 Aug 21 2019
12:50:32 Binary: Windows 64-bit Visual C++ 1900 Release
12:50:32 Engine features: OpenGL OpenGL4.5 Direct3D11 OpenAL XPad360 Joystick HalfTexCoords Microprofile OpenEXR Geodetic
12:50:32 App path: D:/esq-devel/esqgame/bin/
12:50:32 Data path: D:/esq-devel/esqgame/data/
12:50:32 Save path: D:/esq-devel/esqgame/bin/
12:50:32
12:50:32 ---- GPU Detection ----
12:50:32 GPU 0 Active: NVIDIA GeForce GTX 1060 6052 MB
12:50:32 GPU 1        : Microsoft Basic Render Driver 8170 MB
12:50:32
12:50:32 ---- System ----
12:50:32 OS:        Windows 10 (build 17763) 64-bit
12:50:32
12:50:32 CPU:    Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz12:50:32         Frequency: 2807MHz
12:50:32         Extensions: MMX SSE SSE2 SSE3 SSSE3 SSE41 SSE42 AVX AVX2 HTT12:50:32         Cores:4 Threads:8
12:50:32
12:50:32         System memory: 16340 MB
12:50:32         Sync threads: 7
12:50:32         Async threads: 8
12:50:32
12:50:32 GPU:    NVIDIA GeForce GTX 1060
12:50:32         Video memory: 6052 MB
12:50:32
12:50:32 ---- MathLib ----
12:50:32 Set SSE2 simd processor
12:50:32
12:50:32 ---- Sound ----
12:50:32 Renderer: OpenAL Soft on Reproduktory (Realtek High Definition Audio)
12:50:32 OpenAL vendor: OpenAL Community
12:50:32 OpenAL renderer: OpenAL Soft
12:50:32 OpenAL version: 1.1 ALSOFT 1.19.1
12:50:32 Found AL_EXT_LINEAR_DISTANCE
12:50:32 Found AL_EXT_OFFSET
12:50:32 Found ALC_EXT_EFX
12:50:32 Found ALC_SOFT_HRTF
12:50:32 Found EFX Filter
12:50:32 Found EFX Reverb
12:50:32 Found EAX Reverb
12:50:32 Found QUAD16 format
12:50:32 Found 51CHN16 format
12:50:32 Found 61CHN16 format
12:50:32 Found 71CHN16 format
12:50:32 Device enumeration supported
12:50:32 Maximum sources: 256
12:50:32 Maximum effect slots: 16
12:50:32 Maximum auxiliary sends: 2
12:50:32 Output sampling frequency: 192000hz
12:50:32 HRTF not enabled!
12:50:32
12:50:32 ---- Render ----
12:50:32 Renderer: NVidia 6052MB
12:50:32 Renderer API: Direct3D 11.0
12:50:32 Maximum texture size: 16384
12:50:32 Maximum texture units: 16
12:50:32 ---- Physics ----
12:50:32 Physics: Multi-threaded
12:50:32
12:50:32 ---- PathFind ----
12:50:32 PathFind: Multi-threaded

 

Link to comment

Thanks for the information!

As far as I can see all logs was grabbed with the running Editor? And I also can see that some additional visualizer is rendering in 2.9.0.2 case (3ms on CPU + 6ms additional scene intersections) that gives total difference of ~10ms.

Could you please take a quick look and see if the runtime performance is also lower with 2.9 release? If so, could you please send us two runtime microprofile logs?

It's always better to compare performance without Editor since it can heavily affect on overall performance.

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment
8 hours ago, silent said:

Thanks for the information!

As far as I can see all logs was grabbed with the running Editor? And I also can see that some additional visualizer is rendering in 2.9.0.2 case (3ms on CPU + 6ms additional scene intersections) that gives total difference of ~10ms.

Could you please take a quick look and see if the runtime performance is also lower with 2.9 release? If so, could you please send us two runtime microprofile logs?

It's always better to compare performance without Editor since it can heavily affect on overall performance.

What do you mean by aditional visualizer? We have only basic fatures turned on and settings are same as in 2.8. According to microprofiles it seems, that almost everything takes longer. For example world::get intersection is many times slower (in 2.9 there are improved intersections. maybe it has this impact?), render g buffer terrain global is slower, etc...

We will try also in runtime.

 

 

 

perf.jpg

Edited by demostenes
Link to comment
  • 2 weeks later...

We are not able to find any reason, render settings are similar to 2.8 (we exported/imported it). Performance drop is also in runtime, FPS are +- the same (actually I would expect slightly higher in runtime because of editor overhead).

 

Link to comment

Hello,

Could please specify exact location(camera_get coords would be nice) where you're experiencing 2x perf drop. For now we were only able to reproduce only a slight diff and will keep investigating it further.

Thanks

Link to comment
2 hours ago, vvvaseckiy said:

Hello,

Could please specify exact location(camera_get coords would be nice) where you're experiencing 2x perf drop. For now we were only able to reproduce only a slight diff and will keep investigating it further.

Thanks

It is happening everywhere. Above forest, in the forest (50-70FPS vs 40-50), in the city (40-50 vs 15-40)... This decrease is overall. If you want some heavy scene, you can jump on any node in medea WL (esqworld-MC.world). Be sure to try on revisions mentioned above, later have lots of differences.

edit: performance dumps were grabbed with these camera positions: 

player.setPosition(Vec3(1977.0f,3066.0f,401.0f));

player.setDirection(Vec3(90.0f,320.0f,0.0f));

 

Edited by demostenes
update
Link to comment

Unfortunately we're unable to reproduce this drop. I've made some measurements in coords you specified on all three revision you mentioned before, i found a slight drop(~<2ms) on 1083 comparing to 1079, but on revision 1093 total frame time decreased to ~14-15ms which is lower than in 2.8.0.1. I've runned measurements on the most similar setup we have: i7-4790K && GTX Titan X. I've attached results which i received. 
Could you please double check that you measurements weren't somehow affected by any thirdparty software, and if it still reproduces on your machine, please attach microprofile dump from runtime(default main_x64.exe without any logic but camera would be perfect). 

Thanks

esq_perf.zip

Link to comment
15 minutes ago, vvvaseckiy said:

Unfortunately we're unable to reproduce this drop. I've made some measurements in coords you specified on all three revision you mentioned before, i found a slight drop(~<2ms) on 1083 comparing to 1079, but on revision 1093 total frame time decreased to ~14-15ms which is lower than in 2.8.0.1. I've runned measurements on the most similar setup we have: i7-4790K && GTX Titan X. I've attached results which i received. 
Could you please double check that you measurements weren't somehow affected by any thirdparty software, and if it still reproduces on your machine, please attach microprofile dump from runtime(default main_x64.exe without any logic but camera would be perfect). 

Thanks

esq_perf.zip 5.75 MB · 0 downloads

Ok, we can do that. I am quite sure, there is no 3rd party SW running, because it happens on several computers and after rollback to 2.8 everything is OK. Several posts above are microprofiles from editor. You can see, that almost everything takes more time, especially  world::get intersection is many times slower.

Edited by demostenes
Link to comment

It seems that version in names of runtime dumps mismatched, or is it? Could you please check that, cause from files you've send it seems that runtime performance increased. 

Also, there was a minor change in 2.9 affecting threads priority in OS, process_priority command was in previous version, but it was slightly reworked, and new command, - gpu_thread_priority. Normally after migration it shouldn't be anyhow affecting performance on default values. But it may be your case, please try different options for this command, starting with highest 3 && 7 values, and check if that affects performance on your hardware. Meanwhile we will keep investigating on this perf drop.

Thanks

Link to comment
2 hours ago, vvvaseckiy said:

It seems that version in names of runtime dumps mismatched, or is it? Could you please check that, cause from files you've send it seems that runtime performance increased. 

Also, there was a minor change in 2.9 affecting threads priority in OS, process_priority command was in previous version, but it was slightly reworked, and new command, - gpu_thread_priority. Normally after migration it shouldn't be anyhow affecting performance on default values. But it may be your case, please try different options for this command, starting with highest 3 && 7 values, and check if that affects performance on your hardware. Meanwhile we will keep investigating on this perf drop.

Thanks

Sure, it is switched, sorry for that. 

If you look at 2.9 dump, there is world::get intersection and worldspatial::get intersectioncca 3x slower. And intersection was reworked in 2.9. Couldnt be this cause?:

Thanks for hint, we will try to change this thread priority.

 

Link to comment

From your microprofile dumps i see that apart *::getIntersection methods, the timings increased across almost all method consistently, which is very strange behavior. Also in your microprofile dumps i noticed difference in count of CPU threads, which is expected between releases, but there's shouldn't be such change between editor and runtime. Please, check that there's no special parameters set to engine or editor processes in OS task manager(Task Manager → Processes → *engine process* → RMB → Set Affinity/Set Priority). 

Since that we were unable to reproduce this issue on configs we got, firstly, we would like to try to test it on closest possible config. Could you please send us dxdiag info from machine you got dumps from, so we could build up system with same driver version, e.t.c.
If this does not bring results, we would like to do a teamviewer session, so we could inspect and capture system and engine state when the issue happens, i will pm you if it will be required.

Thanks

Link to comment

Here it is from the PCs where we made first dumps. I ve also added dump from second PC, it makes 26-35 FPS (it is unstable), on 2.8 it makes quite stable 39-41. Also 2.9 behaves quite random, I increased size of render window and scaled back and then I had quite stable 25-26 FPS. 2.8 behaves consistently.

DxDiag.txt

DxDiag_pc2.txt

Edited by demostenes
Link to comment
  • 2 weeks later...

Hello, 
We're still struggling to reproduce this drop on our side, so far we found some reports on multi-threaded performance drops in specific win10 versions. Could you please check that all available quality updates installed on PCs where performance is bad? Also, it would be great if you try to do fresh install of Win10 and see if it affects performance.
Thanks
 

Link to comment

Hello,

It does not look like it has something to do with version, updates or age of installation:

PC 1: version 1903,  build 18362.356 (all available windows updates, GPU drivers latest, old installation) - perf issue

PC 2:  1903 - 18362.356 - perf OK  (new installation)

PC 3: 1903 build 18362.356 - perf OK  (new installation)

PC 4: 1809 build  17763.737 - perf issue (old installation)

PC 5:  1809 build  17763.737 - perf OK (old installation)

Anyway night build you ve sent us was OK, so I would suggest following approach: Sent us several versions (each with more "suspicious" commits) and we can test it and hopefully we can find which commit does it. We can also try bisection method. If your last 2.8 commit is n and latest 2.9 is n+100, sent us n+50. We will test and based on result we will try n+25 or n+75, etc.....we should be able to get there quite fast.

Edited by demostenes
  • Like 1
Link to comment

Hi Jirka,

Thanks for the additional testing.

We already found a commit that causes this behavior.

We need to check if the fresh Windows 10 install will eliminate this issue on your side. We have more than 30 PCs tested and 0 of them shows performance drop. We tried to use the same HW that you, changed Windows versions, updates and so on but without any luck.

There should be (in theory) something on your side (either in Windows or BIOS tweaks that kills multi-threading performance). Maybe it's some software installed long time ago or maybe some wrong setting set up in BIOS (can' tell you which one because it can be literally anything).

Could you please do a fresh Windows 10 install (maybe on a separate HDD) on PCs where performance is lower to verify if our theory is correct?

Thanks again for your help!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment

We ve made completly clear installation on PC with perf issue and it seems OK (without any BIOS changes). This is not good news, because fresh Windows installation without anything is not realistic use case and it is really not possible to tell customers buy our nice new game and by the way, if you want to have performance OK, you need fresh installation of Windows. And if performance drops, do reinstall again. These issues started with 2.9, so root cause must be found, to be at least able to warn users against specific setup, or make code compatible with such configuration (because 2.8 was).

Edited by demostenes
Link to comment

It's actually pretty rare issue and seems that it usually occurs on PC where win10 was updated after modification of hardware (e.g. switching CPU, GPU, or just swapping storage devices) Thing is, that the issue is in bad multithreading performance (which is main change between 2.8 and 2.9, and the build I've sent you earlier was with old single-threaded realization of some engine features), so PC with malfunctioning CPU multithreading won't be slow only on unigine-based projects, but on any other multi-threaded software.

Win10 is not the only environment that could affect performance. It is also could have been some of 3rdparty software/service, or even some kind of malware. So it is possible if you continue using PC where you have reinstalled win10, performance might drop again after installing some software which you used to use. If such thing happens, please let us know.

Link to comment
×
×
  • Create New...