Jump to content

[SOLVED] Performance problems related to caching?


photo

Recommended Posts

Hi,

 

We're running Unigine as a simulation where we drive our trains from one point in the world to another. The simulation run can potentially be in the order of 50km in distance. Between simulation runs, we typically leave Unigine instance running (but unload the world between runs).

One thing that we have observed with regular frequency is that the performance on the first run after starting Unigine can be quite poor. We see frame rate drops, stutters, it could be due to a number of factors, loading from disk (we are using SSD) , texture/mesh upload to the card. However, on subsequent simulation runs (that start from the same location in the world), the performance is far smoother. The poor performance occurs in different spots in the scene each time so its not like we can characterise with content. We have also optimised our content fairly agressively to reduce the number of draw calls. I certainly don't think our problem is hardware related - we're running an I7-4970K with 32GB RAM and a triple SLI 780ti configuration (these problems aren't related to microstutter since we see the same problems on our single Titan box)

 

So to me it seems that the first simulation run, some sort of caching is being done under the hood which results in subsequent simulation runs being smoother. So I'm wondering are there any explicit API calls that we can look into to explicitly control the caching/file/resource loading so that our performance is smoother. I understand by my somewhat vague description here that you wont be able to completely characterise the problem but I'm wondering if anyone has any hints on what configuration settings or API calls we should be looking at to explicitly control the loading of caching of resources.

 

thanks

 

Craig 

Link to comment

Craig, first of all you have to identify which resource(s) causes the rendering stalls at certain points of your scene: shader compiles, texture uploads or mesh uploads.

 

Of course this is not so easy in practice and best approach is highly dependent on your scene structuring (usage of a single world file including all nodes or usage of node-file background streaming, material library usage, parameter tuning optimized for fixed frame rate 30/60Hz etc)

 

UNIGINE has e.g. some console variables e.g. to force shader compile/warming at startup. This might be helpful to detect render stalls due to some shader compiles during simulation.

 

In the same way also textures and meshes can be forced to be loaded at startup but I would expect that this might cause cache trashing as your large world will not fit completely into memory

 

As far as I can remember you are using quite large 2k-4K textures e,g, for the terrain and all kinds of masks. This of course might also be a source for render stalls due to GPU upload stalls. Once again the tricky part will be to identify the root of the overload. Also the overload might be caused by multiple texture uploads per frame, not only a single one.

 

One possible tuning parameter might be the time delay between individual file loads by the file reader background thread, as this will allow to limit indirectly the number of texture uploads. Same might be true by tuning UNIGINE frame time budgets for resource uploads/grass/clutter generation, even though I am not sure if this only applies to node-file streaming and current implementation is not optimal.

 

If your scene has a "layered" structure (e.g. terrain / grass+clutter / vegetation / buildings etc.) I would recommend to run your simulation just with the single layers enabled and check for initial stuttering at the problematic locations of your database. If no stuttering occures just activate an alternative\additional layer and check again. Maybe this will give at least an indication of the scene geometry type causing the problem.

 

In all cases caching can be controlled "manually" via engine.filesystem cache functions e.g. addCacheFile(). engine.filesystem has additional functions for checking number of queued files/images/meshes etc which might also be usefull to track down the actual cause of the stall by comparing file numbers (or even better file names) just before and after a frame stall.

 

Just some ideas for finding the bottleneck

Link to comment

Just further more to this post, we found that a small number of our stalls were related to using png files for our Terrain tile masks. But the larger number of stalls could be explained by shader compiles. Now we followed some advice in this thread https://developer.unigine.com/forum/topic/325-render-manager-load-shaders-and-render-manager-warm-shaders/ with regards to this particular quote " You will have a shaders compile stall if you have some node references with unique materials which are not presented in the .world materials."

 

By modifying our content, we didn't see any real reduction in stalls. However, we did notice that stalls stopped occurring on the second simulation run when the file 'shader_d3d11.cache' is present, no stalls at all. Similar to what was mentioned in the above mentioned thread, we can specify 'render_manager_check_shaders' to be 1 in our cfg file but this is supposedly only for debugging purposes since it generates a large cache file. That's fine and in practical reality we're not going to run with this option turned on since it takes nearly 45 minutes to generate the cache file. But I'm wondering:

 

  1. is there another way of batch producing that 'shader_d3d11.cache' file without changing your cfg eg some batch command and more importantly
  2. the 'shader_d3d11.cache' file produced by the 'render_manager_check_shaders' option is presumably a compilation of shaders for all materials in the data path - is this correct?
  3. is it valid approach to run unigine with 'render_manager_check_shaders' set to 1 to produce the cache file and ship our product with this cache file (obviously with render_manager_check_shaders set to 0) to avoid possible shader compile related render stalls or is this not a valid approach?
  4. Does the 'shader_d3d11.cache' differ between different cards? eg if i generated the cache file on a gtx680 and then run it on a 780ti or titan, would the cache file still be valid? 
Link to comment

Hello, I was just wondering if any devs from Unigine had some feedback on this?

Based on my testing done so far 3 and 4 seem to be a valid way of doing things but I want to confirm.

Link to comment
  • 3 weeks later...

Hi Craig,
 
Sorry for the late reply. 
 

is there another way of batch producing that 'shader_d3d11.cache' file without changing your cfg eg some batch command and more importantly


Yes, there are bunch of scripts, located in <SDK>/source/tools/Interpreter/scripts/render which can generate shader caches files:

  • shader_cache_d3d9_default.usc
  • shader_cache_d3d11_default.usc
  • shader_cache_d3d11_simple.usc (Simple mobile shaders)

You can modify defines (for example: quality, msaa level) and materials list (for example, if you are not interested in generating shader cache for some of them) inside this scripts.
 
For shader generation Microsoft DirectX SDK is required. Also, fxc.exe should be in the PATH variable.
 
To generate shaders cache you need to perform following steps:

  • copy usc_x64 (or usc_x86) to the <SDK>/source/tools/Interpreter/scripts/render
  • Execute command inside this folder:

    usc_x64 <needed_shader_script.usc>
     
  • Output shaders cache will be generate inside <SDK>/data folder.

By default generated cache files will be more than 400MB and generation process will take more than 20 hours total for the first 2 scripts (without simple shaders).



the 'shader_d3d11.cache' file produced by the 'render_manager_check_shaders' option is presumably a compilation of shaders for all materials in the data path - is this correct?


Yes, this is correct. Moreover, shaders will be generated with all defines (with all possible combinations of quailty, msaa and other settings).
 

is it valid approach to run unigine with 'render_manager_check_shaders' set to 1 to produce the cache file and ship our product with this cache file (obviously with render_manager_check_shaders set to 0) to avoid possible shader compile related render stalls or is this not a valid approach?


Yes, you can use engine to generate cache files. Or you can generate it with fxc.exe tool from Microsoft Windows SDK.
 

Does the 'shader_d3d11.cache' differ between different cards? eg if i generated the cache file on a gtx680 and then run it on a 780ti or titan, would the cache file still be valid?


Cache files should be compatible between different GPUs.

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment
  • 2 weeks later...
×
×
  • Create New...