Asyncronous Data Streaming
Data streaming is an optimization technique intended to reduce spikes caused by loading of graphic resources. It supposes that not all the data is loaded into random access memory (RAM) at once. Instead, only the required data is loaded, and all the rest is loaded progressively, on demand.
Resource loading is performed and transferred to the GPU in separate asynchronous threads. After that, resources are synchronized and added to the virtual scene on the CPU side.
In UNIGINE, asynchronous data streaming is enabled by default. In cases when it is necessary to force-load all meshes and/or textures required for each frame at once (e.g. grabbing frame sequences, rendering node previews, warmup, etc.), enable the Forced mode in the Streaming Settings in the Editor or by using the render_streaming_mode 1 console command.
Streaming system provides asynchronous loading of the following data to RAM:
- All texture runtime files and textures with the Unchanged option enabled, including cubemaps, voxel probe maps and shadow maps of baked shadows.
- Meshes of ObjectMeshStatic, ObjectMeshClutter and ObjectMeshCluster objects.
General information on streamed resources can be obtained by using the render_streaming_info console command.
It is also possible to print the list of loaded resources and detailed information on them by using the render_streaming_list console command.
Procedurally generated objects such as ObjectMeshClutter and ObjectGrass are generated in a separate thread, which significantly reduces performance costs.
Common Streaming Settings#
The number of loaded/created graphic resources during a frame is limited by the Render Budget parameter. It can be used to find a balance between loading speed and performance.
To take the advantages of multithreading, set the maximum number of threads used for resource streaming by using the render_streaming_max_threads console parameter. Higher number of threads results in faster streaming, but may cause spikes in case of excessive consumption of GPU resources.
By default, the Memory Limit control is enabled. On exceeding specified memory volume, meshes and textures which aren't required for rendering at the moment are unloaded. Maximum memory amounts are defined for meshes and textures individually via the Meshes Memory Limit and Textures Memory Limit parameters, values are specified in percentage of the total GPU memory.
The graphic resources are regularly checked for being modified in order to be reloaded or deleted. The corresponding check intervals are specified by the Check Duration and the Destroy Duration respectively.
Streaming system uses texture cache composed of minimized copies generated for all textures with user-defined resolution and stored in the data/.cache_textures folder. These copies are used instead of the originals while they are loaded.
Texture cache is loaded at engine’s startup and always stays in the memory after loading. To provide smooth loading and rendering of resources, streamed entities have the following loading priorities:
- Texture cache;
- Uncached textures cause spikes as texture cache is generated for them on the fly. Materials with uncached and unloaded textures applied are rendered black.
- Full-size textures.
Texture cache can be preloaded or loaded after geometry data, the Texture Cache Preload flag controls loading priority of texture cache.
By default, texture cache files are generated with a resolution of 16×16, such a low resolution of textures causes visual artifacts during loading. The resolution can be increased by setting the Texture Cache Resolution parameter. Existing cache files should be wiped away by using the render_streaming_textures_cache_destroy console command, after that texture cache will be generated automatically with the new specified resolution.
Video memory amount currently occupied by the texture cache is available in the Performance Profiler tool.
The render_streaming_textures_cache_load and render_streaming_textures_cache_unload console commands allow controlling loading of texture cache. E.g. after loading of full-size textures, the texture cache can be unloaded from video memory for better performance.
Settings and workflow for OpenGL API are slightly different than for DirectX API.
Under OpenGL the Data Streaming System engages two intermediate buffers to provide data transfer between CPU and new resource:
- Async Buffer used for mesh and texture streaming.
- Async Buffer Indices used for streaming of vertex indices of meshes.
The size of the Async Buffer buffer must correspond to the size of the largest resource (mesh/texture), otherwise in case of a larger resource, the buffer will be resized causing a spike.
A mechanism of buffer synchronization is enabled if the Async Buffer Synchronization parameter is set to 1. So, async buffers are created only once and then synchronized reducing the time on allocating and freeing memory. When the synchronization is disabled, both Async Buffer and Async Buffer Indices are created anew for each new resource. This reduces the number of buffer syncronizations but increases the number of memory allocations.
Sometimes (depending on the hardware/driver used, e.g. when the main thread is affected by sychronization primitives in other threads) memory allocation may be faster than synchronizations, in such cases, when streaming becomes unacceptably slow, it is recommended to disable buffer synchronization.
There are some known issues and workarounds for some hardware/driver software:
- The Mesa 3D GL: The buffer synchronization must be disabled (gl_async_buffer_synchronization 0) for better performance. The updated Open Graphics Drivers are required.
- Intel: It is necessary to take into account that VRAM is limited by OS to one-half of RAM.
- Intel Mesa 3D GL: Low performance may be fixed by adjusting the Framerate Stabilization parameters.