Performance Profiler
A performance profiler displays performance data in a timeline. It reports how much time is spent per each frame for updating all aspects of your project: rendering nodes that are in view, updating their states, executing scripts with game logic, calculating physics, etc.
With the profiler, you can:
- Detect the bottlenecks of your application
- Check if art assets optimization is required
- Check if code optimization is required
- Compare the profiling results before and after the changes
Activating Profiler#
To show profiler statistics in runtime, type the show_profiler 1 command in the console. To disable the profiler in runtime, type show_profiler 0 in the console.
You can configure the information displayed by the profiler by using the corresponding console commands.
The following profiling modes are available:
- Generic shows only the general statistics data.
- Rendering shows the detailed rendering statistics.
- Memory shows the detailed visualization of the memory consumption data.
- Physics shows the detailed physics-related statistics (within the Physics radius).
- World Management shows the statistics on the whole loaded world.
- Thread shows the statistics on loading threaded resources.
- Visualization of the table and timeline charts is also configurable by the console commands.
In the Editor, the profiler is also accessible via the toolbar (Tools -> Performance Profiler):
Generic Profiler#
Total | Total time in milliseconds taken to both calculate and render the current frame. This is the duration of the main loop in the application execution sequence. |
---|---|
Total CPU | Total time in milliseconds taken to prepare the current frame (including update, render, and swap). |
Total GPU | GPU execution time measured using timestamp queries from the graphics API. Represents the duration of GPU work for a single frame. Notice Since the data is taken from the graphics API, therefore, convergence with the other timings received by the engine itself is not guaranteed. |
Update | Time taken to update application logic. This includes executing all steps in the update() function of the world script. It also includes the update of states of all nodes (for example, update of the skinned animation or of a particle system to spawn new particles).
|
Render CPU | Time taken to prepare all data to be rendered in the current frame and feed rendering commands from the CPU to the GPU. If Render CPU time is too high, it signals that art assets may need to be optimized, for example:
|
Waiting GPU | Time between completing all calculations on the CPU up to the moment when the GPU has finished rendering the frame. (See the illustration). This counter is useful to analyze the bottleneck in your application's performance.
|
Interface | Time taken to render all GUI widgets. |
Total Physics | Time taken to perform all physics calculations. |
Waiting Physics | Time period during which the physics module waits for the completion of rendering process to return to the Main thread and execute the rest of the physics ticks. Notice In addition to the individual values listed above, average characteristics are also calculated and displayed. |
CPU ram free <FREE> / <TOTAL> |
Physical system memory information.
|
CPU ram usage <USED> / <STREAMING_LIMIT> |
Amount of system RAM used by the engine process with the configured streaming memory limit.
|
CPU ram active | Amount of system RAM actively used by the engine (excluding any cached memory) CPU ram active = CPU ram usage - CPU ram cache |
CPU ram cache | Amount of system RAM used to cache streamed assets, including skinned meshes, static meshes, and textures. CPU ram cache = CPU ram usage - CPU ram active |
CPU ram committed free <COMMITTED_FREE>/<COMMITTED_TOTAL> | Amount of committed memory that is currently available for use, after reserving memory for streaming.
|
CPU ram committed usage <COMMITTED_USAGE>/<COMMITTED_LIMIT> |
Amount of memory currently committed by the engine process.
|
CPU ram malloc | The amount of memory allocated by the UNIGINE custom allocator. |
Frame Allocations | The number of allocations made per frame. The UNIGINE allocator allocates memory in pools which allows the allocation to be faster and more efficient. |
Live Allocations <CURRENT>/<PEAK> | The current number of allocations made / The maximum number of allocations made during runtime (peak consumption). |
GPU vram free | Amount of GPU video memory currently available for use, calculated based on configured streaming limits and budget. |
GPU vram usage <USAGE>/<LIMIT> | The amount of VRAM used by GPU. This value is provided by a graphics driver
The following values are taken into account while the LIMIT is calculated: |
GPU vram active | Amount of GPU video memory currently in active use, excluding cached resources. GPU vram usage - GPU vram cache |
GPU vram cache | Amount of GPU memory used to store cached streamed assets, including skinned meshes, static meshes, and textures. VRAM Textures Unused + VRAM Meshes Static Unused + VRAM Meshes Skinned unused You can limit maximum size of VRAM Caches using render_streaming_cache_vram |
GPU ram usage | Amount of system RAM currently used by the GPU, taken from graphic API. |
Memory Profiler#
The following statistics are displayed in addition to the generic one:
CPU ram static pool | Memory pool pre-allocated at engine startup for small-sized allocations (features fixed-size blocks, e.g., 16, 32, 48 bytes). Static pools must be configured and tuned using memory statisctics (memory_info and memory_optimize_static_pools) |
---|---|
CPU ram dynamic pool | Flexible memory pool that grows and shrinks dynamically for allocation sizes beyond configured static pools capacity. Available on Windows only. |
CPU ram instance pool | Developer-defined memory pools, share allocator behavior with dynamic pools but with no size limitations. Notice Learn more about pools here: Memory Allocator |
GPU vram alloc | The amount of vram allocated made by the engine on the GPU side. |
GPU ram alloc | The amount of ram allocated made by the engine on the GPU side. |
GPU Frame Allocations | The number of video memory allocations made per frame. |
GPU Live Allocations <CURRENT>/<PEAK> | The current number of allocations made / The maximum number of allocations made during runtime (peak consumption). |
GPU Allocator small pool size | The maximum size of the VRAM pool for small allocations. |
GPU Allocator small pool usage | Actual usage of the video memory pool for small allocations. |
RAM ImagePool | RAM usage managed by the ImagePool, which handles all Image class instances. |
RAM StringPool | RAM usage managed by the StringPool, which handles all String instances. The string pool increases performance, as it stores only one string instead of its several duplicates. However, this pool is cleared only on the engine shutdown, so it is important to keep its size at the adequate level during runtime. |
RAM GUI | Memory (RAM) amount currently used for GUI, in megabytes. |
RAM FFP | Memory (RAM) amount currently used for FFP, in megabytes. |
RAM Sounds | Memory (RAM) amount currently used for sound assets, in megabytes. Cache limits for sounds can be adjusted using: |
RAM Terrain | Memory (RAM) amount currently used for terrain streaming, and the memory limit set for it, in megabytes. The memory limit is the maximum amount of RAM that can be used for terrain streaming. Limit is adjustable via render_landscape_cache_cpu_size. |
RAM Animations | Memory (RAM) amount currently used for animations (stored in the .anim files). |
RAM Meshes Static | Total amount of RAM currently used for streaming static meshes.
Notice
The same static mesh will use more RAM than VRAM, as additional data is generated in RAM to calculate collisions, intersections, spatial trees and so on. |
Ram Meshes Procedural Dynamic | The total amount of RAM currently used for streaming procedural dynamic meshes. |
RAM Meshes Skinned | Total amount of RAM currently used for streaming skinned meshes. |
RAM Meshes Static Used | Amount of RAM actively used for static meshes rendering. |
RAM Meshes Static Unused | Amount of RAM reserved for cached static meshes that are currently not in use. |
RAM Meshes Skinned Used | Amount of RAM actively used for skinned meshes rendering. |
RAM Meshes Skinned Unused | Amount of RAM reserved for skinned meshes that are currently not in use. |
VRAM Render Buffers Used | Amount of VRAM actively used for rendering buffers (Gbuffer, post-effects, etc.), in megabytes. |
VRAM Render Buffers Unused | Amount of VRAM currently reserved for rendering buffers (Gbuffer, post-effects, etc.) that are currently not in use. |
VRAM Textures | Total amount of VRAM currently used for textures, in megabytes. |
VRAM GUI | Amount of VRAM currently used for GUI, in megabytes. Visible only if GUI is present in the world. |
VRAM Geometry Grass | Amount of VRAM currently used for vertices of grass, in megabytes. Visible only if Grass present in the world. |
VRAM Geometry Particles <USAGE>/<LIMIT> | Amount of VRAM currently used for vertices of particle systems, and the maximum amount of VRAM that can be used for them, in megabytes. Visible only if Particle Systems are present in the world. Avaiable only when using Vulkan API. This values are adjustable via: |
VRAM Geometry Decals | Amount of VRAM currently used for decals, in megabytes. |
VRAM Meshes Static | Total amount of VRAM currently used for streaming static meshes. |
VRAM Meshes Procedural Dynamic | Total amount of VRAM currently used for procedural meshes. |
VRAM Meshes Skinned | Total amount of VRAM currently used for skinned meshes. |
VRAM Small Pool unallocated | Free VRAM on the Small Pool GPU Allocator small pool size - GPU Allocator small pool usage. |
VRAM Unaccounted | Represents VRAM not allocated through the engine. GPU VRAM Usage - GPU VRAM Alloc |
VRAM Textures Used | Amount of VRAM actively used for textures rendering. |
VRAM Textures Unused | Amount of VRAM currently used for textures cache, in megabytes. |
VRAM Meshes Static Used | Amount of VRAM actively used for static meshes rendering. |
VRAM Meshes Static Unused | Amount of VRAM used for cached static meshes. |
VRAM Meshes Skinned Used | Amount of VRAM actively used for skinned meshes rendering. |
VRAM Meshes Skinned Unused | Amount of VRAM used for cached skinned meshes. |
VRAM Meshes Skinned Allocator Size | Total allocated video memory reserved for skinned mesh data; if usage exceeds current size, the allocator expands. You can configure the size of a skinned mesh chunk using the skinned_mesh_pool_chunk_size setting. Reducing the chunk size may lead to performance degradation, as it's more efficient to perform skinning when the geometry fits within a single chunk. |
VRAM Meshes Skinned Allocator Usage | Amount of video memory currently in use for storing skinned mesh data; reflects how much of the reserved allocator space is actively occupied. |
VRAM Meshes Decals Allocator Size | Total video memory allocated for decal mesh data; defines the reserved memory pool size used specifically for rendering mesh-based decals. You can configure the size of a decal chunk using the decal_pool_chunk_size setting. Unlike skinned meshes, reducing the chunk size does not negatively impact performance. |
VRAM Meshes Decals Allocator Usage | Amount of allocated video memory currently used by mesh decals; indicates how much of the decal memory pool is actively occupied. |
VRAM Terrain Cache <USAGE>/<LIMIT> | GPU memory used for cached terrain tiles (landscape data) and the VRAM cache limit for terrain. Visible only if ObjectLandscapeTerrain is present. Limit is adjustable via: . |
VRAM Terrain Virtual Texture | Amount of VRAM currently used for the Landscape Terrain virtual texture, in megabytes. Visible only if ObjectLandscapeTerrain is present. The amount of VRAM allocated for virtual textures can be adjusted using the render_landscape_terrain_vt_memory_size command. Note that this value is stored in a custom preset, which is activated by setting render_landscape_terrain_streaming_preset to 5. |
VRAM Terrain Detail Textures | Amount of VRAM currently used for Landscape Terrain details, in megabytes. Visible only if ObjectLandscapeTerrain is present. |
RAM Streaming need | Amount of system RAM additionally needed to load currently required assets. A positive value indicates that the engine is short on RAM for streaming and will start freeing caches to load new assets. |
VRAM Streaming need | Amount of system VRAM additionally needed to load currently required assets. A positive value indicates that the engine is short on VRAM for streaming and will start freeing caches to load new assets. |
Rendering Profiler#
The following statistics are displayed in addition to the generic and memory statistics:
Terrain Reload Tiles | Number of Landscape Terrain tiles that are currently awaiting reloading after being modified. |
---|---|
Terrain Reload Bounds | Number of events causing reloading of tiles (one bound may cause reloading of multiple tiles). |
Lights All | Number of light passes rendered per frame. This means that the counter displays the number of all light sources that are currently seen illuminating something in the viewport. This value also includes additional passes for rendering lights in the reflecting surfaces (if dynamical reflections are used). Plain 2D reflection will multiply the number of rendering passes by two, while cubemap-based reflection with six faces updated each frame will multiply the number of rendering passes by six.
Notice
Each light redraws mesh polygons it illuminates. That is why the higher the number of light sources, the higher the number of polygons the graphics card has to render, and the lower the performance. For example, using two omni lights will as much as double the rendered geometry they shine on. |
Lights Omni (Viewports) | Number of rendered Omni lights visible in the main viewport. |
Lights Proj (Viewports) | Number of rendered Projected lights visible in the main viewport. |
Lights World (Viewports) | Number of rendered World lights visible in the main viewport. |
Lights Environment Probe (Viewports) | Number of rendered Environment probes in the main viewport. |
Lights Voxel Probe (Viewports) | Number of rendered Voxel probes in the main viewport. |
Lights Planar Probe (Viewports) | Number of rendered Planar probes in the main viewport. |
Lights Omni (Reflections) | Number of Omni lights rendered in reflection passes. |
Lights Proj (Reflections) | Number of Projected lights rendered in reflection passes. |
Lights World (Reflections) | Number of World lights rendered in reflection passes. |
Lights Environment Probe (Reflections) | Number of Environment probes rendered in reflection passes. |
Lights Voxel Probe (Reflections) | Number of Voxel probes rendered in reflection passes. |
Lights Planar Probe (Reflections) | Number of Planar probes rendered in reflection passes. Notice Planar probes reflect other lights, but do not reflect each other. |
Decals All |
Number of decals rendered per frame (in all rendering passes). If a decal has any additional effect (is transparent, emits light, lit by a light source, or has any other visual effect) that involves an additional rendering pass for this decal, this decal will be rendered additionally in every corresponding pass, thus the total number of rendered decals will increase. |
Decals Ortho (Viewports) | Number of Orthographic decals rendered in the main viewport. |
Decals Proj (Viewports) | Number of Projected decals rendered in the main viewport. |
Decals Mesh (Viewports) | Number of Mesh decals rendered in the main viewport. |
Decals Ortho (Reflections) | Number of Orthographic decals rendered in reflection passes. |
Decals Proj (Reflections) | Number of Projected decals rendered in reflection passes. |
Decals Mesh (Reflections) | Number of Mesh decals rendered in reflection passes. |
Surfaces All |
Number of surfaces rendered per frame (in all rendering passes). If a surface has any additional effect (is transparent, emits light, lit by a light source, or has any other visual effect) that involves an additional rendering pass for this surface, this surface will be rendered additionally in every corresponding pass, thus the total number of rendered surfaces will increase. |
Surfaces (Shadows) | Number of surfaces rendered in shadow map passes. |
Surfaces (Viewports) | Number of surfaces rendered in the main viewport. |
Surfaces (Reflections) | Number of surfaces rendered in reflection passes. |
Dynamic Reflections | The number of dynamic reflections drawn per frame. In case of cubemap reflections, if all six faces are updated, six reflections are rendered each frame. |
Shadows | Number of shadow passes rendered per frame. Each light requires a shadow pass to calculate the shadows. Again, if there are reflecting surfaces with shadows drawn reflected, this will increase the number of shadow passes. |
Triangles All | Total number of triangles rendered per frame including all polygons that are currently visible in the viewport as well as the ones rendered in the process of shadows rendering. |
Triangles (Shadows) | Number of triangles rendered per frame in the process of shadows rendering. Each light source has to redraw the geometry it illuminates, increasing the overall count of rendered triangles. In order to avoid GPU bottleneck, keep the number of dynamic light sources and their radius as low, as possible. |
Triangles (Viewports) | Number of triangles rendered per frame. This includes all polygons that are currently visible in the viewport (geometry). |
Triangles (Reflections) | Number of triangles rendered for reflective surfaces; includes geometry drawn in reflection passes such as planar probes or planar water surfaces. |
Primitives | Number of geometric primitives rendered per frame. This includes points, lines, triangles, and polygons. The visualizer and the profiler itself also add to this counter. The value differs dramatically if tessellation is used. In this case, Triangles reports the number of triangles in the coarse mesh, while Primitives shows statistics on the number of tessellated primitives. |
Dips | The number of draw calls. The higher the number of identical mesh surfaces with the same material, the more effective the instancing is (enabled by default). This means, the number of draw calls is minimized offloading both the CPU and the GPU.
You can compare the number of surfaces (Surfaces) and the number of DIPs used to render them. For example, if there are 30000 surfaces and 1000 DIPs, it means that 30 instanced surfaces of meshes are rendered per only one draw call (Surfaces/Dips). Thus the instancing provides performance boost. |
Barriers | The number of barriers per frame represents how many times GPU resources change states during a single rendering frame. Each transition ensures proper synchronization when a resource is accessed differently, such as switching from rendering to shader reading. This metric reflects the complexity and efficiency of the rendering pipeline - higher counts may indicate frequent state changes, which can lead to pipeline stalls and reduced performance. Reducing unnecessary transitions can improve frame consistency and GPU utilization. |
Materials | Number of materials set per frame. (Materials are set in each of the rendering passes.) |
Shaders | Number of shaders set per frame. (Shaders are set in each of the rendering passes; hence if only one material used, its shader still needs to be set several times. When nothing is visible and the screen is black, even in this case the composite shader is still used.) |
Compiled PSO | Number of pipeline state objects compiled during execution. Reflects runtime PSO creation, which may cause stuttering or delays if not handled in advance. |
Loaded PSO | Number of pipeline state objects loaded from cache or disk. Shows how many precompiled states were reused efficiently, helping to avoid runtime compilation costs. |
Compiled Shaders | Number of shaders that are currently compiled. |
Loaded Shaders | Number of shaders loaded from a precompiled cache or disk. Reflects successful reuse of prebuilt shaders, which improves performance and reduces runtime overhead. |
CPU Loading resources | Number of resources currently being streamed or loaded by the CPU. Represents how many assets are actively being prepared on the CPU side for rendering. |
GPU Loading resources | Number of resources currently being streamed or loaded by the GPU. Represents how many assets are actively being prepared on the GPU side for rendering. |
Physics Profiler#
This profiler shows statistics within the Physics radius.
The following statistics are displayed in addition to the generic one:
PIslands | Number of independent physics islands (groups of bodies that can simulate without interacting with others) within the physics simulation radius. More islands allow better multi-threading of physics; very few islands mean most objects are inter-connected in one simulation group |
---|---|
PBodies | Number of bodies within the physics radius. |
PJoints | Number of joints within the physics radius. |
PContacts | Total number of contacts within the physics radius; this includes contacts between the bodies (their shapes) and body-mesh contacts. |
PCollision | Duration of the collision detection phase of physic simulation when collisions between objects are found. |
PSimulation | Duration of all simulation phases added together. |
Watch an overview of the Physics Profiler options in our video tutorial on physics.
World Management Profiler#
This profiler shows statistics on the whole world.
The following statistics is displayed in addition to the generic one:
WNodes | Total number of nodes in the world (both enabled and disabled). |
---|---|
WBodies | Total number of bodies in the world. |
WShapes | Total number of shapes in the world. |
WJoints | Total number of joints in the world. |
WSpawn | Time in milliseconds that the engine spends on generating content in procedural nodes (such as grass, clutters, world layers). |
Thread Profiler#
The following statistics are displayed in addition to the generic one:
CPUShaders Threads | Number of internal engine threads used for CPU shaders. |
---|---|
AsyncQueue | Time of asynchronous loading of resources (files/meshes/images/nodes), in milliseconds. In addition to timing metrics, you can also inspect detailed information about files, images, meshes, and nodes using the async_queue_info console command. |
Sound | Time of asynchronous loading of sounds, in milliseconds. |
PathFind | Time of asynchronous pathfinding calculations, in milliseconds. |
The information on this page is valid for UNIGINE 2.20 SDK.