This page has been translated automatically.
UnigineScript
The Language
Core Library
Engine Library
Node-Related Classes
GUI-Related Classes
Plugins Library
High-Level Systems
Samples
Usage Examples
C++ API
API Reference
Integration Samples
Usage Examples
C++ Plugins
Migration
Migrating to UNIGINE 2.0
C++ API Migration
Migrating from UNIGINE 2.0 to UNIGINE 2.1
Warning! This version of documentation is OUTDATED, as it describes an older SDK version! Please switch to the documentation for the latest SDK version.
Warning! This version of documentation describes an old SDK version which is no longer supported! Please upgrade to the latest SDK version.

Performance Profiler

A performance profiler displays performance data in a timeline. It reports how much time is spent per each frame for updating all aspects of your project: rendering nodes that are in view, updating their states, executing scripts with game logic, calculating physics, etc.

With the profiler, you can:

  • Detect the bottlenecks of your application
  • Check if art assets optimization is required
  • Check if code optimization is required
  • Compare the profiling results before and after the changes

Performance Profiler

Performance Profiler

Activate Profilers

To turn the profiler on, click Optimize -> Performance Profiler and choose the required profiler mode:

Notice
You can also set a hot key (or a key combination) that will run the profiler via the Controls window. However, you will have to create a custom preset. Then you can press this hot key for several times to cycle through profiling modes.

The following profiling modes are available:

  1. Generic profiler shows only the general statistics block.
  2. Rendering profiler shows the detailed rendering statistics and the timeline chart.
  3. Physics profiler shows the detailed physics-related statistics (within the Physics radius) and the timeline chart.
  4. World Management profiler shows the statistics on the whole loaded world.
  5. Thread profiler shows the statistics on loading threaded resources.

To show profiler statistics in the in-game mode, exit UnigineEditor with the enabled profiler (by typing editor_quit in the console) or type show_profiler command and a value from 1 to 5 in the console. To disable the profiler in the in-game mode, type show_profiler 0 in the console.

You can also enable the additional Renderer profiler block via render_profiler console command. For this block to be shown, the base profiler (in any mode) should be enabled.

Generic Profiler

Generic Profiler Enabled

Total The total time in milliseconds that both rendering and calculating of the frame took. This is the duration of the main loop in the application execution sequence.
You can count the framerate of your application if you sum up the Total time (preparing data on the CPU) and the Present time (when the GPU has finished its rendering work):

FPS = 1 / (Total + Present) x 1000

For example, 5 ms of Total time and 5 ms of Present time will result in FPS equal to 100. 20 ms in sum mean 50 FPS. By 40-50 ms the framerate is too low and the application needs to be optimized.
Update The time to update application logic. This includes executing all functions in the update() loop of the world script. It also includes the update of states of all nodes (for example, update of the skinned animation or of a particle system to spawn new particles). To sum it up, this is the duration of the update loop.
  • If the Update time is too high, it signals that you need to optimize the application logic executed each frame.
  • You may also need to decrease the number of objects in the world, as updating their states (spawn particles by the particle systems, play skinned mesh animation, etc.) increases the load.
Render The time it took to prepare all data to be rendered in the current frame and feed rendering commands from the CPU to the GPU. If the Render time is too high, it signals that art assets may need to be optimized, for example:
  • the LOD system needs to be used;
  • the polygon count of models should be reduced.
Interface The time required to render all GUI widgets.
Physics The time required to perform all physics calculations.
Present The time between completing all calculations on the CPU up to the moment when the GPU has finished rendering the frame. (See the illustration). This counter is useful to analyze the bottleneck in your application's performance.
  • When Present time is equal to 0, it signals that scripts take too long to be updated and calculations are too intensive for CPU to perform them fast enough. In this case, your application has a CPU bottleneck. Optimize your update block in the world scripts or reduce the number of objects updated each frame.
  • When Present time is high, it means either of the following:
    • If the framerate is low, it signals that there exists a GPU bottleneck. The art content needs to be optimized in this case.
    • If the framerate is consistently high, it means you have the free CPU resources available to crunch more numbers in the update() of world script.
Heap The size of all memory pools allocated for the application. Unigine allocator allocates memory pools in pools which allows the allocation to be faster and more efficient (if USE_MEMORY directive is used, by default). As the memory is allocated in pools, the counter value increases stepwise.
Memory The size of all memory blocks allocated on demand. This counter reports the how much memory in allocated pools the application resources really use.
  • If Memory size steadily increases, it may signal that there are memory leaks. Check if all created objects and variables that are no longer used are properly deleted.
System The size of RAM memory used for the application.
Allocations The number of allocation calls during the frame. (This counter reports an allocation call even if several of bytes are requested to be allocated).
Meshes The size of the memory used for mesh geometry.
Textures The size of the memory used for textures in materials.
Samples The size of the memory used for sound samples.

Rendering Profiler

Rendering Profiler Enabled

The following statistics is displayed in addition to the generic one:

RLights The number of light passes rendered per frame. This means that the counter displays the number of all light sources that are currently seen illuminating something in the viewport. This value also includes additional passes for rendering lights in the reflecting surfaces (if dynamical reflections are used). Plain 2D reflection will multiply the number of rendering passes by two, while cubemap-based reflection with six faces updated each frame will multiply the number of rendering passes by six.
Notice
Each light redraws mesh polygons it illuminates. That is why the higher the number of light sources, the higher the number of polygons the graphics card has to render, and the lower the performance. For example, using two omni lights will as much as double the rendered geometry they shine on.
RShadows The number of shadow passes rendered per frame. Each light requires a shadow pass to calculate the shadows. Again, if there are reflecting surfaces with shadows drawn reflected, this will increase the number of shadow passes.
RReflections The number of reflections drawn per frame. In case of cubemap reflections, if all six faces are updated, six reflections are rendered each frame.
RProcedurals The number of procedural textures rendered per frame. Procedural textures of post-process materials applied to the other procedural textures are also taken into account.
RShaders The number of shaders set per frame. (Shaders are set in each of the rendering passes; hence if only one material used, its shader still needs to be set several times. When nothing is visible and the screen is black, even in this case the composite shader is still used.)
RMaterials The number of materials set per frame. (Materials are set in each of the rendering passes.)
RTriangles The number of triangles rendered per frame. This includes all polygons that are currently visible in the viewport. In addition, each light source has to redraw the geometry it illuminates, increasing the overall count of rendered triangles. In order to avoid GPU bottleneck, keep the number of dynamic light sources and their radius as low, as possible.
RPrimitives The number of geometric primitives rendered per frame. This includes points, lines, triangles, and polygons. The visualizer and the profiler itself also add to this counter. The value differs dramatically if tessellation is used. In this case, RTriangles reports the number of triangles in the coarse mesh, while RPrimitives shows statistics on the number of tessellated primitives.
Notice
Primitives statistics are available only under DirectX 10 and 11.
RSurfaces The number of surfaces rendered per frame (in all rendering passes). Each light source doubles the number of surfaces if they are lit.
RDecals The number of decals rendered per frame (in all rendering passes).
RDips The number of draw calls. The higher the number of identical mesh surfaces with the same material, the more effective the instancing is (enabled by default). This means, the number of draw calls is minimized offloading both the CPU and the GPU.
You can compare the number of surfaces (RSurfaces) and the number of DIPs used to render them. For example, if there are 30000 surfaces and 1000 DIPs, it means that 30 instanced surfaces of meshes are rendered per only one draw call (RSurfaces/RDdips). Thus the instancing provides performance boost.
RMTris/sec The number of millions of triangles rendered by the graphics card per second.
RKSurf/sec The number of thousands of surfaces rendered by the graphics card per second.
RKDips/sec The number of thousands of draw calls made by the graphics card per second.
RSpawn The time in milliseconds that the engine spends on loading meshes and textures.

Physics Profiler

This profiler shows statistics within the Physics radius.

Physics Profiler Enabled

Notice
To show valid information on
  • PUpdate
  • PResponse
  • PIntegrate
the physics simulation should be run in the single-threaded mode (physics_threaded console command should be set to 0 or 1).

The following statistics is displayed in addition to the generic one:

PIslands The number of physical islands within the physics radius that could be calculated separately. The lower this number, the less efficient multi-threading is, if enabled.
PBodies The number of bodies within the physics radius.
PJoints The number of joints within the physics radius.
PContacts The total number of contacts within the physics radius; this includes contacts between the bodies (their shapes) and body-mesh contacts.
PBroad The duration of the broad phase of physic simulation when potentially colliding objects are found.
PNarrow The duration of the narrow phase when exact collision tests are performed.
PUpdate The duration of the update phase when objects are prepared for their collision response to be calculated.
PResponse The duration of the response phase when collision response is calculated and joints are solved.
PIntegrate The duration of the integrate phase when physics simulation results are applied to bodies.
PSimulation The duration of all simulation phases added together.

World Management Profiler

This profiler shows statistics on the whole world.

Notice
When in the editor mode, enabling and disabling of the node, body or a joint can increase values of world profiler counters, as a (limited) number of clones are created for undo/redo purposes.

World Management Profiler Enabled

The following statistics is displayed in addition to the generic one:

WNodes The total number of nodes in the world (both enabled and disabled).
WBodies The total number of bodies in the world.
WJoints The total number of joints in the world.
WSpawn The time in milliseconds that the engine spends on generating content in procedural nodes (such as grass, clutters, world layers).

Thread Profiler

Thread Profiler Enabled

The following statistics is displayed in addition to the generic one:

World The time of asynchronous loading the current queue of nodes in milliseconds.
Sound The time of asynchronous loading sounds in milliseconds.
PathFind The time of asynchronous pathfinding calculations in milliseconds.
FileSystem The time of asynchronous loading files in milliseconds.

Additional Renderer Profiler

The renderer profiler allows to find out what aspects of art content could be optimized to increase the overall performance. However, enabling the renderer profiler incurs a very large overhead and the application runs significantly slower while profiling. The reason for that is that the GPU is synchronized with the CPU to measure how long each rendering task takes.

Counters are hidden in case the renderer option they report statistics on is not used.

RPIntersection The time required to go down the BSP tree and cut off all nodes that are currently not visible in the frustum.
RPReflections The time required to render reflections.
RPUpdate The time required to prepare surfaces for rendering. This includes setting alpha-blend fading and tessellation switches.
RPSort The time required to sort all polygons to be rendered in the proper order.
RPDeferred The duration of the deferred pass.
RPQueries The time required to render nodes with Query flag on.
RPDeferredLight The duration of the deferred light pass.
RPOpacityAmbient The duration of the ambient pass for opaque objects.
RPOpacityLight The duration of the light passes for opaque objects.
RPDecalsAmbient The duration of the ambient pass for decals.
RPDecalsLight The duration of the light passes for decals.
RPDeferredLightProb The duration of the pass for rendering the global illumination created with the probe light.
RPTransparent The duration of the pass for rendering transparent objects.
RPScattering The time required to render light scattering pass.
RPVolumetric The time required to render volumetric shadows.
RPDOF The time required to render the depth of field effect.
RPComposite The time required to compose the final viewport image in the composite shader (before applying postprocesses).
RPRefraction The time required to render refractive materials.
RPOcclusion The time required to render ambient occlusion and global illumination.
RPRender The time required to render Render postprocess materials.
RPPost The time required to render Post postprocess materials.
RPHDR The time required to render the HDR effect.
RPGlow The time required to render the glow effect.
RPVelocity The time required to render the velocity buffer with moving physical objects for motion blur.
RPAuxiliary The time required to render the auxiliary pass.
RPShadowWorldIntersection The time required to find objects that cast shadows from world light sources and that are currently visible in the view frustum.
RPShadowWorldRender The time required to render shadow maps from world light sources, if any.
RPShadowProjIntersection The time required to find objects that cast shadows from projected light sources and that are currently visible in the view frustum.
RPShadowProjRender The time required to render shadow maps from projected light sources, if any.
RPShadowOnmiIntersection The time required to find objects that cast shadows from omni light sources and that are currently visible in the view frustum.
RPShadowOmniRender The time required to render shadow maps from omni light sources, if any.
Last update: 2017-07-03
Build: ()