Execution Sequence
This article focuses on details of the UNIGINE execution sequence. Here you will find what is happening under the hood of UNIGINE engine. For a high-level overview of UNIGINE workflow, see the Engine Architecture article.
The UNIGINE engine internal code and the logic of your application that can be extended using plugins in C++ or C#, are executed in the pre-defined order:
- Initialization. During this stage, the required resources are prepared and initialized. As soon as these resources are ready for use, the engine enters the main loop.
- Main loop. When UNIGINE enters the main loop, all its actions can be divided into three stages, which are performed one by one in a cycle:
- Update stage containing all logic of your application that is performed every frame
- Rendering stage containing all rendering-related operations, physics simulation calculations, and pathfinding
- Swap stage containing all synchronization operations performed in order to switch between the buffers
This cycle is repeated every frame while the application is running.
- Shutdown. When UNIGINE stops execution of the application, it performs operations related to the application shutdown and resource cleanup.
The main loop of the execution sequence can be performed in either of the following modes:
- Single-threaded mode. This information is provided for basical understanding how a separate thread is executed.
- Multi-threaded mode. This is the default execution sequence for the UNIGINE engine.
The following diagrams represent the main stages of the UNIGINE execution sequence in the single-threaded and multi-threaded modes.
UNIGINE execution sequence in the single-threaded mode
|
UNIGINE execution sequence in the multi-threaded mode
|
Single-Threaded Mode#
At most times, UNIGINE finishes all its stages in the main loop faster then the GPU can actually render the frame. That is why double buffering is used: it enables to render frames faster by swapping GPU buffers (the back and front ones) into which rendering is performed.
When all scripts have been updated and all calculations on the CPU have been completed, the GPU is still rendering the frame calculated on the CPU. So, the CPU has to wait until the GPU finishes rendering the frame and the rendering buffers are swapped. The period of such waiting time is represented by the Waiting GPU counter in the performance profiler.
If the Waiting GPU time is too high, it may signal that a GPU bottleneck exists, and art content needs to be optimized. But if by that the frame rate is consistently high, it means you still have the CPU resources available to crunch more numbers.
How much time has passed from the moment when all scripts have been updated and all calculations on the CPU have been completed, to the moment when the GPU has finished rendering the frame, also depends on whether the vertical synchronization (VSync) is enabled or not (can be done via the System menu). If VSync is enabled, the CPU waits until the GPU finishes rendering and the vertical synchronization is performed. In this case, the Waiting GPU counter value will be higher.
The four schemes below show different scenarios of CPU and GPU performance with VSync disabled and enabled.
- The first two schemes show calculation and rendering of the frame when VSync is disabled (in both cases the monitor vertical retrace is ignored):
- Scheme 1. The CPU calculates the frame faster than the GPU can render it. So, the CPU waits for the GPU (the Waiting GPU time is high).
- Scheme 2. The CPU calculations are performed slower than the GPU renders the frame. So, the GPU has to wait while the CPU finishes its calculations. In this case, the Waiting GPU time is small.
- Schemes 3 and 4 show calculation and rendering of the frame when VSync is enabled (the monitor vertical retrace is taken into account):
- Scheme 3. The CPU calculates the frame faster than the GPU can render it, and the CPU waits for the GPU. However, in this case, both the CPU and the GPU also wait for VSync.
- Scheme 4. The CPU calculates the frame slower than the GPU renders it. In this case, the GPU waits not only the CPU finishes its calculations, but also VSync.
Multi-Threaded Mode#
In the multi-threaded mode, the engine uses all available threads to update visible nodes simultaneously, instead of updating them one by one, and starts updating multi-threaded physics and pathfinding at the end of the update stage in order to perform them in parallel with the rendering stage.
The following scheme demonstrates calculation and rendering of a frame with VSync enabled:
Correlation between Rendering and Physics Framerates#
The rendering framerate usually varies, while, as we have already mentioned before, physics simulation framerate is fixed. This means that your update() and updatePhysics() functions from the world logic are called with different frequency.
The picture above describes what happens, when the physics framerate is fixed to 60 FPS, and the rendering framerate varies. In general, there are three possible cases:
- The rendering framerate is much higher. In this case, physical calculations are done once for two or more frames. This does not raise any problems, as positions of moving objects are interpolated between the calculations.
- The rendering framerate is the same or almost the same. This situation is also OK, the calculations are performed once per frame; the physics keeps pace with the graphics and vice versa.
- The rendering framerate is much lower. This is where problems begin. First, as you see from the picture, the physics should be calculated twice or even more times per frame, and that does not speed up the overall rendering process. Second, you cannot set the physics framerate too low, as in this case the calculations will lose too much precision.
There is no point in setting the physics framerate too high, too, because if the calculations take more than 40 ms, physics is not computed in full. Hence, if additional iterations are needed, they are skipped.
Limiting Rendering FPS to Physics One#
The rendering FPS can be limited to the physics one, if the rendering FPS is higher. For that engine.physics.isFixed(1) flag is set in code. Such FPS limitation allows calculating physics each rendered frame (rather than interpolate it when this flag is set to 0). In this mode, there is no twitching of physical objects if they have non-linear velocities. (If the rendering FPS is lower than the physics one, this flag has no effect.)