Performance Profiler

章节:

A performance profiler displays performance data in a timeline. It reports how much time is spent per each frame for updating all aspects of your project: rendering nodes that are in view, updating their states, executing scripts with game logic, calculating physics, etc.性能分析器在时间轴上显示性能数据。它报告每帧花费多少时间来更新项目的各个方面：渲染可见的节点，更新其状态，使用游戏逻辑执行脚本，计算物理量等等。

With the profiler, you can:使用事件探查器，您可以：

Detect the bottlenecks of your application检测应用程序的瓶颈
Check if art assets optimization is required检查是否需要优化艺术品资源
Check if code optimization is required检查是否需要代码优化
Compare the profiling results before and after the changes比较更改前后的概要分析结果

性能分析器

Activate Profilers激活分析器#

To turn the profiler on, click Tools -> Performance Profiler and choose the required profiling mode:要打开分析器，请单击Tools -> Performance Profiler并选择所需的分析模式：

注意

您还可以设置一个热键（或组合键），该热键将在窗口的HotkeysSettings部分中运行分析器。但是，您将必须创建一个自定义预设。然后，您可以按几次此热键以循环选择分析模式。

The following profiling modes are available:可以使用以下分析模式：

Generic profiler shows only the general statistics block. Generic profiler仅显示常规统计信息块。
Rendering profiler shows the detailed rendering statistics and the timeline chart. Rendering profiler显示详细的渲染统计信息和时间线图。
Physics profiler shows the detailed physics-related statistics (within the Physics radius) and the timeline chart. Physics profiler显示了与物理相关的详细统计信息（在 Physics radius 之内）和时间线图。
World Management profiler shows the statistics on the whole loaded world. World Management profiler显示了整个已加载世界的统计信息。
Thread profiler shows the statistics on loading threaded resources. Thread profiler显示有关加载线程资源的统计信息。

To show profiler statistics in the in-game mode type show_profiler command and a value from 1 to 5 in the console. To disable the profiler in the in-game mode, type show_profiler 0 in the console. 要在游戏内模式下显示探查器统计信息，请键入show_profiler命令，并在控制台中输入从 1 到 5 的值。要在游戏内模式下禁用事件探查器，请在控制台中键入 show_profiler 0 。

Generic Profiler通用分析器#

启用通用配置文件

Total	Total time in milliseconds taken to both calculate and render the current frame. This is the duration of the main loop in the application execution sequence. Total = Total CPU + Waiting GPU 计算和渲染当前帧所花费的总时间（以毫秒为单位）。这是应用程序执行顺序中的主循环的持续时间。 Total = Total CPU + Waiting GPU
Total CPU	Total time in milliseconds taken to prepare the current frame (including update, render, and swap). 准备当前帧（包括 update，render 和 swap ）花费的总时间（以毫秒为单位）。
Total GPU	Total time in milliseconds taken to render the current frame on the GPU. 注意 This counter may not work on some GPUs.This counter may not work on some GPUs. This counter may not work on some GPUs.在GPU上渲染当前帧所花费的总时间（以毫秒为单位）。注意 This counter may not work on some GPUs.此计数器可能在某些GPU上不起作用。
Update	Time taken to update application logic. This includes executing all steps in the *update()* function of the world script. It also includes the update of states of all nodes (for example, update of the skinned animation or of a particle system to spawn new particles). If the Update time is too high, it signals that you need to optimize the application logic executed each frame.If the Update time is too high, it signals that you need to optimize the application logic executed each frame. You may also need to decrease the number of objects in the world, as updating their states (spawn particles by the particle systems, play skinned mesh animation, etc.) increases the load.You may also need to decrease the number of objects in the world, as updating their states (spawn particles by the particle systems, play skinned mesh animation, etc.) increases the load. If the Update time is too high, it signals that you need to optimize the application logic executed each frame.You may also need to decrease the number of objects in the world, as updating their states (spawn particles by the particle systems, play skinned mesh animation, etc.) increases the load.更新应用程序逻辑所花费的时间。这包括在世界脚本的 *update（）* 功能中执行所有步骤。它还包括所有节点状态的更新（例如，更新蒙皮动画或更新粒子系统以生成新粒子）。 If the Update time is too high, it signals that you need to optimize the application logic executed each frame.如果更新时间太长，则表明您需要优化每个帧执行的应用程序逻辑。 You may also need to decrease the number of objects in the world, as updating their states (spawn particles by the particle systems, play skinned mesh animation, etc.) increases the load.您可能还需要减少世界上的对象数量，因为更新它们的状态（通过粒子系统生成粒子，播放蒙皮的网格动画等）会增加负担。
Render CPU	Time taken to prepare all data to be rendered in the current frame and feed rendering commands from the CPU to the GPU. If Render CPU time is too high, it signals that art assets may need to be optimized, for example: LOD system should be used;LOD system should be used; polygon count of models should be reduced.polygon count of models should be reduced. LOD system should be used;polygon count of models should be reduced.准备要在当前帧中渲染的所有数据并将渲染命令从CPU馈送到GPU所花费的时间。如果“渲染CPU”时间过长，则表示可能需要优化艺术品资源，例如： LOD system should be used;应当使用 LOD 系统； polygon count of models should be reduced.应该减少模型的多边形数量。
Waiting GPU	Time between completing all calculations on the CPU up to the moment when the GPU has finished rendering the frame. (See the illustration). This counter is useful to analyze the bottleneck in your application's performance. When Waiting GPU time is equal to 0, it signals that scripts take too long to be updated and calculations are too intensive for CPU to perform them fast enough. In this case, your application has a CPU bottleneck. Optimize your update block in the world scripts or reduce the number of objects updated each frame.When Waiting GPU time is equal to 0, it signals that scripts take too long to be updated and calculations are too intensive for CPU to perform them fast enough. In this case, your application has a CPU bottleneck. Optimize your update block in the world scripts or reduce the number of objects updated each frame. High Waiting GPU time means one of the following: low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case.low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case. consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script.consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script. low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case.consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script.High Waiting GPU time means one of the following: low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case.low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case. consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script.consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script. When Waiting GPU time is equal to 0, it signals that scripts take too long to be updated and calculations are too intensive for CPU to perform them fast enough. In this case, your application has a CPU bottleneck. Optimize your update block in the world scripts or reduce the number of objects updated each frame.High Waiting GPU time means one of the following: low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case.low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case. consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script.consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script. low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case.consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script.从完成CPU上的所有计算到GPU完成渲染帧之间的时间。（请参见插图）。此计数器对于分析应用程序性能的瓶颈很有用。 When Waiting GPU time is equal to 0, it signals that scripts take too long to be updated and calculations are too intensive for CPU to perform them fast enough. In this case, your application has a CPU bottleneck. Optimize your update block in the world scripts or reduce the number of objects updated each frame.等待GPU时间等于 0 时，它表示脚本花费的时间太长，无法更新，并且计算量太大，CPU无法快速执行它们。在这种情况下，您的应用程序会遇到CPU瓶颈。在世界脚本中优化您的更新块，或减少每帧更新的对象数。 High Waiting GPU time means one of the following: low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case.low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case. consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script.consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script. low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case.consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script.GPU等待时间过长意味着以下情况之一： low framerate signals that there exists a GPU bottleneck. The art content needs to be optimized in this case.低帧率表示存在GPU瓶颈。在这种情况下，艺术内容需要进行优化。 consistently high framerate means you have free CPU resources available to process more numbers in the update() of the world script.持续高的帧速率意味着您可以在世界脚本的 update（）中拥有免费的CPU资源来处理更多数字。
Interface	Time taken to render all GUI widgets.呈现所有GUI小部件所需的时间。
Physics	Time taken to perform all physics calculations.执行所有物理计算所需的时间。
Heap	Time period during which the physics module waits for the completion of rendering process to return to the Main thread and execute the rest of the physics ticks.为应用程序分配的所有内存池的大小。 Unigine分配器在池中分配内存，这使分配更快，更有效（默认情况下，如果使用 USE_MEMORY 指令）。随着在池中分配内存，计数器值逐步增加。
Memory	按需分配的所有内存块的大小。此计数器报告分配的池应用程序资源中实际使用了多少内存。如果内存大小稳定增加，则可能表明存在内存泄漏。检查是否已正确删除所有已创建的不再使用的对象和变量。
System	Size of all memory blocks allocated on demand. This counter reports how much memory in allocated pools application's resources really use. If Memory size steadily increases, it may signal that there are memory leaks. Check if all created objects and variables that are no longer used are properly deleted.If Memory size steadily increases, it may signal that there are memory leaks. Check if all created objects and variables that are no longer used are properly deleted. If Memory size steadily increases, it may signal that there are memory leaks. Check if all created objects and variables that are no longer used are properly deleted.用于应用程序的RAM内存的大小。
Allocations	Size of RAM memory used for the application.帧期间的分配呼叫数。（即使请求分配几个字节，此计数器也会报告分配调用。）
Sync Threads	Number of allocation calls during the frame. (This counter reports an allocation call even if several bytes are requested to be allocated).同步内部引擎线程数（默认情况下，该值等于可用CPU内核数减去1并受32限制）。
Async Threads	Number of internal engine threads used for CPU shaders.异步内部引擎线程的数量（默认情况下，该值等于可用CPU内核的数量，并且受32个限制）。

Rendering Profiler渲染配置文件#

启用渲染分析器

The following statistics are displayed in addition to the generic one:除了 generic 之外，还显示以下统计信息：

Terrain Cache CPU	Memory (RAM) amount currently used for sound samples, in megabytes.当前用于“景观地形”图块的CPU内存缓存量，以兆字节为单位。
Terrain Cache GPU	Memory (RAM) amount currently used for terrain streaming, and the memory limit set for it, in megabytes. The memory limit is the maximum amount of RAM that can be used for terrain streaming.当前用于景观地形图块的VRAM缓存量，以兆字节为单位。
Terrain Virtual Texture	当前用于“景观地形”虚拟纹理的内存量，以兆字节为单位。
Terrain Detail Textures	Amount of RAM currently used for streaming static meshes, and the memory limit set for it, in megabytes. The memory limit is the maximum amount of RAM that can be used for streaming static meshes. 注意 The same static mesh will use more RAM than VRAM, as additional data is generated in RAM to calculate collisions, intersections, spatial trees and so on.The same static mesh will use more RAM than VRAM, as additional data is generated in RAM to calculate collisions, intersections, spatial trees and so on. The same static mesh will use more RAM than VRAM, as additional data is generated in RAM to calculate collisions, intersections, spatial trees and so on.当前用于“地形地形”详细信息的内存量，以兆字节为单位。
Terrain Reload Tiles	Amount of VRAM cache currently used for Landscape Terrain tiles, in megabytes.修改后当前等待重新加载的“地形地形”图块的数量。
Terrain Reload Bounds	Amount of VRAM currently used for the Landscape Terrain virtual texture, in megabytes.导致重新加载图块的事件数（一个边界可能导致重新加载多个图块）。
Sounds	Amount of VRAM currently used for Landscape Terrain details, in megabytes.当前用于声音采样的内存量，以兆字节为单位。
Meshes Limit	Amount of VRAM currently used for streaming static meshes, and the VRAM limit set for it, in megabytes. The VRAM limit is the maximum amount of VRAM that can be used for streaming static meshes.可用于网格几何图形的VRAM的最大数量，以兆字节为单位。
Meshes	Amount of VRAM currently used for skinned meshes, and the VRAM limit set for it, in megabytes. The VRAM limit is the maximum amount of VRAM that can be used for skinned meshes.当前用于网格几何图形的VRAM量，以兆字节为单位。
Textures Limit	Amount of VRAM currently used for textures, and the maximum amount of VRAM that can be used for them, in megabytes.可用于纹理的VRAM的最大数量，以兆字节为单位。
Textures	Amount of VRAM currently used for textures cache, in megabytes.当前用于纹理的VRAM量，以兆字节为单位。
Textures Cache	Amount of VRAM currently used for rendering buffers (Gbuffer, post-effects, etc.), in megabytes.当前用于纹理缓存的VRAM量，以兆字节为单位。
Buffers Render	Amount of VRAM currently used for vertices of particle systems, and the maximum amount of VRAM that can be used for them, in megabytes.当前用于渲染缓冲区（Gbuffer，后效果等）的VRAM量，以兆字节为单位。
Buffers Shadows	当前用于阴影贴图的VRAM量，以兆字节为单位。
Async Buffer	当前用于异步缓冲区的内存量，以兆字节为单位。在流处理过程中将资源异步加载到的中间缓冲区。注意仅适用于OpenGL。
Async Buffer Indices	The total VRAM amount that is currently used.当前用于网格索引缓冲区的内存量，以兆字节为单位。该缓冲区用于存储在流处理过程中异步加载的网格的索引。注意仅适用于OpenGL。
Grasses	Number of Landscape Terrain tiles that are currently awaiting reloading after being modified.当前用于草节点的VRAM量，以兆字节为单位。
Terrains	Number of events causing reloading of tiles (one bound may cause reloading of multiple tiles).当前用于地形的VRAM量，以兆字节为单位。
Allocations Textures	The number of dynamic reflections drawn per frame. In case of cubemap reflections, if all six faces are updated, six reflections are rendered each frame.帧期间纹理的内存分配数。
Compile Shaders	在帧期间编译的着色器的数量。
Dynamic Reflections	Number of light passes rendered per frame. This means that the counter displays the number of all light sources that are currently seen illuminating something in the viewport. This value also includes additional passes for rendering lights in the reflecting surfaces (if dynamical reflections are used). Plain 2D reflection will multiply the number of rendering passes by two, while cubemap-based reflection with six faces updated each frame will multiply the number of rendering passes by six. 注意 Each light redraws mesh polygons it illuminates. That is why the higher the number of light sources, the higher the number of polygons the graphics card has to render, and the lower the performance. For example, using two omni lights will as much as double the rendered geometry they shine on.Each light redraws mesh polygons it illuminates. That is why the higher the number of light sources, the higher the number of polygons the graphics card has to render, and the lower the performance. For example, using two omni lights will as much as double the rendered geometry they shine on. Each light redraws mesh polygons it illuminates. That is why the higher the number of light sources, the higher the number of polygons the graphics card has to render, and the lower the performance. For example, using two omni lights will as much as double the rendered geometry they shine on.每帧绘制的动态反射数。如果是立方体贴图反射，则如果更新了所有六个面，则每帧会渲染六个反射。
Lights	每帧渲染的光通过次数。这意味着计数器将显示当前看到的照明视口中的所有光源的数量。该值还包括用于在反射表面上渲染光的其他遍次（如果使用了动态反射）。普通的2D反射将渲染次数乘以2，而基于立方体贴图的反射（每帧更新了六个面）会将渲染次数乘以6。注意每个灯光都会重绘其照亮的网格多边形。这就是为什么光源数量越多，图形卡必须渲染的多边形数量越多，性能越低的原因。例如，使用两个全向光源将使它们照亮的渲染几何体增加一倍。
Shadows	每帧渲染的阴影遍数。每盏灯都需要经过阴影才能计算出阴影。同样，如果有反射的表面带有绘制的阴影，则会增加阴影的通过次数。
Decals	每帧渲染的贴花数量（在所有渲染遍次中）。如果一个贴花有任何额外的效果(透明的，发光的，被光源照亮的，或有任何其他视觉效果)，涉及到该贴花的额外渲染通道，该贴花将在每个相应的通道中额外渲染，因此已渲染的贴花的总数将增加。
Surfaces	每帧渲染的曲面数（在所有渲染遍次中）。如果一个曲面有任何额外的效果(透明，发光，由光源照明，或有任何其他视觉效果)，涉及到这个曲面的额外渲染通道，这个曲面将在每个相应的通道中额外渲染，因此渲染曲面的总数将增加。
Triangles All	Number of triangles rendered per frame. This includes all polygons that are currently visible in the viewport (geometry).每帧渲染的三角形总数，包括当前在视口中可见的所有多边形以及在阴影渲染过程中渲染的多边形。
Triangles Shadows	Number of geometric primitives rendered per frame. This includes points, lines, triangles, and polygons. The visualizer and the profiler itself also add to this counter. The value differs dramatically if tessellation is used. In this case, Triangles reports the number of triangles in the coarse mesh, while Primitives shows statistics on the number of tessellated primitives.在阴影渲染过程中每帧渲染的三角形数量。每个光源都必须重新绘制其照明的几何形状，从而增加了渲染三角形的总数。为了避免GPU瓶颈，请保持动态光源的数量及其半径尽可能小。
Triangles Viewport	The number of draw calls. The higher the number of identical mesh surfaces with the same material, the more effective the instancing is (enabled by default). This means, the number of draw calls is minimized offloading both the CPU and the GPU. You can compare the number of surfaces ( Surfaces) and the number of DIPs used to render them. For example, if there are 30000 surfaces and 1000 DIPs, it means that 30 instanced surfaces of meshes are rendered per only one draw call (Surfaces/Dips). Thus the instancing provides performance boost.每帧渲染的三角形数量。这包括当前在视口（几何形状）中可见的所有多边形。
Primitives	Number of materials set per frame. (Materials are set in each of the rendering passes.)每帧渲染的几何图元数。这包括点，线，三角形和多边形。可视化器和分析器本身也添加到此计数器中。如果使用镶嵌细分，则该值会显着不同。在这种情况下， Triangles 报告粗糙网格中的三角形数量，而 Primitives 显示有关镶嵌细分图元数量的统计信息。
Dips	Number of shaders set per frame. (Shaders are set in each of the rendering passes; hence if only one material used, its shader still needs to be set several times. When nothing is visible and the screen is black, even in this case the composite shader is still used.) 抽奖电话数。具有相同材质的相同网格曲面的数量越多，实例化效果越好（默认情况下启用）。这意味着，绘图调用的数量得以最小化，从而减轻了CPU和GPU的负担。您可以比较表面的数量（Surfaces）和用于渲染它们的DIP的数量。例如，如果有30000个曲面和1000个DIP，则意味着仅一个绘制调用（Surfaces/Dips）将渲染30个实例化的网格面。因此，实例化可提高性能。
Shaders	Number of shaders compiled during the frame.每帧设置的着色器数量。（在每个渲染过程中都设置了着色器；因此，如果仅使用一种材质，则仍需要对其着色器进行多次设置。当看不到任何东西且屏幕为黑色时，即使在这种情况下，组合着色器仍在使用。）
Materials	每帧设置的材质数量。（在每个渲染过程中都设置了材质。）

注意

小于1 Mb的值显示为 0 。

Physics Profiler物理探查器#

此分析器显示物理半径内的统计信息。

启用了物理探查器

除了 generic 之外，还显示以下统计信息：

PIslands	Number of bodies within the physics radius.物理半径中的物理岛屿的数量，可以单独计算。此数字越低，则多线程is的效率越低。
PBodies	Number of joints within the physics radius.物理半径中的实体的数量。
PJoints	Total number of contacts within the physics radius; this includes contacts between the bodies (their shapes) and body-mesh contacts.物理半径中的关节的数量。
PContacts	Duration of the collision detection phase of physic simulation when collisions between objects are found.物理范围内的触点总数半径；这包括物体（它们的形状）和物体与网格物体之间的接触。
PCollision	Duration of all simulation phases added together.发现对象之间的碰撞时，物理模拟的碰撞检测阶段的持续时间。
PResponse	计算碰撞响应并解决关节时响应阶段的持续时间。
PIntegrate	物理模拟结果应用于人体时积分阶段的持续时间。
PSimulation	所有模拟阶段的持续时间。

在我们的物理学视频教程中观看Physics Profiler选项的概述。

World Management Profiler世界管理概况#

The following statistics is displayed in addition to the generic one:此探查器显示了整个世界的统计信息。

注意

在编辑器模式下，启用和禁用节点，主体或关节会增加World Profiler计数器的值，因为创建（有限数量）的克隆是为了撤消/重做。

启用了World Management Profiler

除了 generic 之外，还显示以下统计信息：

WNodes	Total number of joints in the world.世界上的节点总数（已启用和已禁用）。
WBodies	Time in milliseconds that the engine spends on generating content in procedural nodes (such as grass, clutters, world layers). 世界上个实体的总数。
WJoints	世界上个关节的总数。
WSpawn	引擎花费在程序节点上的时间（以毫秒为单位）（例如草，杂波，世界层）。

Thread Profiler线程分析器#

启用线程分析器

除了 generic 之外，还显示以下统计信息：

AsyncQueue	Time of asynchronous pathfinding calculations, in milliseconds.异步加载资源（文件/网格/图像/节点）的时间，以毫秒为单位。
Sound	异步加载声音的时间，以毫秒为单位。
PathFind	异步寻路计算时间，以毫秒为单位。

最新更新: 2023-06-23

Help improve this article

Was this article helpful?

Report a problem related to this article

(or select a word/phrase and press Ctrl+Enter)