Jump to content

Cpu bottleneck in Unigine::Node::setTransform()


photo

Recommended Posts

Hello,

 

we are running into a cpu bottleneck while updating matrices via setTransform on nodes.

We do have 50 instances with about 20 subtransformations for each node, which will be updated every frame.

Is there a faster/better way to update positions/orientations.

Version is 2.8 , renderer is directx.

 

Thank you for your help.

 

transformation001.PNG

transformation002.PNG

transformation003.PNG

Edited by sebastian.vesenmayer
Link to comment

Just wanted to know how much transformations are too much.

I did some measurements and we do arround 2400 setTransform calls per frame in total (90 entities * 26-27  dynamic sub nodes)

We do also have collisions enabled for screen picking and particle collision and each entity has arround 5 particle systems attached which are disabled.

 

Edited by sebastian.vesenmayer
Link to comment

Hello Sebastian,

Thank you for the sample. We discussed it and checked some hypotheses. 

At this moment such amount of per frame transformation pushes to the limit performance of the engine spatial tree. Minor optimization can be made by disabling the following options: Clutter Interaction, Grass Interaction, Trigger Interaction, and Visibility by Sector/Portal. All of them can be accessed via Node Class.

Alternative solution can be implemented with multi-threaded cluster. However, clusters work only for similar meshes. I suggest to check stress/cluster_03 sample for this.

If this wouldn't work for you, the only solution is limiting number of operations per frame.

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment
  • 3 weeks later...

Thanks morbid for the answer.

That is what i thought, that the spatial tree generation will take very long then.

The options you have provided, will unfortunately not work in this case I think.

I will start by removing disabled nodes and load them on the fly when they will be used.

I will also try to limit the calls to setTransform by checking for changed orientations and positions first.

 

Link to comment
  • 4 weeks later...

@sebastian.vesenmayer

We've found a way to speed up the node transformations (at least on a pure synthetic tests), so it would be nice to see if there will be an improvement on a real-world content. Could share with us your typical scene with your moving objects (>50 instances with about 20 subtransformations for each node, which will be updated every frame)?

Alternatively it should be enough to see the hierarchy of your node (a screenshot from Editor World Hierarchy window).

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment
  • 1 year later...

Hello,

I have missed silent's request for a sample, sorry for that. Will try to setup a reproducer for you.

I am running into this problem again because we have a customer which has 150 moving airborne objects in his exercise which are mainly not in the view frustum.

Most of it is background traffic but needs to be updated, because we never know when it will be inside the viewing frustum.

Looks like BVH recalculation takes some time here again when using setTransform on a Node Tree.

Still having the ~20 objects in the subtree because of animation changes which can occur.

Any updates on this topic?

Thanks.

 

Link to comment

Hi Sebastian,

Distant moving objects should be simplified. Their internal structure should contain as low amount of nodes as possible. If each of 150 moving objects is constructed internally from 20 nodes - it would simply mean that you need to move 3000 (150*20) objects per frame instead.

Reducing complexity of nodes hierarchy is the key point here.

You also can disable collision / intersection flags for these nodes (the same for each surface of meshes). Disabling additional nodes flags also can be helpful:

image.png

Another option is to split objects in groups and update for 50 objects per frame instead of 150. So the whole 150 objects will be updated in 3 frames instead of 1. Since you can't see them it shouldn't be the big issue?

Upgrading the CPU also can be an option :) We've recently tested Ryzen 5 5600X vs Core  i7-10700K and the AMD CPU was almost twice as fast in the same scenario (of moving thousand of nodes each frame).

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment
24 minutes ago, silent said:

Another option is to split objects in groups and update for 50 objects per frame instead of 150. So the whole 150 objects will be updated in 3 frames instead of 1. Since you can't see them it shouldn't be the big issue?

So we would implement an additional pre frustum culling, even if they are out of the view frustum the engine will never know when it will be displayed again.

26 minutes ago, silent said:

Distant moving objects should be simplified. Their internal structure should contain as low amount of nodes as possible. If each of 150 moving objects is constructed internally from 20 nodes - it would simply mean that you need to move 3000 (150*20) objects per frame instead.

This has already been optimized down to 20, our models may have more animated parts in the future.

27 minutes ago, silent said:

You also can disable collision / intersection flags for these nodes (the same for each surface of meshes). Disabling additional nodes flags also can be helpful:

Does this also work when I disable subnodes?

28 minutes ago, silent said:

Upgrading the CPU also can be an option :) We've recently tested Ryzen 5 5600X vs Core  i7-10700K and the AMD CPU was almost twice as fast in the same scenario (of moving thousand of nodes each frame).

Already running on an intel 10900X :)

 

Thanks

Link to comment
×
×
  • Create New...