Jump to content

Unigine stops operating on Virtualized graphic card with DirectX


photo

Recommended Posts

We are testing now a virtualized system with a Nvidia A40 (equivalent to the RTX3090) card.

When passing all graphic card resources to one Virtual machine Unigine works fine with DirectX.

When splitting all graphic card resources on more then one virtual machine Unigine crashes with an Error message "Can't create Texture".

The log output states that somehow the DirectX device gets lost.

We encounter no problems with OpenGL.

This happens when we are creating much Texture Objects at once at the start of the application without calling a frame swap. The Video RAM size is 24 GB. 

Have you any idea what can cause this behavior?

Thanks,

Sebastian

Link to comment

Hi Sebastian,

Hard to tell. Maybe the resulting VRAM that driver allows to use is on a PC after splitting is not enough (somehow) for DX11.

Try to launch debug build with -video_debug 2 to get some additional debug messages from driver. Also you can try to see when this error will stop to appear (try to reduce texturs amount / resolution to see if that's really changes anything).

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment

Thanks for the hints, I am going to try to deploy a debug build on the system to get additional information.

After splitting, the system has 24 GB VRAM for the virtual machine, Unigine is displaying it correctly. We are only using about 2 GB for Textures.

When skipping the large scenery file, it is starting.

I will try to shrink texture resolutions and check if anything changes.

I have added the last log output as attachment.

2106021251_log.txt

Link to comment

That is one physical GPU splitted into two. I will test it when I get hands on the system again, but I think this won't work because this video adapter is connected to the other VM.

Link to comment

Hello silent,

 

good and bad news, we managed to get things work.

It happens when we are baking our Lighttexture for one Spotlight. The Light Texture is generated with over 450 single Spotlights by rendering to a 4096x4096 texture.

When the number of spotlights exceeds 450 the application crashes.

We added Lights piece by piece to check were the limit is.

When reducing the resolution to 2048 x2048 pixels it will work out of the box, but we are loosing resolution.

I guess the driver has a timeout when it kicks the direct3d device and stops working.

Do you know any settings in windows registry or nvidia driver setting to prevent the driver from killing the direct3d device when rendering takes a long time?

 

Thanks

Sebastian

Link to comment

You can try to increase TDR delay in registry. Here is the instructions:

However, if you can reproduce it on a regular GPU on a simplified scene - it would be interesting to see what is actually happens here inside the engine itself.

Maybe there is a way to improve this behavior somehow (especially keeping in mind that OpenGL is working just fine).

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment

Hello silent,

 

I set the TDR delay to 1 minute, but the problem is still there.

I also disabled it but this will freeze the operating system.

I try to reproduce this with a simplified scene, without any result yet.

 

To workarround this problem now: Is it possible to render to a texture without clearing it? So I can render the texture in multiple passes.

Thanks

Link to comment

We were able to reproduce this on the virtual machine with a minimal sample.

I only had one start on my workstation PC were the device has been lost, but it looks like it has something todo with the light settings we make.

Edited by sebastian.vesenmayer
Link to comment

Hello silent,

 

we changed the attenuation distance for the LightProj Object when we create it.

It works for the value 125.f, but still breaks when we double it to 250.f.

We first had a default value of 1000000.f in our configuration.

So I guess there is some calculation in the shader which kills the driver.

 

I could not reproduce this when using a not shared graphics adapter.

I don't know if you have a setup to test Graphics adapter sharing on a virtual machine, but I can provide the example.

 

Thanks,

Sebastian

Link to comment

Can you tell us more about server setup so we can try to build a similar test environment? We found some ancient GRID GPUs in our office, maybe it would be enough for reproduction :)

Could you please specify:

  • OS on host PC (Windows Server 2019 or Linux)?
  • Which VM software are you using on a server and clients?

Thanks!

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment

Hi silent,

the host operating system is Dell VxRail based on vSphere ESXi from VMWare.

Guest OS have been Windows 10 operating systems.

 

Edited by sebastian.vesenmayer
  • Thanks 1
Link to comment
×
×
  • Create New...