Jump to content

Decreasing FPS Problem(viewport , renderTexture2d)


photo

Recommended Posts

Hello everyone ,

I want to enter the subject directly. In line with a project, when I perform a texture creation, the fps of the relevant software drops to 25 .It rises up to 40 when I remove the render operation. How can I prevent the fps from dropping when I create a 2d image in texture.Is there any other way?

Unigine::ImagePtr image;
Unigine::TexturePtr texture;
Unigine::ViewportPtr viewport;

void init(QString pName, int w, int h)
        {
            cW = w;
            cH = h;
            tW = spw;
            tH = sph;
            partial = !(w == spw && h == sph);
            initTextureAndImage(w,h);
            createViewport();
        }

void initTextureAndImage(int w, int h) {
            texture = Unigine::Texture::create();
            texture->create2D(w,h,Unigine::Texture::FORMAT_RGBA8,Unigine::Texture::USAGE_RENDER|Unigine::Texture::FILTER_POINT);

            image = Unigine::Image::create();
            image->create2D(w, h, Unigine::Image::FORMAT_RGBA8);
        }
Unigine::ViewportPtr createViewport()
{
    viewport = Unigine::Viewport::create();
    viewport->removeSkipFlags(Unigine::Viewport::SKIP_VISUALIZER);
    viewport->setRenderMode(Unigine::Viewport::RENDER_DEPTH_GBUFFER_FINAL);

    return viewport;
}

void renderImage(const CameraTextures &pTexture) {
            if(!stream)
                return;
            Unigine::Profiler::get()->begin("renderTexture2D");

                viewport->renderTexture2D(camera->getCamera(), texture);
                break;
            
//            viewport->renderTexture2D(camera->getCamera(), texture);
            Unigine::Profiler::get()->end();
        }


//then call
init("[CAM]", 1920, 1080);
while(true){
renderImage(texture)//When i close the render, fps goes up to 40,50 from 25
  }

What is the efficient way to handle  this 2d image rendering for camera.

Thanks in advance.

Edited by burakdogancay
Link to comment

You can try to use WidgetSpriteVIewport if you need to render an additional camera. That would allow to skip unnecessary Image CPU copy.

Each additional viewport means that you need to do the additional CPU rending for all the objects, so don't expect it would be completely free.

  • Like 1

How to submit a good bug report
---
FTP server for test scenes and user uploads:

Link to comment

Merhaba Burak,


I have a few questions about your code.
Why did you use an image? Because you haven't used it.
When I render to texture, I don't need the engine to do the rendering of the world. So I disable it.

Render::setEnabled(0);

Why do you need an image? What is your requirement? 
The texture is on GPU and image in on System RAM. PCI-e Transfers are always fps troubles.
I think if you answer this image requirement, probably we may discuss other approaches.

Regards,

Rohit

 

Link to comment

Hi Rohit, merhaba :)

Actually we are using an image for next step.We are streaming an image and sharing it with qsharedmemory, so next step is like this

After 

renderImage(texture)

we apply texture to image with,

texture->getImage(image);

Then share image with qsharedmemory infrastructure.

So theproblem is;

the fps falls to 25 when i used functions available below

 

viewport->renderTexture2D(camera->getCamera(), texture);

//and

texture->getImage(image);

if I delete these two function, fps goes up to 100(this condition is like a heaven for us).But as you know there won't be any pixel value in image and our streaming will not work without these two functions.So i am looking for an alternative method to keep the fps between 45 and 100 while we are streaming camera views(images)

 

According to my investigation, "texture->getImage(image) " directly transmit an image from GPU to CPU and it is performance killer as you mentioned above.

 

So is there any other possible methods that i can use?

 

My best regards,

Edited by burakdogancay
Link to comment

Dear Burak,

Your name sounds like a Turkic name. Are you from Turkey? 

I know elementary Turkish.

I have also guessed your problem and I know the solution. As I have implemented it for 4K In and 4K Out. I have installed Video I/O card and this card fetches video buffers in System RAM. So we are uploading the data to GPU to have textures. Then after compositing the frame we again download the data to system RAM and the buffer is sent out via Video I/O card for video output. This is the same case for you to send buffer for streaming.

Now, what makes it possible for me to have I/O with 4096 X 2160 big RGB buffer at 60 fps constant. And again the world will have a good amount of geometry to render.

  1. I have written a separate UnigineApp, now it is CustomApp where Render::setEnabled(0) is used. 
  2. I write to a texture like you write using viewport->renderTotexture()
  3. I cache this texture to ring buffer say of size 2 - 4. Depends upon your usage.
  4. There is another thread that pops up the stacked texture from ring buffer and maps for the GPU to CPU transfer. texture->getImage().
  5. I manage the Rendering FPS synchronization with render_max_fps. So the FPS for Video out and rendering matches and there were no timing issues if animations are used in rendering.
  6. You need to manage the synchronization between these threads.

Now with this approach, I am able to have 4K resolution at 60 FPS.

But there are more points to consider.

  1. Normally the texture on GPU is RGBA. But video out goes in YUV color space. So we convert the RGBA texture in YUV color space on GPU which is exactly half of RGBA bandwidth. It means we are rendering 4K but PCI-e transfers are half. Can you do such compression for your needs? Very important to save transfers bandwidth.
  2. We also struggle sometimes when some video I/O need RGBA buffers. If you know Directx Map function which must be used by Unigine for this getImage() function, it blocks the rendering thread till the transfer is finished.
  3. In the above scenario, we can reduce this blocking time by using NVIDIA Quadro (Direct 2 GPU) and AMD Radeon WX (DGMA) functionality. These cards help in such a scenario. These topics are a little difficult and proprietary, making them difficult to implement. But works.

In conclusion, If you try to work with 6 points above you will get the required boost by using a ring buffer and a separate thread for image mapping. I am 100% sure you will achieve the desired frame rate.

Try it, Let me know as it will not be a simple job to achieve. You will probably need 1-2 weeks to implement all 6 points.

Few more questions: 

  1. Are you working on Linux?
  2. How big is your buffer that you want for streaming?
  3. Your subject for this thread should have been "Decreasing FPS Problem", shouldn't it? At first look, I wonder, how increasing FPS could be a problem? :-)

I hope this helps and very sorry for such a long answer. But it is required.

Regards,

Rohit

Link to comment

Yes I am from Turkey,

Unfortunatley we are out of the time and we can not assemble any additional hardware(Video I/O card) for our costumer. We were hoping that it was not that difficult to solve this problem. 45 frames should not drop for only one image transfer.

I need an official answer or approval for this situation from Unigine.

By the way , Rohit

Thank you for your answer and informations you provided .We will evaluate them.

 

Edited by burakdogancay
Link to comment

Dear Burak,

I was in Izmir for six months. Turkey is a beautiful country and loving people. I am a fan of Kunefe.

I can understand the out of time situation. Few more points:

  • You don't need a video I/O card. It was an example of my case to yours.
  • To solve your problem at this time, move the map calls in a separate thread. It is not difficult.

Otherwise, mapping in the same thread will never help you as GPU architectures based on Direct X and OpenGL specs block the render thread till transfer finishes. These NVIDIA and AMD guys have provided direct Memory access functionality with specialize implementations and put them in Quadro and WX series cards than consumer cards. They know that these PCI-e transfer cases will force clients to buy these expensive cards, sometimes RTX 2080 is much powerful than expensive RTX 6000 cards. Only for the GPUDirect, you need to pay more. My point is, even Graphics card vendors know the problem and have a solution. But with one rendering thread doing rendering and mapping will fail.

image.png.41062c64e499a6d126f2de260e0106be.png

For faster turnaround time, You may check Unreal Plugins for Black-magic and AJA. They have done it in separate thread. But unfortunately they also don't use ring buffer and going above one output their plugin fails. But you will get an idea. I have seen this code one year back. So if there are any updates, I am unaware of it.

Hope this helps. 

Over to Unigine guys.

Regards,

Rohit

 

  • Like 1
Link to comment
×
×
  • Create New...