Jump to content

[SOLVED] Intersection performance


photo

Recommended Posts

Hi everyone !

I'm trying to replicate (roughly) the behaviour of a LIDAR in Unigine.

The one I'm trying to simulate shoots 64 vertical beams with a range of 200m,  it has a vertical aperture of ~45°, so we got 0.7° vertically between each laser beam. As it does a 360° turn, it shoots 1024 times during the whole process, increasing the azimuth by 360/1024~0.3° to scan around. So everytime a 360 turn is completed we capture a point cloud of 1024*64=65536 points. It's able to do up to 10 turns per second.

I tried to approximate this by using World::getIntersection() to check if the 65000 rays encounter any object around and get the coordinates of the intersection point.
 

WorldIntersectionPtr wi = WorldIntersection::create();
double azimuth_step = 360/1024;

for (unsigned int i = 0; i <1024; i++) {
    mat4 rotation_z = rotate(lidar_node_->getDirection(AXIS_Z), i * azimuth_step);

    for (unsigned int j = 0; j < 64; j++) {
      unsigned int index = i * 64 + j;

      mat4 rotation_x = rotate(lidar_node_->getDirection(AXIS_X), beam_altitude_angles.at(j)); // The vertical angles of the beam are hardcoded in a std::vector
      vec3 direction = lidar_node_->getDirection(AXIS_Y) * rotation_x * rotation_z;
      Vec3 target_position = (Vec3)direction * 200;

      ObjectPtr obj = World::getIntersection(lidar_position_, target_position, 1, world_intersection);
      if (obj) {
        points_.at(index) = wi_->getPoint();
      }
    }
  }

 

World::getIntersection() execution time on my rig is between 5-10 µs. One "turn" of the lidar takes 150ms+ with a small number of object, and up to 400-500ms with multiple objects. That is too slow for me as I'm trying to reach 10 turns per second.

Are there any better way to find the intersections between the rays and the objects around ? Or at least is there a more optimized way to use getIntersection ? (maybe less accurate) 

I tried to parallelize the nested for loop with openmp, and I managed to reach 110ms with multiple objects around. That is still a bit too slow for me, would there be any way to parallelize this on the GPU with CUDA for example ?

Link to comment

I would think about a specialized custom Implementation:

1. Query all objects in the surrounding 200m

2. Maybe drop irrelevant e.g. backfacking surfaces to reduce triangle intersections

3. reuse these limited object subset at least for some limited movement variation

4. Maybe precalculate ray positions and use optimized rotation matrices if possible instead of genersl 3-DOF matrices

5. Build some index structure to assign objects to some angular segments, do the same for the rays, so for a given ray only a small subset of ray-object- intersections will have to be tested. 

6.  ....

In general use all optimizations possible in your special setup. In most cases by principle this is much more efficient than any generic approach of the game engine. Most important first step: measure exactly in your code sections where most of the processing time is spend

Link to comment

If you are looking for GPU-accelerated ray-casting have a look at DX12 ray tracing extension. In combination with Nvidia''s latest RTX300 series this provides insane reall-time ray-casting counts (but of course requiring much, much more custom programming...nothing is for free)

https://developer.nvidia.com/rtx/raytracing/dxr/DX12-Raytracing-tutorial-Part-1

https://developer.nvidia.com/rtx/raytracing/dxr/DX12-Raytracing-tutorial-Part-2

Link to comment

Hi Damien,

You can try to use depth maps for this. Render 4 viewports around LiDAR and use the result to find intersection points.

изображение.png

I've attached sample to this post. Just create new C++ project (double precision) and apply to it files from the attach.
On my system it takes 8 ms to render 4 viewports (1.5ms), convert render textures to images (GPU to CPU, 3.5ms) and render 65536 dots (3ms). I believe that it's possible to increase performance several times. Close to 2-3ms.

Best regards,
Alexander

lidar_sample.zip

  • Like 4
Link to comment

@alexander Well that is amazing, thanks a lot ! I didn't expect anyone to answer with such a complete solution ! :)

I'm having some difficulties to run the example on my machine though (Ubuntu, Unigine sim 2.12.0.2) :

Screenshot_2020-09-04_15-10-37.thumb.png.787ea385da3cf5713fefeeb5025d39a7.png

 

The viewport are inverted and the points are scattered around.

The points wrong position seems to be coming from the linearized_depth.frag shader, as I had the following error at runtime :

Failed to compile fragment shader: linearized_depth.frag
Compilation log:
	0(3439) : error C7011: implicit cast from "float" to "vec4"
	glsl 3439: s_frag_color = srgb(getLinearizedDepth(s_texture_TEX_DEPTH, TEXTURE_TEX_DEPTH_SLOT, uv) * s_depth_range.w);

Material::create_shader(): can't compile pass:"post" material:"post_depth"

OUT_COLOR gets replaced by s_frag_color (which seems to be a vec4) on OpenGL. I kinda fixed it, probably in the worst way, by casting srgb output to a vec4  :

s_frag_color = vec4(srgb(getLinearizedDepth(s_texture_TEX_DEPTH, TEXTURE_TEX_DEPTH_SLOT, uv) * s_depth_range.w));

It fixed the point scattering but the viewports and the points are still inverted :

Screenshot_2020-09-04_16-07-04.thumb.png.b6ed95496b77eeb908b6aa2e47e4e380.png

I tried to fix it by rotating the lidar node on the X axis, this fixed the viewport inversion but no the point inversion.

I probably missed something obvious here, any ideas ? I just replaced the files in a new project by the ones you provided.

Thanks again for your answer !

@ulf.schroeter Thanks for the answer, I didn't know Nvidia provided a raytracing API

 

 

Link to comment
2 hours ago, Coppel.Damien said:

I'm having some difficulties to run the example on my machine though (Ubuntu, Unigine sim 2.12.0.2)

Oops, forgot to test it on OpenGL. :)

Here is a fixed version of the linearized_depth.frag shader:

#include <core/shaders/common/fragment.h>

INIT_TEXTURE(0, TEX_DEPTH)

MAIN_BEGIN(FRAGMENT_OUT, FRAGMENT_IN)
	float2 uv = IN_POSITION.xy * s_viewport.zw;
	#ifdef OPENGL
		uv.y = 1.0f - uv.y;
	#endif
	OUT_COLOR = to_float4(srgb(getLinearizedDepth(TEXTURE_OUT(TEX_DEPTH), uv) * s_depth_range.w));
MAIN_END

linearized_depth.frag

  • Like 2
Link to comment
  • morbid changed the title to [SOLVED] Intersection performance
×
×
  • Create New...