Collection of LiDAR data

Collection of LiDAR data

When comparing LiDAR data with other modalities like images, videos, audio, or text, we see that there is a very small number of LiDAR datasets that are publicly available, the reason behind this issue is the difficulty of both the acquisition and annotation process of LiDAR data. Because even high-end LiDAR systems provide sparse and colorless point cloud data, one adds also sensors of other modalities (usually cameras) when collecting LiDAR data.

The curse of sparsity can make the appearance of very different but far from the system objects very similar. For example, a person standing close to the edge of the sensing radius of a LiDAR system can be indistinguishable from a small tree for a human annotator. High-quality images from the cameras accompanying the LiDAR systems solve this issue.

The described solution looks simple and easy to do, but in reality, the calibration of these sensors together is a tedious task. The so-called sensor fusion is achieved by ensuring a precise and fixed position of the sensors, then using those positions, sensor-specific parameters - called intrinsics, undergo complex alignment routines which usually involve big printed chessboard images placed at various angles relative to the sensors.

The calibration process of LiDAR and camera sensor fusion.

As a result, one obtains an extrinsic matrix which is used in combination with camera intrinsic matrices to project points from the point cloud to pixel space and vice versa. This two-way connection between sensors not only enables an easier process of annotation but also the usage of state-of-the-art image understanding models. These models detect or segment the objects captured with the camera and project the labels to the 3D point clouds captured by the LiDAR system, and by doing so, it either completely does the lidar labeling job or greatly augments it.

Uses of deep learning with LiDAR data

Given the type of output that LiDAR systems generate, combining them with neural networks seems like a natural fit, and indeed neural networks operating on point clouds have proven effective. We can apply deep neural networks to LiDAR data for understanding classification, and semantic segmentation. A U-Net like architecture can be used to operate directly on point clouds and demonstrated how it is possible to get superior results than the ones obtained with image-based models.

There are applications in increasingly complex tasks in the domain, such as instance segmentation, object detection, object completion, pose estimation, etc. Eventually, this progress made it possible to create the very first generative 3D model released by OpenAI in December 2022, called Point-E.

Challenges of neural networks with LiDAR

  • An interesting challenge for neural networks operating on LiDAR data is the fact that there's a ton of variation based on scanning times, weather conditions, sensor types, distances, backgrounds, and a plethora of other factors. Because of the way LiDAR works, the density and intensity of objects vary a lot.
  • Combined with the fact that sensors are often noisy and LiDAR data, in particular, is often incomplete (because of factors like low surface?reflectance?of certain materials and cluttered backgrounds in cities), neural networks working with LiDAR data need to be able to handle a lot of variation. Another problem with 3D point data is that, unlike 2D images, there isn't an intuitive order to the points from a LiDAR sensor, which introduces the need for?permutation and orientation invariance?in our model, which not all architectures satisfy.

Four interesting families of architectures proposed to deal with LiDAR data as follows:

1)?Point cloud-based methods:?These networks operate directly on the point clouds using different approaches. One such?approach?is learning the spatial features of each point directly via MLPs and accumulating them via max-pooling.

2)?Voxel-based methods:?The 3D data is divided into a 3D grid of voxels (essentially a grid of cubes), and?3D convolution and pooling are applied in a?CNN-like architecture.

3)?Graph-based methods:?These methods use the inherent geometry present in point clouds to construct graphs out of them and apply common GNN architectures like graph?CNNs and graph attention networks?(which also happen to satisfy the earlier mentioned condition of permutation invariance).

4)?View-based methods:?These methods rely on creating 2D projections of the point clouds using the tried and tested architectures from 2D computer vision. In this case, a tactic that can help improve model performance is to create multiple projections from different angles and vote for a final prediction.

要查看或添加评论,请登录

Shashank V Raghavan??的更多文章

  • Deep Learning Models for PID Control in Robotics

    Deep Learning Models for PID Control in Robotics

    PID controllers are widely used in robotics for motion control, trajectory tracking, and balancing tasks. However, they…

  • DeepSORT Algorithm For Object Tracking

    DeepSORT Algorithm For Object Tracking

    DeepSORT (Deep Simple Online and Realtime Tracking) is an advanced object tracking algorithm that builds upon the…

  • Optics in Quantum Computers

    Optics in Quantum Computers

    Optics play a crucial role in quantum computing, especially in photonic quantum computing and quantum communication…

    1 条评论
  • AI-enabled optical sensor ViDAR (Visual Detection and Ranging)

    AI-enabled optical sensor ViDAR (Visual Detection and Ranging)

    ViDAR (Visual Detection and Ranging) is an advanced optical sensor technology used for wide-area surveillance…

  • Reinforcement Learning Frameworks for Decision-Making in Autonomous Navigation

    Reinforcement Learning Frameworks for Decision-Making in Autonomous Navigation

    Reinforcement Learning (RL) stands at the forefront of artificial intelligence, offering transformative capabilities…

  • Sensor Fusion (LiDAR + Camera) PointPillars

    Sensor Fusion (LiDAR + Camera) PointPillars

    LiDAR and camera fusion algorithms combine data from LiDAR sensors (which provide precise depth and 3D spatial…

  • Point cloud analysis using ICP

    Point cloud analysis using ICP

    Point cloud analysis in LiDAR systems is a critical aspect of computer vision, enabling tasks like object detection…

  • Noise Filtering: LiDAR Systems

    Noise Filtering: LiDAR Systems

    Noise filtering in LiDAR systems is critical for ensuring accurate and reliable data. Noise in LiDAR data can result…

  • 3D Point Cloud Segmentation

    3D Point Cloud Segmentation

    What is Point Cloud Segmentation? A point cloud is an unstructured 3D data representation of the world, typically…

  • Shadowless 3D Perception

    Shadowless 3D Perception

    Shadowless 3D Perception is a concept often linked to advancements in computer vision, machine learning, and robotics…

社区洞察

其他会员也浏览了