登录查看更多内容

Observance - Preface

Rohan Shravan

The longer we wait, the more our position (environment) erodes!

发布日期: 2022年12月2日

Preface

As mentioned in the last post, we will now share what we do, what Observance is, how we implemented Transformers and other AI-based models in our processing pipeline, and the basic building blocks to understand all of this. One can follow these posts to understand the state-of-the-art in 3D reconstruction, LIDAR, and Sensor-Fusion, and create usage of Transformers. We want these articles to be an informative starting point for anyone trying to solve similar problems. So instead of going deeper into Why Observance Exist, we will look at the problems we had to solve while building it and how we solved them.

What exactly are we solving?

We are trying to perform a 3D reconstruction of a scene using techniques this is fast and highly accurate. These scenes can be very large (>1 million sq. ft) and we do not have a lot of time for data collection (less than an hour or as fast as one can walk). Once data is captured we need to convert it into a lightweight 3D model with colored texture and compare it with the actual 3D or 2d model (after automatic alignment) to identify structural deviations or defects.

And of course, this process has to be extremely cost-effective because if you have $300k you can get a 1M sq. ft area scanned by leading service providers within a month.

What is 3D reconstruction?

3D reconstruction is the process of creating a three-dimensional model of an object or scene from a set of two-dimensional images or measurements. This can be done using a variety of techniques, including photogrammetry, structure from motion, Structure from Motion, and SLAM and AI techniques like Monocular Depth Estimation. While we will predominantly focus on techniques similar to SLAM using LiDAR, I remain heavily bullish on Monocular Depth Estimation.

No alt text provided for this image — Ref: Boosting MDE Models to High-Resolution via Context-Adaptive Multi-Resolution Merging by S. Mahdi H. Miangoleh et al.

What is SLAM?

SLAM, or Simultaneous Localization and Mapping, is a technique used to create a map of an environment using a camera, LiDAR, inertial measurement units, or other sensors, while simultaneously tracking the location of these sensors within the map. At Inkers, we are using a sensor fusion framework combining LiDAR, Camera, and Inertial sensors to achieve robust and accurate 3D point estimation.

What is sensor fusion?

Sensor fusion is the process of combining multiple sources of information to obtain a more accurate, reliable, and comprehensive view of the world. This is often done in situations where sensors are measuring different aspects of the same phenomenon, such as in self-driving cars, where sensors like cameras, lidar, and radar are used to understand the environment around the vehicle. Let's break this down.

Let us say we are using LiDAR to capture the 3D point cloud. LiDAR would capture instantaneous 3D data extremely well (but not the color texture). But during the capture, we do not have access to the most accurate 3D location of the LiDAR (GPS in our case is ruled out as we need to scan areas where GPS location might not be accessible). The images on the left are how the overlayed scans might look, while we need the scans to align as shown in the right images below.

领英推荐

Tools and Resources to Learn and Master NeRFs and…

Gabriele Romagnoli 9 个月前

Reality Capture in AECO: The Power of 3D Scanning &…

Santosh Kumar Bhoda 1 个月前

From Point to Plot - Processing LiDAR Data

Arpit Shah 2 年前

There are some techniques like Iterative Closest Point, but practically they are good for small regions. At Inkers, we capture LiDAR points at 10 fps. Assuming a walkthrough from a building takes around 15 minutes, we need to align ~9000 point clouds! Other approaches, including ICP, will heavily depend on initial correct localization.

Instead of LiDAR, one can use multiple camera images and align them to create 3D models as well, just as Matterport does.

This works amazingly well when we want to explore a site visually or capture the texture (as shown in the image below):

But there is only so much detail and accuracy one can capture using just camera-based techniques:

Models would remain incomplete and off by a large percentage on measurements:

If we just need the current location then we can also use an accelerometer or IMU. IMU works well for instantaneous location, but the errors accumulate over time and result in large deviations:

So, LiDAR provides the accurate instantaneous 3D location of points in space, the camera provides amazing texture, and IMUs provide accurate current location.

So, can we

combine LiDAR and a Camera to get a colored point cloud
combined LiDAR, Camera, and IMU to get an accurate location
come up with an algorithm to track the exact location for 100s of meters with precision

These are the problems that we wanted to solve. But as we started to develop our solution, we realized that these are not the only problems at hand.

It is extremely hard to sync a LiDAR frame with a Camera frame (remember, you need to keep the cost low, because there are PTP-based camera solutions that can make this problem easier)
calibrating a LiDAR with a Camera is a very tedious process, especially when both have their own aberration
LiDARs and Cameras might have limited FoVs, and if not designed well, one might capture more data than the other. LiDARs also do not work well with reflection (image water on the floor at a construction site or just after the rains)
LiDARs do not work well for short distances, so how will we capture lift lobbies, small rooms, and small passages?
Cameras wouldn't work at all in dark :|
The amount of data generated is extraordinary (a 100000 sq. ft scan can take up to 1TB of raw storage!) and storing it in a small amount of time has hardware challenges
We are not throwing a 1TB file toward our clients, we need heavy compression to store this data for the future
We still need a MESH and not a point cloud! Mesh needs to have a low polygon count, but still be good enough to capture HVAC, Plumbing, and other objects that we find at a construction site.
Point Cloud to Mesh conversion cannot introduce any approximation error
We need to segment the generated 3D Mesh into walls, columns, beams, pipes, HVAC components, floor, etc.
We need to recognize concrete and other defects (in the images) and then show them on the 3D model.
We also need to know the temperature of every point that we will capture, which means we need to integrate a thermal camera as well!

Well, the list is long and doesn't end here.

In the articles to come, we will cover the problems described above and the solutions we opted for in detail. These solutions would be applicable to problems related to the companies that are working in the field of interior design, building construction, video game production, filmmaking, architecture, restoration, engineering, scientific research, autonomous driving, and robotics.

Stay tuned!

Maqsood Ali

Operations, Data & Strategy

2 年

Amazing Stuff!

Yuvraj Tomar

Founder @CloudWorx | Assembly Line | Digital Twins | Forbes Asia U30 | Shark Tank India

2 年

"We still need a MESH" - Couldn't stress it any further!

1 次回应

Chaitali Deb Purkayastha

2 年

??????????

Arjun Gupta

2 年

??????

查看更多评论

要查看或添加评论，请登录

Rohan Shravan的更多文章

EIP4 Starts now!

2019年10月3日

EIP4 Starts now!

EIP 4 Enrollment Starts now! Enroll at https://www.tiny.

4 条评论
Are Indian smart cities really smart?

2019年5月16日

Are Indian smart cities really smart?

Imagine an India with citizen friendly self-sustainable urban settlements where traffic is managed through AI-driven…

5 条评论
How Artificial Intelligence will impact India

2019年1月25日

How Artificial Intelligence will impact India

Just as he was parking his car inside his house, a few people surrounded and shot him dead. It was all dark, and…

15 条评论

Observance - Preface

Rohan Shravan

The longer we wait, the more our position (environment) erodes!

Preface

What exactly are we solving?

What is 3D reconstruction?

What is SLAM?

What is sensor fusion?

领英推荐

Rohan Shravan的更多文章

社区洞察

其他会员也浏览了

Unlocking the Mysteries of Point Clouds: A Comprehensive Guide

How to Use 3D Mapping in Surveying: Tools, Techniques, and Best Practices

Improving Object Classification with Lidar

How do you colorize a point cloud?

How do you colorize a point cloud?

Key Highlights from the "Gaussian Splatting+SLAM" Webinar | Exploring the Future of 3DGS

Choice of Wavelength for Lidar sensors

Collection of LiDAR data

Point cloud analysis using ICP

As Build Documents - Adding in Accuracy

Preface

What exactly are we solving?

What is 3D reconstruction?

What is SLAM?

What is sensor fusion?

领英推荐

Rohan Shravan的更多文章

EIP4 Starts now!

Are Indian smart cities really smart?

How Artificial Intelligence will impact India

社区洞察

其他会员也浏览了

Unlocking the Mysteries of Point Clouds: A Comprehensive Guide

How to Use 3D Mapping in Surveying: Tools, Techniques, and Best Practices

Improving Object Classification with Lidar

How do you colorize a point cloud?

How do you colorize a point cloud?

Key Highlights from the "Gaussian Splatting+SLAM" Webinar | Exploring the Future of 3DGS

Choice of Wavelength for Lidar sensors

Collection of LiDAR data

Point cloud analysis using ICP

As Build Documents - Adding in Accuracy