???????????????????????? ??????????????????? ?????????? ????????????????????????????????????? ? ?:? ? ?????????? ?&? ????????? ???
Sameer Shirur
Leading Technology Solutions | IISc | Data Science AI I EPC Digitalisation Expert | Industry4.0 Implementation to E&C | Author, Thought leader in EPC Process & Associated Industry |
Purpose
This article is an attempt to give industry leaders a basic understanding of computer vision technology, while they embark on their digital transformation journey. Computer Vision is one of the Industry 4.0 Technology, that has undergone lot of technological advancement and reached quite good level stabilization in terms of available concepts, skills, and research by academia.
In this article, we will cover an overview of computer vision technology and its applications. We will review available programming concepts and cloud computing platforms that can be used by construction industry professionals and technology managers
Origin amp; History of Computer Vision:
1920s saw Bartlane Cable Picture Transmission System as ?one of the original systems of digital image processing to transmit digitized newspaper images over submarine cable lines between London and New York.
1960s advancement of image processing by research laboratories such as Bell Laboratories, MITs, and others, in satellite image processing and image processing in health care. ?
1970s saw introduction of optical character recognition (OCR) technology, that ?enabled extracting texts from images, which was first step towards analysis part from the images being processed.
1980s saw the development of algorithms to understand images and recognize patterns from it.
2000s saw development object recognition, a technology that enabled tagging images, annotating the same and standardizing the same. ImageNet with vast database of images (? ~ 20 billion and growing) paved way for large data set to train and classify the machine learning model.
All these while cost of image processing was coming down with innovation of hardware required in image processing and analyzing such as ?metal–oxide–semiconductor?(MOS) technology, ?charge-coupled device?(CCD) and later the?CMOS sensor. followed by the?discrete cosine transform?(DCT). With the latest semiconductor technology, making the processor smaller and smaller in size with higher and higher computing power.
Understanding the basics:
?There is often seen tendency to use image processing and computer vision interchangeably , when you process an image from a construction site to view items present in the image vs extract information whether items at right place as needed .e.g. whether the person is wearing helmet if not, send alarm or notification to HSE manager at site. Both have different connotations while we articulate.
Image processing : This takes raw images as inputs and enhances it for further use. Hence this becomes a subset of computer vision. This is very well known to us, especially in the medical field X-rays and CT scans, other applications include robotic vision applications and digital image processing applications.
Computer Vision: This is focused on extracting information from the input images or videos to have a clear understanding and predict the difference between images, so that inference can be drawn from them, most of the time by the program itself.
As stated earlier, this article focuses on computer vision and its applications to the construction industry.
WHAT IS COMPUTER VISION?
At an abstract level, the goal of computer vision problems is to use the observed image data to infer something about the world.
— Page 83,?Computer Vision: Models, Learning, and Inference, 2012.
Why Computer vison in engineering amp; construction?
?As per earlier McKinsey report more than 80% of capital projects are either delayed or have cost over runs at the same time worldwide investment in infrastructure projects predicted is more than 12 trillion dollars by 2030s. Closer home, some reports suggests that India is planning invest more than 1 trillion dollars in its infrastructure sector in decade.
With such a huge investment planned and construction technologies such as image capturing, drone streaming and robots playing major role in executing the investment at construction sites, It becomes imperative that technology managers adopt modern technologies, such as Computer Vision, especially since this is? already stabilized with expectation and proven solutions.
“one estimates says that more than 400,000 images are captured from a typical construction project of ~17,000 Sft”
Computer vision as technology can improve how these visual inputs from robotics, surveillance cameras and live streaming are recorded, processed and inferred to realize benefits.
Following areas identified by many researchers , where there is already computer vison technology is deployed and Proof of Concept, Proof of Value being done:
-??Safety at Site
-??Progress Monitoring
-??Quality Management
General use cases in construction
?General use cases for construction computer vision including but not limited to .
There is a research paper on this topic by Mr. Suman Paneru, ?Mr. Idris Jeelani , I would advise readers to review the same for more details.
“Computer vision applications in construction: Current state, opportunities & challenges https://doi.org/10.1016/j.autcon.2021.103940)”
General Tasks involved in implementation of computer vision.
Above are the indicative steps involved in implementing the technology, however there are no standard set of tasks, you may see another research article that has some the tasks above combined with different name or entirely different names that deal with identify, classify, track objects.
My attempt is to provide commonly accepted steps in the industry with details that can be related by the readers.
?“With basics Computer Vision covered, now let us go little further and understand, the machine learning concepts, libraries that are being used along with various cloud computing services and how these are architecturally being used to realize the implementation.”
?Machine Learning concepts in Computer Vision:
?In computer vision, following is the key in preparing right machine learning model with highest accuracy:
Quality of Data : Images and videos should be detailed enough to provide productive usage.
Quantity of Data : Large data sets often result in better or accurate results.
Variety : Collected data needs to have variety, more the variety of existing use case better the training data set.
There are many tools and enterprise grade platforms that can be utilized to achieve above parameters, often for our industry, I would recommend creating custom images/videos with labelled data as per specific use case. Once we have right data, then we move on the adopting right machine learning concepts.
In terms of machine learning concepts, there are R-CNN or Fast CNN and other model architecture that are used for object detection by programmers, which is further used within these libraries to solve computer vision problems.
领英推荐
?CNN : Convoluted Neural Network
?This is most preferred deep learning neural network architecture; key word here is “convolution”. Convolution is a mathematical operation on two functions that produces a third function, which expresses how the shape of one is modified by the other (source : Wikipedia)
Computer Vision needs a large set of data as explained above and the first step in solving a computer vision problem is classifying and labelling it. CNN solves this problem.
CNN? is basically an algorithm that contains neurons (a neuron - is mathematical function that takes multiple inputs and provides single output) that are organized in layers, each layer with their own weights and biases. each layer performs specific action to classify, fine tune the features that are in target get trained and reduce the error of the mathematical model.
With advanced machine learning techniques like hyper parameter tuning , feature engineering, we can fine tune the output for best accuracy.
For more details refer : https://poloclub.github.io/cnn-explainer/#article-convolution
?Let us look at the representative diagram of CNN:
Python libraries for implementing Computer Vision.
?Let us look at a couple of libraries and their reference pointers.
OpenCV :
Open Source : OpenCV is open source and released under the Apache 2 License. It is free for commercial use.
Optimized : OpenCV is a highly optimized library with a focus on real-time applications.
Cross-Platform : C++, Python and Java interfaces support Linux, MacOS, Windows, iOS, and Android.Scikit-Image :
?scikit-image:
scikit-image is an image processing toolbox which builds on numpy, scipy.ndimage and other libraries to provide a versatile set of image processing routines in Python.
Source : https://scikit-image.org/
There are many other open-source libraries that are being developed, I would suggest the readers review them and test them properly.
These libraries can be utilized, for resizing, rescaling, rotation Segmentation, Geo transformations, Color space manipulation, Filtering, Morphology and Feature detection.
Cloud Computing Technologies for Computer Vision
?Let us look at the technical details and reference points, with three major cloud service providers in utilizing Computer Vision use cases.
Microsoft Azure :
Azure's Azure AI Vision service gives you access to advanced algorithms that process images and return information based on the visual features.
?How does it operate?
Key Consideration for Implementation :
Azure Data factory : acts data integration and transformation layer, more than 90 built-in connectors to acquire data from big data sources such as Amazon Redshift, Google BigQuery, and HDFS; enterprise data warehouses such as Oracle Exadata and Teradata; SaaS apps such as Salesforce, Marketo, and ServiceNow; and all Azure data services.
Azure AI Services : this offers Computer Vision & Custom Vision services.
Computer Vision services can be used for Image analysis?that pulls from more than 10,000 concepts and objects to detect, classify, caption, and generate insights. Spatial analysis?to understand people's presence and movements within physical areas in real time. Optical character recognition (OCR)?to extract printed and handwritten text from images with varied languages and writing styles. Facial recognition?to create intelligent applications that recognize and verify human identity.
Custom Vision Customization to your scenario , helps in Intuitive model creation and can Run AI Custom Vision in the cloud or on the edge in containers.
“similar to Microsoft Azure, there are cognitive, or AI services specifically designed for computer vision use cases in AWS and Google cloud” you can refer their website for detailed understanding.”
Computer Vision on AWS
Amazon Rekognition is a service that makes it easy and quick to add deep learning-based visual search and image classification to your applications. With Rekognition, you can detect objects, scenes, and faces in images. You can also search and compare faces, recognize celebrities, and identify inappropriate content
Google Cloud Vision:
Vertex AI Vision is a Low Code , No Code platform where one can build end-to-end machine learning pipelines covering all the steps from inference and analytics applicable in Computer Vision. It targets business decision-makers and analysts who want to build analytics based on computer vision without dealing with complex code. Vertex AI Vision also has an SDK for developers to extend the functionality and embed the output in web and mobile applications.
?Some of Research Papers published for utilizing computer vision in Construction:?
Conclusion :
?As we went through the article , we understood that there are enough research papers published in the area of computer vision for the construction industry.
There is enough evidence that current advancement in software & hardware via cloud computing service providers make it accessible to implement.
The construction industry has enough use cases as per available research papers in the area of worker safety, quality monitoring and progress monitoring.
I would recommend construction industry professionals and technology managers to be more aggressive in adopting computer vision to solve business problems.