The Time of Real-Time Data
Craig Brown, PhD
SME - #Leadership, #SolutionsArchitecture (#Cloud, #BigData, #VectorDB, #LLM, #DataScience, #GenAI, #DataAnalytics, #DataEngineering, #DataArchitecture, #MachineLearning, #ArtificialIntelligence, #YugabyteDB, #CockRoach)
Edge computing is one of the newest innovative approaches to network architecture that helps enterprises overcome limitations that have been imposed by traditional networks and even still with some cloud-based networks, although cloud computing does makeup the modern network architecture. The benefits offered by the Internet of Things (IoT) connecting physical devices, capable of processing the data they gather at the source, are forcing companies to rethink their approach to IT infrastructure.
Edge Computing is a form of computer architecture that serves as an alternative to cloud computing. Rather than transfer data generated by IoT connected devices to the Cloud or a Data Center, it is about processing the data at the edge of the network directly where they are generated. Learn below about the definition, operation, benefits and use cases of Edge Computing.
The amount of data generated by connected objects and other mobile devices (smartphones and tablets) is exploding. In this context, specialists like Cisco's Helder Antunes believe that it is unrealistic to think that all these data can be transmitted between their sources and the Cloud Data Centers in a stable and fast way for analysis.
However, companies in many industries such as manufacturing, health, telecommunications, and finance need to be able to analyze the most important data as quickly as possible, almost at the same time as they are collected. Faced with this future challenge, the Edge Computing could be the solution.
What Is Edge Computing and What Is It For?
To put it simply, Edge Computing or data processing at the edge of the network is an open distributed computing architecture. It offers decentralized processing power. Thus, rather than being transmitted to a remote Data Center, the data is processed directly by the device that generates them (connected object, smartphone, etc) or by a computer/local server.
Specifically, Edge Computing can be thought of as a mesh network of Micro Data Centers that process or store critical data locally. The data is then transmitted to a central data center or cloud storage with a footprint of less than 10 square meters.
Most often, the Edge Computing is used in the field of the Internet of Things. Part of the vast amount of data collected by connected devices is processed locally to reduce traffic to the cloud or data centers and enable important data analysis in real-time (or near real-time).
The connected object usually transfers the data to a small local device capable of processing and storing. The data is thus processed at the edge of the network before being transmitted to the cloud or the data center.
What are the Benefits and Use Cases of Edge Computing?
Edge Computing may be relevant in many situations. For example, when IoT connected objects have poor connectivity, it is not efficient to keep them connected to a central cloud at all times. Processing at the edge can help to solve this problem.
Edge Computing also reduces the latency of information processing because data does not need to traverse a large network to reach a remote Data Center or cloud server. This reduction in latency is particularly important in the areas of financial services or manufacturing.
According to IDC analyst Kelly Quinn, the Edge Computing could take off as part of the deployment of the 5G mobile network. The researcher predicts that telecom providers will be adding more and more Micro Data Centers to their 5G antennas to allow their business customers to rent space in these miniature data centers to reap the benefits of Edge Computing.
Is Edge Computing More Secure Than Cloud Computing?
The security of Edge Computing is a big debate. In a sense, we can consider that security is higher on the periphery of the network because the data do not cross the network and remain close to their source. In fact, if a cloud environment or an internal Data Center is compromised, the number of vulnerable data is minimized.
However, devices on the edge of the network are a priori more vulnerable than the Cloud or Data Centers. As a result, an Edge Computing environment may actually be less secure than a cloud environment or an internal Data Center. In fact, to ensure the security of an Edge Computing environment, special care must be taken in security: data encryption, access control, use of a VPN are essential measures.
New Issues in Big Data Storage and Processing
There was a time when to store the data surge, a three-step protocol was applied: ETL (Extract, Transform, Load) processing raw data, then storage in huge Data Warehouses and then Signage in application Data Marts to meet business needs.
The appearance of Data Lakes in the mid-2010s compressed this long device in one step: now it has become possible to directly store the raw data in the Data Lake without intermediate processing, according to a simple process "Extract Load Transform" which ensures easy access to data for trades, without the Data Mart silo effect.
As the issue of uses then became the central nerve of the development of Big Data, the debate then refocused on the downstream phase of data analysis, the application phase, thus opening the new field practices in data storage and processing. Cloud, Edge Computing, Blockchain.
Fast Data Vs Big Data ... The Time of Real-Time
Fast data is the application of big data analytics to smaller data sets in real-time or near-real to create business value or solve a problem. The goal of fast data is to quickly gather and mine data (both structured and unstructured) so that action can be taken.
If the processing of large volumes of data "batch" remains a major challenge within the company - to identify patterns and meet long-term business needs - the need for immediacy/quickly is imposed on segments smaller and more volatile data. Specifically, for IoT or custom marketing applications, the ability of the system to respond rapidly to an event can be taken into account: for example, the formulation of a dedicated promotional offer for a customer who performs a price comparison on the Internet. Or the answer of a voice assistant to the question of his user.
The concept of Fast Data was then imposed, based on the stream of data technologies (Spark, Storm, Kafka, etc) that reduces the processing time and memory footprint. Fast Data is designed to process and analyze small incoming data sets (structured or unstructured) that may lose their value if not subject to immediate analysis. The process relies on flash storage tools and velocity-oriented databases that will extract and process the data at very high speed (we are talking about several million events per second). This phenomenon should accelerate with the advent of the Internet of Things.
In their study, "Big & Fast Data: The Rise of Insight-Driven Business", Capgemini and EMC indicate that 54% of companies surveyed for the study consider the analysis of Fast Data to be more important than that of Big Data.
In fact, it is the complementarity of the systems that seem to be the number one issue in the coming years: the segmentation of flow data on the one hand and stock data, on the other hand, depending on the differentiated uses that we want to make. In addition to that is the problem of cost (a Fast Data infrastructure costing double a Big Data infrastructure) and it is a likely headache for the CIOs!
IoT and Edge Computing
The advent of the Fast Data is not an above-ground phenomenon: it is as a result of the exponential increase in the Internet of Things devices that we have come to ask the question of simplified treatments. The IoT is capable of generating 5 gigabytes of data per second, and if we do not implement real-time processing, we will miss these data.
Beyond the question of Fast Data processing tools, it is the challenge of flows that is the heart of the device: how to reduce the processing time if the data must go through a cloud server? To answer these questions and make the sensors independent of the concept of connectivity, many "on the edge" processing devices - that is, on a nearby storage infrastructure - have recently emerged.
Thanks to edge computing, data and algorithms are hosted on servers close to information capture devices, thus allowing more immediate and more secure processing. Increasingly popular in industries, the Edge Computing now has a market of offers, which highlights the advantageous costs of this technique compared to Cloud or Data Lake storage. The only prerequisite: Ensure interoperability between the IoT system and the Edge Computing infrastructure.
Author - Craig Brown, PhD