Why a High-Performance Data Management Platform is an Essential Component for Connected-Transportation Analytics in DOT Projects
Martin Schmid
National Sales State and Local Government & Education (SLED), Federal Civilian and Department of Defense - SAP Finance at SAP Americas
Introduction
Connected-vehicles and the availability of connected-vehicle data in real-time is happening now. Standards for embedded vehicle-to-everything (V2X) cellular communication now include both the older IEEE Dedicated Short Range Communication (DSRC) and the more recent cellular V2X (C-V2X) have been developed. Both are designed to enable direct real-time safety communication between road users (vehicle-to-vehicle) and vehicles to infrastructure (V2I). DSRC achieved mass-market adoption in 2017 while C-V2X will appear in chipsets in 2021. Government agencies are actively deciding now on policies for the allocation of dedicated V2X spectrum for these devices.
Many companies today are working actively on enhanced data capture from vehicles via embedded V2X-enabled telematics units, via mobile apps on phones riding in the vehicle, and via increasingly capable roadside devices designed for V2X communication. Public-private sector investments to leverage the advantages of this new connected-transportation ecosystem are in place and have ambitious near-term agendas. The near-term possibilities for real-time traffic shaping and predictive models for vehicle movement represent exciting breakthroughs in safety and efficiency that could result in significant wins for individual travelers, transportation companies, and infrastructure management organizations. Telemetry aggregator companies like Wejo and Arity are now making their aggregated multi-source, multi-location data available to clients in the commercial and public sector.
Underlying all these scenarios, the connected-vehicle data sets are the key resource to be exploited. The volume of this data is by its nature huge since it most often represents continuous reporting from devices or device readers for large numbers of vehicles 24x7. And, furthermore, the most interesting and potentially game-changing data is real-time communications that can be used to take in-the-moment actions to improve safety and efficiency. These scenarios will ultimately include car-to-car communication in flowing traffic, but today the most exciting possibilities relate to real-time traffic management actions by the transportation department overseeing the roads.
Realizing the Potential of Connected-Vehicle Data Sets
Given this situation, projects working on connected-transportation are focusing on data capture and data analysis capabilities, which can operate at a high scale and in some cases in real-time, as their key technical challenge. Data analytics and real-time decision-making capabilities are the keys to unlocking the exciting potential of developing much safer and much more efficient use of roads due to the emergence of connected vehicles and connected transportation networks.
Data analytics dashboards are only as good as the data behind them. If there are limitations in capturing and moving the raw data coming from vehicle-data streams to the data analytics solution, then the data analytics is limited by definition. One way to state this is the principle of “garbage in, garbage out” – that is, any analytics results are only as good as the completeness and quality of the underlying data used in the analysis.
In the case of offline analysis, data aggregators offer rich pre-processed data sets for public sector or commercial clients. These raw data sets must be accessed via software APIs and intelligently filtered and processed in order to operationalize analytics within the client’s domain in a timely fashion. If data is available “close to the device” – that is, true real-time data streams from in-vehicle or roadside devices -- then the data management layer must truly be real-time software with high-performance processing capabilities. In this way, the strength of the real-time data management platform that connects raw data with the target analytics for traffic management or other connected-vehicle systems will determine the limits of what can be accomplished in the project. If this data management platform is weak, then the analytics will fail in its objectives because of “garbage in, garbage out”.
Why SAP Convergent Mediation for enabling New Connected-Transportation Solutions?
SAP Convergent Mediation (SAP CM) is one component of the SAP Billing and Revenue Innovation Management product suite and provides an extremely capable data capture, data management, and data orchestration layer for high-volume real-time vehicle data. SAP CM was first developed in telecommunications networks for extremely demanding data management of device-generated data (network switches and equipment) at extremely high volumes (100s of Billions of records per day in some cases). It has been deployed in over 350 telecom networks worldwide.
As a result, SAP CM is extremely capable of real-time data capture and data processing. Some specific key functionalities are the following:
- dealing with collection APIs and format control for information coming directly from diverse devices, sensors, and networked equipment as well as from databases and back-office systems
- intelligently and flexibly correlating data from different sources – e.g., time-based correlations of sensor data; SAP CM correlates events as part of its real-time processing of a data stream
- aggregations, and combinations of related data into summary objects for target systems
- enrichment of data records with derived metrics and with additional attributes available via calling out to an external database or system – e.g., matching up an in-coming vehicle-id with information available in back-office vehicle registration systems
- filtering, masking and encrypting sensitive data at the field level in raw input records to support appropriate data sharing and privacy concerns
From a performance and scalable processing level, SAP CM is unparalleled. Several telecom customers use SAP CM to collect and process over 100 Billion device-level events per day. In one instance, the volume exceeds 500 Billion events each day in continuous processing at a sustained rate above 4 Million events per second. The input events, in this case, are being collected from network equipment spread throughout a nation-wide wireless network in North America.
Scenarios for Department of Transportation (DOT) Adoption of SAP Convergent Mediation
SAP CM solves the high-volume data management needed by DOT organizations in a comprehensive manner. Specifically, SAP CM has all the required functionality to ensure internal DOT analytics can achieve effective, immediate, and impactful access to rich 3rd-party connected car data feeds.
Use Case 1: Ingestion of Vehicle Data from an Aggregator Platform
SAP CM can provide robust automated collection and transformation of any vehicle information set or derived information made available to a Department of Transportation (DOT) by third-party aggregator vendors.
Bringing data from the aggregator data sources into the DOT systems involves the following steps:
1. Collection of the data from the external source using APIs provided by the Data Aggregator service. SAP CM allows a DOT agency to collect 3rd-party data in a fully controlled fashion using the APIs provided by the data aggregator. Because SAP CM has built-in support for any set of standard APIs, an agency can choose any data vendor without incurring internal IT development costs to use the APIs specific to that vendor. Further, while vendor APIs are designed for targeted data retrieval from the massive connected-car data sets of the Data Aggregator, the retrieval process often involves multiple API calls in a coordinated sequence and is optimally done on-demand or in a scheduled fashion. SAP CM handles all these requirements with a simple configuration.
2. Combining and formatting multiple data streams. SAP CM also has configuration-based controls for dealing with the diverse data formats inherent in multiple data sets and especially with device-based data. SAP CM processing allows for intelligent combining and filtering logic to merge data sets in-memory while eliminating any irrelevant data elements. SAP CM reduces and refines the raw data set without storing it (in-memory processing) and thus eliminates cost issues associated with storing unimportant data in on-site DOT analytics systems. Storage costs are a significant issue because the size of connected-car data repositories is massive and continues to grow exponentially.
3. Enrichment with DOT internal data. The 3rd-party data sets come from many sources (apps, vehicles, equipment, value-add analysis) and many locations. In some cases, the DOT client would like to correlate this data with ‘local data’ obtained or derived from the DOT’s own vehicle registration and control systems or databases. Indeed, the most interesting data will be data either taken from or directly correlated to the DOT’s own transportation network.
Towards this end, SAP CM can use input data to pull out identifiers (e.g., vehicle type, location, or tag) that enable dynamic calls to DOT systems or databases to match up internal DOT information to corresponding retrieved data items. Again, it is important to emphasize that this combining and augmentation of input data sets via dynamic API calls to internal systems is all done with in-memory processing and thus avoids storing the intermediate data sets.
4. Delivery over existing APIs to in-house systems. Once SAP CM has assembled the right data set with the right granularity, SAP CM delivers that data through an API to the internal DOT analytics platform and possibly other internal systems. With its built-in extensive library of built-in connectors, SAP CM can be easily adapted to existing APIs and delivery methods (SQL, APIs, middleware) so that getting new 3rd-party data into in-house existing DOT systems is easy and quick to implement and to adapt to new use cases.
In summary, with SAP CM in place, the DOT will be able to consume any 3rd-party vendor data. SAP CM allows the agency to collect 3rd-party data in a fully controlled fashion, using the data sharing mechanisms offered by the vendor (e.g., APIs), and subject to any timing requirements or schedule. Further, the SAP CM product has built-in configurable real-time alerting for failures or unexpected events (e.g., no data was available at the expected interval) during the execution of its automated collection processes.
Use Case 2: Scenarios involving Real-time Data Flowing from Sensors or Devices
As the connected-vehicle ecosystem and V2X mass-market adoption accelerate, access to real-time data directly from devices will become more widely available, extending beyond device vendors and a limited number of low-level aggregators. Real-time information flows will become increasingly exposed to larger agencies, companies, and stakeholders.
In this environment, a Department of Transportation will have the opportunity to develop real-time solutions within its own Transportation Network and potentially using real-time streams made available from the ecosystem of 3rd-party Connected Car Aggregators. But these capabilities will be possible only if the DOT has a data management layer capable of consuming and controlling the real-time data streams for effective use.
Therefore, as shown in Figure 1, a final critical capability inherent to SAP CM is the following:
5. Enabling Real-time Data Flows for Data and for Bi-Directional Control Messages directly to Road-side Infrastructure.
Due to its telecom heritage, SAP CM is designed and proven for handling high-volume, real-time data streams with low delay. As a true real-time platform, SAP CM can apply all of its capture and transformation functionality described above to real-time data streams interfacing devices, equipment, sensors, and other IoT directly to real-time analysis and real-time decisioning systems.
In this way, SAP CM can enable completely new real-time traffic management and traffic analytics solutions. Many exciting possibilities are emerging today for a DOT that has the capability to consider real-time data scenarios.
About Bert Dempsey
Bert Dempsey is a Senior Solutions Consultant for the SAP Mediation at Digital Route. He has worked 20 years in technical roles within software product companies focused on telecom mediation and usage data management software. To start his career, after obtaining a PhD in Computer Science, he was an Associate Professor at the University of North Carolina, Chapel Hill (1995 – 2003) where he co-authored 35 research papers on topics in real-time networking and distributed systems.
SAP BRIM Manager |SAP CM |SAP CI | Telecom BSS Solution Architect
4 年Great info! Is this capability part of next MZ release?
On a ride somewhere...
4 年Nicely written Martin! Lots of detail in here.
CEO @ DigitalRoute
4 年Great piece Martin Schmid and Bert J. Dempsey