Future of Big Data
Michael Spencer
A.I. Writer, researcher and curator - full-time Newsletter publication manager.
Big data refers to those data sets so large and complex that traditional data processing applications aren't adequate. Adequate for what you might ask?
- Data analysis
- Data capture
- Data Curation
- Search
- Sharing
- Storage
- Transfer
- Streaming
- Visualization
- Information privacy
With the emerging Internet of Things, mobile devices and sensors are/will be placed in everything and everywhere, dealing with an enormous amount of data. Large datasets already are involved in internet search, finance and business informatics, as well as academic science, physics and environmental simulations, etc... When will Big data become main stream?
Big Data has a lot of hype, but a lot of mainstream enterprise are still waiting on better technologies to emerge. Hadoop batch processing is just not enough to seduce most companies to do a pilot study. They are waiting for streaming analytics as analyzing data as a stream whether historical data or incoming real-time data is what the future of Big data is going to look like.
Stream analytics provides better scalability and flexible benefits, regardless of whether it's real-time or historical. Data is becoming more fluid, and new tools are emerging that make it easier to transport as a stream, so as to remove the necessity of interrupting, translating and transforming the data via batching.
As modern architectures bypass ETL processes (stands for Extract, Transform and Load), the facility of real-time analytics from its original source becomes simpler. So industry might be saying goodbye to batch windows, recovery from batch failures and will be handled as a stream end to end. While Hadoop encouraged batch thinking, some people believe it's will soon become outdated.
While data volumes become larger, the tools enterprise will have at their fingertips will be more significant and have higher computational power and machine learning intelligence. So as Big Data unfolds, we can expect mobile data collection to continue in its increase in complexity and cloud applications to improve while costs of storage continue to decrease. So where is all of this heading?
Obviously human cannot handle this kind of mass-scale data analytics, which is what brought platforms like Hadoop into the forefront. Characteristics of Big data are commonly cited as:
- Volume
- Variety
- Velocity
- Variability
- Veracity
- Complexity
Implementations can be thought of as:
- Connection (sensors and networks)
- Cloud (data on demand & computing)
- Cyber (model and memory)
- Content & context (meaning and correlation)
- Community (sharing & collaboration)
- Customization (personalization and value).
From DARPA programs back in the early 2000s to 2015, we are witnessing the evolution of Big Data and starting to get a better feel for what it's going to turn out like. However the tools and software is still evolving, as is the overall infrastructure. We can anticipate it to be more mature by 2020 and robust in enterprise in another five years. By then, machine learning and predictive analytics will be also more sophisticated.