5 challenges Big Data experiences to move forward
As the buzz around big data and what’s possible with it escalates substantially, the need to address big data challenges on the road ahead is only pertinent.
As big data seems to be going beyond the capability of technology to store, manage, and process it, we have tried and highlighted some of the big data challenges that truly deserve your attention and effort.
Storage and transportation
Every time we have invented a new storage medium, the quantity of data has exploded each time. However, this time it's different. While data is being created by everyone and everything (thanks to social media and the IoTs), and not just by professionals, like scientists, writers, or journalists, there has been no new storage medium. The current disk technology limits are about 4 Terabytes per disk, so an Exabyte would require 25000 such disks. Let’s assume an Exabyte of data could be processed on a single computer system, the system would be unable to directly attach the requisite number of disks.
Besides to access to voluminous information would overwhelm the current communication networks. Let’s say that the effective sustainable transfer rate of a 1 Gigabyte per second network is around 80% and the sustainable bandwidth is about 100 Megabytes. Therefore, transferring an Exabyte would approximately take about 2800 hours, assuming a sustained transfer could be maintained. This means that it would take longer to transmit the data from the collection or storage point to the processing point than it would to actually process it!
Management
Unlike the collection of data by manual methods, the richness of big data representation prohibits a bespoke methodology for data collection. Data is also very fine-grained, such as metering or clickstream data. Given the volume of big data, it is impractical to validate every data item; the method of Data Qualification also focuses more on the missing data or outliers rather than validation of every item. The sources of big data vary both temporally and spatially, in terms of their format and method of collection. To summarize this issue, there isn’t any perfect big data management solution yet.
Processing
Let’s say an Exabyte of data needs to be processed in its entirety. For simplicity, assume that this data is chunked into blocks of 8 words, so 1 Exabyte = 1 K Petabytes. Now, of a processor expends 100 instructions on a single block at 5 Gigahertz, the time spent in end-to-end processing would be 20 nanoseconds. You can well imagine that at this rate, processing 1K Petabytes would require a total end-to-end processing time of roughly 635 years. Therefore, for effective processing of Exabytes of data, we would need extensive parallel processing and new analytics algorithms.
Ownership
Ownership is particularly a challenge in the social media arena. While petabytes of social media data reside on the servers of Facebook and Twitter, it is not really owned by them; however, they may contend so because of residency. On the other hand, owners of the pages or accounts believe this data to be theirs. We can clearly see a dichotomy arising here.
Compliance and Security
In certain domains, such as healthcare or social media, as more and more data is accumulated about individuals, there is a fear that certain organizations will know too much about individuals. The biggest threat is the unregulated accumulation of data by numerous social media companies, especially when individuals are so willingly surrendering personal information to these companies. Big data definitely needs to be strongly secured with respect to privacy and security laws.
The list above intends to identify some of the cardinal big data challenges experiences in moving forward. We need to address these issues with spontaneity and responsibility, to ensure a smooth and successful venture around big data.
Assistant Developer at Filets sociaux
7 年wow when I'm I going to be working on this big DATA
Data & Analytics Leader | Commercial & Salesforce Excellence | Advanced Analytics | Data Integration | Data Governance | People management | Vendor Management
7 年Very true. It's only in early stages of data explosion. World is going to face many data storage, processing and Insights challenges in future for sure.
Senior Software Engineer | BigData at Mastercard
7 年Adarsh Nair
Advisory@ Korn Ferry | Ex EY | Ex Capgemini | IMI New Delhi | IIITB | Org. Strategy | Leadership Dev. | People Analytics | Certified KF Hay JE | Jombay HR 200u30
7 年sir, i totally agree with your points but dont you think because of big data companies have the opportunity to know their customers better and can work accordingly