Drowning in Data?  Challenges and Opportunities

Drowning in Data? Challenges and Opportunities

The "digital universe" is growing 40% a year into the next decade, expanding to include not only the increasing number of people and enterprises doing everything online, but also all the “things” – smart devices – connected to the Internet, unleashing a new wave of challenges and opportunities for businesses and people around the world.


Like the physical universe, the digital universe is large – by 2020 containing nearly as many digital bits as there are stars in the universe. It is doubling in size every two years, and by 2020 the digital universe – the data we create and copy annually – will reach 44 zettabytes, or 44 trillion gigabytes. (IDC).

Many people don't really understand the size of this, and just what challenges need to be faced regarding this data deluge.  

Data scientists break big data into three dimensions: volume, velocity, and variety.

Volume

The benefit gained from the ability to process large amounts of information is the main attraction of big data analytics. Having more data beats out having better models: simple bits of math can be unreasonably effective given large amounts of data. If you could run that forecast taking into account 300 factors rather than 6, could you predict demand better?

Velocity

The importance of data’s velocity — the increasing rate at which data flows into an organization — has followed a similar pattern to that of volume. Problems previously restricted to segments of industry are now presenting themselves in a much broader setting. Specialized companies such as financial traders have long turned systems that cope with fast moving data to their advantage.

Variety

Rarely does data present itself in a form perfectly ordered and ready for processing. A common theme in big data systems is that the source data is diverse, and doesn’t fall into neat relational structures. It could be text from social networks, image data, a raw feed directly from a sensor source. None of these things come ready for integration into an application.

Implications for business and IT professionals:

  • Data warehouses will need to be upgraded or swapped out for more flexible data repositories that can handle various data types, automatic tagging, autonomous data “check-in,” and many terabytes. These warehouses must be able to store the vast amount of data on the most efficient infrastructure, bowing to the reality that only a fraction of stored data is actually engaged at any given moment.
  • Data analytic output will need to be driven to more parts of the organization, including real-time input to operational decision making.
  • Big data is messy:  most Data Hub projects are essentially data cleanup projects.  The time spent 'cleaning' and without tangible output will doom a good number of these ambitious projects.  
  • The far-reaching nature of big data analytics projects can have uncomfortable aspects: data must be broken out of silos in order to be mined, and the organization must learn how to communicate and interpet the results of analysis.  This is a cultural problem, and one which will require a rethinking of the interface of multiple (often siloed and geographically dispersed) business organizations and the IT services group.  The role of the CIO will change, and all executives must be engaged in the initiatives.
  • Real transformation to a data-driven or software-defined enterprise is an all-hands-on-deck imperative. IT alone will never be able to make the transition.

______________________________________________________

WHAT ARE BITS AND BYTES?

A "bit" (binary digit) is the smallest unit of information that can be stored in a computer; either a 1 or 0 (or on/off state).  All computer calculations are in bits.

A "byte" is a collection of 8 bits. Bytes are convenient units, because, when converted to computer code, they represent 256 characters, (either numbers or letters). So a byte is 8 times larger than a bit.

Bytes are typically mentioned in multiples of 1,000, such as kilobyte, (1000 bytes) megabyte, gigabyte, etc. The progression is as follows:

Bit (b) 1 or 0

Byte (B) 8 bits

Kilobyte (KB) 1,000 bytes

Megabyte (MB) 1,000 kB

Gigabyte (GB) 1,000 MB

Terabyte (TB) 1,000 GB

Petabyte (PB) 1,000 TB

Exabyte  (EB) 1,000 PB

Zettabyte(ZB) 1,000 EB

This seems simple enough, except sometimes multiples of bytes are considered as powers of 2, since the original machine language only has two states, 1 or

  1. So a kilobyte is 210 bytes, or 1,024 bytes.

A megabyte would be 220 bytes, or 1,024 kilobytes, and so on.

Put it into Context

A short novel                                        1 MB

A meter of shelved books                   100 MB

A stack of tablets reaching 3/4 of the way to the moon   4.4 ZB (today)

A stack of tablets reaching 6.6 times to the moon   44 ZB (2020)

要查看或添加评论,请登录

社区洞察

其他会员也浏览了