登录查看更多内容

Big Data Volume

Divagar Carlmarx

?? Building microservices with k8 for ?? ?? ?? ??

发布日期: 2020年2月1日

+ 关注

Big Data Volume

Data volume is characterized by the amount of data that is generated continuously.
Different data types come in different sizes. For example, a blog text is a few kilobytes; voice calls or video files are a few megabytes; sensor data, machine logs, and clickstream data can be in gigabytes.

Example: we can see how the volume aspect of Big Data gets simply overwhelming with the organization Sasstify Inc.

The complexity is not from the type of data but the size too—100 MB per every four hours versus 1 MB per second makes a lot of difference when you look at the amount of compute and associated process cycles.

The most important point to think here is from your organization’s point of view: What are some of the Big Data specifics that can fall into this category and what are the complexities associated with that data?

Let us examine another consumer-oriented corporation and how they looked at this situation within their organization.

Sasstify, Inc. is a leading photography and videography equipment manufacturer since 1975, providing industry-leading equipment both for commercial and personal use. The company was thriving for over 20 years and was known for its superior customer service.
Sasstify employed traditional customer relationship management (CRM) techniques to maintain customer loyalty with incentives like club cards, discount coupons, and processing services. With the advent of Web 2.0 and the availability of the Internet, smartphones, and lower-priced competitive offerings, the customer base for Sasstify started declining.
The traditional decision support platform was able to provide trending, analytics, and KPIs, but was not able to point out any causal analysis. Sasstify lost shares in their customer base and in the stock market.
The executive management of Sasstify commissioned a leading market research agency to validate the weakness in the data that was used in the decision support platform. The research report pointed out several missing pieces of data that provided insights including sentiment data, data from clickstream analysis, data from online communities, and competitive analysis provided by consumers. Furthermore, the research also pointed to the fact that the company did not have a customer-friendly website and its social media presence was lacking, therefore, its connection with Gen X and Gen Y consumers was near nonexistent.
Sasstify decided to reinvent the business model from being product-centric to customer-centric. As a part of the makeover, the CRM system was revamped, the customer-facing website was redone, and a strong social media team was formed and tasked with creating connections with Gen X and Gen Y customers. Product research and competitive intelligence were areas of focus with direct reporting to the executive leadership.
As the business intelligence team started understanding the data requirements for all the new initiatives, it became clear that additional data was needed, and the company had never dealt with this kind of data in its prior life cycle. The additional data sources documented included:

i. Market research reports

ii. Consumer research reports

iii. Survey data

iv. Call center voice calls

v. Emails

vi. Social media data

vii. Excel spreadsheets from multiple business units

viii. Data from interactive web channels

7. The bigger part of the problem was with identifying the content and the context within the new data and aligning it to the enterprise data architecture. In its planning phase, the data warehouse and business intelligence teams estimated the current data to be about 2.5 TB and the new data to be between 2 TB and 3 TB (raw data) per month, which would be between 150 GB and 275 GB post processing. The team decided to adopt to a scalable platform that could handle this volatility with volume of data to be processed, and options included all the Big Data technologies and emerging database technologies.

8. After the implementation cycle, the business intelligence teams across the enterprise were able to use the new platform to successfully plan the business model transformation. The key learning points for the teams included: A new data architecture roadmap and strategy are essential to understand the data, especially considering the volume.

i. Data volume will always be a challenge with Big Data.

ii. Data security will be determined only post processing.

iii. Data acquisition is first and then comes the analysis and discovery.

iv. Data velocity is unpredictable.

v. Non traditional and unorthodox data processing techniques need to be innovated for processing this data type.

vi. Metadata is essential for processing this data successfully.

vii. Metrics and KPIs are key to provide visualization.

viii. Raw data does not need to be stored online for access.

ix. Processed output is needs to be integrated into an enterprise level analytical ecosystem to provide better insights and visibility into the trends and outcomes of business exercises including CRM, Optimization of Inventory, Clickstream analysis and more.

x. The enterprise data warehouse (EDW) is needed for analytics and reporting.

The business model transformation brought with it a tsunami of data that needed to be harnessed and processed for meaningful insights. With a successful change in the data architecture and strategy, Sasstify was able to quickly re-establish itself as a leading provider of photography services including products. With the new business model, the company was able to gain better insights into its legacy and new-generation customer expectations, market trends and their gaps, competition from their view and their customers’ view, and much more. There are nuggets of insights that are found in this extreme volume of information.

The point to pause and ponder is, how might your own organization possibly adapt to new business models? What data might be out there that can help your organization uncover some of these possibilities?.

Please comment your thoughts in comments :)

查看更多评论

要查看或添加评论，请登录

Divagar Carlmarx的更多文章

Processing large amount of CSV data using JAVA

2022年3月15日

Processing large amount of CSV data using JAVA

Have you worked with large amount of csv DATA in GBs ?? And you have memory constraints ?? This might help for you…

1 条评论
Fell in love with Scala

2020年6月3日

Fell in love with Scala

I was a hard core JAVA developer in both my professional and learning journey, but recently for a reason i have started…
Scala - Sealed Class Hierarchies

2020年5月28日

Scala - Sealed Class Hierarchies

In my previous article i had shared you regarding Option feature in Scala, in this article come lets discuss about…
Scala - NULL handling with MAP

2020年5月27日

Scala - NULL handling with MAP

Sharing three useful types that express a very useful concept i learned today, for NULL handling. Most languages have a…
WHY and HOW I started using IntelliJ IDE and SCALA

2020年5月22日

WHY and HOW I started using IntelliJ IDE and SCALA

I was using Eclipse IDE for java enterprise development from beginning of my career and learning journey. In my life…
Product based company team management strategies for productivity

2020年4月30日

Product based company team management strategies for productivity

I am sharing my knowledge i got in my professional and personal life as software developer for team management. Lets…
Distributed Systems - Multi Leader Replication

2020年1月6日

Distributed Systems - Multi Leader Replication

We know in Leader follower model, client can able to write only by leader this if leader is down for any reason, you…
Distributed Systems - Replication

2020年1月6日

Distributed Systems - Replication

Replication means keeping a copy of the same data on multiple machines that are connected via a network. Reasons for…
Transaction Processing or Analytics ?

2020年1月4日

Transaction Processing or Analytics ?

Transaction processing systems In the early days of business data processing, a write to the database typically…
Designing key value database with btree

2020年1月1日

Designing key value database with btree

Introduced in 1970 and called “ubiquitous” less than 10 years later , B-trees have stood the test of time very well…

See all articles

Big Data Volume

Divagar Carlmarx

?? Building microservices with k8 for ?? ?? ?? ??

Divagar Carlmarx的更多文章

社区洞察

其他会员也浏览了

A Lakehouse CDP means less pain for data teams & better results for marketers

The Untapped Potential of Data Insights & Analytics for Business Success

How Can Companies Use Big Data to Drive Business Growth? ????

How to Effectively Collect and Use First-Party Data: A Practical Guide

CX Data Strategy: Quant vs Qual. Which is Most Critical?

Maximising Business Impact From Data

Why Data Enrichment is Your Secret Weapon in 2024

3 Crucial Ways Customer Data Platforms Drive Value for Your Business

The Data Confidence Paradox

Know What You Are Building

Divagar Carlmarx的更多文章

Processing large amount of CSV data using JAVA

Fell in love with Scala

Scala - Sealed Class Hierarchies

Scala - NULL handling with MAP

WHY and HOW I started using IntelliJ IDE and SCALA

Product based company team management strategies for productivity

Distributed Systems - Multi Leader Replication

Distributed Systems - Replication

Transaction Processing or Analytics ?

Designing key value database with btree

社区洞察

其他会员也浏览了

A Lakehouse CDP means less pain for data teams & better results for marketers

The Untapped Potential of Data Insights & Analytics for Business Success

How Can Companies Use Big Data to Drive Business Growth? ????

How to Effectively Collect and Use First-Party Data: A Practical Guide

CX Data Strategy: Quant vs Qual. Which is Most Critical?

Maximising Business Impact From Data

Why Data Enrichment is Your Secret Weapon in 2024

3 Crucial Ways Customer Data Platforms Drive Value for Your Business

The Data Confidence Paradox

Know What You Are Building