Big Data Means "No More Modeling"... Right?
Brian Farish
#BigID | IT Consulting | Information Management | Data Architecture | Data Modeling | Data Management/Governance | Thoughts -> Words -> Design -> Technology
Everybody wants to get applications running as quickly as possible. Unfortunately, one of the most time-consuming parts of effectively implementing a useful application is the part where you actually figure out what the application should be doing and what the data is actually supposed to mean.
There are a variety of technical solutions that can be used to facilitate the identification, collection, analysis and structuring of data and information requirements. And with the recent proliferation of various "big data" tools, I found myself wondering (very quietly, because no one wants to be the "slow one in the herd" about figuring this stuff out) how to tell the difference between situations where "big data" tools and approaches are appropriate and situations where data modeling activities are recommended(/required).
While the name "Big Data" emphasizes the quantity of data involved, the fact is that there are some pretty big "structured" data sets out there (e.g. telephone company transaction data, stock market transaction data, etc...). The quantity of the data is not the biggest difference between big data and structured data, the biggest difference I've noticed is the undefined/semi-defined nature of the data sets that folks are calling "big data" (e.g. social media data, emails, video streams, audio, etc...).
While the automated capabilities of the tools being aimed at these big data feeds are becoming more and more impressive with each passing day, it still boils down to pattern analysis, pattern matching and affinity analysis and therefore will require a fairly high level of expertise on the part of the end-user to be accurately interpreted.
With these assumptions in mind, here's what I came up with (and please feel free to share in the comments below):
Big Data
Consumer Expertise Level Required: High
Typical Consumer Roles: Data Scientists, Data Analysts, Power-SubjectMatterExperts
Summary: Basically highly-trained (and typically expensive) resources
Structured Data
Consumer Expertise Level Required: Low and possibly automated
Typical Consumer Roles: Clerical, Web Services, Enterprise Services, Executives
Summary: Basically people (and technology) needing things very clearly communicated to them with as little ambiguity as possible.
Now, there might be those who would say "But we can just put a front end on the big data and..." ...and now we are right back to the need for analyzing the data and documenting its meaning (aka "data modeling").
So, my take... use the right tool for the right job and accept that there is no free lunch. Be prepared to invest at either the front end (structured analysis) or the back end (big data).
Nice ... also, you will pay with the Customer disruption/disatisfaction if either the front end or back end ignores the Vision of what the Customer wants. Well stated Brian and thanks. I will now steal this for later use. Chris Church ... he is not kidding about the iPhone 6
Experienced Operations and Business Intelligence Leader
10 年Amen ... and amen! You took the thoughts right out of my head when you wrote, "very quietly, because no one wants to be the 'slow one in the herd' about figuring this stuff out." I keep thinking that I'm missing something in this big data conversation, but I think you nailed it.
Data Strategy Consultant at 303Computing
10 年Nice write-up Brian. This post has forced me into the following observation/summarization: "Structure me now, or structure me later" :)