登录查看更多内容

Turning your data into a product

Eli Steinberger

VP AI Products | Multidisciplinary Product Manager | Business Tactics | Founder | Machine Learning Strategist | Fintech | Quantitative Models Developer | Leading Impactful AI Deliverables End to End

发布日期: 2020年1月13日

Data is the new oil, a quote from 2006 that took the headline at 2019. Now - what should we do with this "oil"? build a drill, a refinery, or feed a new combustion engine to ship commodities across the globe?

You've decided to transform your data to an internal revenue improving resource. Good idea. We've all heard by now that "data is the new oil". However, before creating this as a product we need to relate to it as a product. A new product, an independent venture hoping to penetrate a market.

Why? well - just like any other product, your data needs to solve a problem or create a new opportunity. It's a startup within your company. Keeping the Oil comparison in mind - you have to decide and plan ahead - are you going to sell the Crude oil? Refined Gasoline? are you creating a new venture of Motor Engines? (I'll let the true visionaries imagine how they are going to ship commodities with engines across the globe... start with the basics).

In order to answer this question, and the tactics resulting from it - you need to figure out your motivation. Why are you venturing out to this journey? is it because the competition is doing it? Is it because management got this new "bug"? or do you have this gut feeling that you can significantly increase your performance by mastering the data you hold? (If you find yourself with lack of motivation - see this article by Franklin Morris from Sisense - how data can transform your business ).

If your focus is first & foremost improving marketing, I'd recommend reading DMI's blog.

As Marty Cagan mentions many times, a true product manager needs to act as the CEO of the product; Data Product Managers need to act exactly as such, with an even more entrepreneurial spirit. You are leading a new product for the company. Would it be internal? external? what pains, needs or opportunities do we wish to attain by it? Are we answering questions? empowering decision makers? or are we looking to create additional impact for our users?

For a wider spread of considerations we should examine a wider vision and ask many hypothetical questions about the impact we wish our data product to create. Following up with Data = Oil line of thinking, we could compare raw/clean data as the oil, Cloud services as drilling method, Data-Science/Machine-Learning processes as refineries, BI/Analytics as products (ranging from refined gasoline to plastic), and new product lines as combustion engines. So whether you're going to gather and sell "crude-oil" or evangelize a new "plastic" industry - the process starts in the same manner.

The main steps of creating this "internal startup" are detailed beautifully in Ulla Kruhse-Lehtonen blog. In here I'll map out the process steps, so you could start identifying gaps and create a road-map for implementation. Borrowing the commonly used ETL abbreviation for big data: Extract, Transform and Load:

Extract (in our case Gather the data) - in most companies you should start with data you already log. It is often tempting to store all possible data, with storage costs being so low and dropping. It may very well be a good default. However there's a trade-off, more available data may result in Paralysis By Analysis at a later stage. Re-evaluate this resource periodically, separating the wheat from the chaff.

Store existing data as-is with unique source id and timestamps
Identify your data limits (cost? size? security? language? latency?)
Analyze data-dirt vs storage costs - so you could decide how much effort if any should be put into minimizing the data stored;

Transform - create a process where data is organised into a natural, basic language. These are not insights or conclusions yet, but a common, clean language transforming the raw data into a normalized structure.

Clean the data into comprehensible tables and columns, with short documentation aligned to it. Many times this data would need normalizing to proper and consistent format and language.
Compromise - Data is always messy. Never aim for "clean data" because you'll never get it. Never settle for dirty data, because GIGO. And make it a rule to always work within a sandbox with a sample that is ~1% of your entire storage, before taking even the smallest step forward.
Create a Data Model. This can be anything from an XML to a presentation or a spreadsheet; It's a stand-alone document, centralized and maintained by you, which has one purpose - aligning all stake holders into understanding of this new lingo. The Data Model is the first object that needs to be familiar for most stake holders within the organization. Whether you're a Search Engine like Pipl or Experian, a Marketing-tech company learning users' habits to attribute monetization like AppsFlyer, a high end user-oriented mogul like Netflix or Amazon or a company storing anonymous trips data like Otonomo - most stakeholders, and ALL product-managers need to be aware to what data types and fields you have.
Communicate internally - Not only your team of product analysts and data scientists. Discuss with your Product peers, R&D team leaders, Sales, Marketing, Business Development, Customer Success, Growth - present what you have, with examples.
Brainstorm questions - remember that you are the CEO of the data? good! this is your road map for MVP. This is your time to map Known Unknowns and how to take them apart. As a group - identify main hypothesis and assumptions you have about the data, and mainly - the impact. What can you unveil with this data? what are the most important questions you think you could answer? How far do we think this data could take us? The output of this process is your backlog of MVPs, the prioritized list of ideas you are going to test, validate and research.

Load, or if I may: LOAD (Link, Observe, Analyze, Derive) - after finishing the brainstorming session of potential leads of data - this is the executions stages/sprints. Each study needs to "move a needle", don't let FOMO lead you into analyzing useless information and chasing the wrong questions. Now you need to find the data that would create impact:

Link - apply the prioritization decided in the previous step with the real time data gathered; link it with other fields along the same time-frame (if relevant)
Observe - demonstrate how a hypothesis of the data really looks like (graphs are always better than words). Do we have what we anticipated to have in the raw material? Do we see (at least) an initial correlation between the data and the predictability of impact?
Analyze - If the first two steps were successful, you need to check your hypothesis in fuller landscape. You know that correlation doesn't mean causation. Make sure you truly understand the connection between the data and the implications. Don't be tempted to run complicated Machine Learning tools (if you "have to", use fast tools like Dataiku, DataRobot or DominoDataLab, for example). Don't let the excellent defeat the good. Simpler would usually be better, especially at the first iterations.
Derive conclusions - now your internal startup meets the "market". It would almost always be an internal audience first, but that doesn't reduce the impact one bit. If you communicated well with the stakeholders, this is the beginning of a very successful product, that would always be essential and impactful. Like with the Lean approach - measure fast, measure often, and improve your questions.

Fantasy Data Prioritized - which data is relatively attainable and impactful

The last step, or even a thought process if you get stuck on your way: Fantasy Data. Ask yourselves - in my entire realm of operational environment - what data, if I had it clean and precise, would really create value for our clients internally/externally? Prioritize by complexity (or cost) and importance, and go fetch it.

There are many issues I haven't touched-on here that you should be aware of in your data strategy. Any use of data must handle both the privacy laws and the company image by users; there are data engineering concerns (in both withdrawl, processing and storing), maintainability concerns, integration with legacy tools and sources, security concerns, Data Debt vs Technical Debt in your organization, Vanity-(data)-Metrics and their hazardous potential of skewing decisions, etc. Last but not least - execution pains - since management would intervene with your backlog, with the same vigor and volatility a CEO of a early-days-startup would interfere with development backlog.

Most importantly, always remember that whatever your process is - data alone can't always provide value, since data won't answer the "why" question. Data only answer the pragmatic questions of what happened and when. Data would have only told Henry Ford about the market's horses and owners, not the reasons horses were used.

Exactly like classic product management - Data product management is filled with landmines and considerations you need to make. Make sure you are the CEO of your data.

Good luck, and may the logs be with you

Amichai Oron

UX/UI SAAS Product Designer & Consultant ?? | Helping SAAS / AI companies and Startups Build Intuitive, Scalable Products.

4 个月

???? ??? ?? ?? ???????? ??? ????? ???? ?????? ???: ?????? ????? ??? ??????? ?????? ??????, ?????? ?????? ??????,?????? ????? ????????. https://chat.whatsapp.com/BubG8iFDe2bHHWkNYiboeU