Buy don't build
One of the principles in enterprise architecture is that we should buy applications rather than building them in-house. In data architecture this is also one of the principles. A data platform consists of many components, from data stores, file stores, data ingestion tools, data catalog, reporting tools, etc. So it is clear that it is not possible build them in-house. We have to buy all of those tools and platform from another company, off-the-shelf so to speak.
But the data lake or data warehouse themselves cannot be bought off-the-shelf. Say you are a fashion retailer or a cake factory. There is no off-the-shelf data lake or data warehouse out there which can provide you with analysis of your fashion sales numbers or cake cost breakdown. Why? Because they do not have your sales data or your production cost data. So your company will have to build it yourself. You need to build data pipeline to bring various data to your data lake/warehouse, and build all the reports/dashboards which support your business analysis on sales and costs. That my friends has been going on for 30 years, and will still happen in the next 30 years.
领英推荐
So it is all very well and good to say that your company, as a principle, do not build systems or applications, but you buy them instead. But in reality, there is one category of system which you can't buy, and you have to build it. And that is data warehouse/BI/data lake. The other category which you can't buy is AI system. Whether it is forecasting your sales, clustering your customers and products, or processing credit applications, any AI system has to be built. You can't buy them off-the-shelf, just like the DW/BI/DL. Of course you can hire a company or a team of contractors to build them for you.
So, the principle of buy don't build is applicable for the data stores, file stores, data ingestion tools, data catalog, reporting tools, etc. But the data warehouse/BI/data lake/AI system themselves will have to be built in-house. You cannot buy them off-the-shelf.
Data Eng??DeltaLake??Databricks??AI & BI?? - Views are mine
1 年Had to laugh once. Someone said to me "or get the off the shelf enterprise data warehouse SAP"... oh yeah that's what they sell you, then pay the cost of a small country in services and plugins configuring it to your specific requirements. You're trading bespoke nimble with fat, over complicated configuration. Nearly every successfull data stack I've worked on (and we all know most of them historically fail) ditched the "enterprise". Model and build data marts using agile for specific business requirements and conform their dimensions together so you can cross drill when you need to. So many "enterprise" data warehouses cost a small fortune, are modelled on the transactional ops process instead of strategic business processes and decision making; and ultimately are a fat waste of money.
Data Consultant, Advisor, Leader, Mentor, Data Architect, Data Engineer, Community Organiser, Charity Trustee
1 年From a data warehouse/lake point of view, business rules can be so bespoke, that even off the shelf packages require customisation. This can still be better than a build from scratch... until it gets to the point that the vendor needs you to upgrade. This can be just as costly as a build, depending on the level of customisations.