Is SAP HANA An Expensive Data Storage Option?
Introduction
In a recent conversation the subject of data storage came up with unfavourable comments being made about SAP costs versus other storage options.? Not being privy to the quotes received, I was in no position to argue, however reflecting on the conversation I could not help but wonder if a fair comparison had been made.? After all, in the physical world, my garden shed is ‘storage’ but it is a far cry from an automated warehouse full of clever gizmos such as PreBilt.
When you talk about ‘SAP’ and ‘Data’, then the likelihood is that the data will, one way or another, end up in SAP’s database: HANA. ?In this article I highlight some (and by no means all) features of the HANA database and its overall context which contribute to its value proposition.
DB Compression
Straight out of the blocks, a gigabyte for gigabyte comparison of data storage is pretty meaningless.? I have overseen a few migrations from other, traditional, databases to HANA.? In my experience, HANA database compression results in a 6-7 factor reduction in disk space consumption compared to a database which does not use compression.
Stellar Performance
Rather than the traditional relational database storage, HANA utilises column storage for most of its tables.? Additionally, ‘hot’ tables (more on that later) are loaded into memory.? Calculation results are also stored in memory.
These features result in system performance which is far faster than a traditional database.? Exactly how much faster depends on what you are comparing it to. My anecdotal evidence suggests HANA is roughly ten times faster for inserts and a one hundred to a thousand times faster for reads when comparing to traditional big-name databases.
Self-managing
In the bad old days, database management involved a lot of manual intervention such as rebuilding indexes and reorganisation of tables.? HANA controls many of its own processes.? For example, HANA manages its own memory choosing which tables and calculation results should be kept and which are of lower importance and can be ‘unloaded’.? Similarly, the amalgamation of new data with existing data is handled as a staged approach called delta merging. Again, this process is self-managed to ensure that data is always available and the underlying data operations happen transparently to the user.
This capability automatically optimizes both the size of the system and reduces the overhead of running it.
Structured and Semi-Structured Data
Data comes in different formats. ?The data that the ERP world is most familiar with is structured data.? This typically comes in tables with defined columns and field values.? This data is the staple of the data warehouse and is what is most commonly found in HANA systems.
Unstructured data is usually generated in vast volumes and is stored in a data lake. ?For the record, comparing the cost of storage of a data lake against the cost of HANA would be like comparing the cost of a tractor against a Ferrari.
There is also semi-structured data, which is not formatted as tables with columns but has a structure defined within the data itself.? HANA is able to work with unstructured data through its JSON document store.? This capability confers HANA with a significant advantage as it can bring data generated by transaction processing systems together with data generated through, for example, internet forms.
Machine Learning
Having data is one thing.? Doing something with it is another.? The HANA database provides an environment which supports far more than the expected basic calculations.? Embedded in HANA is the Predictive Analytics Library which supports the most commonly required ML algorithms.? If more complexity is required, it also supports R and machine learning extensions to TensorFlow, as an example.
SAP uses this capability extensively in S/4 HANA to improve processes and reduce manual workload, but it is also available to data scientists working with HANA natively or through connected products.
Graph Engine
Another advanced data analysis approach that HANA supports is graph processing.? The SAP HANA Graph Data Model allows data scientists to analyze relationships between data entities.? Entities in this situation could be people in social networks, geographic locations or any sort of networked or hierarchical data. ?This has use cases such as fraud detection, supply chain transparency, product recommendation and geo-spatial analytics, predicting best store sites or likely locations for activities.
领英推荐
Data Tiering and Federation
You don’t actually have to store the data in HANA to get the benefit from HANA and its environment.? Dynamic Tiering is a feature that allows you to identify data as having different performance requirements or ‘temperatures’ and to store them accordingly.? Hot data is stored in memory, warm data on disk and cold data in near-line storage.
Alternatively, data can be accessed wherever it currently resides through functionalities such as HANA Smart Data Access, HANA Smart Data Integration and BTP integration options.
The Bigger Picture
Storing data in a database is the bottom-most layer of a range of functionalities that are required to support a data-driven organisation.? The database cannot be viewed in isolation and the successful integration of the database with the layers that sit above it is critical.
Whilst it is possible to buy the complete set of capabilities from different vendors based on price alone, the vendor management overhead and overall solution risk increases with this approach.? Incompatible systems, divergent development and finger pointing during issue resolution are all potential challenges.
The HANA DB is the bedrock for a holistic approach to data delivery which spans the range of capabilities needed to produce a full data fabric.? Over and above the database, one must consider:
Data Warehousing
This is the application sitting on top of a database to ingest, transform and make ready data for consumption.
The HANA DB supports not one, but three options for this use case. ?Native HANA DB, SAP Business Warehouse (including BW/4 HANA and BW on HANA) and Datasphere.? These can be used individually or in concert, allowing the customer their best blend of on-premises and cloud solutions.? Comparing these different data warehousing options could be an article in its own right, so I shall resist the temptation to expand on it here.
Analytics
Once you have acquired your data and transformed it, you have to present it to your users.? SAP’s offering in this department is SAP Analytics Cloud. SAP seeks to differentiate this product through its ‘smart’ capabilities: Augmented analytics, gen AI dashboards, predictive planning, natural language processing to name just a few.? These functionalities are possible because of the tight integration between front-end product and the provisioning database, HANA.
Data Democratisation
Zooming out even further, the modern data nirvana is for a data democracy where all users have access to the data that they need wherever it resides and howsoever it is created.? This drives complex requirements for data pipelines, system to system integration, access and security.? SAP have all this covered with their BTP platform.? The Identity Authentication Service controls who can access what, Datasphere takes care of your pipelines, and the Integration Suite ensures you have the right tools to connect your systems together.
Data Governance and Management
Numbers on a screen are not helpful if you don’t know where they came from, how they were calculated or if a colleague has a different number with the same label. Once again, SAP have accounted for these requirements with SAP Datasphere providing users with data lineage and cataloging functionality.? This in turn can be consumed through partner tools such as Collibra to provide the vital context that business users need.
Conclusion
Returning to my physical world analogy, with SAP you are not only getting a state-of-the-art warehouse, but you are also putting yourself into an environment which has round the clock staffing, a packing service, a fleet of lorries, a planning and analytics capability (literally)... a full end-to-end service. ?Better still, if you really want to, you can keep some of your inventory in the shed till you need it.
Few of the considerations listed above will appear on any data storage price comparison sheet. ?More importantly, when you put data into a HANA database, you are not simply buying ‘storage’.? HANA, through its own native capabilities and its position in a suite of best-of-breed products, gives you an exceptional opportunity to extract value from your data.
If you have different views, I would love to hear about them in the comments. Alternatively, feel free to contact me directly. For general info, see our website: https://www.nac-it.com/.