Snowflake advancing analytics
Each patient engagement with the health system leaves behind a trail of data about a particular visit and about how that individual's health record might look in the future. Analysing these interactions help us to make a quantifiable assessment of how that patient and healthcare provider use these analytics to provide more efficient, higher quality, safer and more personalised care coordination.
The diverse set of health data includes medical records, disease registries, health surveys, administrative enrolment, billing records including hospitals, clinical systems, physicians and health plans.
Data about patients and the records builds up quickly in our system and it is difficult to unify, integrate, and analyse. To derive analytics from the data we get, we must store, access, centralise and normalise all types of data and make it actionable for business intelligence and data science initiatives.
Traditional analytics today is built on a legacy data warehouse platform. Development cycles and analytical operations are affected, as sending BI reports to customers in real time is difficult and query speed is slow due to the data warehouse often not only being used for reporting, but also for budget, planning and audit activity that gives the system mission-critical status.
Today’s organisations are evolving and trying to solve problems related to exponentially growing data; the problem is complex, no debate. Organisations integrate data platforms with existing operational systems, many of which were developed decades ago, resulting in a culture and sets of processes that have been built around these systems and hardened over the years. Even at Cerner, we had problems dealing with diverse types of data.
Analytics reporting with Snowflake data cloud
At Cerner, we have preferred Snowflake as a virtual data warehouse that is a fully managed; a true software as a service (SaaS) offering backed up by a fully compliant SQL database that is designed for the cloud.
?This provides flexibility, scalability and more configurable options without much administration and maintenance.
As it supports RDMS, Snowflake works seamlessly with enterprise BI tools like?Tableau, BusinessObjects and Power BI,?so there is little or no learning curve for experienced analytics users. The key features as used at Cerner:
This feature helps in scaling independently, so customers can use and pay for storage and computation separately.
? It provides a convenient way to quickly take a snapshot of any table, schema or database and create a derived copy of that object, which initially shares the underlying storage. This is extremely useful while creating instant backups that do not incur any additional costs.
It allows customers to share (import/export) data across Snowflake accounts seamlessly with decreased processing and development time
This layer is the compute layer of architecture that contains multiple virtual warehouses, and every query runs on one virtual warehouse.
Read-and-write query integrity is maintained on the cloud service layer that accesses required virtual warehouse and computes nodes.
Tight integration with cloud storage allows moving Snowflake from different cloud providers by just changing the addresses of the cloud without affecting your customers.
For instance, if data is stored in Amazon S3, Google Cloud or Azure, you can create Snowflake environments in each then ingest the data using SQL commands and configuration
At Cerner, we leverage the tools provided by Snowflake ecosystem to connect to the warehouse:
领英推荐
?What makes Snowflake Analytics friendly to the data cloud
Table storage
Snowflake automatically partitions tables by grouping data into individual micro partitions of 50-500MB without explicitly mentioning partition boundaries. It applies Columnar compression and retrieves specific columns when the query is executed. It is further pruned during execution based on Metadata stored during micro partition.
This key is required for tables with data more than 1 TB; it orders the micro partition records based on key and is maintained by Snowflake itself. Snowflake periodically looks at the table to reorder and avoid unnecessary scanning of micro-partitions.
Support for semi-structured data
Snowflake’s ability to combine structured and semi-structured data helps Cerner to store geospatial data. This comes in multiple forms from sources like machine-generated data, sensors and health devices. Snowflake supports ingestion of semi-structured data in various formats like JSON, Avro, Parquet and XML with the VARIANT data type which imposes a 16MB size limit.
Result caching
Snowflake architecture includes caching at various levels to help speed up your queries and minimise costs. When a query is executed, Snowflake holds the results of the query for 24 hours. So, if the same query is executed again, by the same user or another user within the account, the results are already available to be returned, provided that the underlying data has not changed.
Cerner usage workflow
1.? The data comes from disparate sources (unstructured, semi-structured, structured, geospatial), then is extracted, loaded and transformed so that solutions can consume the processed data.
2.??The data is now loaded into Snowflake warehouse.
3.??Solutions like HealtheAnalytics? and Analytics Query Tool query and retrieve the data in real time, allowing the customer to process the dataset, data model and display BI reports.
4.??Snowflake provides a platform that allows users to focus on BI projects while accelerating business intelligence tool performance, Cerner enables real-time analyzing, discovering and reporting on data to help the customer make more informed business decisions.
5.??The BI tools that Cerner uses, like Tableau and SAP BusinessObjects, connects with Snowflake and get real-time data that is used by our customers to generate a different kind of health report.
6.??With the above integrations with Snowflake, we were able to solve the problems related to storing, managing and deriving analytics in real time.
References: