Snowflake's Data Workload Orchestration in the Cloud
Snowflake's architecture is designed for scalability and separates functionality into three distinct layers:
[1] Storage Layer: This layer handles persistent data storage. Data is stored in a format that is optimized for efficient retrieval and querying. You can scale the storage layer independently based on your data volume requirements.
This layer stores data in a compressed, columnar format optimized for fast retrieval.
[2] Compute Layer: This layer handles query processing. When you run a query, Snowflake spins up virtual warehouses, which are essentially cloud-based compute clusters. These virtual warehouses process your queries and return the results. You can scale the compute layer independently based on your processing needs.
This layer spins up and manages virtual warehouses for query processing.
[3] Cloud Services Layer: This layer acts as the control center for Snowflake. It manages various tasks like user authentication, authorization, metadata management, and query routing. This layer is automatically managed by Snowflake and scales automatically to handle your workload.
The Snowflake account is created on top of the hyper scalar cloud services (AWS/Azure/GCS). All the data is stored on the blob storage one which the Snowflake Cloud is set-up.
领英推荐
The Snowflake Cloud Services Layer keeps a metadata information of these data raw files, logs every changes made on that. This is maintained till the Table Retention Period. When the table retention period is over, the data moves to the Fail-Safe State and it is no longer available for the user to directly read from it. The Table Retention Period is a duration till which you can time travel into your data. For Standard account, the retention period is up to 1 Day. While for the Enterprise Account, the Retention Period is till 90 Days.
Deep-diving the Cloud Services Layer in Snowflake:
Here's an analogy:
Imagine the Cloud Services Layer as a high-end restaurant kitchen.
The Cloud Services Layer relies on the public cloud for resources but maintains its own internal functionalities for core operations. This separation allows Snowflake to offer a consistent user experience across different cloud platforms.