Snowflake
What is a Snowflake data warehouse?
Snowflake is the first analytics database built with the cloud and delivered as a data warehouse as a service. It can run on popular providers like AWS, Azure, and Google cloud platforms. There is no hardware (virtual or physical) or software needed to install, configure, and manage, entirely runs on public cloud infrastructure. It's ideal for?data warehousing, data engineering, data lakes, data science, and developing data applications. But what makes it unbeatable is its architecture and data sharing capabilities.
What is Snowflake Architecture?
Snowflake architecture?is built for the cloud. Its unique multi-cluster shared data architecture delivers the performance, concurrency, and elasticity that organizations require. It handles all aspects of authentication, resource management, optimization, data protection, configuration, availability, and more. Snowflake features compute, storage, and global service layers which are physically separated but logically integrated.?
Architecturally, the snowflake data warehouse consists of three key layers:
#1?Database storage in Snowflake
Snowflake stores all data in databases. A database is a logical grouping of objects, consisting primarily of tables and views, classified into one or more schemas. We can store any kind of structured or semi-structured data in Snowflake, and all the tasks related to data are handled through SQL query operations. The underlying filesystem in Snowflake is managed by S3 in Snowflake's account, where data is encrypted, compressed, and distributed to optimize the performance.?
#2?Query Processing in Snowflake
Snowflake processes the queries using cs, where each virtual warehouse(or cluster) can obtain all the data in the storage layer, then run separately, so the warehouses do not share or compete for compute resources. Virtual Warehouses are actually used for the purpose of data loading or running queries and are capable of doing both of these tasks simultaneously. A virtual warehouse can be scaled up or down without any downtime or destruction.
领英推荐
Cloud services in Snowflake
The services layer coordinates and handles all other services in Snowflake, including sessions, encryption, SQL compilation, and more. It eliminates the manual data warehousing and tuning requirement. Services in this layer include:
By design, all these layers are independently scaled and are redundant.
To know how the different layers work together, let's understand the lifecycle of a query.
After connecting the Snowflake through one of the supported clients and starting a session, the first virtual warehouse submits a query and services layer verifies the authorized access data in the database, and later executes the operations defined in the query, and then creates an optimized query plan. Next, the services layer sends query execution instructions to the virtual warehouse, which allocates resources because any needed data from the storage layer can execute the query. The results are returned to the user.
How to connect Snowflake?
Snowflake can be connected with other services in many ways: