Snowflake

Snowflake

What is a Snowflake data warehouse?

Snowflake is the first analytics database built with the cloud and delivered as a data warehouse as a service. It can run on popular providers like AWS, Azure, and Google cloud platforms. There is no hardware (virtual or physical) or software needed to install, configure, and manage, entirely runs on public cloud infrastructure. It's ideal for?data warehousing, data engineering, data lakes, data science, and developing data applications. But what makes it unbeatable is its architecture and data sharing capabilities.

What is Snowflake Architecture?

Snowflake architecture?is built for the cloud. Its unique multi-cluster shared data architecture delivers the performance, concurrency, and elasticity that organizations require. It handles all aspects of authentication, resource management, optimization, data protection, configuration, availability, and more. Snowflake features compute, storage, and global service layers which are physically separated but logically integrated.?

Architecturally, the snowflake data warehouse consists of three key layers:

  • Database Storage
  • Query Processing
  • Cloud Services

#1?Database storage in Snowflake

Snowflake stores all data in databases. A database is a logical grouping of objects, consisting primarily of tables and views, classified into one or more schemas. We can store any kind of structured or semi-structured data in Snowflake, and all the tasks related to data are handled through SQL query operations. The underlying filesystem in Snowflake is managed by S3 in Snowflake's account, where data is encrypted, compressed, and distributed to optimize the performance.?

#2?Query Processing in Snowflake

Snowflake processes the queries using cs, where each virtual warehouse(or cluster) can obtain all the data in the storage layer, then run separately, so the warehouses do not share or compete for compute resources. Virtual Warehouses are actually used for the purpose of data loading or running queries and are capable of doing both of these tasks simultaneously. A virtual warehouse can be scaled up or down without any downtime or destruction.

Cloud services in Snowflake

The services layer coordinates and handles all other services in Snowflake, including sessions, encryption, SQL compilation, and more. It eliminates the manual data warehousing and tuning requirement. Services in this layer include:

  • Authentication
  • Infrastructure management
  • Metadata management
  • Query parsing and optimization
  • Access control

By design, all these layers are independently scaled and are redundant.

To know how the different layers work together, let's understand the lifecycle of a query.

After connecting the Snowflake through one of the supported clients and starting a session, the first virtual warehouse submits a query and services layer verifies the authorized access data in the database, and later executes the operations defined in the query, and then creates an optimized query plan. Next, the services layer sends query execution instructions to the virtual warehouse, which allocates resources because any needed data from the storage layer can execute the query. The results are returned to the user.

How to connect Snowflake?

Snowflake can be connected with other services in many ways:

  • web-based User Interface
  • ODBC and?JDBC drivers
  • command-line clients
  • native connectors?
  • Third-party connectors such as?ETL tools?and BI tools.

要查看或添加评论,请登录

Anu Priya的更多文章

  • Predictive Analytics

    Predictive Analytics

    What is predictive analytics? Predictive analytics is a branch of advanced analytics that makes predictions about…

  • Springboot

    Springboot

    Spring Boot is an open source Java-based framework used to create a micro Service. It is developed by Pivotal Team and…

  • Business Intelligence

    Business Intelligence

    What Is Business Intelligence (BI)? Business intelligence (BI) refers to the procedural and technical infrastructure…

  • SharePoint

    SharePoint

    What is Microsoft SharePoint and what is it used for? Microsoft SharePoint is a document management and collaboration…

  • Automation Testing.

    Automation Testing.

    What is Automation Testing? Automation Testing is a software testing technique that performs using special automated…

  • DevOps

    DevOps

    DevOps is a set of practices, tools, and a cultural philosophy that automate and integrate the processes between…

  • Cloud Ops

    Cloud Ops

    What is Cloud Operations (CloudOps)? Cloud Operations (CloudOps) is the practice of managing delivery, tuning…

  • Collibra

    Collibra

    What is Collibra? Collibra is a data catalog platform and tool that helps organizations better understand and manage…

  • Map Reduce

    Map Reduce

    What is MapReduce? MapReduce is a processing technique and a program model for distributed computing based on java. The…

  • Microsoft Outlook

    Microsoft Outlook

    What is Microsoft Outlook? Microsoft Outlook is the preferred email client used to send and receive emails by accessing…

社区洞察

其他会员也浏览了