10 Snowflake Concepts You Need To Know (+15 You Ought To)
### INTERFACE
1. Web UI: The central interface for managing and querying data.
2. CLI (Command-line Interface): An alternative method to connect to Snowflake using an IDE like Visual Studio Code.
### SESSION VARIABLES?
3. Virtual Warehouses: The compute engines responsible for performing data processing tasks.
4. Worksheets: Interface used to execute SQL queries. Each worksheet can have its own combination of role, virtual warehouse, database, and schema.
5. Context Functions: Provide information about various elements like session, account, users, etc., in the context of the executing query.
### OBJECT HIERARCHY
6. Schemas: Structures within a database used to group logically related tables and views, forming a part of the namespace along with the database.
- Public Schema: A default schema created with every database.
- Information Schema: Acts as a data dictionary and contains metadata like tables, columns, and data types.
### TABLES & VIEWS
7. Permanent Table: Stores data until explicitly dropped.
8. Transient Tables: Offer lower levels of data protection, suitable for less critical data.
9. Temp Tables: Exists for the duration of a user session, not visible to other users, and automatically dropped at the end of a session.
10. External Tables: Provide a view on top of files stored in a stage, used for reading external data.
11. Standard View: A database object that treats the result of a query as a virtual table, allowing it to be used in further queries and joins.
12. Materialized View: A type of view where the query results are stored, improving performance by eliminating the need to re-run the query.
### DATA LOADING
13. File Format: Specification defining the structure of files to be processed, including aspects like header rows.
14. Stage: A logical concept representing an abstraction of a filesystem location, used for data storage and manipulation.
- External Stage: A reference to a cloud storage location, used for managing external data.
- Internal Stages: Area for loading data, typically from on-premises systems, using the CLI.
15. Snowpipe: A feature enabling continuous, automated data ingestion, or streaming.
### SEMI-STRUCTURED DATA
16. VARIANT: A universal data type in Snowflake capable of storing data of any type.
### DATA PROCESSING
17. Tasks: Scheduled execution of SQL statements, used for regular workloads like data transformation or processing.
18. Conjugate Task: A method to connect multiple tasks in a parent-child relationship, creating a hierarchy of tasks.
19. Stream: Monitors changes (inserts, updates, deletes) in a table, facilitating trigger-based data processing.
### DATA SHARING
20. Secure Data Sharing: Share data with other accounts via a soft reference to the tables, without physical data movement.
21. Reader Account: An account type for sharing data with non-Snowflake customers, acting as a subsidiary to the main account.
### ROLES
22. System Roles: Predefined roles include ACCOUNTADMIN, SYSADMIN, SECURITYADMIN, USERADMIN, and PUBLIC, each with specific privileges and responsibilities.
23. Custom Roles: User-defined roles. It's recommended to grant custom roles to the SYSADMIN role for comprehensive administrative privileges.
### PERFORMANCE OPTIMISATION & CONTINUOUS DATA PROTECTION
24. Clustering: A technique for data distribution into related storage blocks, involving sorting data into micro-partitions.
25. Time Travel: Enables users to access and retrieve historical data at or before a specific timestamp, including changes to data in tables, schemas, and databases.