Automating Schema Inference and Table Creation in Snowflake Using Staged Files
Snowflake offers powerful features for dynamically creating tables by inferring schema directly from staged files. This functionality allows data engineers to automate table creation without manually defining columns. Here, I'll walk through an example where we upload a CSV file to a Snowflake stage, infer its schema, and use that schema to create a new table—all through Snowflake SQL commands.
Generate Column Descriptions
Snowflake’s GENERATE_COLUMN_DESCRIPTION function creates descriptive metadata based on inferred columns from your staged file. This is a useful step if you’re creating tables dynamically and want column names and types generated without manual effort.
SELECT GENERATE_COLUMN_DESCRIPTION(
ARRAY_AGG(OBJECT_CONSTRUCT(*)), 'table') AS column_descriptions
FROM TABLE(INFER_SCHEMA(
LOCATION => '@MYSTAGE',
FILES => 'emp.csv',
FILE_FORMAT => 'MYFILEFORMAT'
));
Create the Table Dynamically Using a Template
USING TEMPLATE, Snowflake allows you to create tables based on inferred schema. This ensures that the table will match the structure of the staged file, including column names and data types.
CREATE OR REPLACE TABLE EMP USING TEMPLATE(
SELECT
ARRAY_AGG(OBJECT_CONSTRUCT(*)) AS column_descriptions
FROM TABLE(INFER_SCHEMA(
LOCATION => '@MYSTAGE',
FILES => 'emp.csv',
FILE_FORMAT => 'MYFILEFORMAT'
)));
check for the table
And the file in the stage.