Snowflake Stored Procedure using Javascript
Yes you can create stored procedures using SQL and yes they are very useful. But there are things which you can't do using SQL, such as array, some string manipulation, etc. And for that you will need to write your proc in Javascript.
For example, you want to loop around each table in your database, and for each table you want to do “select max(len(column_name))” on each column. In other words you want to get the maximum length of the data values in each column.
To do this you need to write a stored procedure like below. To make it simpler to understand, in this example I’m going to show you how to loop around list of tables, then do “select count(*)” on each table. Once you manage to get this running, you can then develop it further to get the max column length like above.
So first, paste this SQL into Snowflake UI and execute it. And I’ll walk you through it.
CREATE OR REPLACE PROCEDURE PROC1()
RETURNS STRING
LANGUAGE JAVASCRIPT
AS
$$
var tables = snowflake.execute({sqlText:
'SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = CURRENT_SCHEMA()'});
while (tables.next())
{ var table_name = tables.getColumnValue(1);
var row_count = snowflake.execute({sqlText: 'SELECT COUNT(*) FROM ' + table_name});
row_count.next()
var insert_sql = 'INSERT INTO METADATA.ROW_COUNT VALUES (\'' + table_name + '\',' + row_count.getColumnValue(1) + ')'
snowflake.execute({sqlText: insert_sql});
}
$$;
The basic mechanism in the above proc is running “SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES” which will get you a list of tables. And then we loop around each row using “while (tables.next())”. This next() function is necessary in order to be able to read that row using “tables.getColumnValue(1)”. 1 here means column 1.
So to recap:
1.????? Get list of tables from Information Schema using snowflake.execute.
2.????? Loop around each row of the result set using next() function.
3.????? Get the cell value in each row using getColumnValue() function.
That’s how you get the table names. Then you do “select count(*)” on each table. And get the output using the same thing: first do next(), then do getColumnValue(), which is the row count. Now that we have the row count for each table, insert the table name and the row count into a table called Metadata.Row_Count.
So to recap: Once you get a list of tables, do the following for each table:
1.????? Do “select count(*)” to get row count. Note: to get max column width, you just need to replace this part.
2.????? Do next() to read the row.
3.????? Do getColumnValue() to read the cell. (the row count)
4.????? Insert that output (the row count) into another table.
Now that you understand how it works, let’s run the proc.
1.????? First, create a schema called Metadata: “create schema Metadata”.
2.????? Then in this Metadata schema, create that Row_Count table. This table contains 2 columns: table name and row count. So the SQL is like this:
领英推荐
CREATE OR REPLACE TABLE ROW_COUNT
( TABLE_NAME VARCHAR(100),
ROW_COUNT INT
);
3.????? You can create this table in the same schema as everything else, but you might want to add “where table_name <> ‘Row_Count’ ” in the “SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES”.
4.????? Now you can run that proc like this: “CALL PROC1()”.
Now a few notes about the code:
1.????? Second line: you have to return something.
2.????? Third line: indicate that the language is Javascript.
3.????? 3rd line from bottom (insert_sql): to write a single quote within a string prefix it with a back slash.
4.????? Note that at the start and the end of the proc you need to put $$.
5.????? To see the insert_sql, you can add this within the while loop: “output.push(insert_sql)” At the top you need to define “output” as an array like this: “var output = []”. Then at the bottom you convert that output array to string (because your proc returns a string) like this: “return output.join("\n")”.
Have a nice try and good luck.
PS.
1.????? To get the row count for each table you can read the Information Schema: “SELECT TABLE_NAME, ROW_COUNT FROM INFORMATION_SCHEMA.TABLES”.
2.????? To get the max length on each column on each table, you need to get a list of columns for each table like this: “SELECT TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS”.
References:
1.????? Basic of writing Javascript stored procedure in Snowflake: https://docs.snowflake.com/en/developer-guide/stored-procedure/stored-procedures-javascript
2.????? Next(), getColumnValue(), getColumnCount(), getColumnSqlType(): https://docs.snowflake.com/en/developer-guide/stored-procedure/stored-procedures-api
3.????? Good example for looping around list of tables: https://community.snowflake.com/s/article/Snowflake-Introduces-Javascript-Stored-Procedures
?