Unlocking High-Performance Snowflake Integrations with OpenResty
INTRODUCING RESTY-SNOWFLAKE
Efficiency and performance are paramount in the ever-evolving landscape of web applications and data management. Today, I am thrilled to introduce resty-snowflake, a Lua library that bridges the powerful capabilities of Snowflake with the high-performance web platform OpenResty. It leverages the Snowflake SQL REST API and supports fast JSON processing using libraries like rsjson, dkjson, or cjson. Users can configure the library via environment variables or directly in Lua scripts. The provided Taskfile simplifies key setup, JWT generation, and connectivity testing. Ideal for boosting performance, simplifying architecture, and ensuring scalability in modern web applications.
This article will delve into this integration's motivations, benefits, and technical intricacies, showcasing how resty-snowflake can revolutionize your data handling workflows.
Why resty-snowflake?
Snowflake has emerged as a leading cloud data platform, renowned for handling vast amounts of data with speed and scalability. However, integrating Snowflake into web applications often involves additional backend layers, which can introduce latency and complexity. This is where resty-snowflake shines.
The Motivation
The primary motivation behind resty-snowflake is to harness the power of Snowflake directly within OpenResty, eliminating the need for a separate backend. This approach offers several compelling advantages:
The Power of OpenResty and LuaJIT
OpenResty is a robust web platform based on Nginx and LuaJIT, designed to deliver high-performance web applications. LuaJIT is a Just-In-Time Compiler for Lua, known for its speed and efficiency. Combining these with Snowflake's data management prowess creates a potent solution for modern web applications.
Key Features of resty-snowflake
GETTING STARTED
Setting Up Your Snowflake User
To make resty-snowflake work, you must generate public and private keys using the generate-keypair task. Then, assign the public key to the user in Snowflake (for the sake of the demo, we will imagine DEVUSER):
ALTER USER DEVUSER SET RSA_PUBLIC_KEY='MII...';
DESC USER DEVUSER;
SELECT TRIM((SELECT "value" FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()))
?WHERE "property" = 'RSA_PUBLIC_KEY_FP'), 'SHA256:');
The result of this query will provide you with the Public Key Fingerprint for the key-pair authorization. You can read about it more at https://docs.snowflake.com/en/developer-guide/sql-api/authenticating
Taskfile for Automation
The following Taskfile simplifies the setup and testing of resty-snowflake:
version: 3
silent: true
vars:
DOCKER_TAG: resty-snowflake-demo
PRIVATE_KEY_PATH: _keys/rsa_key.p8
SNOWFLAKE_ACCOUNT_NAME: YOUR_ACCOUNT_NAME
SNOWFLAKE_USERNAME: YOUR_USERNAME
SNOWFLAKE_PRIVATE_KEY:
? sh: cat {{.PRIVATE_KEY_PATH}}
SNOWFLAKE_PUBLIC_KEY_FINGERPRINT: "YOUR_PUBLIC_KEY_FINGERPRINT"
tasks:
generate-keypair:
? cmds:
? ? - mkdir -p _keys
? ? - openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out _keys/rsa_key.p8 -nocrypt
? ? - openssl rsa -in _keys/rsa_key.p8 -pubout -out _keys/rsa_key.pub
test-keypair:
? cmds:
? ? - snowsql -a {{.SNOWFLAKE_ACCOUNT_NAME}} -u {{.SNOWFLAKE_USERNAME}} --private-key-path _keys/rsa_key.p8 -q "SELECT CURRENT_TIMESTAMP() AS now"
generate-jwt:
? cmds:
? ? - |
? ? ? HEADER='{
? ? ? ? "alg": "RS256",
? ? ? ? "typ": "JWT"
? ? ? ?}'
? ? ? ACCOUNT="{{.SNOWFLAKE_ACCOUNT_NAME}}"
? ? ? USER="{{.SNOWFLAKE_USERNAME}}"
? ? ? PUBLIC_KEY_FP="{{.SNOWFLAKE_PUBLIC_KEY_FINGERPRINT}}"
? ? ? NOW=$(date +%s)
? ? ? EXP=$(($NOW + 3540))
? ? ? QUALIFIED_USERNAME=$(echo "$ACCOUNT.$USER" | tr '[:lower:]' '[:upper:]')
? ? ? PAYLOAD=$(jq -n --arg iss "$QUALIFIED_USERNAME.SHA256:$PUBLIC_KEY_FP" \
? ? ? ? ? ? ? ? --arg sub "$QUALIFIED_USERNAME" \
? ? ? ? ? ? ? ? --argjson iat $NOW \
? ? ? ? ? ? ? ? --argjson exp $EXP \
? ? ? ? ? ? ? ? '{
? ? ? ? ? ? ? ? ? ? iss: $iss,
? ? ? ? ? ? ? ? ? ? sub: $sub,
? ? ? ? ? ? ? ? ? ? iat: $iat,
? ? ? ? ? ? ? ? ? ? exp: $exp
? ? ? ? ? ? ? ? ? }')
? ? ? HEADER_BASE64=$(echo -n "$HEADER" | openssl base64 -A | tr -d '=' | tr '/+' '_-')
? ? ? PAYLOAD_BASE64=$(echo -n "$PAYLOAD" | openssl base64 -A | tr -d '=' | tr '/+' '_-')
? ? ? SIGNATURE=$(echo -n "$HEADER_BASE64.$PAYLOAD_BASE64" | openssl dgst -sha256 -sign {{.PRIVATE_KEY_PATH}} | openssl base64 -A | tr -d '=' | tr '/+' '_-')
? ? ? GENERATED_JWT="$HEADER_BASE64.$PAYLOAD_BASE64.$SIGNATURE"
? ? ? echo $GENERATED_JWT
test-sql:
? cmds:
? ? - |
? ? ? JWT="$(task generate-jwt)"
? ? ? echo "Testing SQL..."
? ? ? ?
? ? ? curl -iv -X POST \
? ? ? -H "Content-Type: application/json" \
? ? ? -H "Authorization: Bearer $JWT" \
? ? ? -H "X-Snowflake-Authorization-Token-Type: KEYPAIR_JWT" \
? ? ? -d '{"statement": "SELECT CURRENT_TIMESTAMP()"}' \
? ? ? ? ?"https://{{.SNOWFLAKE_ACCOUNT_NAME}}.snowflakecomputing.com/api/v2/statements"
docker-build:
? cmds:
? ? - docker build --progress plain -t {{.DOCKER_TAG}} .
docker-run:
? cmds:
? ? - |
? ? ? docker run -it -d \
? ? ? ? -p "8000:80" \
? ? ? ? -v "$(pwd)/app:/app" \
? ? ? ? -v "$(pwd)/lib:/app-libs" \
? ? ? ? -v "$(pwd)/nginx/conf/nginx.conf:/usr/local/openresty/nginx/conf/nginx.conf" \
? ? ? ? -e SNOWFLAKE_ACCOUNT_NAME="{{.SNOWFLAKE_ACCOUNT_NAME}}" \
? ? ? ? -e SNOWFLAKE_USERNAME="{{.SNOWFLAKE_USERNAME}}" \
? ? ? ? -e SNOWFLAKE_PRIVATE_KEY="{{.SNOWFLAKE_PRIVATE_KEY}}" \
? ? ? ? -e SNOWFLAKE_PUBLIC_KEY_FINGERPRINT="{{.SNOWFLAKE_PUBLIC_KEY_FINGERPRINT}}" \
? ? ? ? ?{{.DOCKER_TAG}}
SETTING UP RESTY-SNOWFLAKE
Installation
To start with resty-snowflake, you can try it with Docker, so you need Docker installed on your machine. The following Dockerfile builds your Docker image:
FROM openresty/openresty:alpine
RUN mkdir -p /app/lib && mkdir -p /vendor/lib \
? && apk add --no-cache curl jq libc6-compat perl ca-certificates \
? && opm get bsiara/dkjson \
? && opm get fffonion/lua-resty-openssl \
? && opm get pintsized/lua-resty-http \
? && opm get SkyLothar/lua-resty-jwt \
? && opm get fffonion/lua-resty-openssl \
? && apk del curl
EXPOSE 80
CMD ["/usr/local/openresty/bin/openresty", "-g", "daemon off;"]
Configuration
Ensure that your nginx.conf is set up to include the necessary Lua scripts. Here's an example configuration. Let's do a bare minimum (the environment variables used in his config are described below):
worker_processes auto;
error_log /dev/stdout info;
env SNOWFLAKE_ACCOUNT_NAME;
env SNOWFLAKE_USERNAME;
env SNOWFLAKE_PRIVATE_KEY;
env SNOWFLAKE_PUBLIC_KEY_FINGERPRINT;
events {
? worker_connections 1024;
? multi_accept on;
? use epoll;
}
http {
? default_type application/octet-stream;
? resolver 1.1.1.1 1.0.0.1 valid=300s;
? resolver_timeout 5s;
? lua_package_path "/usr/local/openresty/lualib/?.lua;/app-libs/?.lua;/vendor/lib/?.lua;;";
? lua_package_cpath "/usr/local/openresty/lualib/?.so;/app-libs/lib?.so;;";
? lua_shared_dict my_cache 10m;
? lua_code_cache off; #comment for production
? init_by_lua_file /app/nginx_init.lua;
? sendfile on;
? tcp_nopush on;
? tcp_nodelay on;
? keepalive_timeout 65;
? types_hash_max_size 2048;
? log_format main '$remote_addr - $remote_user [$time_local] "$request" '
? ? ? ? ? ? ? ? ? '$status $body_bytes_sent "$http_referer" '
? ? ? ? ? ? ? ? ? '"$http_user_agent" "$http_x_forwarded_for"';
? gzip on;
? gzip_disable "msie6";
? gzip_vary on;
? gzip_proxied any;
? gzip_comp_level 6;
? gzip_buffers 16 8k;
? gzip_http_version 1.1;
? gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
? ssl_protocols TLSv1.2 TLSv1.3;
? ssl_prefer_server_ciphers on;
? server {
? ? ? listen 80;
? ? ? location / {
? ? ? ? ? default_type 'text/plain';
? ? ? ? ? content_by_lua_file /app/demo.lua;
? ? ? }
? }
}
Initialization
In nginx_init.lua, the resty-snowflake library is initialized and configured to use rsjson for JSON processing. However, you can choose any JSON library that fits your requirements, such as dkjson or cjson. The key is to ensure that the chosen library is globally mapped as json.
json = require ("rsjson")
Mapping the JSON Library Globally
The essential requirement is that whichever JSON library you choose, it must be globally accessible as json. This allows resty-snowflake to use it seamlessly for JSON parsing and serialization. Here’s a detailed step-by-step guide:
Here’s a consolidated example for nginx_init.lua showing the flexibility:
领英推荐
-- Uncomment the JSON library you want to use
-- json = require("rsjson") -- High-performance Rust-based library
-- json = require("dkjson") -- Pure Lua library
-- json = require("cjson") ? -- Fast C-based library
-- Initialize the snowflake library
local snowflake = require("snowflake")
snowflake.init()
Benefits of Custom JSON Libraries
By allowing users to choose their preferred JSON library, resty-snowflake provides the flexibility to optimize performance and compatibility according to specific requirements. As long as the selected library is globally mapped as json, resty-snowflake will function seamlessly, making it adaptable to various scenarios and user preferences.
CONFIGURATION OPTIONS: ENVIRONMENT VARIABLES VS. LUA CONFIGURATION
When using resty-snowflake, you have the flexibility to configure the library either through environment variables or directly within your Lua scripts. This dual approach ensures that you can choose the configuration method that best suits your deployment environment and personal preferences. Let’s explore both options in detail.
Using Environment Variables
Environment variables offer a convenient way to configure resty-snowflake, especially in dynamic and containerized environments like Docker. By setting environment variables, you can decouple your configuration from your code, making it easier to manage different environments (e.g., development, staging, production) without altering your Lua scripts.
Setting Environment Variables
In your Docker configuration or shell environment, you can set the necessary variables as follows:
docker run -it -d \
-p "8000:80" \
-v "$(pwd)/app:/app" \
-v "$(pwd)/lib:/app-libs" \
-v "$(pwd)/nginx/conf/nginx.conf:/usr/local/openresty/nginx/conf/nginx.conf" \
-e SNOWFLAKE_ACCOUNT_NAME="your_account_name" \
-e SNOWFLAKE_USERNAME="your_username" \
-e SNOWFLAKE_PRIVATE_KEY="your_private_key" \
-e SNOWFLAKE_PUBLIC_KEY_FINGERPRINT="your_public_key_fingerprint" \
resty-snowflake-demo
In your nginx.conf, ensure you declare the environment variables to pass them to the worker processes:
env SNOWFLAKE_ACCOUNT_NAME;
env SNOWFLAKE_USERNAME;
env SNOWFLAKE_PRIVATE_KEY;
env SNOWFLAKE_PUBLIC_KEY_FINGERPRINT;
The resty-snowflake library will automatically pick up these environment variables without any additional code changes.
Configuring Directly in Lua
For more fine-grained control, or if you prefer to keep configuration within your Lua scripts, you can set the necessary configurations directly in the Lua code. This method can be beneficial in environments where modifying environment variables is less convenient or if you need to configure multiple instances with different settings programmatically.
Setting Configuration in Lua
To configure resty-snowflake directly in Lua, you need to pass a configuration table to the init function of the library. Here’s an example of how to do this:
local snowflake = require("snowflake")
-- Initialize with Lua configuration
snowflake.init({
? account_name = "your_account_name",
? username = "your_username",
? private_key = "your_private_key",
? private_key_password = "your_private_key_password", -- optional
? public_key_fp = "your_public_key_fingerprint"
})
ngx.header.content_type = "text/plain"
ngx.log(ngx.INFO, "Accessing Snowflake...")
-- Execute a sample SQL query
local result, err_exec_sql = snowflake.execute_sql("SELECT CURRENT_TIMESTAMP()")
if err_exec_sql then
? ngx.say("Error: " .. err_exec_sql)
? return
end
ngx.say("Result: ", result.data[1])
Benefits of Each Approach
Environment Variables:
Lua Configuration:
Choosing the Right Method
The choice between environment variables and Lua configuration depends on your specific use case and deployment environment:
More advanced example
Here is a sample demo of how to use the resty-snowflake library.lua script:
local snowflake = require("snowflake")
ngx.header.content_type = "text/plain"
snowflake.init()
ngx.log(ngx.INFO, "Accessing snowflake...")
-- Get current timestamp
local result, err_exec_sql = snowflake.execute_sql("SELECT CURRENT_TIMESTAMP()")
if err_exec_sql then
? ngx.say("Error: " .. err_exec_sql)
? return
end
ngx.say("Result: ", result.data[1])
-- Testing bindings
local result_bindings, err_exec_sql_with_bindings = snowflake.execute_sql("SELECT 1=?", {
? ["1"] = {
? ? ? type = "FIXED",
? ? ? value = "123"
? }
})
if err_exec_sql_with_bindings then
? ngx.say("Error: " .. err_exec_sql_with_bindings)
? return
end
ngx.say("Binding Result: ", result_bindings.data[1])
-- Testing CortexAI Translation services
ngx.log(ngx.INFO, "Testing CortexAI Translation services...")
local result_translation, err_translation = snowflake.execute_sql("SELECT SNOWFLAKE.CORTEX.TRANSLATE(?, '', ?) AS TRANSLATION_RESULT", {
? ["1"] = {
? ? ? type = "TEXT",
? ? ? value = "Skitsanos does pretty awesome things!"
? },
? ["2"] = {
? ? ? type = "TEXT",
? ? ? value = "de"
? }
})
if err_translation then
? ngx.say("Error: " .. err_translation)
? return
end
ngx.say("Translation Result: ", result_translation.data[1])
Final Thoughts
The resty-snowflake library offers a streamlined solution for integrating Snowflake directly with OpenResty, significantly enhancing performance and simplifying web application architectures. It reduces latency, boosts response times, and simplifies system maintenance by eliminating the need for additional backend layers. This integration leverages OpenResty's non-blocking I/O and LuaJIT's performance capabilities, ensuring scalability even under high-traffic conditions.
With resty-snowflake, developers can efficiently manage configurations, dynamically adjust setups, and harness Snowflake's powerful data handling capabilities directly within their web applications. This approach optimizes performance and supports advanced functionalities like fast JSON processing and secure JWT authentication.
Embracing resty-snowflake means fully utilizing Snowflake's potential while maintaining a lean, efficient, and scalable web infrastructure, making it a valuable addition to any modern web application's toolkit.
Activate Innovation Ecosystems | Tech Ambassador | Founder of Alchemy Crew Ventures + Scouting for Growth Podcast | Chair, Board Member, Advisor | Honorary Senior Visiting Fellow-Bayes Business School (formerly CASS)
3 个月Innovative approach streamlining data access and web delivery.