Snowpark Python On Jupyter Notebook

Snowpark Python On Jupyter Notebook

  • Snowpark provides a data programmability interface for Snowflake.?It provides a developer experience that brings deeply integrated, DataFrame-style programming to Snowflake for the languages developers like to use including Scala, Java and Python.
  • Snowpark is a new developer library in Snowflake that provides an API to process data using programming languages like Scala , Java and Python instead of SQL.
  • All our data remains inside our Snowflake Data Cloud when it is processed so indirectly it reduces the cost of moving data out of the cloud to other services. Snowpark uses its own (snowflakes) processing engine that's why it eliminates the cost of external infrastructure like Spark infrastructure.

Below are the Prerequisites to run snowpark-python using anaconda.

  1. Download/Install jupyter notebook on your local machine.
  2. Create snowflake account and create objects as well.
  3. Python 3.8.

Step 1 : We are using Anaconda on our local machine so we need to? create a conda env.

Please? Run below command on your command prompt.

?1. conda env create -f jupyter_conf_env.ym
 2. conda activate getting-started-snowpark-python        

Please create jupyter_conf_env.yml file.?

name: getting_started_snowpark_python
channels:
? - https://repo.anaconda.com/pkgs/snowflake
dependencies:
? - python=3.8
? - scikit-learn==1.0.2
? - jupyter==1.0.0
? - numpy
? - matplotlib
? - seaborn
? - snowflake-snowpark-python[pandas]==0.7.0        

Conda will automatically install snowflake-snowpark-python==0.7.0 and all other dependencies for you.

Step 2 : Once Snowpark is installed, create a separate? kernel? for Jupyter:

?python -m ipykernel install --user --name=getting-started-snowpark-python        

Step 3 : Now, launch Jupyter Notebook using below command on your local machine:

jupyter notebook        

Step 4 : Open up the config.py file in Jupyter and modify it with your account, username, and password information.

snowflake_conn_prop = {
? "account": "ACCOUNT",
? "user": "USER",
? "password": "PASSWORD",
? "role": "ACCOUNTADMIN",
? "database": "snowpark_quickstart",
? "schema": "TELCO",
 ?"warehouse": "sp_qs_wh",
}
        

Step 5 : Now, you are ready to get started with the notebooks. For each notebook, make sure that you select the getting-started-snowpark-python kernel when running. You can do this by navigating to:?

No alt text provided for this image

Kernel => Change Kernel => select getting-started-snowpark-python after launching each Notebook.


Step 6 : Write a Python program to import necessary dependencies in jupyter notebook.

from snowflake.snowpark.session import Session
from snowflake.snowpark import functions as F
from snowflake.snowpark.types import *
import pandas as pd
from sklearn import linear_model
import matplotlib.pyplot as plt
#Snowflake connection info is saved in config.py
from config import snowflake_conn_pro

# lets import some tranformations functions
from snowflake.snowpark.functions import udf, col, lit, translate, is_null, iff
from snowflake.snowpark import version        

Step 7 :Write the actual Python code which Joins employee and salary table on empId and id using data frame?

#create a session using snowflake conf file
session = Session.builder.configs(snowflake_conn_prop).create()

dfLoc = session.table("employee")
dfServ = session.table("salary")

dfJoin = dfLoc.join(dfServ,dfLoc.col("id") == dfServ.col("empId"))

dfResult = dfJoin.select(col("id"),col("name"),?col("salary"))

dfResult.show()        
No alt text provided for this image

Output of above python code :



No alt text provided for this image
No alt text provided for this image

In above screenshot we can see how the data frame is processed by? snowflake and also in background snowflake converts the data frame into SQL query and run it on its environment.

Rishabh Singh

GCP Certified Professional ML Engineer | Data, Generative AI, Data Scientist@LTIMindtree

2 å¹´

Thanks to your blog, i was stuck yesterday for a while on the snowpark connection error, then i remembered about your blog and it helped.

Anand Deshpande

Co-Founder and CEO at Atgeir Solutions

2 å¹´

Good one Nihal, thanks for sharing.

要查看或添加评论,请登录

Nihal Diwan的更多文章

  • Data Mesh In Snowflake

    Data Mesh In Snowflake

    Data Mesh is not a technology but an architectural pattern for interconnecting & joining data silos while pushing most…

    3 条评论
  • SQL Server To Snowflake Migration

    SQL Server To Snowflake Migration

    Snowflake’s Data Cloud is powered by an advanced data platform provided as Software-as-a-Service (SaaS). Snowflake…

    2 条评论
  • Spark to Snowpark(Snowflake) Migration

    Spark to Snowpark(Snowflake) Migration

    Snowpark provides a data programmability interface for Snowflake. It provides a developer experience that brings deeply…

    2 条评论

社区洞察

其他会员也浏览了