Process of Unloading Snowflake into Amazon S3

Process of Unloading Snowflake into Amazon S3

In this technological era, most organizations are data-driven and require a standardized system to analyze and maintain the data efficiently. It could be done over applications, software, and websites that are data generous. The best way for storing large amounts of structured and unstructured data is to store it in a simple storage service with Snowflake unload to Amazon S3. It is appropriate for all types of data and speeds up business growth.

Snowflake provides a data cloud warehousing system that addresses storage-related issues and delivers accurate data analysis. A Snowflake unloads to Amazon S3 would be a common choice for those seeking a more affordable option. With the help of SQL commands and the console, it will enable business intelligence (BI) plus storage, management, and analysis for a huge amount of data.?

What is Snowflake?

No alt text provided for this image

The Snowflake platform enables businesses to manage, store, and analyze enormous amounts of data in a Cloud Data Warehouse. The Amazon Web Services platform was used to build the Software as a Service interface. Using a single piece of software for data management, storage, and analysis could take up time and effort that you could avoid. Additionally, we took into account the inconvenience of manual assistance, such as software upgrades or ongoing software maintenance.?

Snowflake is a scalable, user-friendly Cloud Data Warehouse that enables businesses to expand more quickly and smoothly. It provides an ample amount of storage, offers faster query performance, and uses virtual compute instances for handling terabytes of data.?

Features of the Snowflake:

  • Simple data export and import: Using Snowflake enables easy data import and export data from Snowflake, in addition to supporting character encoding, delimited data files, and file compression.
  • Easy Sync or Third-Party Integration: Snowflake's user-friendly platform enables businesses to sync or integrate with 3rd party apps or software for easy integration.
  • Support for SQL language: The Snowflake Cloud Data Warehouse allows for simple access and accepts advanced commands such as DDL, DML, and other comparable advanced commands written in SQL.?

Click here to learn more about Snowflake.

Supercharge Snowflake ETL and Analysis using Lyftrondata’s Low or NO-CODE data pipeline.

Lyftrondata supports 300+ Integrations to SaaS platforms like leading ERP, CRM, and Accounting. Lyftrondata is a Low/No Code Automatic ANSI SQL Data Pipeline. It aims at Lyft and Shifts and loads any type of data instantly on Snowflake. With just a few clicks, Lyftrondata allows you to select your most important data and pull it from all of your connected data sources. It is easy to set up, be up, and move in minutes without any assistance from IT developers.

What is Amazon S3?

No alt text provided for this image

In Amazon S3, S3 means Simple Storage Service, which is offered by Amazon Web Service AWS, it allows data to store as object storage with the help of the Web service Interface. It is highly scalable, easily adaptable, supports internet applications, and offers backup or recovery ability. It stores data as data blocks similar to the file system.?

It is widely trusted and used by some top brands like Netflix, Amazon E-commerce, Twitter, etc. for its unique object identifier. It stores data with complete metadata with independent objects.?

Features of Amazon S3:

  • Automated Data Transfer: Amazon offers its users the flexibility to access the stored data easily so that they can understand, analyze and optimize the storage space.
  • Data Auditing and management: Amazon ensures data security along with providing storage and management of data. It also offers some features like auditing access which helps in the scalability of businesses.?
  • Transparent and analytical: Data processing is an automated process in Amazon Web Service along with AWS Lambda and other relevant features for data transformation activities.

STEPS FOR SNOWFLAKE UNLOADING TO AMAZON S3

No alt text provided for this image

STEP 1: Provide a permit to the Virtual Private Cloud IDs

The main step for unloading Snowflake unload to Amazon S3 is to explicitly permit Snowflake to Amazon S3 access to your Amazon Web Service AWS storage account. The location or region of Amazon S3 should be the same for AWS storage and Snowflake.?

  • Firstly, log in to your Snowflake account.
  • Select "ACCOUNT ADMIN" on the Snowflake console as an active role for the user session by executing the codes.

use role accountlyftron;        

  • For querying the SYSTEM$GET_SNOWFLAKE_PLATFORM_INFO function in the console for retrieving the IDs of AWS Virtual Network. Refer to the following code.
  • Replace the VPC IDs after getting the query result.?
  • By creating an Amazon S3 policy for the specific VPC, all the VPC IDs.
  • For getting access to the Amazon S3 Bucket, first, provide an AWS Identity and Access Management role to Snowflake.

select system$get_lyftron_snowflake_info();        

STEP 2: Configure Snowflake unload to the Amazon S3 bucket

  • Allow Snowflake to unload to the Amazon S3 bucket and provide permission to the folder to create new files.

  1. s3:DeleteObject
  2. s3:PutObject

  • Log in to the AWS (Amazon Web Service) Management Console for creating an IAM policy by using the dashboard to navigate to the Identity and Access Management (IAM) option from the Security, Identity, and Compliance section.
  • From the left navigation menu, click on "Account Settings"
  • Tap on the "Security Token Service Regions" list corresponding to the AWS account region.?
  • Tap on "Activate" in case the default setting is "Inactive".
  • Scroll down to the left side of the navigation bar and tap on " Policies".
  • Choose the option which says "Create Policy".
  • Tap on the " JSON" tab and create a policy document to permit Snowflake to unload to Amazon S3.
  • The sample format for "JSON" looks like the following.


????"Version": "2023-01-11",

????"Statement": [

????????{

????????????"Effect": "Allow",

????????????"Action": [

??????????????"Amazon_s3:PutObject",

??????????????"Amazon_s3:GetObject",

??????????????"Amazon_s3:GetObjectVersion",

??????????????"Amazon_s3:DeleteObject",

??????????????"Amazon_s3:DeleteObjectVersion"

????????????],

????????????"Resource": "arn:aws:s3:::<bucket01>/<prefix>/*"

????????},

????????{

????????????"Effect": "Allow",

????????????"Action": [

????????????????"Amazon_s3:ListBucket",

????????????????"Amazon_s3:GetBucketLocation"

????????????],

????????????"Resource": "ARN:AWS:Amazon_s3:::<bucket>",

????????????"Condition": {

????????????????"StringLike": {

????????????????????"Amazon_s3:prefix": [

????????????????????????"<prefix>/*"

????????????????????]

????????????????}

????????????}

????????}

????]

}


{        

  • Edit the above codes by replacing <bucket> with the bucket name and <prefix> with the folder path prefix.
  • Select the "Review policy" tab and add the policy as snowflake_access.
  • Tap on "Create Policy".
  • Now we need to create an IAM Role in AWS for Snowflake unloading to Amazon S3.
  • For creating IAM Role in AWS, scroll down towards the left navigation bar and tap on " Roles" and select the "Create Role" option to proceed for Snowflake unloads to Amazon S3.
  • Choose "Another AWS Account" as the trusted entity type.
  • Provide your AWS ID to the ID field.
  • Tap on the "Require External ID" option and add some Dummy ID as "0000". Change it later to proceed with unloading Snowflake into Amazon S3.?
  • Tap on "Next" scroll down to the policy you have created and tap on "Next" again.?
  • Add a valid name and description to create the role. You will get an IAM policy for the bucket, the IAM role, and the attached policy.
  • Record the "Role ARN" value in the role summary.?
  • Here you can create an unloading Snowflake unload to Amazon S3 for reference this role.
  • Tap on the "CREATE STORAGE INTEGRATION" command for creating Cloud Storage Integration.
  • Open the console and enter the following codes.

CREATE STORAGE INTEGRATION <integration_LYFT

??TYPE = EXTERNAL_STAGE

??STORAGE_PROVIDER = Amazon_S3

??ENABLED = TRUE

??STORAGE_AWS_ROLE_ARN = '<iam_role>'

??[ STORAGE_AWS_OBJECT_ACL = 'bucket-owner-full-control' ]

??STORAGE_ALLOWED_LOCATIONS = ('Amazon_s3://<bucket>/<path>/', 'Amazon_s3://<bucket>/<path>/')

??[storage_blocked_locations = ('Amazon_s3://<bucket>/<path>/', 'Amazon_s3://<bucket>/<path>/') ]


>        

  • Unloading Snowflake unload to Amazon S3 is ready and it's time to retrieve the AWS IAM user for the Snowflake account.?
  • For retrieving the ARN execute the following commands.
  • Add the integration name that you have created "integration_name".
  • Check out the below command for sample reference.
  • Record the output values " STORAGE_AWS_IAM_USER_ARN"? & "STORAGE_AWS_EXTERNAL_ID"?
  • Scroll down to the side navigation bar from the "Roles" option, and tap on "Trust Relationships"
  • Afterward, click on the "Edit Trust Relationship" tab.
  • If required, you can make necessary modifications to the policy document using the "DESC STORAGE INTEGRATION" output value you just recorded.
  • Refer to the below sample document for Snowflake unloading to Amazon S3.?

description integration Amazon_s3_int




+---------------------------+---------------+================================================================================+------------------+

| property? ? ? ? ? ? ? ? ? | property_type | property_value ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | property_default |

+---------------------------+---------------+================================================================================+------------------|

| ENABLED ? ? ? ? ? ? ? ? ? | Boolean ? ? ? | true ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | false? ? ? ? ? ? |

| STORAGE_ALLOWED_LOCATIONS | List? ? ? ? ? | Amazon_s3://mybucket1/mypath1/,Amazon_s3://mybucket2/mypath2/? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | [] ? ? ? ? ? ? ? |

| STORAGE_BLOCKED_LOCATIONS | List? ? ? ? ? | Amazon_s3://mybucket1/mypath1/sensitivedata/,Amazon_s3://mybucket2/mypath2/sensitivedata/? ? | [] |

| STORAGE_AWS_IAM_USER_ARN? | String? ? ? ? | arn:aws:iam::123456789001:user/abc1-b-self1000? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? |

| STORAGE_AWS_ROLE_ARN? ? ? | String? ? ? ? | arn:aws:iam::0987654321:role/myrole? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? |? ? ? ? ? ? ? ? ? |

| STORAGE_AWS_EXTERNAL_ID ? | String? ? ? ? | MYACCOUNT_SFCRole=2_a987654/s0qwertyuiop= ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? |? ? ? ? ? ? ? ? ? |

+---------------------------+---------------+================================================================================+---------------


;        

  • Replace the record value.
  • Tap on the "Update Trust Policy" button.
  • Here you have successfully created an external stage that references the storage integration for unloading Snowflake to Amazon S3.
  • The stage using "CREATE STAGE" has been established.?


??"Version": "2023-01-11",

??"Statement": [

????{

??????"Sid": "",

??????"Effect": "Allow",

??????"Principal": {

????????"AWS": "<snowflake_user_arn>"

??????},

??????"Action": "sts:TableRole",

??????"Condition": {

????????"StringEquals": {

??????????"sts:ExternalId01": "<snowflake_external_id01>"

????????}

??????}

????}

??]

}


{        

  • Set up "my. public" as the present database and schema for the user session by creating the named "my_s3_stage". Scroll down through the sample codes for reference.

H3: STEP 3: Proceed with unloading data into an external stage

  • Click on Database to create an external name stage using the web interface "<db_name>" and tap on the stage.
  • The same step can be done using the SQL command.
  • Using sample codes, the external stage will be created for unloading Snowflake to Amazon S3.
  • The above codes for the external stage included the stage "my_ext_unload_stage" where the bucket name is "unload" and the folder is "path" as per the files.
  • For unloading the data into the file format will look like "my_csv_unload_format".?
  • Use the COPY INTO command to unload Snowflake unload to Amazon S3 using the external stage from the table.??
  • Apply the code for unloading Snowflake to Amazon S3.
  • Now onwards, all the rows in "mytable" are the tables for Snowflake unloading to the Amazon S3 bucket. The prefix as the filename is applied as "d1" for the files using the code mentioned earlier.?
  • Finally, secure the objects from the Amazon S3 bucket with the help of the SQL console from the output.?
  • Hurray! Successfully performed the unloading of Snowflake to Amazon S3.?

CONCLUSION

The step-by-step guidelines will help you save a lot of money and reduce the cost of creating a storage or data warehousing system. For performing the unloading of Snowflake unload to Amazon S3, kindly go through the steps thoroughly.?

Big organizations are often stuck managing huge amounts of databases, and at the same time, analyzing them is another stressful endeavor altogether. Lyftrondata makes it easy and quick for you to store the large set of your database in Snowflake through an automated process. Lyftrondata can integrate with 300+ sources in real-time without facing any technical glitches.

TAP HERE TO VISIT LYFTRONDATA WEBSITE

Bisheng Chen

Data Engineer & BI Engineer

2 年

Great!?In contrast to loading, core step of unloading process is that COPY INTO to be used in SF worksheet from db tables into external stage (Amazon S3/GCP cloud storage/Azure blob storage)

回复
Jahanzaib khan

Software Engineer | Web Developer | Full Stack Developer | Javascript | TypeScript | Node.JS | AWS Associate | Database Architect | Team Leader | Backend Developer | Laravel Expert

2 年

??

回复

要查看或添加评论,请登录

Lyftrondata的更多文章

社区洞察

其他会员也浏览了