登录查看更多内容

Handle Large CSV Files (>12MB even over 36MB) in Apex with AWS Lambda Function Integration!

Halil ?brahim ?zdemir

发布日期: 2023年12月19日

At first glance, it seems impossible to handle a CSV file that is larger than 12MB in apex. Hang tight! It is possible for even more than 36MB CSV files with the right approach! Join me in exploring how Salesforce and AWS Lambda can be a game-changing combination.

Before diving into details AWS Lambda Function has a FREE tier! "400000.0 seconds are always free per month as part of AWS Free Usage Tier (Global-Lambda-GB-Second)"

Our journey began with an integration between Google Ad Manager and Salesforce, where Google provided data via CSV files. Initially, these files were manageable, under 10MB. However, as our operations expanded, so did the file sizes, ballooning to over 40MB. This posed a significant challenge as we needed to process and store each row into the Salesforce database.

The Solution: Here's how we tackled this challenge:

Retrieve the CSV File: Using Apex, we fetched the CSV file link through an API.
Utilize AWS Lambda: The file was sent to an AWS Lambda Function.
File Splitting: The Lambda Function then split this file into manageable 3MB chunks.
Processing Each Chunk: These smaller files were sent back to an Apex API Service for processing.

Below is a simplified version of the Apex code used to send the CSV file link to the AWS Lambda Function. Note: In your implementation, prefer using Named Credentials for enhanced security and simplicity.

Http h = new Http();
HttpRequest req = new HttpRequest();
req.setEndpoint('Lambda Function Endpoint');
req.setMethod('POST');
req.setBody('{"url": "CSV Link"}');
req.setHeader('Content-Type', 'application/json');
HttpResponse res = h.send(req);

Handling Smaller CSV Files in Apex: The next step involves creating an Apex class to manage these smaller CSV files.

领英推荐

Arraya's I.T. Staffing Newsletter

Arraya Solutions 1 个月前

Expanding Terraform's Horizons: The Call for Custom…

Infralovers GmbH 6 个月前

How to Convert ICS to CSV: An Ultimate Guide

SysInfoTools Software 3 周前

@RestResource(urlMapping='/CSVHandler/*')
global with sharing class CSVHandler {
    @HttpPost
    global static String processCSV() {
        // [Code to process CSV and handle exceptions]
    }
}

Setting Up the Lambda Function: The final piece of the puzzle is configuring the Lambda Function with an endpoint URL. For a detailed guide, check out AWS Lambda URLs Tutorial. In summary, create a Python 3.9 function and enable the 'Function URL' under advanced settings.

Here's a snippet of the Lambda Function code for handling and splitting the CSV file:

import requests
import csv
import io

def lambda_handler(event, context):
    # Replace sensitive information with placeholders
    client_id = "YOUR_CLIENT_ID"
    client_secret = "YOUR_CLIENT_SECRET"
    username = "YOUR_EMAIL"
    password = "YOUR_PASSWORD_WITH_SECURITY_TOKEN"

    try:
        # Salesforce OAuth2 authentication
        params = {
            "grant_type": "password",
            "client_id": client_id,
            "client_secret": client_secret,
            "username": username,
            "password": password
        }
        auth_response = requests.post("https://login.salesforce.com/services/oauth2/token", params=params)
        access_token = "Bearer " + auth_response.json().get("access_token")
        instance_url = auth_response.json().get("instance_url")
        print("Access Token:", access_token)
        print("Instance URL", instance_url)
        
        # Get CSV url from request
        records = event['Records'][0]
        print("Records ", records)
        url = records['body']
        print("BODY URL", url)
        
        # Downloading CSV file
        response = requests.get(url)
        print("Google Response", response.status_code)
        csv_file = io.StringIO(response.text)
        print("response.text", len(response.text))
        
        # Splitting CSV into chunks
        max_chunk_size = 2.5 * 1024 * 1024  # 2.5 MB
        reader = csv.reader(csv_file)
        header = next(reader)
        current_chunk = []
        current_chunk_size = 0
        chunks = []

        # Chunk processing logic
        for row in reader:
            # [Code for processing rows and forming chunks]
            current_chunk.append(row)
            current_chunk_size += len(",".join(row)) + 1
            if current_chunk_size >= max_chunk_size:
                print("Current size: ", current_chunk_size);
                chunks.append(current_chunk)
                current_chunk = []
                current_chunk_size = 0
            if len(current_chunk) > 0:
                chunks.append(current_chunk)

        # Sending each chunk to Salesforce API
        for chunk in chunks:
            # Add header to each chunk
            chunk.insert(0, header)

            # Create CSV file from the chunk
            csv_file = io.StringIO();
            writer = csv.writer(csv_file)
            writer.writerows(chunk)
            csv_file.seek(0)
            
            reqBody = '' + csv_file.read()+ ''
            csv_file_content = csv_file.read()
            csv_file.close()

            # Send the chunk to the other API
            headers = {'Authorization': access_token,'Content-Type': 'text/text; charset=utf-8'}

            sfResponse = requests.post("https://YOUR_INSTANCE.my.salesforce.com/services/apexrest/CSVHandler", data = reqBody.encode('utf-8') ,headers=headers)
            print("response: " , sfResponse.text)
        return {
            'statusCode': 200,
            'body': "Success"
        }

    except Exception as e:
        raise e

Conclusion: There you have it - a seamless process to manage large CSV files in Apex using AWS Lambda! The integration not only solves the size limitation issue but also streamlines the data handling process.

Hope this guide helps you navigate similar challenges with ease.

要查看或添加评论，请登录

Halil ?brahim ?zdemir的更多文章

How to use Comparable Interface in Apex with a real-life example!

2023年12月25日

How to use Comparable Interface in Apex with a real-life example!

Business Example: I would like to give an example from one of my important clients, Trio Cab. They have been assisting…

Handle Large CSV Files (>12MB even over 36MB) in Apex with AWS Lambda Function Integration!

Halil ?brahim ?zdemir

领英推荐

Halil ?brahim ?zdemir的更多文章

社区洞察

其他会员也浏览了

Native binary for Mac/Linux, writing and reading for External Streams, infra usage monitoring

A New Framework for Ephemeral Resources

How to Write the Best API Documentation

Optimizing Performance in Entity Framework Core: Tips and Techniques

How to Convert Excel File to CSV File? Here are the Top Ways

What is an API? – A Quick-Read Summary for Executives

Mastering Data Workflows with Apache Airflow: A Fun and Comprehensive Guide for Beginners

The Power of Docker: How It Transformed My Deployment Process

What the Heck is SDF?

Orchestrating Business Rules with AWS Step Functions and Amazon DynamoDB

领英推荐

Halil ?brahim ?zdemir的更多文章

How to use Comparable Interface in Apex with a real-life example!

社区洞察

其他会员也浏览了

Native binary for Mac/Linux, writing and reading for External Streams, infra usage monitoring

A New Framework for Ephemeral Resources

How to Write the Best API Documentation

Optimizing Performance in Entity Framework Core: Tips and Techniques

How to Convert Excel File to CSV File? Here are the Top Ways

What is an API? – A Quick-Read Summary for Executives

Mastering Data Workflows with Apache Airflow: A Fun and Comprehensive Guide for Beginners

The Power of Docker: How It Transformed My Deployment Process

What the Heck is SDF?

Orchestrating Business Rules with AWS Step Functions and Amazon DynamoDB