登录查看更多内容

Managing PostgreSQL Query Disconnections during AWS Batch Job Termination

Nikhil Surendran

Cloud & DevOps Specialist | Platform Engineering | AWS | GCP , Kubernetes, GitHub Certified | Software Engineer at Heart | Building Scalable, Resilient Infrastructure | Driving Innovation in Tech

发布日期: 2024年11月8日

Recently, one of our project teams reached out to me about an issue they encountered: terminating an AWS Batch job wasn’t stopping the SQL query running on the PostgreSQL database. This article covers how we addressed this issue and resolve this issue.

Challenge

The team has developed multiple AWS batch jobs using a Docker image that executes an SQL query on PostgreSQL and exports the data to an S3 bucket via a psql command. The query syntax was like

SELECT * from aws_s3.query_export_to_s3( select col1,col2…. from TABLE …

Most of the jobs take about 10 to 15 minutes to complete the SQL execution. However, we encountered an issue where, if a job was terminated automatically or manually, the SQL query initiated by the batch job would continue running instead of stopping. This can be a bottleneck whenever there are long running jobs, this can dump multiple SQL’s in running state in the database.

Solution

AWS Batch job sends the SIGTERM signals to docker before it terminates the job. In the container, we can capture the SIGTERM signal and control what action need to be performed.

In our case since we were using the bash PSQL command I decided to inject the trap and capture the SIGTERM signal and gracefully kill the running process.

领英推荐

Bytebase vs. Liquibase: a side-by-side comparison for…

Bytebase - Database CI/CD and Security at Scale 10 个月前

Top 8 Free, Open Source SQL Clients to Make Database…

Bytebase - Database CI/CD and Security at Scale 1 年前

Steps to Become a Skilled Database Developer

Cloud Hub Institute 1 个月前

BatchJob Definition at beginning

{
 "jobRoleArn": "${batch_job_role_arn}",
 "image": "${docker_image_tag}",
 "memory": ${memory},
 "vcpus": ${vcpus},
 "command": ["sh", "-c", "psql -h $PGHOST -p $PGPORT -U $PGUSER -d $PGDATABASE -f /script.sql"],
 "environment": [{
   ..... 
  } 
 ]
}

BatchJob Definition after fix

Added “trap ‘kill -TERM $(pgrep psql)’ SIGTERM; exec” this will capture the SIGTERM signal and kills the SQL before the job terminates.

{
 "jobRoleArn": "${batch_job_role_arn}",
 "image": "${docker_image_tag}",
 "memory": ${memory},
 "vcpus": ${vcpus},
 "command": ["sh", "-c", "trap 'kill -TERM $(pgrep psql)' SIGTERM; exec  psql -h $PGHOST -p $PGPORT -U $PGUSER -d $PGDATABASE -f /script.sql"],
 "environment": [{
   ..... # Removed 
  } 
 ]
}

Another Solution

We can improve this further by using Python script and handle it with handle_sigterm.

import os
import subprocess
import signal
import sys
 
pg_username = os.getenv("PG_USERNAME")
pg_password = os.getenv("PG_PASSWORD")
pg_host = os.getenv("PGHOST")
pg_port = os.getenv("PGPORT")
pg_database = os.getenv("PGDATABASE")

def handle_sigterm(signum, frame):
    print("Received SIGTERM, terminating PostgreSQL command...")
    if process:
        process.terminate()   
        process.wait()        
    sys.exit(0)

# Register the SIGTERM handler
signal.signal(signal.SIGTERM, handle_sigterm)

# PostgreSQL command to run with credentials
command = [
    "psql",
    "-h", pg_host,
    "-p", pg_port,
    "-U", pg_username,
    "-d", pg_database,
    "-f", "./script.sql"
]
 
env = os.environ.copy()
env["PGPASSWORD"] = pg_password
 
try:
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=env)
    stdout, stderr = process.communicate()   
    print(stdout.decode())  
    if stderr:
        print(stderr.decode())   
except Exception as e:
    print(f"An error occurred: {e}")
finally:
     
    if process and process.poll() is None:
        process.terminate()

要查看或添加评论，请登录

Nikhil Surendran的更多文章

DockerSlim: Optimize Containers for Performance and Security

2024年12月2日

DockerSlim: Optimize Containers for Performance and Security

Docker has revolutionised and changed the software development pain point by build and ship the applications. However…
GitHub Action: Leverage the power of release in Github

2024年11月13日

GitHub Action: Leverage the power of release in Github

What’s release in GitHub A release in GitHub is a feature provided to package & document specific version of codebase…
Adding AWS Batch Job Status & EMR States to CloudWatch Metrics

2024年11月1日

Adding AWS Batch Job Status & EMR States to CloudWatch Metrics

This Python Lambda function is designed to monitor AWS Batch Job and EMR state changes and send custom metrics to…
An EventBridge to Send Custom Events to New Relic from an AWS

2024年10月27日

An EventBridge to Send Custom Events to New Relic from an AWS

New Relic provides multiple ways to integrate application observability, including using APM agents for applications or…
Streamline Workflow: Automating Jenkins Jobs with Slack Commands

2024年4月15日

Streamline Workflow: Automating Jenkins Jobs with Slack Commands

Jenkins Slack command integration is a powerful collaboration feature that allows users to interact with Jenkins, a…

1 条评论
Building a Scalable Serverless API Architecture on AWS

2024年4月3日

Building a Scalable Serverless API Architecture on AWS

In this article, i’ll explore how to architect and deploy a modern serverless API solution using various AWS services…

1 条评论
How AI can harness opportunities within the realm of DevOps

2024年3月6日

How AI can harness opportunities within the realm of DevOps

From healthcare to finance, Artificial Intelligence (AI) is revolutionising sectors far and wide, and the domain of…

1 条评论
Application Performance Monitoring (APM) for Optimal Application Performance

2024年2月27日

Application Performance Monitoring (APM) for Optimal Application Performance

In today’s digital landscape, ensuring optimal performance of software applications is a necessity. With rapid growth…

See all articles

Managing PostgreSQL Query Disconnections during AWS Batch Job Termination

Nikhil Surendran

Cloud & DevOps Specialist | Platform Engineering | AWS | GCP , Kubernetes, GitHub Certified | Software Engineer at Heart | Building Scalable, Resilient Infrastructure | Driving Innovation in Tech

领英推荐

Nikhil Surendran的更多文章

社区洞察

其他会员也浏览了

Oracle to PostgreSQL migration challenges - The Language Differences

Using Oracle Advanced Queuing? Migrate to TxEventQ for Increased Performance and Reliability.

An intro to using Oracle SQLcl on Mac

FlameGraphs and eBPF (or perf) to troubleshoot Postgres performance

Working with WTForms and Oracle REST Database Services (ORDS) APIs

PostgreSQL Control Flow Statements

SQL is the past...AND the future!

Emulating SQL Using Linux Commands

How to install Microsoft SQL on a Mac with Apple silicon chip

FlywayDB Migration with Spring Boot

领英推荐

Nikhil Surendran的更多文章

DockerSlim: Optimize Containers for Performance and Security

GitHub Action: Leverage the power of release in Github

Adding AWS Batch Job Status & EMR States to CloudWatch Metrics

An EventBridge to Send Custom Events to New Relic from an AWS

Streamline Workflow: Automating Jenkins Jobs with Slack Commands

Building a Scalable Serverless API Architecture on AWS

How AI can harness opportunities within the realm of DevOps

Application Performance Monitoring (APM) for Optimal Application Performance

社区洞察

其他会员也浏览了

Oracle to PostgreSQL migration challenges - The Language Differences

Using Oracle Advanced Queuing? Migrate to TxEventQ for Increased Performance and Reliability.

An intro to using Oracle SQLcl on Mac

FlameGraphs and eBPF (or perf) to troubleshoot Postgres performance

Working with WTForms and Oracle REST Database Services (ORDS) APIs

PostgreSQL Control Flow Statements

SQL is the past...AND the future!

Emulating SQL Using Linux Commands

How to install Microsoft SQL on a Mac with Apple silicon chip

FlywayDB Migration with Spring Boot