Thoughts on Apache Airflow AWS Lambda Operator

Thoughts on Apache Airflow AWS Lambda Operator

Apache Airflow?is a popular open-source workflow management platform. Typically tasks run remotely by?Celery?workers for scalability. In AWS, however, scalability can also be achieved using serverless computing services in a simpler way. For example, the?ECS Operator?allows to run?dockerized?tasks and, with the?Fargate?launch type, they can run in a serverless environment.

The ECS Operator alone is not sufficent because it can take up to several minutes to pull a Docker image and to set up network interface (for the case of?Fargate?launch type). Due to its latency, it is not suitable for frequently-running tasks. On the other hand, the latency of a Lambda function is negligible so that it's more suitable for managing such tasks.

In this post, it is demonstrated how AWS Lambda can be integrated with Apache Airflow using a custom operator inspired by the ECS Operator.

Continue...

Thanks for sharing! I was wondering... It looks like the wait on the task is a busy-wait (Airflow servers) Thus, wouldn't I prefer using an Operator that executes the service and a Sensor (mode=reschedule) that checks whether the task is finished?

要查看或添加评论,请登录

Jaehyeon Kim的更多文章

社区洞察

其他会员也浏览了