Auto-Scaling iDialogue Document Workflows
Michael Leach
Intelligent Document Processing (IDP) and Generative AI/GPT for Salesforce
When building the iDialogue document automation web service, we wanted to ensure that everything auto-scales. Whether processing 1 document or 1 Million, the platform needed to handle any workload.
We chose AWS Lambda Step Functions as our primary technology stack for document processing. For the edge cases where we rely on 3rd party solutions, this meant ensuring their software also worked in the AWS auto-scaling stack.
One such partner is Native Documents, who we collaborated with on DOCX to PDF conversion. They were able to quickly re-package their converter into a Serverless Lambda CloudFormation repository, which we dropped right into an AWS step function.
This modular approach elegantly embraces the Unix philosophy of "Do one thing well". In the case of Native Documents docx-to-pdf converter, it does one thing extremely well... convert Word docs into PDF files. Our upstream and downstream conversion filters similarly follow this "do one thing well" philosophy, and when combined in series, these many modules perform well at any scale.
We're able to set performance metric goals and monitor the performance of individual document tasks in CloudWatch. These metrics helped us identity some bottlenecks and optimize the document processing pipeline. At the time of writing this article, AWS was in the process of further improving Lambda performance, which we hope will ultimately get performance down to sub-second processing.
Jason Harrop, Founder of Native Documents, recently gave a presentation on "Millions of PDFs Once a Month: Serverless eStatements". iDialogue's Quote-to-Cash invoice generation engine is built upon this principle of using auto-scaling serverless architecture to cost effectively scale document generation services on-demand.