Plan an Azure AI Document Intelligence Solution
Azure Document Intelligence

Plan an Azure AI Document Intelligence Solution

Introduction

Today, I delved into Azure AI Document Intelligence, an innovative solution leveraging Azure AI Services to transform scanned forms into structured data. This tool is particularly valuable for businesses needing to process large volumes of forms quickly and accurately.

Understanding AI Document Intelligence

Azure AI Document Intelligence is a game-changer for streamlining data entry workflows. Traditionally, extracting data from forms required manual input, a time-consuming and error-prone process.


Key Features of Azure AI Document Intelligence

What is Azure AI Document Intelligence?

Azure AI Document Intelligence is an Azure service designed to analyze forms and extract the data they contain. This service can significantly reduce the time and cost associated with manual data entry while minimizing errors.

Responsible Use of AI

Microsoft emphasizes six principles for responsible AI use: Fairness, Reliability and Safety, Privacy and Security, Inclusiveness, Transparency, and Accountability. Ensuring adherence to these principles is crucial for developing ethical and effective AI solutions.

Models in Azure AI Document Intelligence

Azure AI Document Intelligence provides several prebuilt models for common forms like invoices, receipts, and business cards. For unique forms, custom models can be trained to meet specific requirements. These models can even be combined into composed models for analyzing different types of documents with a single endpoint.

Explore model overview here.

In Azure AI Document Intelligence, three of the prebuilt models are for general document analysis:

The other prebuilt models expect a common type of form or document:

  • Invoice
  • Receipt
  • W-2 US tax declaration
  • ID Document
  • Business card
  • Health insurance card

If you have an unusual or unique type of form, you can use the above general document analysis prebuilt models to extract information from them. However, if you want to extract more specific information than the prebuilt models support, you can create a custom model and train it by using examples of completed forms.

You can also associate multiple custom models, trained on different types of document, into a single model, known as a composed model. With a composed model, users can submit forms of different types to a single service, which identifies them and selects the most appropriate custom model to use in their analysis.

Using Azure AI Document Intelligence

Integration with Azure AI Vision

Azure AI Document Intelligence builds on Azure AI Vision's OCR capabilities but offers more sophisticated document analysis, such as identifying key-value pairs and tables. This makes it ideal for comprehensive document analysis solutions.

Tools and APIs

Azure AI Document Intelligence Studio allows for code-free exploration and testing. For integration into applications, APIs are available in multiple languages, including C#/.NET, Java, Python, and JavaScript.

Planning and Deployment

Creating Azure AI Document Intelligence Resources

To start, you'll need to create and configure resources in your Azure subscription. This involves selecting the appropriate pricing tier and obtaining the necessary connection details (Endpoint and Access Key).

Steps:

  1. In the Azure portal, select Create a resource.
  2. In the Search services and marketplace box, type Document Intelligence and then press Enter.
  3. In the Document intelligence page, select Create.
  4. In the Create Document intelligence page, under Project Details, select your Subscription and either select an existing Resource group or create a new one.
  5. Under Instance details, select a Region near your users.
  6. In the Name textbox, type a unique name for the resource.
  7. Select a Pricing tier and then select Review + create.
  8. If the validation tests pass, select Create. Azure deploys the new Azure AI Document Intelligence resource.

Example Code for Integration

Here’s a snippet of Python code to connect your application to Azure AI Document Intelligence:

from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult

endpoint = "<your-endpoint>"
key = "<your-key>"
docUrl = "<url-of-document-to-analyze>"

document_analysis_client = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(key))
poller = document_analysis_client.begin_analyze_document_from_url("prebuilt-document", docUrl)
result: AnalyzeResult = poller.result()        

Choosing the Right Model

Azure AI Document Intelligence offers both prebuilt and custom models. Prebuilt models are ideal for standard documents like invoices and receipts, while custom models are suited for unique forms. By training custom models with examples, you can achieve high accuracy in data extraction for specialized documents.

Prebuilt Models in Azure AI Document Intelligence

Document types like invoices and receipts often have similar structures and key-value pairs across different businesses. For example, the "Total cost" field may appear as "Total," "Sum," or another term. Azure AI Document Intelligence provides several prebuilt models to handle these common types of documents, making it quick and easy to create solutions without needing to train your own models.

General Document Analysis Models

  • Read: Extracts words and lines from both printed and handwritten documents, and detects the document's language

example from

  • General Document: Extracts key-value pairs and tables

example from

  • Layout: Extracts text, tables, and structure information from forms, including selection marks like checkboxes and radio buttons

example from

Specific Document Type Models

  • Invoice: Extracts key information from sales invoices in English and Spanish

  • .Receipt: Extracts data from both printed and handwritten receipts

  • .W-2: Extracts data from the U.S. government's W-2 tax declaration form.
  • ID Document: Extracts data from U.S. driver's licenses and international passports.
  • Business Card: Extracts names and contact details from business cards

Custom Models

If prebuilt models don’t meet your needs, you can create custom models tailored to the specific documents you'll be analyzing. While general document analyzers can extract rich information, custom models provide more predictable and standardized results for unique form types.

Training Custom Models

To train a custom model, provide at least five examples of completed forms. The more examples you supply, the greater the confidence levels in the analysis. Include a range of document variations, such as both handwritten and printed entries, to ensure reliability.

There are two types of custom models:

  1. Custom Template Models: Ideal for forms with a consistent visual template. If the blank forms are identical, use this model. It supports 9 languages for handwritten text and a wide range for printed text. For different template variations, train a model for each and compose them together.
  2. Custom Neural Models: Suitable for both structured and unstructured documents, such as contracts with no defined structure. These models work best in English but also support Latin-based languages like German, French, Italian, Spanish, and Dutch.

Composed Models

Composed models consist of multiple custom models, which is useful when you don’t know the document type in advance. When a form is submitted, the service classifies it to determine which custom model to use. This is beneficial for handling a variety of similar forms and simplifies publishing a single endpoint for all form types. The results include the docType property, indicating the chosen custom model for each form.

By understanding the differences between prebuilt, custom, and composed models, you can effectively utilize Azure AI Document Intelligence to streamline document processing in various scenarios. This knowledge not only helps in selecting the right model but also in designing efficient and accurate data extraction solutions.

Conclusion

Exploring Azure AI Document Intelligence has been an insightful journey. This powerful tool not only automates data extraction but also adheres to ethical AI principles, ensuring fair, reliable, and secure operations. Whether using prebuilt models for common documents or custom models for unique forms, Azure AI Document Intelligence provides a robust solution for modern data entry challenges.

Ritvik Shukla

Building @ Stealth Startup | Python | Typescript | LLM | Go | Nextjs | AWS | Ex-YC SDE | Open Source

8 个月

Useful Posts, Thanks Shobhit Tiwari!

回复

要查看或添加评论,请登录

Shobhit Tiwari的更多文章

社区洞察

其他会员也浏览了