登录查看更多内容

Custom Document Extractors with Google Document AI

Vijay Chaudhary

Lead Software Engineer

发布日期: 2024年12月8日

GCP Document AI broadly has three categories of document extraction models – General Document processors (Layout, Form and Doc OCR), specialized processors (invoices, tax forms, lending forms, contracts etc.) and custom processors (custom classifiers, splitters and extractors).?In this article, we will focus on Custom Extractor feature also sometimes called CDE (custom document extractor). CDEs are suited for image data extraction with specific business documents (forms unique to your organization). This processor identifies and extracts entities from documents, one can then use this trained processor on new documents for data extraction. ?

High level steps for CDE creation and testing, ?

Create a Custom Document Extractor - Initialize custom processor in AI Workbench.?

Create the Processor Schema - Add fields and data types.?

Import Documents - Upload documents to be used for training.?

Annotate Documents Manually - Label document data manually to build a training dataset.?

Use Generative AI to Auto-Label - Use AI to reduce manual labelling effort.?

Train the Processor - Start the training job to fine-tune the model.?

Test & evaluate the Trained Model - Validate the model’s accuracy by testing it on a separate set of documents.?

Deploy the Processor - Make the trained model available for use in production to extract similar documents.?

Let's try the same steps in GCP console for better understanding. ?

Step [1] Enable Document?AI API in console. Once enabled, go to Custom Processors section and create custom extraction as highlighted below. ?

Step [2] Enter the name and other details. On the next screen click on Get Started under Customize tile.?

Step [3] Create fields and declare data types. ?

Step [4] Add document from device or Google cloud (or local device) storage for training and testing for labelling (annotation) exercise. ?

领英推荐

Implementing AI Agents, Insights from 30k Data Science…

Open Data Science Conference (ODSC) 2 个月前

Intelligent Document Processing with AWS, Mastering…

Open Data Science Conference (ODSC) 1 年前

Goodbye Manual Data Entry: Automating Document…

Emmanuel Ramos 11 个月前

Step [5] Start labelling, documents are auto labelled by Generative AI. Confirm the documents if you think the suggestions are correct. See an example below. Repeat the steps for all training & testing documents.?

Review each field carefully, suggestions could be wrong. In the example SSN value is not read properly.

You can run into issues if the minimum criteria for creating custom model are not met. A minimum of ten documents are needed each for training and test datasets. Once you have met all the prerequisites, you should be able to create the model. ?

Step [6] Once annotation exercise is completed, go to Train a custom model tile and select Create New Version.

Step [7] To check the status of model availability, go to Deploy & Use tab and check the status. ?

Step [8] Once the model is available run the evaluation matrix to check the accuracy rate against test document set. Below metric is obtained from out of the box available Generative AI based model (you can select your model from top left corner).

Step [9] Once model training is completed you can upload a document and see if it is able to extract the required fields. Once you are happy with evaluation metric on test set and few random unit test - deploy and start using it for production images.

Summary??

We saw the creation and testing of Custom Document Extractors (CDE) in Google Cloud Platform's Document AI. CDEs are ideal for extracting data from business documents, such as unique forms specific to an organization. The process involves initializing a custom processor, defining the processor schema with fields and data types, uploading documents for training, and annotating data manually. Generative AI can be leveraged to auto-label documents, reducing manual effort. Once sufficient labeled data is available, a training job is initiated to fine-tune the processor. After training, the processor is tested for accuracy on a separate dataset. Once validated, the model can be deployed for production use to process similar documents. By following these steps, users can build tailored solutions for enterprise.?

AI-ML & Automations

1,576 位关注者

要查看或添加评论，请登录

Vijay Chaudhary的更多文章

Understanding RAG Evaluation: A Practical Approach to Retrieval Metrics

2025年3月16日

Understanding RAG Evaluation: A Practical Approach to Retrieval Metrics

Retrieval-Augmented Generation (RAG) systems are gaining popularity, helping users find relevant documents to answer…

1 条评论
Splitting Text Right Way - NLTK, SpaCy or Markdown

2025年3月2日

Splitting Text Right Way - NLTK, SpaCy or Markdown

For natural language processing (NLP) working with large pieces of text can be challenging. Many language models have…

1 条评论
Unlocking Entities and Relations: Creating Knowledge Graphs with AI

2025年2月16日

Unlocking Entities and Relations: Creating Knowledge Graphs with AI

GraphRAG is something which is picking up recently, in this article we will try to get to the basics of GraphRag…
Structured Outputs from LLMs: LangChain Output Parsers

2025年2月9日

Structured Outputs from LLMs: LangChain Output Parsers

LLMs are good at generating human-like text (hence called Generative AI), but when it comes to integrating to…
Handling Sensitive Data: Redaction, Masking and Compliance

2025年2月2日

Handling Sensitive Data: Redaction, Masking and Compliance

In today's data-driven world, digital documents containing sensitive information pose challenges to privacy and…
Optimizing AI Workflows with LangChain - A Practical Introduction

2025年1月25日

Optimizing AI Workflows with LangChain - A Practical Introduction

LangChain is a framework for developing applications powered by large language models (LLMs). It helps in simplifying…
Effortlessly Organize Mixed Documents with GCP's Custom Splitter Feature

2025年1月19日

Effortlessly Organize Mixed Documents with GCP's Custom Splitter Feature

In real-world scenarios, it's common to encounter multiple documents combined into a single, multi-page image or PDF…
Improving AI Contextual Understanding -Retrieval Augmented Generation (RAG)

2025年1月4日

Improving AI Contextual Understanding -Retrieval Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is a technique in natural language processing that uses knowledgebase information…

2 条评论
Understanding Custom Classifiers in Google Document AI

2024年12月29日

Understanding Custom Classifiers in Google Document AI

There are three categories of models or services in GCP Document AI – General Document processors (Layout, Form and Doc…
Processing with GCP Document AI: Exploring Pretrained Parsers

2024年12月15日

Processing with GCP Document AI: Exploring Pretrained Parsers

GCP Document AI offers multiple products to process documents for information for different use cases. Below…

2 条评论

See all articles

Custom Document Extractors with Google Document AI

Vijay Chaudhary

Lead Software Engineer

领英推荐

Summary??

AI-ML & Automations

1,576 位关注者

Vijay Chaudhary的更多文章

社区洞察

其他会员也浏览了

What’s RAG (and why should enterprise leaders care)?

How to Build a Robust Data Collection Pipeline for Machine Learning

SAP AI Ambitions (Part III-Final)

Microsoft Copilot… or Azure?AI?!

New synthetic Text-to-SQL dataset, LLM training workshop, and more

How MLOps Improves the Lifecycle of Machine Learning Models

Data Dominance in AI: The Rise of Adobe's Firefly and the Industry's Race for Quality Data

Google Cloud Platform Generative AI Services

Powered By… Edition 7: The Technology Powering AI Agent Prototypes

Put the Power of Machine Learning in the Hands of Operational Experts

领英推荐

Summary??

AI-ML & Automations

1,576 位关注者

Vijay Chaudhary的更多文章

Understanding RAG Evaluation: A Practical Approach to Retrieval Metrics

Splitting Text Right Way - NLTK, SpaCy or Markdown

Unlocking Entities and Relations: Creating Knowledge Graphs with AI

Structured Outputs from LLMs: LangChain Output Parsers

Handling Sensitive Data: Redaction, Masking and Compliance

Optimizing AI Workflows with LangChain - A Practical Introduction

Effortlessly Organize Mixed Documents with GCP's Custom Splitter Feature

Improving AI Contextual Understanding -Retrieval Augmented Generation (RAG)

Understanding Custom Classifiers in Google Document AI

Processing with GCP Document AI: Exploring Pretrained Parsers

社区洞察

其他会员也浏览了

What’s RAG (and why should enterprise leaders care)?

How to Build a Robust Data Collection Pipeline for Machine Learning

SAP AI Ambitions (Part III-Final)

Microsoft Copilot… or Azure?AI?!

New synthetic Text-to-SQL dataset, LLM training workshop, and more

How MLOps Improves the Lifecycle of Machine Learning Models

Data Dominance in AI: The Rise of Adobe's Firefly and the Industry's Race for Quality Data

Google Cloud Platform Generative AI Services

Powered By… Edition 7: The Technology Powering AI Agent Prototypes

Put the Power of Machine Learning in the Hands of Operational Experts