登录查看更多内容

Understanding Custom Classifiers in Google Document AI

Vijay Chaudhary

Lead Software Engineer

发布日期: 2024年12月29日

There are three categories of models or services in GCP Document AI – General Document processors (Layout, Form and Doc OCR), specialized processors (invoice, tax forms, lending forms, contracts etc.) and custom processors (custom classifiers, splitters and extractors). In this article we will focus on the Custom classifier feature - CDC (custom document classifier). CDC is?for document classification, mainly used for identifying type of business documents which are not in general categories of document (like invoice, passport etc.). Classifying a document or identifying type is an important pre-requisite for extraction – our end goal of document capture is to lift the field data automatically from image and export it, and to know which data fields needs to be lifted you need to know the type of the document (e.g. you won’t be able to lift an invoice number or invoice date fields?from gas bill document). This custom model is trained using specific documents from business and custom classes are created as per the requirement of the implementation.??

High level steps to create and use a GCP custom classifier, ?

Initialize a Custom Classifier - Set up a new custom classifier in Document AI.?
Prepare a Dataset - Create a dataset using a designated Cloud Storage bucket.?

Import Documents - Upload the documents required for training and testing.?

Annotate Documents- Label document data manually in Document AI.?

Train the Classifier - Initiate a training job to build a custom classification model.?

Evaluate the Model - Check performance on the test dataset to validate accuracy.?

Deploy the Classifier - Publish the trained classifier for production use.?

Test with Production Similar Images – Test the classifier.?

Start Using the Classifier - Begin processing images.?

Let's try the same steps in GCP console for a better understanding.???

Step [1] Enable Document AI API in console. Once enabled, go to Custom Processors section and create custom classifier as highlighted below.??

Step [2] Enter the name and other details. On the next screen click on Configure Your Dataset button – opt for google managed storage or pass the storage bucket details you already have one for dataset configuration. ?

Step [3] Create a storage bucket to import documents which need to be part of the classification training. Upload the sample training and test images into this bucket. ?

Step [4] Click on the import documents button and pass the storage bucket address. Keep the auto-split configuration if all your documents are in one folder. If you have separate folders for training and test sets, you can add folders for the same separately. ?

It will take a few minutes for all documents to be imported.?

Step [5] Click on edit schema and add document labels or document type names. ?

Step [6] Double click on any document tile and start labelling exercise – Categorize which image belongs to which class manually, this information will be used to train the model. Select the class -> click on Mark as Labelled -> Continue doing the same for other documents. ?

Review each image and select the document type. ?

领英推荐

Microsoft's Unified AI Building Blocks for .NET

developrec 5 个月前

AI as a Service (AIaaS) in the era of “buy not build”

Algolia 1 年前

Comparison between OpenAI and OCI Gen AI Services -…

Sanjay Basu PhD 4 个月前

Step [7] Once the minimum number of documents is labeled, try training the model by clicking on Train New Version.?

You can run into some issues if the minimum criteria for creating custom model are not met. A minimum of ten documents are needed for training and two for the test dataset. Once you have met all the prerequisites, you should be able to create the model.??

Note – To overcome these issues import and label more documents.???

Step [8] Enter the version and click on the Start Training button.

Step [9] To check the status of model training status, go to Manage Versions and check the status.??

Step [10] Once training is completed. Check the F1, precision and recall score for metrics on test set. This will give you an indication of how well the model is behaving.

Next you can deploy if you are happy with metrics else you can train the model with more sample to import

Step [11] Once deployment is completed you can test by uploading an image. Go to ?Evaluate &Test tab and upload a document.?

In few seconds you can see the Document classification results and percentage confidence score.

Step [12] Once the model is available you can make API requests and get classification results.

Sample endpoint - https://us-documentai.googleapis.com/v1/projects/75XXXXXXX803/locations/us/processors/7dd4dazzzzzzzz429/processorVersions/21YYYYYY191af7cc:process

Request body format

Summary??

In this article, we explored the creation and use of Custom Document Classifiers (CDC) within Google Cloud's Document AI platform. CDCs are good for categorizing unique business documents, enabling next data extraction processes. The workflow involves initializing a custom classifier, preparing a dataset, and importing documents. Users annotate data manually or utilize auto-labeling (once first model version is deployed) to minimize annotation efforts. Also we covered splitting datasets into training and testing sets, classification model is built with training set, validation metric is created with test set, and if accuracy scores are poor, it can be optimized with more samples and retraining.?

Once model is deployed, it can be tested with production like documents. This process enables businesses to classify document types, for next step which is automated data capture. Use Document AI to meet specific document capture classification needs, improving automatic document classification accuracy and reducing manual labor.?

AI-ML & Automations

1,576 位关注者

要查看或添加评论，请登录

Vijay Chaudhary的更多文章

Understanding RAG Evaluation: A Practical Approach to Retrieval Metrics

2025年3月16日

Understanding RAG Evaluation: A Practical Approach to Retrieval Metrics

Retrieval-Augmented Generation (RAG) systems are gaining popularity, helping users find relevant documents to answer…

1 条评论
Splitting Text Right Way - NLTK, SpaCy or Markdown

2025年3月2日

Splitting Text Right Way - NLTK, SpaCy or Markdown

For natural language processing (NLP) working with large pieces of text can be challenging. Many language models have…

1 条评论
Unlocking Entities and Relations: Creating Knowledge Graphs with AI

2025年2月16日

Unlocking Entities and Relations: Creating Knowledge Graphs with AI

GraphRAG is something which is picking up recently, in this article we will try to get to the basics of GraphRag…
Structured Outputs from LLMs: LangChain Output Parsers

2025年2月9日

Structured Outputs from LLMs: LangChain Output Parsers

LLMs are good at generating human-like text (hence called Generative AI), but when it comes to integrating to…
Handling Sensitive Data: Redaction, Masking and Compliance

2025年2月2日

Handling Sensitive Data: Redaction, Masking and Compliance

In today's data-driven world, digital documents containing sensitive information pose challenges to privacy and…
Optimizing AI Workflows with LangChain - A Practical Introduction

2025年1月25日

Optimizing AI Workflows with LangChain - A Practical Introduction

LangChain is a framework for developing applications powered by large language models (LLMs). It helps in simplifying…
Effortlessly Organize Mixed Documents with GCP's Custom Splitter Feature

2025年1月19日

Effortlessly Organize Mixed Documents with GCP's Custom Splitter Feature

In real-world scenarios, it's common to encounter multiple documents combined into a single, multi-page image or PDF…
Improving AI Contextual Understanding -Retrieval Augmented Generation (RAG)

2025年1月4日

Improving AI Contextual Understanding -Retrieval Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is a technique in natural language processing that uses knowledgebase information…

2 条评论
Processing with GCP Document AI: Exploring Pretrained Parsers

2024年12月15日

Processing with GCP Document AI: Exploring Pretrained Parsers

GCP Document AI offers multiple products to process documents for information for different use cases. Below…

2 条评论
Custom Document Extractors with Google Document AI

2024年12月8日

Custom Document Extractors with Google Document AI

GCP Document AI broadly has three categories of document extraction models – General Document processors (Layout, Form…

See all articles

Understanding Custom Classifiers in Google Document AI

Vijay Chaudhary

Lead Software Engineer

领英推荐

Summary??

AI-ML & Automations

1,576 位关注者

Vijay Chaudhary的更多文章

社区洞察

其他会员也浏览了

Artificial Intelligence on Google Cloud Platform

PoC / MVP / Scaling or Implementation - Generative AI Professional Services Offerings in AWS Marketplace (Part 13)

Azure AI Agents vs. AWS AI Agents vs. Google Vertex AI Agent Builder

Unlocking the Power of AI Search: What is Amazon Kendra and Why Should Businesses Care?

Harness the Power of Generative AI with AWS Bedrock: Unlock Innovation with ExpertsCloud

A Closer Look at the Major Players GenAI Stack

Machine Learning Recommendation Systems and Azure ML: A Comprehensive Guide

Crafting the Golden Age of AI: Keynotes from Microsoft Build 2024

ITVersity's AI and Data Newsletter - 25-07 Edition - 1

Part 6: Setting Up Your AI Environment

领英推荐

Summary??

AI-ML & Automations

1,576 位关注者

Vijay Chaudhary的更多文章

Understanding RAG Evaluation: A Practical Approach to Retrieval Metrics

Splitting Text Right Way - NLTK, SpaCy or Markdown

Unlocking Entities and Relations: Creating Knowledge Graphs with AI

Structured Outputs from LLMs: LangChain Output Parsers

Handling Sensitive Data: Redaction, Masking and Compliance

Optimizing AI Workflows with LangChain - A Practical Introduction

Effortlessly Organize Mixed Documents with GCP's Custom Splitter Feature

Improving AI Contextual Understanding -Retrieval Augmented Generation (RAG)

Processing with GCP Document AI: Exploring Pretrained Parsers

Custom Document Extractors with Google Document AI

社区洞察

其他会员也浏览了

Artificial Intelligence on Google Cloud Platform

PoC / MVP / Scaling or Implementation - Generative AI Professional Services Offerings in AWS Marketplace (Part 13)

Azure AI Agents vs. AWS AI Agents vs. Google Vertex AI Agent Builder

Unlocking the Power of AI Search: What is Amazon Kendra and Why Should Businesses Care?

Harness the Power of Generative AI with AWS Bedrock: Unlock Innovation with ExpertsCloud

A Closer Look at the Major Players GenAI Stack

Machine Learning Recommendation Systems and Azure ML: A Comprehensive Guide

Crafting the Golden Age of AI: Keynotes from Microsoft Build 2024

ITVersity's AI and Data Newsletter - 25-07 Edition - 1

Part 6: Setting Up Your AI Environment