登录查看更多内容

Building an Azure OpenAI-Powered PDF Question-Answering System in .NET

Chander D.

CEO of Cazton, Author, Microsoft AI MVP, Microsoft RD & Google Developer Expert Award

发布日期: 2023年5月20日

Introduction: With the increasing amount of information available in PDF documents, it has become essential to find ways to extract meaningful insights quickly and efficiently. One way to achieve this is by leveraging artificial intelligence and natural language processing techniques to create powerful question-answering systems.

In this blog post, we will guide you through the process of building a natural language question-answering system using OpenAI and Azure in .NET. This system will allow you to ask questions about a PDF document in natural language, and the AI will provide relevant answers based on the document's content.

We will cover every necessary step and concept required to make this work, including setting up the environment, configuring the OpenAI API client, loading and extracting text from a PDF document, and asking questions to receive answers.

Prerequisites: Before we start, ensure that you have the following installed and set up on your machine:

.NET 7 SDK
An IDE or text editor of your choice (Visual Studio, Visual Studio Code, or JetBrains Rider)
An Azure subscription with the Azure OpenAI API enabled. Use the following link: Introduction to Azure OpenAI Service
PdfPig NuGet package
Azure.AI.OpenAI NuGet package

Step 1: Set Up the Environment

First, create a new .NET Console app and install the required NuGet packages:

dotnet new console -o OpenAI_PDF_QA_Azure

cd OpenAI_PDF_QA_Azure

dotnet add package PdfPig

dotnet add package Azure.AI.OpenAI

Next, replace the content of the Program.cs file with the following C# code:

using System;

using System.IO;

using System.Text;

using UglyToad.PdfPig;

using UglyToad.PdfPig.Content;

using OpenAI;

using Azure;

using Azure.AI.OpenAI;

using static System.Environment;

Step 2: Configure the OpenAI API Client

To configure the OpenAI API client, you will need to provide your Azure OpenAI API Key, endpoint, and deployment name. Replace the placeholders in the following code snippet with your respective values:

// Load environment variables

string key = "<Add your key>";

string endpoint = "<Add your endpoint>"; (Looks like: "https://Cazton.openai.azure.com/" )

string engine = "<Add your deployment name>";

// Configure OpenAI API client

OpenAIClient client = new(new Uri(endpoint), new AzureKeyCredential(key));?

Step 3: Load the PDF File and Extract Text

In this step, we will use the PdfPig library to load a PDF file and extract its text content. Replace the pdfFilePath variable's value with the path to your PDF file:

// Load PDF file and extract text

string pdfFilePath = "gpt.pdf";

StringBuilder fullText = new StringBuilder();?

using (PdfDocument pdf = PdfDocument.Open(pdfFilePath))

{

???for (int i = 0; i < pdf.NumberOfPages; i++)

???{

???????Page page = pdf.GetPage(i + 1);

???????fullText.Append(page.Text);

???}

}

Step 4: Set Up the GPT-3 Model and Prompt

Now that we have extracted the text from the PDF, we will set up the GPT-3 model and the initial prompt. The prompt will include the full text from the PDF document, providing GPT-3 with the necessary context to answer the questions.

Step 5: Ask Questions and Get Answers

In this step, we will define an array of questions to ask about the PDF document. Then, we will iterate through these questions, sending them to the GPT-3 model and receiving relevant answers. Finally, we will print the answers to the console:

// Ask questions and get answers

string[] questions = { "What is the document about?", "Who wrote the document?", "What is the main idea of the document?" };

?foreach (string question in questions)

{

???string fullPrompt = question;

???CompletionsOptions completionsOptions = new()

???{

???????MaxTokens = 100

???};

???Response<Completions> completionsResponse = await client.GetCompletionsAsync(deploymentOrModelName: modelEngine, prompt: fullPrompt);?

???string answer = completionsResponse.Value.Choices[0].Text;

???Console.WriteLine($"{question}\n{answer}\n");

}

Now, combine all the code snippets into the Main method in the Program.cs file:

using System;

using System.IO;

using System.Text;

using UglyToad.PdfPig;

using UglyToad.PdfPig.Content;

using OpenAI;

using Azure;

using Azure.AI.OpenAI;

using static System.Environment;

namespace OpenAI_PDF_QA_Azure

{

???class Program

领英推荐

? GitHub Universe 2024 marks the start of a new AI era…

GitHub 4 个月前

The Best Open Source AI Developer Tools of 2024

Michael Spencer 11 个月前

What is CodeGen?

Michael Spencer 2 年前

???{

???????static async Task Main(string[] args)

???????{

???????????// Load environment variables

???????????string key = "<Add your key>";

???????????string endpoint = "<Add your endpoint>";

???????????string engine = "<Add your deployment name>";

???????????// Configure OpenAI API client

???????????OpenAIClient client = new(new Uri(endpoint), new AzureKeyCredential(key));

???????????// Load PDF file and extract text

???????????string pdfFilePath = "gpt.pdf";

???????????StringBuilder fullText = new StringBuilder();

???????????using (PdfDocument pdf = PdfDocument.Open(pdfFilePath))

???????????{

???????????????for (int i = 0; i < pdf.NumberOfPages; i++)

???????????????{

???????????????????Page page = pdf.GetPage(i + 1);

???????????????????fullText.Append(page.Text);

???????????????}

???????????}

???????????// Set up GPT-3 model and prompt

???????????string modelEngine = "CaztonDavinci3";

???????????string prompt = $"What is the answer to the following question regarding the PDF document?\n\n{fullText}\n\n";

???????????// Ask questions and get answers

???????????string[] questions = { "What is the document about?", "Who wrote the document?", "What is the main idea of the document?" };

???????????foreach (string question in questions)

???????????{

???????????????string fullPrompt = question;

???????????????CompletionsOptions completionsOptions = new()

???????????????{

???????????????????MaxTokens = 100

???????????????};

???????????????Response<Completions> completionsResponse = await client.GetCompletionsAsync(deploymentOrModelName: modelEngine, prompt: fullPrompt);

???????????????string answer = completionsResponse.Value.Choices[0].Text;

???????????????Console.WriteLine($"{question}\n{answer}\n");

???????????}

???????}

???}

}

Finally, run the console app with the following command:

dotnet run

Conclusion: In this blog post, we have demonstrated how to build a natural language question-answering system using Azure OpenAI in .NET. This system allows you to ask questions about a PDF document in natural language, and the AI will provide relevant answers based on the document's content.

We covered setting up the environment, configuring the OpenAI API client, loading and extracting text from a PDF document, and asking questions to receive answers.

GPT-3 models, various PDF documents, or even other types of documents. It can be further extended to support additional natural language processing tasks such as summarization, translation, or sentiment analysis.

By leveraging the power of OpenAI and Azure in .NET, you can create intelligent applications that can understand and process human language, unlocking new possibilities and improving efficiency in various domains like education, research, customer support, and more.

In the future, you can explore incorporating this question-answering system into a web application, chatbot, or voice assistant, providing users with an interactive interface to extract information from documents seamlessly.

Improvements to Consider:

Introducing Efficient Chunking: One limitation of OpenAI GPT-3 is its token limit, which can restrict the amount of text that can be processed at once. To overcome this limitation, we can implement efficient chunking techniques. By breaking down the PDF document into smaller sections and processing them sequentially, we can bypass the token limit and extract information from the entire document seamlessly.
Utilizing Better Retrieval Algorithms: Retrieving relevant information from a PDF document can be challenging, especially when dealing with large datasets. To address this, we can employ advanced retrieval algorithms that prioritize and rank the most pertinent sections of the document. These algorithms can consider factors such as keyword relevance, document structure, and semantic understanding to improve the accuracy and speed of information retrieval.
Incorporating Embeddings and Vector Databases: To enhance the search capabilities, we can leverage word embeddings and vector databases. By converting the text in the PDF document and the query into numerical representations, we can measure the semantic similarity between them. This enables us to find answers that may not have an exact keyword match but share a semantic relationship. By integrating vector databases, we can speed up the retrieval process and ensure more accurate results.
Building an Enhanced Query Interface: A crucial aspect of efficient document questioning is a user-friendly and powerful query interface. We can create an interface that allows users to input multiple questions and receive corresponding answers in a structured manner. This interface can support advanced query features like complex boolean operations, filters, and context-aware querying. Such improvements will empower users to extract comprehensive insights from PDF documents in a single interaction.

Mastering OpenAI GPT-4 and Azure OpenAI using Python and .NET

We are thrilled to offer a comprehensive training program for individuals who are eager to expand their knowledge and master the art of PDF document questioning. Through hands-on exercises, real-world examples, and personalized guidance, participants will learn how to implement efficient chunking, apply advanced retrieval algorithms, utilize embeddings and vector databases, and build an enhanced query interface. They will also gain insights into optimizing model performance, handling different document types, and overcoming common challenges encountered during the document questioning process.?

Conclusion:

By incorporating these improvements and participating in our training program, you will unlock the full potential of OpenAI GPT-3 for PDF document questioning. With efficient chunking, better retrieval algorithms, embeddings and vector databases, and an enhanced query interface, you will be able to extract valuable information, answer complex queries, and gain deeper insights from PDF documents.

Are you ready to take your skills to the next level? Join our training program and discover the limitless possibilities of OpenAI. Sign up for our online training program and take advantage of a special $1,000 discount by using the code "Cazton" during registration. Mastering OpenAI GPT-4 and Azure OpenAI using Python and .NET Tickets, Thu, Jun 22, 2023 at 9:00 AM | Eventbrite

Mastering OpenAI GPT-4 and Azure OpenAI using Python and .NET

Training Agenda:

Setting Up the Environment and Working with APIs

Module 1: Setting up Python and .NET Environments for GPT-4

Module 2: Mastering the Art of Prompt Engineering

Enterprise development with OpenAI: APIs, Text Generation, and GPT-4 App Development

Module 3: Understanding and Working with OpenAI APIs

Module 4: Text Generation and Completion

Module 5: Building a GPT-4 Powered PDF Query App

Advanced Projects and Techniques

Module 6: Chatbots and Chat Completion

Module 7: GPT-4 Chat Bot for Your Website

Module 8: GPT-4 with Private Data

Module 9: GPT-4 with Live Internet Data

Module 10: Vector Databases

For detailed agenda, click here: Mastering OpenAI GPT-4 and Azure OpenAI using Python and .NET Tickets, Thu, Jun 22, 2023 at 9:00 AM | Eventbrite

Happy coding!

Mingying Xue

Application System Analyst at Harris County Central Technology Services (CTS)

1 年

thanks, it is better than ms own document.

1 次回应

查看更多评论

要查看或添加评论，请登录

Chander D.的更多文章

Infinite Retrieval: Attention Enhanced LLMs in Long-Context Processing

2025年3月3日

Infinite Retrieval: Attention Enhanced LLMs in Long-Context Processing

Major Highlights Challenge of Long-Context Processing: Large Language Models (LLMs) struggle with handling extensive…
Why GPT-4.5 Might Be More Important Than You Think

2025年2月28日

Why GPT-4.5 Might Be More Important Than You Think

When OpenAI announced GPT-4.5, the reaction was mixed.

1 条评论
The Evolution of Angular: From AngularJS to a Modern Web Framework

2025年2月23日

The Evolution of Angular: From AngularJS to a Modern Web Framework

Major Highlights The inception of AngularJS and its goal to simplify web application development. The collaboration…
OmniParser: Unifying Text Spotting, Key Information Extraction, and Table Recognition

2025年2月22日

OmniParser: Unifying Text Spotting, Key Information Extraction, and Table Recognition

Major Highlights Introduction of OMNIPARSER, a unified model for visually-situated text parsing tasks. Ability to…
DeepSeek-R1: Enhancing LLM Reasoning with Reinforcement Learning

2025年2月7日

DeepSeek-R1: Enhancing LLM Reasoning with Reinforcement Learning

Highlights Introduction of DeepSeek-R1-Zero: a model trained purely via reinforcement learning without supervised…

1 条评论
Angular Team Discusses 2025 Strategy and Upcoming Features: A Comprehensive Overview

2025年1月31日

Angular Team Discusses 2025 Strategy and Upcoming Features: A Comprehensive Overview

Major Highlights Unit Testing Improvements: Exploring alternatives to Karma, such as Web Test Runner and Vitest…
OpenAI's o1 Model: Advancements in Reasoning and Safety

2025年1月23日

OpenAI's o1 Model: Advancements in Reasoning and Safety

Highlights Introduction to OpenAI's o1 model series and its reasoning capabilities. Overview of the model's data…
Titans: Better than LLMs

2025年1月15日

Titans: Better than LLMs

Major Highlights Introduction of Titans, a novel architecture from Google Research that aims to provide AI models with…

2 条评论
AGENTLESS

2025年1月12日

AGENTLESS

Major Highlights Introduction of AGENTLESS: A straightforward approach to automate software development tasks without…

2 条评论
Think Big, Solve Small: How Small Models Are Outperforming AI Giants in Math!

2025年1月11日

Think Big, Solve Small: How Small Models Are Outperforming AI Giants in Math!

How Small Language Models Can Master Math Reasoning: Insights into rStar-Math Major Highlights Introduction to…

See all articles

Building an Azure OpenAI-Powered PDF Question-Answering System in .NET

Chander D.

CEO of Cazton, Author, Microsoft AI MVP, Microsoft RD & Google Developer Expert Award

领英推荐

Chander D.的更多文章

社区洞察

其他会员也浏览了

Why Choose OpenAI APIs? Unleash the Power of AI in Your Development Projects

Build Smarter Apps Faster Using ChatMotor's OpenAI-Powered SDKs

Image Watermarking Using Computer Vision

Integrating AI in Web Apps: Next.js + FastAPI + LLMs

Practical Guide: Using Gemini Context Caching with Large Codebases

A Tale of Two Copilots: One You Know, The Other a Mystery

How to Use OpenAI's New "Deep Research" Tool to Supercharge Your Software House

59% of developers use AI tools & there are 25.2 million JavaScript users

Creating a Presentation (PPT) with React, Express, and OpenAI APIs

OpenAI for Enterprise, Code Llama, AI Hardware and 8 AI Tools to Manage E-Commerce Business

领英推荐

Chander D.的更多文章

Infinite Retrieval: Attention Enhanced LLMs in Long-Context Processing

Why GPT-4.5 Might Be More Important Than You Think

The Evolution of Angular: From AngularJS to a Modern Web Framework

OmniParser: Unifying Text Spotting, Key Information Extraction, and Table Recognition

DeepSeek-R1: Enhancing LLM Reasoning with Reinforcement Learning

Angular Team Discusses 2025 Strategy and Upcoming Features: A Comprehensive Overview

OpenAI's o1 Model: Advancements in Reasoning and Safety

Titans: Better than LLMs

AGENTLESS

Think Big, Solve Small: How Small Models Are Outperforming AI Giants in Math!

社区洞察

其他会员也浏览了

Why Choose OpenAI APIs? Unleash the Power of AI in Your Development Projects

Build Smarter Apps Faster Using ChatMotor's OpenAI-Powered SDKs

Image Watermarking Using Computer Vision

Integrating AI in Web Apps: Next.js + FastAPI + LLMs

Practical Guide: Using Gemini Context Caching with Large Codebases

A Tale of Two Copilots: One You Know, The Other a Mystery

How to Use OpenAI's New "Deep Research" Tool to Supercharge Your Software House

59% of developers use AI tools & there are 25.2 million JavaScript users

Creating a Presentation (PPT) with React, Express, and OpenAI APIs

OpenAI for Enterprise, Code Llama, AI Hardware and 8 AI Tools to Manage E-Commerce Business