"Making a PDF OCR Tool with ChatGPT" - Your next big project awaits!

"Making a PDF OCR Tool with ChatGPT" - Your next big project awaits!

Today, I want to show how we can build an application that can,

  • Open a PDF file,
  • Go to the next or previous pages,
  • Rotate PDF file,
  • OCR selected areas in the PDF,
  • Save texts from PDF into a table,
  • Finally, export the table to an Excel file

I asked ChatGPT to create this application based on the below question or process description. ??

I want to create a Python program with a graphical user interface, and for this, I want to use PyQt5; 

I will use Windows and VS Code IDE,   

Application layout requirements:  
- One top bar for buttons,  
- Two equal size canvases with proper scrollbars,  
- Application color scheme: Hex:343541, Hex:444654, Hex:bdc3ec, Hex:777b92   

The goal of the application: 
View a PDF file, OCR user selected area and export the value to Excel

First canvas (PDF Canvas): View and navigate PDF files quickly and easily (next, previous page, rotate clockwise, rotate anticlockwise, page/total count pages, zoom in, zoom out, extract text, export to Excel etc. buttons with functions)   

Second Canvas (Excel Canvas): A table like an Excel table that has rows and columns, OCR values from PDF Canvas will be posted on this canvas.   

Your method of explanation of code:  
Always write the whole code; if necessary, write it in parts and explain it step by step. You can start by installing libraries.        

I wanted to demonstrate this process to show how we can use ChatGPT in our daily work and also to highlight its usefulness in construction, particularly when it comes to extracting drawing titles from a PDF file. As you may know, manually writing everything into an Excel file can be cumbersome and error-prone.

Although the application has some limitations with certain PDF files that have unconventional coordinate systems, it has still been a significant improvement for me. I believe that every day we are advancing our knowledge of AI and discovering new ways to utilize it for greater productivity in the construction industry.

Of course, this is not a professional-level application as I am not a professional software guy but ChatGPT has been very helpful in this process, providing good suggestions and understanding the overall goal of the app. Although it struggles with some small things due to displayed code length limitations, I am still pleased to use it. I believe that asking good questions is the key to receiving good answers from AI. Overall, I think ChatGPT is a valuable tool, and I appreciate its assistance.

Anyway, after spending the weekend because of Covid at home, the result is like the below:

Application window without any PDF files
PDF loaded
Zoomed in to title block area
OCR text extraction
Text population on the table
Excel export result



要查看或添加评论,请登录

社区洞察

其他会员也浏览了