My Weekend fun with Google Vision API..
My weekends started with a call with old friend who is into publishing and reprinting Marathi philosophical books. He told me that he is looking forward for some tool which he can use to scan old books copies and convert the text into Unicode format into Marathi. He is currently using one of the optical character recognition tool but it allows him to scan only one page at a time and the current process is very time consuming approximately 5 to 6 minutes per page to convert. He was looking for some tool which will give him ability to scan multiple images at once and get output so that his job will be easy and fast. He told me that the proof reading needs to be done after OCR process anyway where we really need not be 100% correct when we are scanning the text from the images.
I told him that I can possibly try to help him but accuracy can't be the criteria for this tool and I will call him on next day.
On Saturday checked the Google Vision API and it is really very easy to implement for the requirement which he discussed with me. Decided to give it a try on Google Cloud and not to include any of the technical jargon's where he should be comfortable in operating the tool. I really don't wanted to create any compute resources or any service resources. I thought of creating simple function which will be simple to implement, simple to write and has to be simple to maintain.
Recently while working with clouds heard about node.js and its related lightweight operations so decided to implement using node.js function. For Google vision API demo location found the sample codes for vision API.
Creating a project and implementing sample code took me about 30 to 40 minutes. Created one storage bucket and few functions. On the storage bucket, whenever there will be a object created/image uploaded, set up the trigger so that it will generate the output and write it into a simple text file on the same location which should include filename detected language and detected text in text format. Find the below text as sample which I am adding in the end of text file.
With some sample pages found online in Marathi languages and Google images attempted to Test the functionality and it worked very well and beyond my expectations. Next day morning called him and given him a quick demo and he found it useful.
--------------------
Image Name: Page1223.jpg
Lang: en
Extracted Text:
Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text, Sample extracted text,
--------------------
Never thought, Google Vision API are so easy to implement and it has so diverse scope that it even works in most of the world locales..
#GoogleVisionAPI
Head of Technology Operations (Professional Services India) at Capita
4 年The application matters..... nice one...