Can your OCR software do handwriting?
Scratching the surface on handwriting analysis by Dan Atkins

Can your OCR software do handwriting?

What a question. Sometimes we need to correct expectations on handwriting. I have spent a lot of time on the analysis of handwriting and what we can and can’t do. I am going to share some myth busting with you in this article and maybe see how the misconceptions come about with OCR/ICR and handwriting.

My phone can do it… actually so can my tablet and Surface Pro. Why can't you do it? It must be easy surely.

Well. Let’s stop for a moment and think about handwriting and how it's made up. We all have unique features of our handwriting that either are easy to read or not to the naked eye. A Doctor's handwriting for example, is definitely hard to read. My handwriting I am told is easy to read. I am going to take a moment and go around my office to get some handwriting for us to look at. Hang on..

OK- I asked three of my devs to write this sentence.

The rain in Spain stays mainly on the plain.

This is Cat’s handwriting. I ran this through a bog standard Tesseract read and got:

1thiin saju inhh978

I78FYni90lu8hj 9!o

Not good. I asked her to write the same sentence on her surface pro using the stylus, and the same with the mouse. Both got the same result.

The rain in Spain stays mainly on the plain.

Here are two more examples.

Sophie’s handwriting

Ryan’s handwriting.

I asked both of my other developers to do the same using different OCR/ICR tools and different tablets. Same result. Rubbish from the OCR read, perfection from the tablet. So when a client asks me if our Smart Reader technology can read handwriting, what should I say? Other readers will let you ‘train’ it for handwriting but that isn’t much use unless you can get samples, or wait for enough samples. That really isn’t good enough.

Without setting expectations too high, I had an idea.

Why does it read from the tablet?

The reason is that not only is it capturing the finished sentence, but, it is also capturing the strokes and gestures that were made during the writing of the sentence. That’s a hard one for non techies to follow. In fact try explaining that to people at an executive level.

Below is a diagram that explains strokes, typically I use this during presentations.

The above is the breakdown of single straight strokes only. Given that now we have got the straight line stroke AND the finished word, we can probably get the word itself.

However, if we also add in the curved strokes as well, and add all that information into an array of strokes, plus the finished word then we will easily be able to identify the letter, and pass the letters that make up a word through a machine learning model to correct the word if it is wrong. If we also know the context then we are away.

So to answer the question in a nutshell, that’s how your phone and tablet convert writing into digital text. Before the developers amongst you start arguing with me - it does way more than that - which is true, but for now let's think of it as a collection of pixels, strokes and gestures.

 

OCR/ICR

Right now the term A.I. is massive in most industries. I think most know my feelings about it, I think Alan Turing would be turning in his grave considering what some companies are calling A.I.

However, there is an element of intelligence in the above. How about when we just have a scan of handwriting. How will we read that given we do not have that stroke and gesture information?

The answer is, it is very tricky. And impossible to get right 100% of the time. Let's think about what we are getting. We are getting an array of pixels. Little on/off switches that make up a picture. These can come in a variety of different formats that we are all familiar with. JPG,BMP, PNG etc.

Firstly how do we know it is handwriting? I am working on a project right now that is a mixture of handwriting and plain text. So a scan can come in essentially with handwritten notes on it.

The idea is to OCR the text and then ICR the handwriting. That is a huge problem.

1.      How do we know where the handwriting may be

2.      How do we know if it is handwriting

3.      How do we know if the handwriting is in a straight line

Those are SOME of the questions, but as you can imagine, there are a lot more! The answer is very simple. You cannot blanket read a document for handwriting.

As I always tell our developers - divide and conquer.

Let me repeat - you cannot blanket read a document for handwriting. However, like all programming problems you can do things based on what you know and find a solution.

Context is so important. In fact that is half the battle. Assuming you know the industry you are reading documents for, then you can eliminate a lot of problems.

If you know the place you are looking for the handwriting then you are away. There are however, other tricks you can employ to find handwriting. But that is for another discussion.

Let’s say you know where it is and you have snipped it programmatically out of your document. We are still missing that lovely gesture and stroke information, so it isn’t as easy to solve, just yet.

We must think about the handwriting as that array of pixels. I described it recently to one of our junior devs as a microscope zooming in and out.

In is

[0,1,0,0,0,1,1,1,…N]

Out is

Or the picture.

Without giving away all my secrets, we have devised something I believe is unique. An intelligent way of tracing the 1,0 in the same way a person would. Thus we can capture what we think of as the gestures and strokes people would use.

Once this is applied we can then begin to try to understand a bit better what the handwriting says.

We here at Sonix have begun to create a gesture and stroke library to mimic the handwriting capture in a similar way to a tablet. We will keep you posted!

Click the robot to visit us!



Dan Atkins

Lead Developer at Sonix Software

5 年

How's about I send you some samples to run through?

回复
Nowell Outlaw

VC/PE-Backed CEO | AI/ML Innovator | Revenue Growth | Turnaround Architect | Tech-Driven M&A Strategist

5 年

100M pages processed, validated by independent 3rd party. Send me a direct message I'll give you a tour.

回复
Nowell Outlaw

VC/PE-Backed CEO | AI/ML Innovator | Revenue Growth | Turnaround Architect | Tech-Driven M&A Strategist

5 年

Dan, we've cracked a lot of this for forms processing. Did 100M page project with 97% accuracy. Vidado.ai

要查看或添加评论,请登录

Dan Atkins的更多文章

  • Quill - An alternative digitisation tool

    Quill - An alternative digitisation tool

    Quill is contract and large document digitisation with lots of other features bundled into one handy tool. It is the…

  • My thoughts on the Z340 Cipher

    My thoughts on the Z340 Cipher

    I'll be honest, I have little knowledge of the Zodiac murder case. The closest I have come was when I watched a film…

  • Fermat's Last Theorem and Me

    Fermat's Last Theorem and Me

    When I was in school I got a D in Mathematics. I never really applied myself and wasn’t much interested in maths.

  • A.I. – One of the most misused terms in our industry

    A.I. – One of the most misused terms in our industry

    So by now if you have read any of my previous articles you probably know my feelings on companies who use the term A.I.

  • Fuzzy Lookups (Clause finding)

    Fuzzy Lookups (Clause finding)

    Fuzzy lookups (Clause Finder) Recently during a build for a client I was asked to look into finding predefined clauses…

  • Sonix Software Smart Reader V1

    Sonix Software Smart Reader V1

    Sonix Smart Reader- what is it for? Sonix Smart reader is not an OCR reader. If you have clear scans that are not…

  • Bespoke Applications and UiPath - Part 2

    Bespoke Applications and UiPath - Part 2

    A few people contacted me after the last article I did on bespoke applications and UiPath to ask if I could go a bit…

  • Bespoke applications and UiPath - A Small Case Study

    Bespoke applications and UiPath - A Small Case Study

    We have a client who has gone through the RPA set-up. We came across all the expected challenges with building an RPA…

    2 条评论
  • UiPath and the Programmer..

    UiPath and the Programmer..

    UiPath Development and the Programmer Having been a developer in the ‘coding’ sense for many years - when I was first…

  • What do we do with our RPA Developers?

    What do we do with our RPA Developers?

    An Interesting Experience So let’s talk Robots, RPA, Robotics etc. There are a lot of names and a lot of people talking…

社区洞察

其他会员也浏览了