Microsoft Support Number OR Call Toll-Free @ 1-844-478-2887

Monday, September 26, 2011

Using OneNote As an OCR

The other day I was sent a pdf that I was supposed to pull content from for a website I was working on.

I didn't think much of it, being that I thought I would copy and paste the content into the HTML Of course things always have to be harder than they appear per Murphy's Law.

Instead of re-typing everything by hand, I looked for a faster alternative. This alternative was to find an OCR tool. Optical character recognition (OCR) is defined by Wikipedia.org as:

"OCR is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text."

Most OCR tools I have ever used are terrible at what they do, and usually spit out weird symbols. That is where Microsoft Office OneNote comes in.

The Solution

Here is the step-by-step procedure to converting those ugly PDF/Images into readable text.

Step 1

If the file is a PDF, or a non-image file you will need to save it as an image file. Save it as the highest quality you can. This allows the OCR to recognize the characters more easily. Some low resolution images will not work with OneNote.

Step 2

Open OneNote and insert the picture onto a blank page

Step 3

Now right-click the image and select "copy text from picture".

Step 4

You should now have the text you need to properly insert it into your website, or print material.

Article Source: http://EzineArticles.com/5780947
Post a Comment