Optical Character Recognition provides a nearly automated means of digitizing text from scanned pages, eliminating the need to retype them. Adobe Acrobat Professional includes OCR capabilities that enable you to save scanned results directly into Rich Text Format or Microsoft Word's DOC and DOCX file types. If you open a document in Acrobat Professional but the program refuses to recognize text that's clearly visible on the page, check the source file for some common problems that can cause OCR headaches.
Perhaps the least obvious cause of OCR failures in Acrobat Professional is when you attempt to digitize a page that already contains live text. If you absolutely must run OCR on text that you could copy to the clipboard and paste into a word processor or export from Acrobat directly into a word processing format, you must convert your file's live text to pixels first. Otherwise, you'll see an error message that reports a recognition failure.
Advertisement Article continues below this adLow-resolution scans of less than 150 pixels per inch provide poor source material for Acrobat Professional's OCR capabilities and for other OCR programs as well. Likewise, if your scans come out crooked, the probability of obtaining good results is reduced. Correcting low-resolution problems usually requires rescanning your source at a higher ppi value, preferably 300 ppi. If you are scanning printed pages on a graphic-arts scanner that lacks a document feeder, take a moment to position your paper properly on the scanner glass or open your scans in a program that can help you straighten them, such as Adobe Photoshop.
Although high-resolution scans improve OCR results by providing Acrobat Professional with better source material, the old saying "garbage in, garbage out" applies when your original is of poor quality. Scans of faxed material and printouts from a microfilm or microfiche printer can yield inferior OCR results. If sources such as these are your only form of input, plan to invest the time necessary to correct your OCR output or retype the text if it's short.
Advertisement Article continues below this adOCR software does its best work when you present it with clear, uninterrupted runs of text in full-page columns. If your source material includes boxed text, such as the format on a form, or large amounts of graphic material, OCR quality may be reduced as the software struggles to distinguish text from non-textual material. In extreme cases, you may wish to photocopy a form and blank out some of its boxes and lines before you attempt to scan and recognize its contents.