Free OCR Software (Optical Character Recognition)
Convert Scanned Images with Text to Pure Text Documents
Free OCR Software (Optical Character Recognition)
Free OCR software are programs that will take an image file containing text (words) and generate a text document containing those words. You usually get such pictures containing text when you scan a document using a scanner. In general, these programs don't do well if the text on your page does not stand out clearly from its background, nor if the fonts used are highly stylised.
Some OCR programs can be trained. That is, you can get it to scan some text, and then you teach the software what those characters are. In this way, the program is able to learn the shape of each of the characters even from unusual fonts. Many, if not most, of the OCR software also consult a dictionary of words for that language when converting.
Note: that OCR software often come free with your scanner or all-in-one machine (ie, printer, scanner and copier combined), so you may want to to see if you already have such a program before rushing out to download one. The ones bundled with your scanner are usually limited versions of commercial software, and can sometimes work better than the free ones listed here (or as well as OCRs can be expected to work given the current state of technology).
If you are looking for full-blown commercial OCR software, probably one of the most well-known one is
ABBYY
FineReader.
Another possibility is OmniPage
Professional.
Related Pages
- How to Create / Make a Website: The Beginner's A-Z Guide
- Free Word Processors and Office Suites
- Free Video Capture and Recording Software
- Free Screen Video Recorders and Screen Capture Software
- Free Screen Readers: Text to Speech Conversion
- Free DVD Authoring and Creation Software
- Free CD and DVD Burners and Copying Software
- Free Command Line Shells
- Free x86 / PC Emulators and Virtual Machines
- Free Device Driver Backup and Extraction Utilities
Disclaimer
The information provided on this page comes without any warranty whatsoever. Use it at your own risk. Just because a program, book, document or service is listed here or has a good review does not mean that I endorse or approve of the program or of any of its contents. All the other standard disclaimers also apply.
Free OCR Software (Optical Character Recognition)
- TopOCR: Free OCR for Digital Cameras (Windows)
This free OCR program is designed especially for recognizing text from the poorer quality images that come from digital cameras or smartphones, since such images can have variable lighting conditions. Your camera needs to have a minimum of 3 megapixels resolution though. It can, of course, also be used for scanned images (ie, obtained from scanners). TopOCR supports 11 languages, including English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish and Swedish. It is able to obtain images directly from your scanner or camera, or, if you wish, you can drag and drop files onto the application window. File format supported include JPEG, TIFF, GIF and BMP. The program is able to handle images containing a mixture of text and graphics. This is a Windows program.
- Tesseract OCR (Windows, Linux)
Currently sponsored by Google and originally developed by Hewlett Packard, this open source OCR program works under Windows and Linux. It can recognize 6 languages, is fully UTF-8 capable, is able to detect fixed pitch vs proportional pitch fonts, and can be trained. It takes a TIF image file as input (but if you need to, you can always convert your images from other formats using one of the free image and photo editing programs available. At the time I write this, the program can only handle text in a single column.
- GOCR (Linux, Windows, OS/2)
GOCR is an OCR program that converts scanned images of text into a text file. It is multiplatform and is released under the open source GNU General Public License. Executables (or binaries) are available for Linux, Windows and OS/2. This is a command line program.
- Ocropus (Linux)
Ocropus is a document analysis and OCR system that uses plugins for its character recognition engine and has layout analysis and statistical natural language modelling, multi-lingual capabilities. The OCR engine uses Tesseract (see elsewhere on this page). It comes in source code form, so you will have to compile it yourself.
- Ocrad: The GNU OCR (Linux)
Ocrad is a command line OCR utility that accepts files in the format of pbm, pgm, or ppm. It is able to handle multi-column texts or blocks of text. The program is available only in source code form.
- Ocre (Linux)
This open source tool runs from the command line and you're supposed to able to integrate it with a spell checker. The program accepts pgm and pbm files as input and sends the output to stdout (the terminal window).
- Microsoft Office Document Imaging (Windows, Mac OS X)
If you use Microsoft Office, you will probably already have this tool on your system. (Although it doesn't have a separate free download, it is listed here since many people already have this software on their system, and are not aware of the existence of this utility.) Windows users can find it in "Microsoft Office\Microsoft Office Tools" on the Start menu.
Related Pages
- Free File Renaming Tools for Bulk Renaming of Multiple Files
- Free Hard Disk Backup and Restore, Hard Disk Image and Cloning Utilities
- Free Partitioning Software - Copy, Create, Move, Resize, Convert, Undelete Partitions
- How to Work Around the Missing Up Arrow Button in Vista's Windows Explorer
- How Much Does It Cost to Set Up a Website?
- Important Precautions to Take When Buying a Domain Name
- How to Add Google Advertisements (Google AdSense) to Your Blog or Website
- Which Web Host Do You Recommend? (FAQ)
Can't Find What You're Looking For?
Newest Pages
- Free Programmer's Fonts
- The Decline and Fall of Internet Explorer 6: Implications for Webmasters
- How to Point a Domain Name to Your Website (Or What to Do After Buying Your Domain Name)
- Should You Choose a Linux or a Windows Web Hosting Package? Is There Such a Thing as a Mac Web Host?
- Free Font Manager Software
- What Does It Mean to Park a Domain Name? Domain Name Parking Explained
- Free Text User Interface Programming Libraries and Source Code
- How to Redesign a New Website without Affecting the Old Site Until the New One is Completely Finished (Using Dreamweaver)
- How to Change the Disk Signature of a Drive Without Losing Existing Data or Reformatting
- Is There a WYSIWYG Editor for PHP? Should I Use PHP or HTML for My Website?
- How to Add a CAPTCHA Test to Your Feedback Form Script: Reducing Spam in Your Contact Form
Popular Pages
- How to Create/Start Your Own Website: The Beginner's A-Z Guide
- How to Register Your Own Domain Name - how to get your own domain name
- How to Create a Website with Dreamweaver CS4 (Dreamweaver Tutorial)
- How to Create a Bootable Windows XP Setup CD/DVD on a Preinstalled Windows System
- Free Partitioning Software - Copy, Create, Move, Resize, Convert, Undelete Partitions
- Free Data Recovery, File and Partition Recovery, Undelete and Unformat Software
- How to Create an Emergency Windows Rescue CD
- Free DVD Authoring and Creation Software
- Free CD and DVD Burners and Copying Software
- Free Hard Disk Backup and Restore, Hard Disk Image and Cloning Utilities
How to Link to This Page
It will appear on your page as:
Free OCR Software (Optical Character Recognition)
