From cvance@cfar.umd.edu Sun Jul 3 12:28:17 1994 From: cvance@cfar.umd.edu (Christopher Vance) Newsgroups: alt.sources Subject: Re: OCR sources? Date: 27 Jun 1994 19:58:11 GMT Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 NNTP-Posting-Host: empedocles.cfar.umd.edu X-Newsreader: TIN [version 1.2 PL2] Joerg Behrend (aba09@rrz.uni-koeln.de) wrote: : To learn something about OCR, I am looking for such a program in source code : (C/C++ preferred, but other standard languages are also welcome). Also parts : of an OCR program (like segmentation of characters or assignment of an : unknown bitmap to a character) are of interest. : Thanks in advance. : Joerg Behrend, aba09@rrz.uni-koeln.de, I. Physikalisches Institut Uni Koeln The only public domain OCR software (for unix) that I have found was written by some students at CalPoly, you can get the software by ftping to polyslo.csc.calpoly:/pub/ocr Based on that software is an improved version by Jason Molenda, available at ftp.cygnus.com:/pub/gnucr-0.0.tar.gz If there is additional code available, I would be interested in hearing about it, and would appreciate it if you could send me some more information. OCR is one of the projects I have been working with for a while now. I have spent a large amount of time going over that software, re-writing just about everything except the neural net. Please contact me, and let me know more about what you are looking for. I am posting here to let everyone else know that such software is available, and unfortuantely, right now, it doesn't work all that well, but people are working to improve that. The software I have re-written isn't ready for any sort of release (it is still a mess) but the with the improved segmentation routines, it works much better than the original software. I have also set up a WWW server for the Document Image Understanding group for the Center for Automation Research (CfAR) at the University of MD, College Park. That site is accessible at http://documents.cfar.umd.edu/ It is currently a repository for Document Image Understanding and Character Recognition information and resources. You can check there for more information. I had to put in a plug for the site :)