java - How load first 100 characters with png in TrainingImageLoader -
i draw best png file first 100 characters, if can not out.
file there: http://abatis.org.uk/projects/txt2fig.png
file fff = new file("c:\\users\\lll\\desktop\\txt2fig.png"); ocrscanner scanner = new ocrscanner(); trainingimageloader loader = new trainingimageloader(); hashmap<character, arraylist<trainingimage>> trainingimagemap = new hashmap<character, arraylist<trainingimage>>(); loader.load(fff.getabsolutepath(), new characterrange('a', 'z'), trainingimagemap); scanner.addtrainingimages(trainingimagemap); image image = imageio.read(fff); pixelimage pixelimage = new pixelimage(image); pixelimage.tograyscale(true); pixelimage.filter(); string text = scanner.scan(image, 0, 0, 0, 0, null); system.out.println(text);
exception:
java.io.ioexception: expected decode 26 characters decoded 911 characters in training: c:\users\lll\desktop\txt2fig.png @ net.sourceforge.javaocr.ocrplugins.mseocr.trainingimageloader.load(trainingimageloader.java:107) @ net.sourceforge.javaocr.ocrplugins.mseocr.trainingimageloader.load(trainingimageloader.java:83)
my library in pom:
<dependency> <groupid>net.sourceforge.javaocr</groupid> <artifactid>javaocr-core</artifactid> <version>1.0</version> </dependency> <dependency> <groupid>net.sourceforge.javaocr.plugins</groupid> <artifactid>javaocr-plugin-awt</artifactid> <version>1.0</version> </dependency>
i know the:
new characterrange ('a', 'z')
should include first , last character in file, can somehow around?
you don't understand concept of tool. you've put image text ocr training image, while training image should have training characters corresponding ascii codes 0x20h 0x7ch (or above range) in ther numerical order @ least below:
!"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
please note the space on beginning of training-image.
firstly try anazyle sample images , training-images javaocr-20100605.zip/ocrtests/ directory, eg. file trainingimages/hpljpica.jpg trainig-image , file hpljpicasample.jpg image analyze. use tab feature called mean square ocr recognzier of java ocr gui (executed java -jar javaocr.jar). later can try own training-image composed image analyze. purpose can use feature tab called character extractor of java ocr gui extract characters image. arrange output files extracted characters ordered ascii codes. compose them training-image.
screenshots attached below show how use ocr gui , results.
ocr java ocr tool space ~
ocr results - can see ocr errors
as can see @ least 2 recognition errors occured, not much.
Comments
Post a Comment