android - tess-two OCR not decoding correctly -


i have followed tutorials tesseract , tess-two , eyes-two installed , part of android app.

it runs, ocr text returned baseapi.getutf8text(); complete gibberish.

bitmapfactory.options options = new bitmapfactory.options();         options.insamplesize = 4;         bitmap bmp = bitmapfactory.decodefile(path , options);         receipt.setimagebitmap(bmp);          try {             exifinterface exif = new exifinterface(path);             int exiforientation = exif.getattributeint(exifinterface.tag_orientation , exifinterface.orientation_normal);             int rotate = 0;             switch (exiforientation) {                 case exifinterface.orientation_rotate_90:    rotate =  90;    break;                 case exifinterface.orientation_rotate_180:   rotate = 180;    break;                 case exifinterface.orientation_rotate_270:   rotate = 270;    break;             }             if (rotate != 0) {                 int w = bmp.getwidth();                 int h = bmp.getheight();                 matrix matrix = new matrix();                 matrix.prerotate(rotate);                 bmp = bitmap.createbitmap(bmp, 0, 0, w, h, matrix, false);             }              bmp = bmp.copy(bitmap.config.argb_8888, true);               tessbaseapi baseapi = new tessbaseapi();             baseapi.init(data_path , "eng");             baseapi.setimage(bmp);             string ocrtext = baseapi.getutf8text();             baseapi.end();              log.i("ocr text", "rotate  " + rotate);             log.i("ocr text", "ocr   ");             log.i("ocr text",  ocrtext);             log.i("ocr text", "======================================================================================="); 

photographing check has ocr characters returns

05-14 11:01:59.131: i/ocr text(18199): rotate  90 05-14 11:01:59.131: i/ocr text(18199): ocr    05-14 11:01:59.131: i/ocr text(18199): 4— ‘ ‘ 05-14 11:01:59.131: i/ocr text(18199): \dxfi ‘ 05-14 11:01:59.131: i/ocr text(18199): w man"! no accounv 05-14 11:01:59.131: i/ocr text(18199): 1’ 05-14 11:01:59.131: i/ocr text(18199): my... «unblm m. mm. 05-14 11:01:59.131: i/ocr text(18199): :~a 05-14 11:01:59.131: i/ocr text(18199): «ln. 05-14 11:01:59.131: i/ocr text(18199): ‘ “w “in. n “h‘m‘ 05-14 11:01:59.131: i/ocr text(18199): mmnwnmw- .; k. ' 05-14 11:01:59.131: i/ocr text(18199): wilt-run”. uni” nl 05-14 11:01:59.131: i/ocr text(18199): mam. 05-14 11:01:59.131: i/ocr text(18199): ======================================================================================= 

any advice on how clean , correct ocr recognition? device used samsung galaxy 7".

you use like

ocrtext = ocrtext.replaceall("[^a-za-z0-9]+", " "); ocrtext = ocrtext.trim(); 

which based on tesseract implementation found here: simpleandroidocractivity.java


Comments

Popular posts from this blog

Email notification in google apps script -

c++ - Difference between pre and post decrement in recursive function argument -

javascript - IE11 incompatibility with jQuery's 'readonly'? -