How to get better OCR results (without confusing digits for letters)
Show older comments
Hello all,
I'm trying to use OCR to determine the axes scale on a graph:

(I want to be able to extract the numbers "0, 32000, 4000, etc." on the y-axis, and "-50, 50, 150, etc." on the x-axis)
My initial attempt is this code:
detect = ocr(justAxes, 'TextLayout', "Block");
Iocr = insertObjectAnnotation(justAxes, 'rectangle', ...
detect.WordBoundingBoxes, ...
detect.Words + " " + detect.WordConfidences);
figure; imshow(Iocr);
words_string = detect.Words;
Which gives me this result:

The results aren't bad, but I'm wondering if there is any preprocessing I can do to avoid the OCR misreading digits as letters (e.g. the '50' as 'so', the '8000' as 'sooo', and to '0' as 'o'). Can I somehow tilt the OCR to detect digits more than it detects letters? Or do I have to preprocess the image further in some way?
Accepted Answer
More Answers (0)
Categories
Find more on Convert Image Type in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!