identify words from a sentence
Show older comments
The image in the below link is a sentence.

Is it possible to divide the sentence into words..... that is I want to draw a box around the words and display each word separately.... Please can someone help me.... How to identify words from an image..... Please do reply....
Accepted Answer
More Answers (1)
Walter Roberson
on 26 Mar 2013
imdilate(). regionprops() to find the resulting bounding boxes.
Or alternately, regionprops() to find bounding boxes. Merge any areas whose bounding boxes touch or overlap. Now, find the distances between bounding boxes. You will find that they have an uneven distribution, small distances between adjacent letters, larger distances between words. Merge the areas that are only a small distance apart. You might want to use a ratio of the size of the existing bounding boxes to help determine what "small distance" means.
Strings such as
'...'
could give you trouble, though.
5 Comments
Elysi Cochin
on 26 Mar 2013
Edited: Elysi Cochin
on 26 Mar 2013
Image Analyst
on 26 Mar 2013
I see no reason why the dilation method won't work. Did you actually try it?
Elysi Cochin
on 27 Mar 2013
Image Analyst
on 27 Mar 2013
Calculate the area and centroid of all blobs. If the area is about the size of a dot, and it's fairly round, then combine the bounding box or mask of that dot with the closest word. In pseudocode
if area < largestDotArea
% It's a dot
if itIsCircular
for 1 to allOtherBlobs
distance = hypot(centroid1, centroid2)
if distance < mergingDistance
% Merge bounding boxes
newBoundingBox = f(wordBoundingBox, dotBoundingBox)
break;
end
end
end
Do that in a loop over all blobs to check whether it is a dot and thus needs to be combined with the closest word. Circularity is the Perimeter^2/(4*pi*area).
Elysi Cochin
on 28 Mar 2013
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!