HOW TO SEGMENT A WORD INTO CHARACTERS USING VERTICAL PROJECTION?

hello everyone! i am trying to segment a word into characters using vertical projection of histogram. i can find the histogram and i am able to find the threshold value. the histogram looks like this.
as you can see that there are 4 peaks in the histogram hence i am assuming it represents the 4 characters from the above figure.
the code is given in the figure below.
now i am stuck with how to segment each characters after finding the threshold. can anyone help me with this??

4 Comments

Sir, I have to segment a handwritten text line into words and plot the segmented words. I have obtained the projection profile, but i don't know to segment and plot the words separately.Thank you..
Threshold the signal and call regionprops() to identify runs of about 20 or so.
% Threshold
binarySignal = signal < 500;
% Get rid of runs less than 20
binarySignal = bwareaopen(binarySignal, 20);
% Label and find centroids of remaining big gaps
labeledSignal = bwlabel(binarySignal);
measurements = regionprops(labeledSignal, 'Centroid');
% Extract word 1 as a new image.
word1 = binaryImage(:, 1:measurements(1).Centroid);
or something like that. If it doesn't work, post your binary image in a new question.
Sir, thank you for your code. The above code displays only one word. I tried to display all three words separately using for loop.But I get some errors. Could you please help me..
Probably. Post your original binary image and code in a new question.

Sign in to comment.

 Accepted Answer

Histogram has nothing to do with it. You don't calculate it and you don't use it. I think you mean "profile" instead of histogram.
If you want to extract the vertical strips of your image where you have characters, then you need to threshold the profile and determine where each regions starts and stops.
% 0 where there is background, 1 where there are letters
letterLocations = verticalProjection > 0;
% Find Rising and falling edges
d = diff(letterLocations);
startingColumns = find(d>0);
endingColumns = find(d<0);
% Extract each region
for k = 1 : length(startingColumns)
% Get sub image of just one character...
subImage = binaryImage(:, startingColumns(k):endingColumns(k));
% Now process this subimage of a single character....
end
The above code is untested - just off the top of my head. You may need to debug it or move the starting and ending columns a pixel to the right or left.

11 Comments

thank you again,sir!! its working perfectly after some modification..
@soumyadeep, Can you please send me your full matlab code?
soumyadeep can u plz provide full code of the above ..plz
clear all;
cd('C:\Users\IFIM\Desktop\New folder\KANND_HAND_SET');
myFolder = 'C:\Users\IFIM\Desktop\segment';
[filename, pathname] = uigetfile('*.bmp','Select image to be read.');
i= imread(fullfile(pathname,filename));
i=padarray(i,[0 10]);
verticalProjection = sum(i, 1);
set(gcf, 'Name', 'DEMO BY SOUMYADEEP', 'NumberTitle', 'Off')
subplot(2, 2, 1);imshow(i);
subplot(2,2,3);
plot(verticalProjection, 'b-');
grid on;
t = verticalProjection;
t(t==0) = inf;
mayukh=min(t)
% 0 where there is background, 1 where there are letters
letterLocations = verticalProjection > mayukh;
% Find Rising and falling edges
d = diff(letterLocations);
startingColumns = find(d>0);
endingColumns = find(d<0);
% Extract each region
y=1;
for k = 1 : length(startingColumns)
% Get sub image of just one character...
subImage = i(:, startingColumns(k):endingColumns(k));
[L,num] = bwlabel(subImage);
for z= 1 : num
bw= ismember( L, z);
% Construct filename for this particular image.
baseFileName = sprintf('curvedimage %d.png', y);
y=y+1;
% Prepend the folder to make the full file name.
fullFileName = fullfile(myFolder, baseFileName);
% Do the write to disk.
imwrite(bw, fullFileName);
subplot(2,2,4);
pause(2);
imshow(bw);
end;
y=y+1;
end;
1. USE PRE-PROCESSED IMAGE. THAT IS COMPLEMENT OF BINARY WITH NO NOISE. 2. MY CODE IS MUCH MORE THAN JUST VERTICAL PROJECTION. IT SEGMENTS CURSIVE HANDWRITTEN WORDS BUT IT WILL WORK JUST AS FINE FOR VERTICAL PROJECTION. 3. DO THE REQUIRED EDITING SUCH AS CHANGING THE "PATH NAME" WHERE YOUR IMAGE IS STORED AND THE FORMAT YOU ARE USING( example BMP,JPG,PNG) P.S IT WONT WORK FOR HINDI AS IT HAS MATRA OR ANY LANGUAGE WITH MATRA.
Please May I know where are the features of vertical projection? How can I use its features for training purpose. Thanks In advance
In that code, it sets t (which is the horizontal projection, not vertical despite what he calls it) to infinity wherever it's zero (wherever there are no white pixels in a column). Then he finds the min value location, which is where there are white pixels, and the number of pixels in that column is the thinnest of any columns. Then he finds where there are more than that, in other words, columns where there are words.
i'd tried your code above to my own image, the startingColumn produced [1 50 313] while endingColumn produced [49 312]. Finally i got error "index exceeds matrix dimensions. Error in thanksstartingColumn(k):endingColumn(k)):; ". can you help me sir.
thanks.
Possibly, if you attach your image.
these are my images that i've tried sir.
thanks before.
@Image Analyst : this is my code file. hopefully you will help me sir. Thank you

Sign in to comment.

More Answers (4)

@Image Analyst, can you please explain the logic behind the code which you have posted?

5 Comments

Kartik, virtually every single line of code has a comment before it. Those explain the logic, or I thought they did. Which line of code do you not understand the comment for?
d = diff(letterLocations);
startingColumns = find(d>0);
endingColumns = find(d<0);
These lines.
These lines of code find the rising and falling edges. diff() is the k'th element minus the (k-1)st element. So if the kth element is bigger than the (k-1) element, the value of d will be positive in other words d>0. So find() will return the indexes at which d>0, in other words, the indexes of the rising edges. Conversely d<0 means the array values fell, and those indicate falling values (falling edges).
I am trying use the code you have mentioned above, to crop an image.
I want to crop the image as follows.
If there are 2 points closer to each other and both have y = 0. Then i want to crop the image between those 2 points as one images.
Will you be able to help me with this sir?
Thanks!
There is no zeroeth row in an image or a matrix. There is a first row. If you want the first row, you can use indexing:
croppedImage = fullImage(1, :, :);

Sign in to comment.

if true
% code
endfunction se = slag(img)
format long g; %sformat compact; fontSize = 20; % Read in a standard MATLAB gray scale demo image. folder = 'C:\Users\vinod\Documents\MATLAB\character';
grayImage = img; % Get the dimensions of the image. % numberOfColorBands should be l = 1. [rows ,columns,l] = size(grayImage); % Display the original gray scale image. figure, subplot(5,6 , 1); imshow(grayImage, []); title('Original Grayscale Image', 'FontSize', fontSize);
% Enlarge figure to full screen. set(gcf, 'units','normalized','outerposition',[0 0 1 1]); % Give a name to the title bar. set(gcf,'name','Segmented objects','numbertitle','on') %Convert to grayscale %{ if l > 1 grayImage = grayImage(:,:,2); % Take green channel end disp(l); %}
%{ % Threshold the image. binaryImage = grayImage < 175; % Display the image. subplot(4, 4, 2); imshow(binaryImage, []); title('Binary Image', 'FontSize', fontSize); % connect all the letters binaryImage = imdilate(binaryImage, true(7)); % Get rid of blobs less than 200 pixels (the dot of the i). binaryImage = bwareaopen(binaryImage, 200); % Display the image. subplot(4, 4, 3); imshow(binaryImage, []); title('Binary Image', 'FontSize', fontSize); %}
% Find the areas and bounding boxes. %measurements = regionprops(binaryImage, 'Area', 'BoundingBox'); grayImage = bwareaopen(grayImage, 500); measurements = regionprops(grayImage, 'Area', 'BoundingBox'); Areas = [measurements.Area]
% Crop out each word len= length(measurements); disp(len); y=1; %thisBoundingBox = measurements.BoundingBox; %figure,imshow(grayImage,thisBoundingBox); for blob = 1 : len % Get the bounding box. thisBoundingBox = measurements(blob).BoundingBox; % Crop it out of the original gray scale image. se = imcrop(grayImage, thisBoundingBox); %se = reshape(se,[10, 10]); filename = sprintf('segmetimage %d.jpg',y); y=y+1;
file = fullfile(folder,filename);
imwrite(se,file);
% Display the cropped image
subplot(5,6,1+blob); % Switch to proper axes.
pause(1);
imshow(se); % Display it.
% Put a caption above it.
caption = sprintf('Word #%d', blob);
title(caption, 'FontSize', fontSize);
end
i want characters to be seperated from words but i'm getting words sepeartion, i had attached the o/p of image .....someone pls to run and help me its urgent
<<
>>

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!