how to count the no of words in a sentence?

i have read the text file sentence by sentence by using the following code. now i want to count the total no of words in each sentence.. how can i do that? i know how to read a text file word by word but it gives total no of words in the whole document and i want to calculate no of words for each sentence separately.
fid=fopen('hello.txt');
text = textread('hello.txt','%s','delimiter','.')
fclose(fid);
[r,c]=size(text)

3 Comments

can i also split the sentences in words using regexp because i want to find the length of each word and then compare each letter of the word with all other letters of the same word to check how many times the specific letter occur in that word.i will try to post the algo which i want to implement so u can get a better idea.
Can you clarify what your desired output actually is? Is it a count of letters per word, or a count of all of the letters used within a single sentence, or within the document.
I think you can do this fairly easily using hist(), but I admit that I am a bit confused as to what your final goal is. Can you post a sample text (maybe only a couple of short sentences), and your desired output?
first of all i want to thank u for ur help. the algo is 1.read text file
2.find no of sentences(NOS)in text file
3. for 1= 1 to NOS, repeat step 4 to 15
4.now=no of words in ith sentence
5.for j=1 to now, repeat step 6 to 15
6.low= length of jth word
7.for k=1 to low, repeat steps 8 to 15
8. sa= letter for comparison/search(kth letter)
9.count=0
10,for m=1 to word length, repeat steps 11 to 13
11.current letter=mth letter of word
12. if (sa=current letter)
count=count+1
13.m=m+1
14.if(count>1)
word(k)= initial letter of kth word
break internal for;
15. k=k+1
16.j=j+1;
17. i=i+1;
output(watermark) in this algo words having multiple occurrences in each sentence are identified and then the initial letters of these words are used to generate a watermark pattern. after obtaining initial letters from each sentence, they are concatenated to form a watermark. for example the sentence is 'text watermarking is difficult than image watermarking'. the desired output would be 'twdw'.

Sign in to comment.

 Accepted Answer

Matt Kindig
Matt Kindig on 18 Jun 2013
Edited: Matt Kindig on 18 Jun 2013
txt = fileread('hello.txt'); %read in file, assuming it is not too big
sentences = strtrim(regexp(txt, '[\.\?\!]+', 'split')); %split into sentences
%I assume that sentences end with ., ?, or !, which don't appear elsewhere in
% the file (i.e., no quoted strings, abbreviations containing '.', etc.).
words = regexp(sentences, '\s+', 'start'); % find spaces (which separate words)
wordsPerSentence = cellfun(@length, words, 'UniformOutput', true); %count words
%we only care about sentences containing at least one word.
wordsPerSentence = wordsPerSentence(wordsPerSentence>=1)+1;

1 Comment

the algo is 1.read text file
2.find no of sentences(NOS)in text file
3. for 1= 1 to NOS, repeat step 4 to 15
4.now=no of words in ith sentence
5.for j=1 to now, repeat step 6 to 15
6.low= length of jth word
7.for k=1 to low, repeat steps 8 to 15
8. sa= letter for comparison/search(kth letter)
9.count=0
10,for m=1 to word length, repeat steps 11 to 13
11.current letter=mth letter of word
12. if (sa=current letter)
count=count+1
13.m=m+1
14.if(count>1)
word(k)= initial letter of kth word
break internal for;
15. k=k+1
16.j=j+1;
17. i=i+1;
output(watermark)

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!