Find string that has multiple substrings

16 views (last 30 days)
I have a cell array each cell containing a string and I am trying to find all cells that have contain 2 substrings. For example
A = [Car is fast; Car is slow; Train is fast; Plane is fast]
I am new to cellfun (that doesn't necessarily have to be the solution) but figured that was the only way to do it.
any(~cellfun('isempty',strfind(A,'Car' 'fast')))
The result should be
[1;0;0;0]

Accepted Answer

John BG
John BG on 28 Sep 2016
Tyler
1.- Instead of
any(~cellfun('isempty',strfind(A,'Car' 'fast')))
you should use something like
any(~cellfun('isempty',strfind(A,'Car')))
or
and(any(~cellfun('isempty',strfind(A,'Car'))),any(~cellfun('isempty',strfind(A,'fast'))))
If you want to input all conditions (strings to spot) at once, at least you should combine them logically, but the function strfind only compares one string against another, both have to be char type.
2.- so, to answer your question try this
A = {'Car is fast'; 'Car is slow'; 'Train is fast'; 'Plane is fast'}
B={'Car';'fast'}
[szA1 szA2]=size(A); % cell you want to scan
[szB1 szB2]=size(B); % cell containing the patterns you want to look for
marker1=zeros(szB1,szA1); % position where marker spotted
for n=1:1:szB1
L1=B{n}
for k=1:1:szA1
L2=A{k}
if strfind(L2,L1)
marker1(n,k)=1;
end;
end
end
now marker1 contains all coincidences, all left to do is to AND vertically with
prod(marker1)
=
1 0 0 0
This is the result you are after, isn't it?
There are more compact ways to write this answer, without for loops, but testing them takes time.
Tyler, please would you be so kind to mark my answer as accepted answer? thanks in advance.
To any other reader if you find my answer of any help, would you please click on the thumbs-up link, thanks in advance
John BG

More Answers (1)

Walter Roberson
Walter Roberson on 28 Sep 2016
A = {'Car is fast'; 'Car is slow'; 'Train is fast'; 'Plane is fast'};
targets = {'Car', 'fast'};
lit_targets = regexptranslate('escape', targets);
pattern = [sprintf('(?=.*%s)', lit_targets{:}) '.?'];
matches = ~cellfun(@isempty, regexp(A, pattern) );
This is extendable to any number of strings in targets.
The step with regexptranslate is to ensure that anything in targets is matched literally. For example if you had 'Car.' then the period needs to be treated as an exact period. Without this step, regexp would treat the period as meaning "any one character"
There is another potential approach:
A = {'Car is fast'; 'Car is slow'; 'Train is fast'; 'Plane is fast'};
targets = {'Car', 'fast'};
lit_targets = regexptranslate('escape', targets);
pattern = strjoin(lit_targets, '|');
matches = cellfun(@length, regexp(A, pattern)) >= length(targets);
However, this will have problems if there is a string that contains multiple copies of one of the words. For example, 'This is a test' contains two copies of 'is' so if you were searching for 'is' and 'car' then the two matches for 'is' would count as 2 and the code would not notice that 'car' was not there. This approach is therefore not recommended for the general purpose.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!