Function 'contains' did not work

12 views (last 30 days)
Alex
Alex on 30 Jan 2024
Commented: Alex on 31 Jan 2024
The background is CODY problem 95
The purpose is to confirm whether the vector s1 contains the vector s2. (or s2 cover s1)
The problem is that when the size of the vector is very large, the function ’contains‘ cannot be judged correctly.
case 1
clear
s1 = 1:100;
s2 = [50 51];
s1str=num2str(s1);
s2str=num2str(s2);
tf1=contains(s1str,s2str)
tf1 = logical
0
tf2=contains(s2str,s1str)
tf2 = logical
0
tf1 SHOULD be 1, since [50 51] is part of s1. but the result is not correct.
case 2
s1 = 40:60;
s2 = [50 51];
s1str=num2str(s1);
s2str=num2str(s2);
tf1=contains(s1str,s2str)
tf1 = logical
1
tf2=contains(s2str,s1str)
tf2 = logical
0
When the size of the vector S1 decreases, the function works normally.
  3 Comments
Dyuman Joshi
Dyuman Joshi on 30 Jan 2024
"The background is CODY problem 95
The purpose is to confirm whether the vector s1 contains the vector s2. (or s2 cover s1)"
If you are talking about this problem - https://in.mathworks.com/matlabcentral/cody/problems/95, then you have misunderstood what is being asked. I suggest you go through the question and the test cases once again.
Also, if you want to check whether elements of a numeric array are present or not in another, you should use ismember or ismembertol, instead of converting to a char array and using contains().
Alex
Alex on 31 Jan 2024
@Stephen23 thank you so much for your support.
overriding the default beavior is really a good way!

Sign in to comment.

Accepted Answer

Dinesh
Dinesh on 30 Jan 2024
Hi Alex,
The issue arises from the default behavior of "num2str" when converting arrays to strings. When "s1" is converted to a string, each element in the array is separated by a set number of spaces, regardless of the number of digits in each number. This means '50' and '51' in "s1str" are separated by more spaces than they are in "s2str", where only two spaces are used between '50' and '51'.
The "contains" function checks for exact substrings, and since the spacing differs, it does not consider '50 51' (with two spaces) to be present in "s1str" (where there are 3 spaces between '50' and '51'). To reliably check for the presence of the sequence of numbers, you would need to account for the variable spacing introduced by "num2str".
To consistently check if a vector contains another, regardless of size, use numerical operations like "ismember":
s1 = 1:100;
s2 = [50 51];
tf1 = all(ismember(s2, s1));
This code will return a logical "1" because "s2" is contained within "s1", which is the correct behavior for your case. When using numerical arrays, it's more reliable to use numerical comparisons rather than string functions. For the CODY problem 95, this method should give you the correct result even for very large vectors.
  3 Comments
Dinesh
Dinesh on 30 Jan 2024
@Alex, The judgement function was restored because the space between 50 and 51 for case 2 in s1str are just 2 spaces which is same as the number of spaces between 50 and 51 in s1str. That is why it returns 1. But for case 1, in s1str, there are 3 spaces between 50 and 51. Therefore, string matching in case 1 didn't give the expected result.
The inconsistency arises because the string representation of numerical vectors with "num2str" does not maintain a consistent number of spaces between numbers, particularly when the numbers vary in length (single-digit versus double-digit). This spacing issue is less pronounced in smaller vectors or vectors with numbers of similar lengths, hence the different outcomes in Case 1 and Case 2.
Thanks for pointing out the critical requirement of preserving the order of elements in the vectors. To meet your requirements, we should implement a custom solution that checks for "s2" as a contiguous subsequence within "s1". Here's an adjusted MATLAB function:
function isSubsequence = checkSubsequence(s1, s2)
isSubsequence = false;
for i = 1:(length(s1) - length(s2) + 1)
if isequal(s1(i:i+length(s2)-1), s2)
isSubsequence = true;
break;
end
end
end
s1 = 1:100;
s2 = [50 51];
tf1 = checkSubsequence(s1, s2);
Hope this helps!
Alex
Alex on 31 Jan 2024
thank you so much for your support~
this function really helps!

Sign in to comment.

More Answers (1)

VBBV
VBBV on 30 Jan 2024
if you use compose function nstead of num2str then contains works correctly,
clear
s1 = 1:100;
s2 = [50 51];
s1str=compose('%d',(s1)); % converts array elements into discrete array elements
s2str=compose('%d',(s2));
tf1=contains(s1str,s2str)
tf1 = 1×100 logical array
Columns 1 through 49 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Columns 50 through 98 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Columns 99 through 100 0 0
tf2=contains(s2str,s1str)
tf2 = 1×2 logical array
1 1
% case 2
s1 = 40:60;
s2 = [50 51];
s1str=compose('%d',(s1));
s2str=compose('%d',(s2));
tf1=contains(s1str,s2str)
tf1 = 1×21 logical array
0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0
tf2=contains(s2str,s1str)
tf2 = 1×2 logical array
1 1
num2str converts entire array into one continous string, so when comparing commonalitiy of elements in two different arrays, its better to avoid num2str
  2 Comments
Stephen23
Stephen23 on 30 Jan 2024
Edited: Stephen23 on 30 Jan 2024
"if you use compose function nstead of num2str then contains works correctly, "
CONTAINS worked correctly in all of the OP's examples.
VBBV
VBBV on 30 Jan 2024
Edited: VBBV on 30 Jan 2024
Yes, i meant contains function works correctly as expected by OP

Sign in to comment.

Products


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!