Counting the Same Occurance of a row string

Hello everyone,
I am working on creating a code that can count the same occurance of a row string.
For example,
I have a variable named as Proejct1: that contains the following data.
ADS µSOIC8
AVX 0603
AVX 0603
AVX 0603
ELN
EPC 0603
EPC 0603
EPC 0603
FAG DO214AA
FAG SOD128
FAG SOD128
ELN
FAG SOD123W
FAG DO214AC
FAG SOD123W
I want to count the occurance of the unique rows.
for example, AVS 0603 3
ELN 2
ADS µSOIC8 1 and so on.
Note: I have browsed alot but not geeting the concrete answer.
Thank you in advance.
Regards,
Waqar Ali Memon

2 Comments

Hello,
What is the type of "Proejct1"? is it look like this
Proejct1 = {
'AVX', '0601'
'AVX', '0603'
'AVX', '0603'
'EPC', '0603'
'EPC', '0603'}
Thank you for reply. Yes, that looks like the same as you have mentioned.
Regards,
Waqar Ali

Sign in to comment.

 Accepted Answer

>> P = {'ADS','µSOIC8';'AVX','0603';'AVX','0603';'AVX','0603';'ELN','';'EPC','0603';'EPC','0603';'EPC','0603';'FAG','DO214AA';'FAG','SOD128';'FAG','SOD128';'ELN','';'FAG','SOD123W';'FAG','DO214AC';'FAG','SOD123W'}
P =
'ADS' 'µSOIC8'
'AVX' '0603'
'AVX' '0603'
'AVX' '0603'
'ELN' ''
'EPC' '0603'
'EPC' '0603'
'EPC' '0603'
'FAG' 'DO214AA'
'FAG' 'SOD128'
'FAG' 'SOD128'
'ELN' ''
'FAG' 'SOD123W'
'FAG' 'DO214AC'
'FAG' 'SOD123W'
>> [~,~,id1] = unique(P(:,1));
>> [~,~,id2] = unique(P(:,2));
>> [~,X,idx] = unique([id1,id2],'rows');
>> [cnt,bin] = histc(idx,unique(idx));
>> Q = P(X,:);
>> Q(:,3) = num2cell(cnt)
Q =
'ADS' 'µSOIC8' [1]
'AVX' '0603' [3]
'ELN' '' [2]
'EPC' '0603' [3]
'FAG' 'DO214AA' [1]
'FAG' 'DO214AC' [1]
'FAG' 'SOD123W' [2]
'FAG' 'SOD128' [2]

10 Comments

Thank you for your time and help. :-)
It is working and I have used it. :-)
Best Regards,
Waqar Ali Memon
This code is calculating false if number is greator than 1.
How can we modify this to work for numbers greator than 1.
Waiting for your assistance. Thank you :-)
I mean, for instance:
Project1 is summed up as following:
'ADS' 'µSOIC8' 1
'AVX' '0603' 3
'ELN' [] 2
'ADS' 'µSOIC8' 3
'AVX' '0603' 9
'ELN' [] 8
Instead of showing the result:
'ADS' 'µSOIC8' 4
'AVX' '0603' 12
'ELN' [] 10
It is showing as:
'ADS' 'µSOIC8' 2
'AVX' '0603' 2
'ELN' [] 2
Any help would be appreciated :-)
Thank you.
"Project1 is summed up as following:"
'ADS' 'µSOIC8' 1
That is what I would expect, based on the description in your question, in fact you even stated that output explicitly in your question: "ADS µSOIC8 1 and so on."
"Instead of showing the result:"
'ADS' 'µSOIC8' 4
I do not know why you expect the output 4, because your example data only has this line once.
"It is showing as:"
'ADS' 'µSOIC8' 2
Not for me. Using the data that you have given us (which I copied into my answer) I still get exactly the same output as when I answered this six days ago.
If you are using different data, then I do not have that data (unless you upload it here in a comment): please click the paperclip button to upload a .mat file with data that shows this behavior. Also show the complete output that you expect for the uploaded data.
I have clipped the .mat file.
Or may that would be easier if we create a structure and on 1st and second column it shows all unique ids. and on 3rd it shows the number of times that is present in Project 1 and on 4th column it shows number of times that is present in Project2 and so on. Just an idea?
Waiting for your assitance.
Thank you :-)
I'm surprised that unique(..., 'rows') doesn't work with cell arrays of char vectors. However, it does work with string arrays, so to simplify things:
%demo data
P = {'ADS','µSOIC8';'AVX','0603';'AVX','0603';'AVX','0603';'ELN','';'EPC','0603';'EPC','0603';'EPC','0603';'FAG','DO214AA';'FAG','SOD128';'FAG','SOD128';'ELN','';'FAG','SOD123W';'FAG','DO214AC';'FAG','SOD123W'}
[values, ~, id] = unique(string(P), 'rows');
count = histcount(id, 'BinMethod', 'integers')
%for display or storage:
table(values, count')
@Waqar Ali Memon : In your uploaded data the row
'ADS' 'µSOIC8'
only occurs once, and this is exactly what my code shows:
>> Q
Q =
'ADS' 'µSOIC8' [1]
'AVX' '0603' [1]
'ELN' '' [1]
'EPC' '' [2]
'EPC' '0603' [1]
'FAG' 'DO214AA' [2]
'FAG' 'DO214AC' [1]
'FAG' 'SOD123W' [2]
... etc
That row does not occur four times, in fact even the 'ADS' string itself only occurs once:
>> nnz(strcmp(P(:,1),'ADS'))
ans =
1
So far you have not given any data or explanation why you expect that output to be 4.
"Or may that would be easier if we create a structure and on 1st and second column it shows all unique ids. and on 3rd it shows the number of times that is present in Project 1 and on 4th column it shows number of times that is present in Project2 and so on. Just an idea?"
It is certainly an idea, but it means nothing to me because nowhere have you explained what Project2 is, or how it relates to your original question or to my answer.
@Guillaume : yes, using strings is a neat solution.
@ Stephen Cobeldick, may be my explanation is not good here but i am explaning again :-)
See in uploaded data:
On row 9: 'FAG' 'SOD128' 2
On row 112: 'FAG' 'SOD128' 3
So instead of showing: 'FAG' 'SOD128' 2
It should show: 'FAG' 'SOD128' 5
May be u got now? :-(
@Waqar Ali Memon: your original data did not have third column, did not contain numeric data, and you requested to count the rows... now you want me to guess that you actually want to sum numeric values... luckily for you my magical crystal ball is working this morning:
vec = accumarray(bin,vertcat(P{:,3}));
Q(:,4) = num2cell(vec);
Giving:
>> Q
Q =
'ADS' 'µSOIC8' [1] [ 1]
'AVX' '0603' [1] [ 3]
'ELN' '' [1] [ 2]
'EPC' '' [2] [ 6]
'EPC' '0603' [1] [ 3]
'FAG' 'DO214AA' [2] [ 40]
'FAG' 'DO214AC' [1] [ 1]
'FAG' 'SOD123W' [2] [ 7]
'FAG' 'SOD128' [2] [ 5] % <--- check this
'FSL' 'LQFP-64 ePAD' [1] [ 1]
'FSL' 'LQFP80-ePad' [2] [ 2]
... etc.
Note that your uploaded data still does not give the output that you requested earlier (because the 'ADS' 'µSOIC8' third-column values sum to 1, not 4 as you requested. Very confusing).
Thanks Man, it worked and sorry for confusion :-).

Sign in to comment.

More Answers (2)

infinity
infinity on 4 Jul 2019
Edited: infinity on 4 Jul 2019
Here is a possible solution that you can refer
clear
pj1 = {
'AVX', '0601'
'AVX', '0603'
'AVX', '0603'
'EPC', '0603'
'EPC', '0603'
'FAG', 'SOD123W'}
n = length(pj1);
pj2 = cell(n,1);
for i = 1:n
pj2{i} = [pj1{i,1},' ',pj1{i,2}];
end
res = cell(n,2);
for i = 1:n
res{i,1} = pj2{i};
res{i,2} = sum(ismember(pj2,pj2{i}));
end
The result will be
res =
6×2 cell array
{'AVX 0601' } {[1]}
{'AVX 0603' } {[2]}
{'AVX 0603' } {[2]}
{'EPC 0603' } {[2]}
{'EPC 0603' } {[2]}
{'FAG SOD123W'} {[1]}

1 Comment

Thank you for your time and help. This code is also working. So, one may use both codes to get required solution. Thank you so much :-)
Best Regards,
Waqar Ali Memon

Sign in to comment.

A solution with less calls to unique:
P = {'ADS','µSOIC8';'AVX','0603';'AVX','0603';'AVX','0603';'ELN','';'EPC','0603';'EPC','0603';'EPC','0603';'FAG','DO214AA';'FAG','SOD128';'FAG','SOD128';'ELN','';'FAG','SOD123W';'FAG','DO214AC';'FAG','SOD123W'}
Ptemp = strcat(P(:,1), '_', P(:,2)) ;
[~, i, j] = unique(Ptemp) ;
OUT = P(i,:) ;
OUT(:,3) = num2cell(accumarray(j, 1, [numel(i) 1], @sum, 0))

1 Comment

Although this does use fewer unique calls than my answer, it unfortunately concatenates the data together (which is exactly what I wanted to avoid). The method is not robust when the data includes an underscore (or space, or whatever character is used to separate them, if any). The method cannot distinguish between these two rows of data:
'A_B' 'C'
'A' 'B_C'
An uncritical reliance on such a method could very easily lead to an unexpected output.
Trung VO DUY's answer also suffers from exactly the same limitation.

Sign in to comment.

Products

Release

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!