How to group string datasets?

I have the following dataset:
ID_Ref SP MD FI
123456 [] [] 'A+'
234567 [] [] 'A'
234567 [] [] 'A'
345678 [] [] 'A+'
345678 [] 'Aa2' []
456789 [] [] 'A+'
456789 [] 'Aa2' []
456789 AA+ 'Aa2' []
All the column arrays are string.
How do I group the above dataset by unique ID_Ref to read:
ID_Ref SP MD FI
123456 [] [] 'A+'
234567 [] [] 'A'
345678 [] 'Aa2' 'A+'
456789 AA+ 'Aa2' 'A+'
Apologies if you are unable to read this table in this message box.

 Accepted Answer

Azzi Abdelmalek
Azzi Abdelmalek on 17 Apr 2014
Edited: Azzi Abdelmalek on 17 Apr 2014
Edit
ratings={123456 [] [] 'A+'
234567 [] [] 'A'
234567 [] [] 'A'
345678 [] 'Aa2' 'A+'
456789 'AA+' 'Aa2' 'A+'
345678 [] 'Aa2' []
456789 [] [] 'A+'
456789 [] 'Aa2' []
456789 'AA+' 'Aa2' []}
A=cellfun(@num2str,ratings,'un',0);
c1=cell2mat(ratings(:,1));
[ii,jj,kk]=unique(c1,'stable');
cc=size(A,2);
out=cell(numel(ii),cc);
for k=1:numel(ii)
idx=find(kk==k);
out{k,1}=ii(k,1);
for mm=2:cc
a=A(idx,mm);
b=unique(a);
out{k,mm}=b{end};
end
end
out(cellfun(@isempty ,out))={[]}

4 Comments

Mattew commented
Hi. I have tried your suggestion but it does not work unless I am doing something wrong.
It has taken the first entry and skipped to the next unique record.
This is what I have coded via your suggestion:
[~, I1] = unique(Ratings(:, 1));
Ratings1 = Ratings(I1, :);
I require a unique and a merge of the rows.
Hi Azzi,
Thanks for the code you have sent.
I have tried it out and it doesn't work. Can you explain what you are trying to do in each line of code please?
Thanks
Hi Azzi,
I have used your code above again and it works to a certain extent:
Here is the code I have used but I am getting the following error now:
"Index exceeds matrix dimensions."
This error is occurring from the for line stage.
Here is the code I have used:
Ratings = dataset2cell(Ratings);
Ratings = cellfun(@num2str,Ratings,'un',0);
[ii,jj,kk] = unique(Ratings,'stable');
cc = size(Ratings,2);
Ratings1 = cell(numel(ii),cc);
for k=1:numel(ii)
idx=find(kk==k);
Ratings1{k,1}=ii(k,1);
for mm=2:cc
a=Ratings(idx,mm);
b=unique(a);
Ratings1{k,mm}=b{end};
end
end
Ratings1(cellfun(@isempty ,out)) = {[]};
Can you help me from the for line stage please as it still producing the final unique table with merged rating and how I bypass/ correct so this error message does not appear?
Thanks
Okay I have done a bit more digging and it is to with the following function:
[ii,jj,kk] = unique(Ratings,'stable');
It is taking into account the Rating column headers i.e. SP, MD and FI and the values these columns can take and then putting that in the ii table at the bottom.
Any suggestions how to make the function work properly?

Sign in to comment.

More Answers (0)

Categories

Asked:

on 17 Apr 2014

Commented:

on 22 Apr 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!