How do you remove rows that contain a duplicate value within the same column of a cell array?

I need to remove the entire row within a cell array if a value within a column (in this example, the column (4) headed with d) is equal to the value within that same column in the following row, so that I am only left with one row containing that value in the column.
For example:
data=
{'a','b','c','d';
'z','y', 1, 2;
'z','y', 1, 2;
'z','y', 1, 3;
'z','y', 1, 3;
'z','y', 1, 4;}
where a,b,c, and d are simply colummn headers
so that I am left with
newdata=
{'a','b','c','d';
'z','y', 1, 2;
'z','y', 1, 3;
'z','y', 1, 4;}
Where in newdata, the duplicate values (2 and 3) in column 4, headed d, are removed, along with their entire corresponding row.

Answers (1)

I am answering as if you could have multiple consecutive rows with same values in column 4.
% - Modified data set that tests "multiple.." as well.
data = {'a','b','c','d'; 'z','m',1,2; 'z','n',1,2; 'z','o',1,2; ...
'z','y',1,3; 'z','y',1,3; 'z','y',1,4} ;
% - Build logical index vector of rows to keep based on differences in col 4.
toKeep = [true; diff( cell2mat( data(2:end,4) )) ~= 0; true] ;
% - Extract relevant rows.
newData = data(toKeep,:) ;
This produces
>> newData
newData =
'a' 'b' 'c' 'd'
'z' 'o' [1] [2]
'z' 'y' [1] [3]
'z' 'y' [1] [4]
PS: and it is the same if you cannot have multiple consecutive rows with same values actually.

Categories

Asked:

on 28 Jan 2015

Edited:

on 28 Jan 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!