Deleting certain rows with partially duplicate values
Show older comments
If two rows have matching values in column A and in column B, I need to delete the row with the value RED. For example:
A B C D
1 March-1st 2 RED
1 March-1st 3 GREEN
3 March-5th 8 GREEN
6 March-8th 4 GREEN
2 April-3rd 5 RED
2 April-3rd 0 GREEN
Here I would need to delete row 1 and 5 beacuse they have the value RED in column D while they share values in column A and B with another row.
Answers (1)
I'll assume that's a table and the text entries are strings.
T = table( ...
[1;1;3;6;2;2], ...
["March-1st";"March-1st";"March-5th";"March-8th";"April-3rd";"April-3rd"], ...
[2;3;8;4;5;0], ...
["RED";"GREEN";"GREEN";"GREEN";"RED";"GREEN"], ...
'VariableNames',["A","B","C","D"])
[G,GID] = findgroups(T.A,T.B);
is_red = strcmp(T.D,"RED");
delete_row = false(size(T,1),1);
for ii = 1:numel(GID)
idx = G == GID(ii);
if nnz(idx) == 1
continue
end
delete_row(idx & is_red) = true;
end
T(delete_row,:) = []
7 Comments
Charles
on 12 Mar 2024
Voss
on 12 Mar 2024
You're welcome!
Please upload your table (save it to a mat file and upload the mat file using the paperclip button), and I'll take a look.
One way you'd get that error is if the date column is a cell array (as opposed to a string array like I've used) and is the first argument passed to findgroups.
T = table( ...
[1;1;3;6;2;2], ...
cellstr(["March-1st";"March-1st";"March-5th";"March-8th";"April-3rd";"April-3rd"]), ...
[2;3;8;4;5;0], ...
["RED";"GREEN";"GREEN";"GREEN";"RED";"GREEN"], ...
'VariableNames',["A","B","C","D"])
[G,GID] = findgroups(T.B,T.A); % B (cell array) first gives the error
is_red = strcmp(T.D,"RED");
delete_row = false(size(T,1),1);
for ii = 1:numel(GID)
idx = G == GID(ii);
if nnz(idx) == 1
continue
end
delete_row(idx & is_red) = true;
end
T(delete_row,:) = []
In that case, make the numeric column be the first one passed to findgroups:
[G,GID] = findgroups(T.A,T.B); % numeric first
Charles
on 12 Mar 2024
Voss
on 12 Mar 2024
Please upload your table (save it to a mat file and upload the mat file using the paperclip button), and I'll take a look.
Voss
on 13 Mar 2024
You should be deleting rows from TUnique instead of T.
Can you upload the xlsx file? Without the data, I can only guess.
Categories
Find more on Data Type Identification in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!