How to remove duplicate groups of rows
2 views (last 30 days)
Show older comments
I have a matrix with the size of 20000 x 30. I'm trying to identify groups of rows that are similar (but not identical) to each other. What I did was to go through each of the row, and compare it with the entire 20000 rows of data. Now I have multiple groups of rows that meet my criteria. The things is that a lot of them will be duplicates. For example, if Row 1, 2, and 5, are in a group. I would have three groups of 1,2,5; 2,1,5; and 5,1,2.
Here is my question. How do I remove the duplicate rows so that in the end I only have one group 1,2,5 in the above example?
Many thanks!
2 Comments
Answers (1)
Andrei Bobrov
on 30 Nov 2017
Let A - your data as array with size (2e4 x 30) [ Year Month Day Longitude Latitude ... and etc.]
[~,ii] = uniquetol(A(:,4:5),.5,'ByRow',1,'DataScale',[1 1]);
out = A(ii,:);
See Also
Categories
Find more on Cell Arrays in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!