Help required with deleting rows of a matrix with similar first 2 elements (after summing up the last elements of rows to be deleted)

[1069 12059 100200 145
1069 12063 100200 471
1073 1001 100100 213
1073 1001 100101 213
1073 1007 100100 633
1073 1007 100101 633]
This is a portion of the matrix. As you can see that 1073-1001 and 1073-1007 is repeated twice (which is not true for the entire matrix; we might have 3 such pairs for eg: 1073-1008 may occur thrice below this). I want their to be only one row for 1073-1001 and 1073-1007 with the elements in column 4 being 213+213=426 and 633+633=1266, respectively.
So I want the output to be like:
[1069 12059 100200 145
1069 12063 100200 471
1073 1001 100100 426
1073 1007 100100 1266]
The actual matrix has dimensions 10730x4
Having just started using matlab I am facing severe difficulties in solving this problem. Any help will be much appreciated. Thank you.

7 Comments

no it is in matlab in a matrix form. Sorry I should have mentioned that. Its a matrix with dimensions 10730x4.
I didnt know how else to show an example on this website.
@faraz: Posting the values directly would be much better than a screen copy on an excel sheet. Using Matlab syntax would allow for a copy&paste to test a suggestion before posting.
@Jan, Hi. How do I post the matrix from matlab here?
@faraz: A small example is usually enough to explain a problem.
AnExample = [1,2,3; ...
4,5,6; ...
7,8,9];
Just attach the m-file where you create this array. Use the paper clip icon.
Attached the code. Thanks. I'm assuming by m-file you meant the code I was using because thats the only m-file i could find.

Answers (2)

Do you mean this:
[dummy, Index] = unique(Data(:, 1:2), 'rows');
Result = Data(Index, :);
[EDITED] With adding the corresponding elements from the 4th column:
[dummy, Index] = unique(Data(:, 1:2), 'rows');
S = accumarray(Index(:), Data(:, 4));
Result = cat(2, Data(Index, 1:3), S);

3 Comments

@Jan, Thank you for the quick response. When I run this (substituting my matrix name for the Data variable), I get the error "Index exceeds matrix dimensions".
Also does this code add the values in the 4th column? Before deleting the rows that share common 1st and 2nd column values (and leaving a unique one), I want to take a sum of the values in the 4th column of all these rows and put this sum in front of the unique row.
So for example, in the excel sheet i provided there are 3 rows with common 1st and 2nd column values. For the rows which have 1073 and 1002 I want to sum the values in the 4th column, and then delete the extra rows so that I have just one row with 1073 and 1002 with the sum in front of it.
@faraz: I cannot guess, where the error comes from. Please post a small example, which reproduces the problem.
I've omitted the line to sum the values, because I was not sure if I understand what you are searching. See [EDITED].
Hi Jan,
I am sorry. I copied the original (unedited) code you gave incorrectly. Now it is not giving any errors, however, it is not changing my data file at all. The number of elements and everything remains the same. I assume that the purpose of the unedited code was to remove all rows that have the same elements in the first and second column. Shouldn't that be doing something?
The edited code with accumurray gives the following error:
"Error using accumarray Second input VAL must be a vector with one element for each row in SUBS, or a scalar.
Error in untitled2task2test (line 80) S = accumarray(Index(:), pairs3(:, 4));"
Where my matrix is called pairs3 and you called it data.
Thank you for your patience. Its just that I am new to these forums and matlab in general and am facing a lot of difficulty in accomplishing this task.
[1069 12059 100200 145
1069 12063 100200 471
1073 1001 100100 213
1073 1001 100101 213
1073 1007 100100 633
1073 1007 100101 633]
This is a portion of the matrix. As you can see that 1073-1001 and 1073-1007 is repeated twice (which is not true for the entire matrix; we might have 3 such pairs for eg: 1073-1008 may occur thrice below this). I want their to be only one row for 1073-1001 and 1073-1007 with the elements in column 4 being 213+213=426 and 633+633=1266, respectively.
So I want the output to be like:
[1069 12059 100200 145
1069 12063 100200 471
1073 1001 100100 426
1073 1007 100100 1266]
a = [1069 12059 100200 145
1069 12063 100200 471
1073 1001 100100 213
1073 1001 100101 213
1073 1007 100100 633
1073 1007 100101 633];
[aa,b,c] = unique(a(:,1:2),'first','rows');
out = [aa,a(b,3),accumarray(c,a(:,4))];
another variant only for your example
[~,b2,c2] = unique(a(:,end),'first');
[~,ii] = sort(b2);
out2 = [a(b2(ii),1:end-1), accumarray(ii(c2),a(:,end))];

This question is closed.

Tags

No tags entered yet.

Asked:

on 30 Sep 2013

Closed:

on 20 Aug 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!