# reduce matrix data by combining values

3 views (last 30 days)
Christoph Meier on 1 Mar 2016
Commented: Walter Roberson on 1 Mar 2016
Dear matlab community,
I need to reduce data in a matrix of two columns, and many rows (almost 2 billion, so I would like to find a way to automate it!)
The structure is as follows:
Column 1 has a bunch of discrete values, which occur repeatedly.
Example:
a
b
a
c
b
Column 2 has values in it, that should be added up for all "a", all "b", and all "c", that only one "a", "b", "c" is left in the final matrix.
Example:
Initial matrix:
a 10
b 5
a 1
c 20
b 3
The final matrix should look like this:
a 11
b 8
c 20
Thank you very much! A solution would help me a lot!

Walter Roberson on 1 Mar 2016
In the below code I assume your array is a cell array (because you cannot mix letters and numbers in a numeric array).
[unique_keys, ~, idx] = unique(YourArray(:,1));
vals_to_total = cell2mat( YourArray(:,2) );
totals = accumarray(idx, vals_to_total);
results = [unique_keys, num2cell(totals)];

Christoph Meier on 1 Mar 2016
Thank you very much, Walter! First, for editing, second, for your correct assumption: a,b,c was only for simplification, they are actually numerals, and third for the great solution!
Christoph Meier on 1 Mar 2016
I came across a problem:
I replaced YourArray with my actual array "in10", the 2 columns, 2b rows matrix:
[unique_keys, ~, dx] = unique(in10(:,1));
vals_to_total = cell2mat( in10(:,2));
totals = accumarray(idx,vals_to_total);
results = [unique_keys,num2cell(totals)];
However, I get the following error message:
Error in unique>uniqueR2012a (line 542)
groupsSortA = [true; groupsSortA]; % First
element is always a member of unique list.
Error in unique (line 88)
[varargout{1:nlhs}] = uniqueR2012a(varargin{:});
Do you know what the problem may be?
Thank you!
Walter Roberson on 1 Mar 2016
[unique_keys, ~, idx] = unique(YourArray(:,1));
vals_to_total = YourArray(:,2);
totals = accumarray(idx, vals_to_total);
results = [unique_keys, totals];
You did not show enough of the error message for me to see what it is complaining about.
Is it possible that your data is a MATLAB table() or dataset() data type, or something other than numeric or a cell array?
Also watch out: you assigned to "dx" in the unique() line, but you use "idx" in the accumaray()