Cumulative sum by group

18 views (last 30 days)
zppine
zppine on 27 Jul 2012
Answered: Quoc T. Phan on 19 Jun 2020
Hi, Could someone explain how to compute the running sum by group (without using for loop)? A example is: groupid = [1; 1; 1; 2; 2; 3; 3;]; value=[1;2;3;4;5;6;7]. the desired result is: runsum = [1;3;6; 4;9; 6;13]. Thanks a lot. zppine

Answers (4)

Sean de Wolski
Sean de Wolski on 27 Jul 2012
Edited: Sean de Wolski on 27 Jul 2012
Using a for-loop here may be your best option. However, my unconditional love for accumarray forces me to present you with this:
groupid = [1; 1; 1; 2; 2; 3; 3;];
value=[1;2;3;4;5;6;7];
gcs = cell2mat(accumarray(groupid,value,[],@(x){cumsum(x)}))

zppine
zppine on 27 Jul 2012
Hi, Sean. Very good idea. Thanks a lot.

Daniel
Daniel on 3 Jul 2015
function S = labeledCumsum(X,L)
% e.g. X=[3 5 8 9 1 2 5 8 5 4 9 2]
% L=[0 1 1 2 2 0 0 1 1 1 0 5]
% result=[NaN 5 13 9 10 NaN NaN 8 13 17 NaN 2]
L = L(:);
X = X(:);
if numel(L) ~= numel(X)
error('The two inputs should be vectors of the same length.')
end
% Do the full cumulative sum
X(isnan(X)) = 0;
S = cumsum(X);
mask = logical(L);
% Lookup the cumulative value just before the start of each segment
isStart = mask & [true ; L(1:end-1) ~= L(2:end)];
startInds = find(isStart);
if startInds(1) == 1
S_starts = [0 ; S(startInds(2:end)-1)];
else
S_starts = S(startInds-1);
end
% Subtract off the excess values (i.e. the ones obtained above)
L_safe = cumsum(isStart); % we do this to main case the labels in L were not sorted integers
S(mask) = S(mask) - S_starts(L_safe(mask));
% Put NaNs in the false blocks
S(L==0) = NaN;
end

Quoc T. Phan
Quoc T. Phan on 19 Jun 2020
Hello
When it comes to work with group, these two functions are great tools. They go together
  1. findgroups: --> indexing groups
  2. splitapply: --> split your data by index, then apply @function to it, then combine back to original
Here is a sample code
clear
load patients
% Create a table X
X = table(LastName,Gender,Height,Weight,Age)
% Sort rows by Gender and Last Name (Actually u dont need Last Name, but I
% just put here in case)
X = sortrows(X,{'Gender' 'LastName'},{'ascend' 'ascend'})
% Function 01: Using findgroups --> to return logical index
G = findgroups(X.Gender)
% (*) Note: by default they index by alphabet order. Although 1st obs is male,
% however, F(emale) is before M(ale). Then 1 is Female and 2 is Male.
% Function 02: Using splitapply
% Basically what it does it slit table X by logical index G
% It then apply function cumsum for variable X.Height
cumHeight = splitapply(@(x1){cumsum(x1)},X.Height,G)
% Change cell to matrix and add to table X
X.cumHeight = cell2mat(cumHeight)

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!