Cumulative sum of last n entries of column vectors

3 views (last 30 days)
Hi,
I've got the following 2 column vectors:
"Name" (300x1 Categorical) and "Amount" (300x1 Double).
I wanna calculate the cumulative sum for each category of vector "Name" BUT I don't wanna start my cumulative sum at the first entry of each category, I only want to calculate the cumulative sum of the last n entries (for each category of "Name"). The exmaple table below illustrates my plan for the cumulative sum of the last 2 entries. The (desired) result is shown in "CumAmount".
A is not a good example, so please follow example B :
The row in which on can find the first entry for B shows the value 4 ("Amount") so the cumulative sum ("CumAmount") would be 4 in this row.
Since the next "Amount" for B is 3, the cumulative sum is 7 in this row.
The third "Amount" for B is 1. Since this example refers to the last two entries, the cumulative sum is not 8 but rather 4 (3+1) - the first "Amount" of B is irgnored, only the last 2 values should be used for the cumulative sum.
I was not able to solve my problem using the functions find and ismember.
Is anybody able to help?
Thank you so much!
+------+--------+-----------+
| Name | Amount | CumAmount |
+------+--------+-----------+
| A | 1 | 1 |
+------+--------+-----------+
| B | 4 | 4 |
+------+--------+-----------+
| C | 4 | 4 |
+------+--------+-----------+
| D | 2 | 2 |
+------+--------+-----------+
| E | 0 | 0 |
+------+--------+-----------+
| F | 2 | 2 |
+------+--------+-----------+
| G | 3 | 3 |
+------+--------+-----------+
| H | 3 | 3 |
+------+--------+-----------+
| I | 2 | 2 |
+------+--------+-----------+
| C | 0 | 4 |
+------+--------+-----------+
| E | 1 | 1 |
+------+--------+-----------+
| F | 1 | 3 |
+------+--------+-----------+
| I | 3 | 5 |
+------+--------+-----------+
| B | 3 | 7 |
+------+--------+-----------+
| A | 2 | 3 |
+------+--------+-----------+
| H | 4 | 7 |
+------+--------+-----------+
| D | 2 | 4 |
+------+--------+-----------+
| E | 0 | 1 |
+------+--------+-----------+
| B | 1 | 4 |
+------+--------+-----------+
| H | 1 | 5 |
+------+--------+-----------+
| C | 2 | 2 |
+------+--------+-----------+
| D | 1 | 3 |
+------+--------+-----------+
| F | 2 | 3 |
+------+--------+-----------+
| I | 0 | 3 |
+------+--------+-----------+
  1 Comment
Ameer Hamza
Ameer Hamza on 11 May 2020
Can you attach the data as a mat file. That will at least make it easy to test a solution. This table format is not very useful for importing the data to MATLAB.

Sign in to comment.

Accepted Answer

Sean de Wolski
Sean de Wolski on 11 May 2020
Edited: Sean de Wolski on 11 May 2020
Read saved table
T = readtable('data.txt');
head(T)
ans =
C A CA
_____ _ __
{'A'} 1 1
{'B'} 4 4
{'C'} 4 4
{'D'} 2 2
{'E'} 0 0
{'F'} 2 2
{'G'} 3 3
{'H'} 3 3
Sort the table to get the ordering and then unique indices for each group.
[Tsorted, sortidx] = sortrows(T, "C");
[~, ~, uniqueix] = unique(Tsorted.C, 'stable');
Specify n, index. The sum function is the valid convolution with a window size n preceded by 0.
n = 2;
oneToN = (1:numel(uniqueix)).';
sumXminusN = @(x){conv([0;Tsorted.A(sort(x))],ones(n,1),'valid')};
Accumulate by group and take the convolution of each group. The order matches those in sorted.
ccac = accumarray(uniqueix, oneToN, [], sumXminusN);
ccaf = vertcat(ccac{:});
Undo the sorting operation and check that it worked.
CA(sortidx, 1) = ccaf;
assert(isequal(CA,T.CA))
This works for N=2. You may need to be smarter with the number of zeros to prepend if you want it work for other N...
  1 Comment
buhmatlab
buhmatlab on 11 May 2020
Edited: buhmatlab on 11 May 2020
Awesome! I have not really understood your code and it will take some time to retrace your answer but at least I can tell that it just works fine! Thank you so much!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!