Inquiry on Time Complexity of cumsum

Question

아사로 on 9 Aug 2025

0
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/2179262-inquiry-on-time-complexity-of-cumsum

Commented: 아사로 on 11 Aug 2025

Hello,

My name is Asaro Hong, an M.S. student in Mechanical Engineering at Inha University.

In P2D lithium-ion battery model calculations, I frequently use cumsum on 1D vectors. I’d like to confirm the following:

What is the time complexity (Big-O) of cumsum with respect to input length n?
With Parallel Computing Toolbox or when using GPU arrays (gpuArray), are there documented differences in time complexity or constant factors/overhead?
For citation in a paper, could you point me to any official references (product documentation, technical notes, benchmark materials, etc.)?

Environment: MATLAB R2024a, Windows 10

Thank you for your help.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

John D'Errico on 9 Aug 2025

2
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/2179262-inquiry-on-time-complexity-of-cumsum#answer_1569134

Edited: John D'Errico on 9 Aug 2025

Open in MATLAB Online

There is no (good) answer to your question. And that answer will subtly change next year too, with the next release of MATLAB, or the next computer you buy. I'm sorry, but it cannot be the case, even for something as simple as cumsum. Sorry. But it gets complicated. Blame your computer.

Effectively, for small vector lengths, the time complexity of cumsum will be linear, so O(n), where n is the length of the vector. But for those small vectors, there will also be calling overhead. Error checks, setup, etc. The basic algorithm will be...

Set up. Error checks. initialize everything.
A simple loop. Just add the numbers sequentially.

So the total time taken will be

c1 + c2*n

where c1 is the total of the initial overhead times. c1 will be pretty much fixed of course. And for really small values of n, c1 will dominate the time required. c2 is just the time for a loop to process each element.

The problem is, when n grows large, your computer does things in a smart way. It realizes that for really big problems involving sums, etc., it can sometimes gain by using multiple processors. So I tried out cumsum, on some pretty big vectors. (I've got a 16 core machine, and a fair amount of RAM, so I could go pretty large in these tests.)

x = rand(3e9,1);
timeit(@() cumsum(x))
ans =
9.14715126520833

I was watching the number of cores my computer used in that, though my activity monitor was probably not sampling very finely. It did get at least more than 2 cores running though. So 9 seconds to do a cumsum on 3 billion element vectors.

Next, I cut the vector length in half, and in half again.

x = rand(1.5e9,1);
timeit(@() cumsum(x))
ans =
1.46785434854167
x = rand(0.75e9,1);
timeit(@() cumsum(x))
ans =
0.742577598541667

Hmm. do you see anything interesting? The time to go from 0.75e9 to 1.5e9 doubled. But to then double the vector length again, the time went way up. Not just double, but by a factor of roughly 6.

What happened? Most likely, I was getting hung up, due to something like a RAM limit, or maybe cache size. I never got near making all 16 cores fire up. That takes some serious effort. (Believe me, I've tried. It takes some work to get this beast of mine fully awake.)

But effectively, cumsum will have multiple domains in its run time.

For very small n, cumsum will be O(1). So constant run time, independent of the vector length.

For intermediate to moderately large n, cumsum will be O(n) in run time.

For HUGE n, cumsum will be something completely, unpredictably different!

For really small vector lengths, the setup dominates EVERYTHING. For really large vector lengths, it gets complicated. For MOST problems in the intermediate domain, you should expect linear, O(n) timing.

And the large scale run time behavior of cumsum will be completely different in completely different ways, depending on your computer. It will depend on the number of cores on your CPU. It will depend on the amount of RAM you have, cache sizes, bus sizes. This means we don't even know where the breaks will come, because your computer is different from mine.

And, NO. There are no documented references, papers, etc. on the time complexity of cumsum. It wil completely depend on your computer for the end cases, for both small and large values of n. For intermediate size problems, it will be O(n). But nobody is going to write a paper on the behavior of cumsum. Feel free, if you want to do so, but you will probably need to get a job at MathWorks to learn the exact code, and that will be important. You will need to learn the depths of tools like the BLAS, to learn when your computer will start to automatically use multiple cores, because these things are not determined by cumsum. And you will need to learn a great deal about how your computer interacts with its RAM and cache limits, how the system bus size will impact how fast data can be passed around internally. It is almost certainly not worth the effort, because every computer will be different, and the computers from next year will be different yet. But feel free to try.

1 Comment
Show -1 older commentsHide -1 older comments

아사로 on 11 Aug 2025

Thank you for the professional response and references; this has been very helpful for my research. I also appreciate everyone who participated in the discussion.

Sign in to comment.

Answer 2

Walter Roberson on 9 Aug 2025

1
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/2179262-inquiry-on-time-complexity-of-cumsum#answer_1569133

Open in MATLAB Online

Looks pretty linear to me.

format long g

A = rand(1e6,1);

times = zeros(10,1);

for K = 1 : 10

subset = A(1:K*1e5);

tic; p = cumsum(subset); times(K) = timeit(@()ti(subset),0);

end

plot(times);

polyfit(1:10,times,2)

ans = 1×3

1.0e+00 * 1.10126262626266e-07 0.000117698409090909 1.04042222222222e-05

●

function ti(ss)

p = cumsum(ss);

end

2 Comments
Show NoneHide None

John D'Errico on 9 Aug 2025

Yes, but try pushing the limits.

Walter Roberson on 9 Aug 2025

Open in MATLAB Online

@John D'Errico

I pushed the limits about as far as it would go here on MATLAB Answers. Higher multiplication factors led to errors about running out of memory, or just to general "something went wrong" errors.

format long g

N = 25;

E = 1e7;

A = rand(N*E,1);

times = zeros(N,1);

for K = 1 : N

subset = A(1:K*E);

tic; p = cumsum(subset); ti(subset); times(K) = toc;

end

plot(times);

polyfit(1:N,times,2)

ans = 1×3

-7.74329431438166e-05 0.0864124326755854 -0.0063587443478265

function ti(ss)

p = cumsum(ss);

end

Sign in to comment.

Answer 3

Joss Knight on 9 Aug 2025

1
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/2179262-inquiry-on-time-complexity-of-cumsum#answer_1569138

We don't typically give details of our algorithm implementations, not least because they will vary by platform, release, data type etc. So the main thing to say is that we don't provide citable info at that level of detail unfortunately.

The simple answer is to Google it, because MATLAB isn't doing anything peculiar beyond what is known in numerical analytics. The complexity of a serial prefix scan is O(n) and of a (perfectly) parallel one it's O(log(n)), and so MATLAB's CPU and GPU versions will be somewhere between these two extremes. On the GPU, the balance between compute and memory access means each thread will process multiple entries in series, but obviously there is much more parallelism than on even a CPU with a large number or cores.

There will always be a difference between a CPU and GPU implementation, although less and less now that CPU algorithms are themselves highly parallelizable. Sometimes the parallel version of an algorithm is a straightforward extension of a serial one (like a parallel for loop), sometimes it's a wildly different implementation (like convolution via matrix multiplication, tiling or block matrix schemes for factorisation etc).

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Inquiry on Time Complexity of cumsum

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (2)

2 Comments
Show NoneHide None

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Inquiry on Time Complexity of cumsum

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (2)

2 Comments Show NoneHide None

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

2 Comments
Show NoneHide None

0 Comments
Show -2 older commentsHide -2 older comments