How to average over different length vectors without excessive for loops?

Question

Ashley Wilkins on 1 Dec 2020

0
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/671968-how-to-average-over-different-length-vectors-without-excessive-for-loops

Edited: dpb on 5 Apr 2023

Hi there,

My problem involves running lots of different stochastic simulations (imagine some sort of Brownian motion) and then averaging over all of these different histories to compute quantites such as mean, variance etc

At the moment for each run I do have an output vector that that e.g. could be

X1 = [0 1 4 6 8]

where each new entry in the vector represents the position of a particle after a standard time increment. Here we have 5 elements of the vector so there have been 4 time increments. Although in practice these would be much longer. The problem is that each run ends when a certain condition is met (say X = 8) and this generically happens after differnt times. This means the next run might be something like

X2 = [0 4 8]

Which is only 3 elements long and thus only 2 time increments. I have done this for R number of runs. If each Xi vector had the same length I know I could simply collect them in one object X like so:

X = [X1; X2; ... XR]

and then compute the mean using the mean function in the appropriate direction. However unfortunately this wouldn't work in this case as the vectors are of different lengths.

For example if all I had was X1 and X2 I want some process that would calculate the mean at each timestep like so

mean1 = (X1(1)+X2(1))/2; mean2 = (X1(2)+X2(2))/2; mean3 = (X1(3)+X2(3))/2; %data at each timestep for X1 and X2 runs so average over both
mean4 = X1(4); mean5 = X1(5); %no X2 data for these timesteps so only averaging over X1 run
meanX = [mean1 mean2 mean3 mean4 mean5]

But obviously in a way that is scaleable without doing this process thousands of times using lots of for loops. In my actual code I have several thousand runs with each run having several hundred elements so this needs to be reasonably scaleable.

Thanks for any help people can offer and I'm obviously happy to try and clarify anything I have poorly explained

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Adam Danz on 1 Dec 2020

4
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/671968-how-to-average-over-different-length-vectors-without-excessive-for-loops#answer_562153

Edited: Adam Danz on 1 Dec 2020

Open in MATLAB Online

I suggest collected all of the variable-length row-vectors within a cell array, then organize them in a matrix and use NaN to pad missing values. Then you can use the "omitnan" property of mean() to average across columns while ignoring NaNs.

Demo:

a{1} = [1 2 5];
a{2} = [5 1 3 5];
a{3} = [9 0 2 1 8];
a{4} = [4 2];
% Vertically concatenate, pad with NaNs
maxNumCol = max(cellfun(@(c) size(c,2), a));  % max number of columns
aMat = cell2mat(cellfun(@(c){padarray(c,[0,maxNumCol-size(c,2)],NaN,'Post')}, a)')
aMat = 4×5
     1     2     5   NaN   NaN
     5     1     3     5   NaN
     9     0     2     1     8
     4     2   NaN   NaN   NaN
●
colMeans = mean(aMat,1,'omitnan')
colMeans = 1×5
    4.7500    1.2500    3.3333    3.0000    8.0000
●

5 Comments
Show 3 older commentsHide 3 older comments

Ashfaq Ahmed on 4 Apr 2023

@Adam Danz this is a brilliant approach. Can you please help me to write the code as a function in a way that we only need to input the variables (of different lengths) and it will do the mean of them?

dpb on 4 Apr 2023

Edited: dpb on 5 Apr 2023

Open in MATLAB Online

What do you want the footprint of the function to be -- any number of vectors of variable length?

If so, then use varargin and you'll have the cell array automagically. All you'll have to do is ensure they're all oriented the same direction first; Adam's solution above assumes they're row vectors--

function colMeans=avgVecs(varargin)
  a=varargin;                                   % use Adam's internal variable; could change a-->varargin
  % Vertically concatenate, pad with NaNs
  maxNumCol = max(cellfun(@(c) size(c,2), a));  % max number of columns
  aMat = cell2mat(cellfun(@(c){[c nan(1,maxNumCol-numel(c))]}, a)');
  colMeans = mean(aMat,1,'omitnan');
end

Locally, the above with the same input vectors as separate variables

>> avgVecs(a,b,c,d)
ans =
4.7500    1.2500    3.3333    3.0000    8.0000
>> 

I don't have Image Processing TB so replaced padarray with base MATLAB code.

In general, I wouldn't recommend going at it this way in creating the multiple named variables; it would be better to use a cell array initially and avoid the need to make the conversion entirely. In that case, you would simply pass in the cell array itself; varargin does the dirty work of creating a cell array out of multiple inputs when used in a function argument as shown. There's no equivalent neat syntax I'm aware of that does this directly at the command line or inside a script or function without the call to the lower-level function. You could, of course, simply have the oneliner function of

function varargout=vecs2cell(varargin)
  varargout=varargin;
end

The output would be the 1x4 cell array; of course at this point they wouldn't be yet padded to common length, but that's what Adam's code expects as input.

Sign in to comment.

Answer 2

dpb on 1 Dec 2020

0
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/671968-how-to-average-over-different-length-vectors-without-excessive-for-loops#answer_562148

Open in MATLAB Online

Use a cellarray to store the results of each trial instead of individual named variables; then

means=cellfun(@mean,x);

1 Comment
Show -1 older commentsHide -1 older comments

Adam Danz on 1 Dec 2020

I think she's averaging between vectors, not within, based on mean1 = (X1(1)+X2(1))/2;

Sign in to comment.

Answer 3

David Hill on 1 Dec 2020

0
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/671968-how-to-average-over-different-length-vectors-without-excessive-for-loops#answer_562163

Open in MATLAB Online

I would use a cell array.

for k=1:100
    x{k}=randi(100,1,randi(1000));%simulate your outputs
end
Mean=zeros(1,100);
for k=1:100
    Mean(k)=mean(x{k});%calculate the mean and whatever else you want
end

1 Comment
Show -1 older commentsHide -1 older comments

Adam Danz on 1 Dec 2020

This is the loop version of dpb's answer.

Sign in to comment.

How to average over different length vectors without excessive for loops?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

5 Comments
Show 3 older commentsHide 3 older comments

More Answers (2)

1 Comment
Show -1 older commentsHide -1 older comments

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How to average over different length vectors without excessive for loops?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

5 Comments Show 3 older commentsHide 3 older comments

More Answers (2)

1 Comment Show -1 older commentsHide -1 older comments

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

5 Comments
Show 3 older commentsHide 3 older comments

1 Comment
Show -1 older commentsHide -1 older comments

1 Comment
Show -1 older commentsHide -1 older comments