How do I apply the same operation on vectors of different length but of similar name?

45 views (last 30 days)
I have imported roughly 50 column vectors that start with the same name. Now I want to apply the same operation (mean value of each vector) but I dont know how to do. My idea is to use a loop combined with a type 'startsWith'-condition. However I am a complete beginner and therefore I have no idea how to actually do it. I've read about a similiar problem where people suggested to store it in a cell array. But I assume this requires vectors of equal length.
Thank you in advance
  1 Comment
Stephen23
Stephen23 on 3 Dec 2025 at 12:27
Edited: Stephen23 on 3 Dec 2025 at 14:36
"I've read about a similiar problem where people suggested to store it in a cell array."
Because using a cell array is good approach, perhaps the best one.
"But I assume this requires vectors of equal length."
I do not see anything in the MATLAB documentation that suggests that restriction:
"I have imported roughly 50 column vectors that start with the same name."
And that is exactly the place to fix your code, by importing into one array (e.g. a cell array).
Note that the approach you suggest in your question would be slow, complex, and very inefficient approach:

Sign in to comment.

Accepted Answer

Umar
Umar on 3 Dec 2025 at 15:00

Hi @Henning,

I wanted to address your question about calculating means for your 50 column vectors and clear up some confusion. First, Stephen23 is absolutely correct that cell arrays do NOT require equal-length vectors, so that assumption you had is not a limitation at all. The MATLAB documentation confirms cell arrays can hold vectors of any length, which makes them perfect for your situation. However, the real issue Stephen23 is pointing out is that having 50 separate variables like A_1, A_2, A_3, etc. in your workspace is actually the root problem, and that's where you should focus your fix. The approach you're thinking of with a loop and startsWith condition will technically work, but as Stephen23 mentions in that tutorial link, dynamically named variables lead to slow, complex, and inefficient code. What you really want to do is go back to where you're importing the data and change that process to load everything into a single container from the start, either a cell array, a structure, or a containers.Map object.

I've put together a complete working example with synthetic sensor data that shows you four different solutions. Solution 1 uses a structure where you store each vector as a field like dataStruct.Sensor_1, dataStruct.Sensor_2, etc., and then you can loop through the field names, filter with startsWith, and calculate means easily. Solution 2 uses a cell array where all your vectors go into sensorData{1}, sensorData{2}, etc., and you can calculate all the means in literally one line using cellfun(@mean, sensorData), which is incredibly clean and fast. Solution 3 shows you how to work with your existing scattered variables using who() to get all variable names, filtering them with startsWith, and then using eval() in a loop to access each one and calculate its mean, but this is the slowest approach and should only be used if you absolutely cannot refactor your import code. Solution 4 demonstrates containers.Map which is a modern key-value storage approach that's very flexible for lookups, though it's about 4 times slower than the cell array method based on my benchmarking.

The code I've included generates 50 synthetic temperature sensors with varying numbers of samples from 5 to 200 readings each, calculates mean, standard deviation, and sample count for each sensor using all four methods, and creates visualizations showing sensor means with error bars, the distribution of sample counts, a histogram of mean temperatures, and raw data traces from the first three sensors. When I ran performance tests over 100 iterations, the structure approach took 0.0136 seconds, cell array took 0.0163 seconds, and containers.Map took 0.0634 seconds, clearly showing that the cleaner approaches are also faster. The visualizations show a nice upward temperature trend across sensors from about 22 to 25 degrees Celsius with realistic noise, and the histogram shows most sensor means cluster around 23.5 degrees, which confirms the synthetic data is working as expected.

My recommendation is this: if you can modify your import code, absolutely do that and use either a cell array or structure from the start, with cell arrays being slightly preferable for your use case since cellfun makes calculations so elegant. If you're completely stuck with existing variables and cannot change the import process, then yes, use the who() and eval() approach I showed in Solution 3, but understand this is a workaround for a bad situation, not a best practice. The key takeaway Stephen23 is emphasizing is that you should never create dynamically named variables in the first place because it makes everything harder down the line, and the time you invest now in fixing your import process will save you countless hours of frustration later. I've attached the complete code with all four solutions, performance benchmarking, and visualizations so you can see exactly how each approach works with real data, and you can adapt whichever solution fits your current situation best.

%% BETTER SOLUTIONS FOR CALCULATING MEANS OF MULTIPLE VECTORS
% This script demonstrates improved approaches using realistic synthetic 
 data
%% Generate realistic synthetic sensor data
fprintf('Generating synthetic sensor data...\n');
rng(42); % For reproducibility
% Create synthetic data that mimics real-world scenario
numSensors = 50;
sensorData = cell(numSensors, 1);
sensorNames = cell(numSensors, 1);
sensorLengths = zeros(numSensors, 1);
for i = 1:numSensors
  % Each sensor has different number of readings (5 to 200 samples)
  nSamples = randi([5, 200]);
  % Generate temperature-like data: baseline around 22°C with noise
  baseline = 22;
  trend = (i/numSensors) * 3; % Slight trend across sensors
  noise = randn(nSamples, 1) * 2; % 2°C standard deviation
    sensorData{i} = baseline + trend + noise;
    sensorNames{i} = sprintf('Sensor_%d', i);
    sensorLengths(i) = nSamples;
  end
fprintf('Generated %d sensors with %d to %d samples each\n\n', ...
  numSensors, min(sensorLengths), max(sensorLengths));
%% SOLUTION 1: Structure-based approach (BEST)
fprintf('=== SOLUTION 1: Structure (BEST) ===\n');
dataStruct = struct();
for i = 1:numSensors
  dataStruct.(sprintf('Sensor_%d', i)) = sensorData{i};
end
fieldNames = fieldnames(dataStruct);
sensorFields = fieldNames(startsWith(fieldNames, 'Sensor_'));
means1 = zeros(length(sensorFields), 1);
stdDevs1 = zeros(length(sensorFields), 1);
counts1 = zeros(length(sensorFields), 1);
for i = 1:length(sensorFields)
  data = dataStruct.(sensorFields{i});
  means1(i) = mean(data);
  stdDevs1(i) = std(data);
  counts1(i) = length(data);
end
results1 = table(sensorFields, means1, stdDevs1, counts1, ...
  'VariableNames', {'Sensor', 'Mean', 'StdDev', 'NumSamples'});
fprintf('First 10 sensors:\n');
disp(results1(1:10,:));
fprintf('Summary: Mean temperature across all sensors: %.2f°C\n',   
mean(means1));
fprintf('Temperature range: %.2f°C to %.2f°C\n\n', min(means1),   
max(means1));
%% SOLUTION 2: Cell Array (VERY GOOD)
fprintf('=== SOLUTION 2: Cell Array (VERY GOOD) ===\n');
means2 = cellfun(@mean, sensorData);
stdDevs2 = cellfun(@std, sensorData);
counts2 = cellfun(@length, sensorData);
results2 = table(sensorNames, means2, stdDevs2, counts2, ...
  'VariableNames', {'Sensor', 'Mean', 'StdDev', 'NumSamples'});
fprintf('First 10 sensors:\n');
disp(results2(1:10,:));
fprintf('This approach is clean, fast, and ideal for varying-length vectors\n\n');
%% SOLUTION 3: Working with existing workspace variables (IF NEEDED)
fprintf('=== SOLUTION 3: Workspace Variables (ACCEPTABLE) ===\n');
fprintf('Simulating scattered variables in workspace...\n');
clearvars Sensor_*
for i = 1:numSensors
  eval(sprintf('Sensor_%d = sensorData{%d};', i, i));
end
allVars = who('Sensor_*');
numVars = length(allVars);
means3 = zeros(numVars, 1);
stdDevs3 = zeros(numVars, 1);
counts3 = zeros(numVars, 1);
for i = 1:numVars
  data = eval(allVars{i});
  means3(i) = mean(data);
  stdDevs3(i) = std(data);
  counts3(i) = length(data);
end
results3 = table(allVars, means3, stdDevs3, counts3, ...
  'VariableNames', {'Sensor', 'Mean', 'StdDev', 'NumSamples'});
fprintf('First 10 sensors:\n');
disp(results3(1:10,:));
fprintf('Note: This works but is slower. Use only if variables already exist.\n\n');
%% SOLUTION 4: containers.Map (MODERN)
fprintf('=== SOLUTION 4: containers.Map (MODERN) ===\n');
dataMap = containers.Map();
for i = 1:numSensors
  key = sprintf('Sensor_%d', i);
  dataMap(key) = sensorData{i};
end
allKeys = keys(dataMap);
sensorKeys = allKeys(startsWith(allKeys, 'Sensor_'));
sensorKeys = sort(sensorKeys);
sensorKeysCol = sensorKeys(:);
means4 = cellfun(@(k) mean(dataMap(k)), sensorKeysCol);
stdDevs4 = cellfun(@(k) std(dataMap(k)), sensorKeysCol);
counts4 = cellfun(@(k) length(dataMap(k)), sensorKeysCol);
results4 = table(sensorKeysCol, means4, stdDevs4, counts4, ...
  'VariableNames', {'Sensor', 'Mean', 'StdDev', 'NumSamples'});
fprintf('First 10 sensors:\n');
disp(results4(1:10,:));
fprintf('Great for dynamic key-value storage and lookups\n\n');
%% BONUS: Matrix approach (if equal length)
fprintf('=== BONUS: Matrix Approach (if equal-length vectors) ===\n');
matrixData = randn(100, numSensors) * 2 + 22 + (1:numSensors)/numSensors 
* 3;
columnMeans = mean(matrixData, 1);
columnStdDevs = std(matrixData, 1);
fprintf('Calculated means for %d sensors in one line!\n', numSensors);
fprintf('First 10 means: ');
fprintf('%.2f ', columnMeans(1:10));
fprintf('\n');
fprintf('This is the fastest approach when vectors have equal length.\n\n');
%% Visualization
fprintf('=== CREATING VISUALIZATION ===\n');
figure('Position', [100, 100, 1200, 600]);
subplot(2,2,1);
errorbar(1:numSensors, means2, stdDevs2, 'o-', 'LineWidth', 1.5);
xlabel('Sensor Number');
ylabel('Temperature (°C)');
title('Sensor Measurements: Mean ± Std Dev');
grid on;
subplot(2,2,2);
bar(counts2);
xlabel('Sensor Number');
ylabel('Number of Samples');
title('Sample Count per Sensor');
grid on;
subplot(2,2,3);
histogram(means2, 20, 'FaceColor', [0.3 0.6 0.9]);
xlabel('Mean Temperature (°C)');
ylabel('Frequency');
title('Distribution of Sensor Means');
grid on;
subplot(2,2,4);
hold on;
colors = lines(3);
for i = 1:3
  plot(sensorData{i}, 'Color', colors(i,:), 'DisplayName', sensorNames{i});
end
xlabel('Sample Index');
ylabel('Temperature (°C)');
title('Raw Data: First 3 Sensors');
legend('Location', 'best');
grid on;
hold off;
fprintf('Visualization complete!\n\n');
%% Performance comparison
fprintf('=== PERFORMANCE COMPARISON ===\n');
nRuns = 100;
tic;
for run = 1:nRuns
  temp = cellfun(@mean, sensorData);
end
time2 = toc;
tic;
for run = 1:nRuns
  for i = 1:length(sensorFields)
      temp = mean(dataStruct.(sensorFields{i}));
  end
end
time1 = toc;
tic;
for run = 1:nRuns
  temp = cellfun(@(k) mean(dataMap(k)), sensorKeysCol);
end
time4 = toc;
fprintf('Time for %d runs:\n', nRuns);
fprintf('  Cell array:     %.4f sec (fastest, baseline)\n', time2);
fprintf('  Structure:      %.4f sec (%.1fx slower)\n', time1, time1/time2);
fprintf('  containers.Map: %.4f sec (%.1fx slower)\n', time4, time4/time2);
fprintf('\n');
%% Recommendations
fprintf('=== FINAL RECOMMENDATIONS ===\n');
fprintf('1. BEST:       Cell array (Solution 2) - fast, clean, handles varying 
lengths\n');
fprintf('2. VERY GOOD:  Structure (Solution 1) - organized, readable, easy to 
maintain\n');
fprintf('3. MODERN:     containers.Map (Solution 4) - flexible, good for 
lookups\n');
fprintf('4. LAST RESORT: Workspace variables (Solution 3) - slow, only if 
already exists\n');
fprintf('5. IDEAL:      Matrix - if all vectors have equal length, use this!\n');
fprintf('\n');
fprintf('KEY TAKEAWAY: Import your data into a container (cell/struct/map) 
from the start!\n');
fprintf('              Never create separate variables like A_1, A_2, A_3...\n');

Note: please see attached.

Hope this helps clarify things and gives you a clear path forward. Let me know if you have any questions about implementing any of these solutions.

  1 Comment
Henning
Henning 11 minutes ago
Hey Umar,
thank you very much for all the work you have put into your answer, I really appreciate it. I will take a look at for my future projects when there is a calm moment, right now it's chaotic times for me. Therefore I may come up with some follow-up questions later.
Best regards
Henning

Sign in to comment.

More Answers (1)

Star Strider
Star Strider on 3 Dec 2025 at 11:02
Edited: Star Strider on 3 Dec 2025 at 13:58
Cell arrays do not require vectors of equal length.
I am not certain what your actual problem is, however something like this may work --
A_1 = randn(5,1);
A_2 = randn(9,1);
A_3 = randn(7,1);
A_cell = {A_1, A_2, A_3}
A_cell = 1×3 cell array
{5×1 double} {9×1 double} {7×1 double}
A_mean = cellfun(@mean, A_cell)
A_mean = 1×3
-0.0764 0.3722 0.1208
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
EDIT -- (3 Dec 2025 at 13:58)
This approach uses the dreaded eval funciton, however I do not believe you can do anything else, considering what you began with. As @Stephen23 points out, starting with a cell array would be best.
Try something like this --
clear variables
A_1 = randn(5,1);
A_2 = randn(9,1);
A_3 = randn(7,1);
B_1 = randn(10,1);
B_2 = randn(3,1);
v = who
v = 5×1 cell array
{'A_1'} {'A_2'} {'A_3'} {'B_1'} {'B_2'}
A_v = v(cellfun(@(x)startsWith(x,'A'), v))
A_v = 3×1 cell array
{'A_1'} {'A_2'} {'A_3'}
for k = 1:numel(A_v)
A_mean(k,:) = mean(eval(A_v{k}));
end
A_mean
A_mean = 3×1
0.1469 0.7570 0.2063
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Results = table(A_mean, VariableNames="Mean", RowNames=A_v)
Results = 3×1 table
Mean _______ A_1 0.14694 A_2 0.75704 A_3 0.20634
Results = table(A_v, A_mean, VariableNames=["Variable","Mean"])
Results = 3×2 table
Variable Mean ________ _______ {'A_1'} 0.14694 {'A_2'} 0.75704 {'A_3'} 0.20634
.

Categories

Find more on Descriptive Statistics in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!