Why do I get: Error using sum Invalid data type. First argument must be numeric or logical?

Hi,
I have a cell array called "pre_data" with 1 column and 27 rows. Each element in the column contains a cell with 21 colums and a varying number of rows.
I want to scan the columns in the cells of pre_data. For each seperate column, if there are values in a column that are above 3 standard deviations of that column, then I want the row cointaining that value to be removed.
For that I have written the following piece of code:
pre_data_clean = cell(size(pre_data));
% iterate over each cell in pre_data
for i = 1:length(pre_data)
data_pre = pre_data{i}; % get the data in the current cell
means_pre = cellfun(@mean, data_pre, 'UniformOutput', false); % calculate the means of each column
stds_pre = cellfun(@std, data_pre, 'UniformOutput', false); % calculate the standard deviations of each column
% remove values that are above 3 standard deviations from the mean
for j = 1:size(data_pre{1}, 2) % iterate over each column
for k = 1:length(data_pre) % iterate over each row in each cell
data_pre{k}(:, j) = data_pre{k}(:, j) .* (abs(data_pre{k}(:, j) - means_pre{k}(j)) <= 3*stds_pre{k}(j));
end
end
pre_data_clean{i} = data_pre; % save the cleaned data to pre_data_clean
end
The code seems to me like it works. But when I apply the code for other cell arrays that I have (balls_data, between_data, baskets_data or post_data), similar to "pre_data", then I continue to get error messages like this:
Error using sum
Invalid data type. First argument must be numeric or logical.
Error in mean (line 127)
y = sum(x, dim, flag) ./ mysize(x,dim);
Is this because the cell arrays have empty cells? If so, how can I fix this code?
Thank you!

Answers (1)

You generally have cell arrays that contain cell arrays. However, in some places some of the entries are not cell arrays and are instead [] which is an empty double array.
Example fix:
baskets_data(cellfun(@isempty,baskets_data)) = {{}};

7 Comments

Thanks! How would I need to integrate that into the existing code above?
Sorry I am still quite novice.
You would not incorporate it into the above code that you posted. You would incorporate it into some portion before that, just after you loaded the data.
load baskets_data.mat
baskets_data(cellfun(@isempty,baskets_data)) = {{}};
and similarly for the other data files.
I tried your fix (see below) and now I get a new error.
Index exceeds the number of array elements. Index must not exceed 0.
Error in untitled (line 10)
for j = 1:size(data_balls{1}, 2) % iterate over each column
It also seems that the "balls_data_clean" (see attached) cell array is missing all the cells from row 19 upwards.
Do you have any idea why?
balls_data(cellfun(@isempty,balls_data)) = {{}};
balls_data_clean = cell(size(balls_data));
% iterate over each cell in pre_data
for i = 1:length(balls_data)
data_balls = balls_data{i}; % get the data in the current cell
means_balls= cellfun(@mean, data_balls, 'UniformOutput', false); % calculate the means of each column
stds_balls = cellfun(@std, data_balls, 'UniformOutput', false); % calculate the standard deviations of each column
% remove values that are above 3 standard deviations from the mean
for j = 1:size(data_balls{1}, 2) % iterate over each column
for k = 1:length(data_balls) % iterate over each row in each cell
data_balls{k}(:, j) = data_balls{k}(:, j) .* (abs(data_balls{k}(:, j) - means_balls{k}(j)) <= 3*stds_balls{k}(j));
end
end
balls_data_clean{i} = data_balls; % save the cleaned data to pre_data_clean
end
balls_data{19} is {} which is size 0 0.
You have
for j = 1:size(data_balls{1}, 2)
but data_balls is empty so data_balls{1} does not exist.
You should be checking isempty(data_balls) before trying to use data_balls{1}
In that case I would want to skip that cell. I thought was already happening.
How can I do that?
Thanks!
if isempty(data_balls)
balls_data_clean{i} = data_balls;
continue;
end
Thank you!
I have one last question. I just tried editing the code by making it scan for values above 1 SD... but nothing is removed which I think is not what would happen. Is the code performing correctly because it seems odd to me that not one value is above 1 SD.

Sign in to comment.

Categories

Asked:

on 14 Mar 2023

Commented:

on 16 Mar 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!