Calculate mean of all variables that have a mean

I have a table with mixed types of variables. I want to calculate the mean of any variable that can have a mean without converting to another type- so just numericals, logicals, etc. and leave string types alone. I do not want to specify which ones to find the mean for because I will probably add variables to this table eventually. The AI made up this apparently fake option, which is what I am looking for: 'Exclude', {'StringVar'}

 Accepted Answer

Instead of calling varfun twice as the answer posted by @Arun does, call it once with the InputVariables name-value argument. The value of InputVariables can be a function handle that varfun can execute on the table variables; if the function handle returns true varfun will process that variable.
load patients
T = table(LastName,Age,Height,Weight);
T.Over40 = T.Age > 40;
head(T)
LastName Age Height Weight Over40 ____________ ___ ______ ______ ______ {'Smith' } 38 71 176 false {'Johnson' } 43 69 163 true {'Williams'} 38 64 131 false {'Jones' } 40 67 133 false {'Brown' } 49 64 119 true {'Davis' } 46 68 142 true {'Miller' } 33 64 142 false {'Wilson' } 40 68 180 false
T2 = varfun(@mean, T, InputVariables=@(x) isnumeric(x) | islogical(x))
T2 = 1x4 table
mean_Age mean_Height mean_Weight mean_Over40 ________ ___________ ___________ ___________ 38.28 67.07 154 0.4
There is no variable mean_LastName in T2 because the LastName variable in T did not satisfy isnumeric | islogical.

6 Comments

Thanks! This actual intelligence answer is definitely better.
Hi Marcus,
"I want to calculate the mean of any variable that can have a mean"
If that's really the case (and limited in scope to built-in types), then you may need to also check for duration, datetime, and even char if those are potentially relevant.
Interestingly, in 2022a the doc page for mean included a "Data Types" entry that showed which data types are mean-able. For some reason, that information is no longer shown in mean.
Thanks. Good point- and I do mean it. I often calulate the mode of a char, but never tried the mean... I looked at the documentation and I'm still not even sure what the mean of a set of char would be.
Looks like mean of char means the mean of the array after doing the implicit conversion of char to numeric.
c ='abcd'
c = 'abcd'
[mean(c) mean(double(c))]
ans = 1x2
98.5000 98.5000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
c =['abcd';'abcd']
c = 2x4 char array
'abcd' 'abcd'
mean(c,2)
ans = 2x1
98.5000 98.5000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
mean(c,1)
ans = 1x4
97 98 99 100
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
double('abcd')
ans = 1x4
97 98 99 100
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Thanks! That's probably what mode is doing as well, but it returns the characters
c =['abcd';'abcd';'defg'];
mode(c)
ans = 'abcd'
Interesting. The doc page Mode - Input Argument does not explicitly list char as an acceptable class for the input array. But the doc page Mode - Output Arguments does say that "If the input A is an array, then the output M is an array of the same class."
On the other hand, the doc page for mean is silent on the list of acceptable input arguments, but the discussion under the 'outtype' argument clearly allows for char input. But that section is kind of hard to follow for char input. I guess one could arguy that from that section that char input should result in a double output.
Seems odd that mean and mode would return different class of output for the same input.

Sign in to comment.

More Answers (1)

I understand that you wish to calculate the mean of any variable in a table without specifying which ones to target specifically.
To achieve this for variables that naturally support mean calculation (such as numerical and logical types) without needing to explicitly convert types or specify variables, one can programmatically determine the variable types and then calculate the mean for those that are appropriate. This method is flexible and will automatically adjust to new variables added to the table.
Following are the steps that can be followed to achieve the required objective:
  1. Step-1: Identify Numerical and Logical Variables: Use the table's “vartype” method or inspect variable types with “varfun” to identify variables that are either numerical or logical, as these are the types for which a mean would typically be meaningful.
  2. Step-2: Calculate Mean for Identified Variables:Use the identified variable names to calculate their means.
Here is an example script that demonstrates this approach:
% Example table with mixed variable types
T = table([1; 2; 3; 4], {'a'; 'b'; 'c'; 'd'}, [true; false; true; false], 'VariableNames', {'NumericVar', 'StringVar', 'LogicalVar'});
% Identify variables that are either numerical or logical
isNumericOrLogical = varfun(@(x) isnumeric(x) || islogical(x), T, 'OutputFormat', 'uniform');
% Calculate mean for numerical and logical variables only
means = varfun(@mean, T(:, isNumericOrLogical), 'OutputFormat', 'table');
% Display the result
disp(means);
mean_NumericVar mean_LogicalVar _______________ _______________ 2.5 0.5
This method avoids the need for hard-coding variable names or types and will automatically adapt as you add or remove variables from the table, provided one rerun the script or incorporate it into a function that operates on your table.
Please refer the following documentation links for more information regarding the related functions:
  1. vartype: https://www.mathworks.com/help/matlab/ref/vartype.html
  2. varfun: https://www.mathworks.com/help/matlab/ref/table.varfun.html
Hope this helps!
Regards
Arun

Categories

Find more on Graphics Performance in Help Center and File Exchange

Products

Release

R2023b

Tags

Asked:

on 14 Jun 2024

Commented:

on 25 Jun 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!