Calculate the Standard Deviation based on multiple conditions

9 views (last 30 days)
Hello everyone,
I am hoping to get some help with calculating the standard deviation for my large data set. I have attached just a small snippet of it here.
There are over 4000 rows and 30 columns. It is broken down into some basic information, such as date, month, and season. The data that I would like to calculate the standard deviation is in columns labeled A-I, again just a 1 year snippet. It is in a text file. Tried to attach it, but Chrome didn't like that idea. So, sorry it is just an image.
Here is an example of what I would like to do. I would like to find the standard deviation of all of the values in the A column from Spring, labeled as NV, and dated between 1987-1997. I have been sorting this data into the different groups, but it is quite a bit. There must be a better way. Is there a way to do this with a formula and also skip all of the blank values?
Using MATLAB Online, so be kind with the code.
  4 Comments
Cg Gc
Cg Gc on 12 Jul 2025
Each column is its own column. A is just a number.
There are blanks across all seasons. There are 4000 rows. The first spring blank occurs at row 120. No, I do not want the blanks included.
I tried to upload a copy of the data, but Chrome didn't like it. Not sure what else to say. It would look exactly the same as above, just more rows and columns.
Stephen23
Stephen23 on 12 Jul 2025
"I tried to upload a copy of the data, but Chrome didn't like it."
Either
  1. change the file extension to .TXT, or
  2. zip it into a ZIP file,
then upload it.

Sign in to comment.

Answers (1)

Star Strider
Star Strider on 9 Jul 2025
Edited: Star Strider on 9 Jul 2025
It would definitely help to have the data, or at lest a representative sample of it. One option would be to use the function, then attach the .zip file, providing it meets the 5 MB size limit.
That aside, one option would be to do something like this --
Seasons = reshape(repmat(["Spring","Summer","Autumn","Winter"], 4,1), [], 1);
Seasons = repmat(Seasons,4,1);
Years = reshape(repmat([1987,1988,1989,1990], 16, 1), [], 1);
A_col = randn(size(Seasons));
T1 = table(Years,Seasons,A_col);
disp(T1)
Years Seasons A_col _____ ________ __________ 1987 "Spring" -0.57128 1987 "Spring" -0.18962 1987 "Spring" 0.04167 1987 "Spring" 1.8298 1987 "Summer" -0.11033 1987 "Summer" -0.026817 1987 "Summer" 0.94144 1987 "Summer" 0.62166 1987 "Autumn" 0.33936 1987 "Autumn" 0.329 1987 "Autumn" 0.33187 1987 "Autumn" -1.19 1987 "Winter" 2.3013 1987 "Winter" -0.29744 1987 "Winter" -0.50834 1987 "Winter" 0.98364 1988 "Spring" -0.12044 1988 "Spring" -1.1003 1988 "Spring" 0.45325 1988 "Spring" 1.3311 1988 "Summer" 1.5602 1988 "Summer" -0.4503 1988 "Summer" 1.2863 1988 "Summer" 1.8512 1988 "Autumn" 0.18766 1988 "Autumn" 0.23527 1988 "Autumn" 1.4489 1988 "Autumn" 0.99098 1988 "Winter" 0.5748 1988 "Winter" -1.401 1988 "Winter" -0.17484 1988 "Winter" -0.10002 1989 "Spring" 0.25363 1989 "Spring" 2.0738 1989 "Spring" -0.3582 1989 "Spring" 0.77048 1989 "Summer" 0.80731 1989 "Summer" -0.062933 1989 "Summer" 0.60588 1989 "Summer" 0.82984 1989 "Autumn" 0.58104 1989 "Autumn" 1.1067 1989 "Autumn" 0.305 1989 "Autumn" 0.032715 1989 "Winter" 1.0284 1989 "Winter" -0.22976 1989 "Winter" -0.50884 1989 "Winter" 0.00057698 1990 "Spring" -0.15528 1990 "Spring" 0.058896 1990 "Spring" -0.15138 1990 "Spring" -1.0657 1990 "Summer" -0.15268 1990 "Summer" 0.31799 1990 "Summer" 0.91269 1990 "Summer" 1.2581 1990 "Autumn" 2.976 1990 "Autumn" -0.44626 1990 "Autumn" -0.13499 1990 "Autumn" -0.14337 1990 "Winter" -0.010632 1990 "Winter" -0.29054 1990 "Winter" -0.49267 1990 "Winter" -0.83724
[G,ID1,ID2] = findgroups(T1.Years, T1.Seasons=="Spring");
Out = accumarray(G, (1:numel(G)).', [], @(x){[T1.Years(x(1)), T1.Seasons(x(1)), std(A_col(x))]});
LvS = cell2mat(cellfun(@(x)strcmp(x(:,2),"Spring"), Out, Unif=0));
Result = array2table(cell2mat(Out(LvS,:)), VariableNames=["Year","Season","StDev"])
Result = 4×3 table
Year Season StDev ______ ________ _________ "1987" "Spring" "1.0652" "1988" "Spring" "1.0203" "1989" "Spring" "1.0345" "1990" "Spring" "0.50162"
Unfortunately, findgroups is failing me here, and returning both "Spring" and "Summer" values. Creating and using 'LvS' solves that problem.
Make appropriate changes with your data to get the results you want.
To select only 1987-1997, I would just use logical indexing to begin with to create a secondary table with only those years, then use it for your calculations.
EDIT -- (9 Jul 2025 at 13:04)
An easier approach would be to simply identify the "Spring" rows and then call accumarray --
T1Spr = T1(T1.Seasons=="Spring",:)
T1Spr = 16×3 table
Years Seasons A_col _____ ________ ________ 1987 "Spring" -0.57128 1987 "Spring" -0.18962 1987 "Spring" 0.04167 1987 "Spring" 1.8298 1988 "Spring" -0.12044 1988 "Spring" -1.1003 1988 "Spring" 0.45325 1988 "Spring" 1.3311 1989 "Spring" 0.25363 1989 "Spring" 2.0738 1989 "Spring" -0.3582 1989 "Spring" 0.77048 1990 "Spring" -0.15528 1990 "Spring" 0.058896 1990 "Spring" -0.15138 1990 "Spring" -1.0657
[Yu,ia,Yidx] = unique(T1Spr.Years,'stable');
Resultc = accumarray(Yidx,(1:numel(Yidx)),[],@(x){[T1Spr.Years(x(1)), std(T1Spr.A_col(x))]});
Result = array2table(cell2mat(Resultc), VariableNames=["Years","std A"])
Result = 4×2 table
Years std A _____ _______ 1987 1.0652 1988 1.0203 1989 1.0345 1990 0.50162
You can use the same sort of approach to isolate the desired year range as the first step, then isolate the "Spring" entries as I did here. The rest should be straightforward.
.
  3 Comments
Star Strider
Star Strider on 9 Jul 2025
Possibly. There are likely several ways to do this. I chose the most basic approach, because it should be the most generally applicable.
Cg Gc
Cg Gc on 12 Jul 2025
If am I reading the code correctly, I will be regrouping my data into their respective seasons and decades and then calculating the standard deviation based on the groups. It really isn't anything different than what I am already doing manually. It also seems like more work, just to get the standard deviation.
Is there a more direct approach? I can manually get rid of all of the blank data rows.

Sign in to comment.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!