How can I transform these data into seasonal data?

Hello everyone,
I should transform these data into seasonal data:
1) December-January; 2) From March to October; 3) November; 4)February.
Thank you!

 Accepted Answer

@Pul. Here's a solution that allows you to get summary statistics by season:
load('testdata.mat'); % loads giulia_TT timetable
% create logical vector for each season
s1Logical = month(giulia_TT.Time) == 12 | month(giulia_TT.Time) == 1; % Dec, Jan
s2Logical = month(giulia_TT.Time) >= 3 & month(giulia_TT.Time) <= 10; % Mar to Oct
s3Logical = month(giulia_TT.Time) == 11; % Nov.
s4Logical = month(giulia_TT.Time) == 2; % Feb.
sAll = s1Logical*1 + s2Logical*2 + s3Logical*3 + s4Logical*4;
% add column for season: 1, 2, 3, 4
giulia_TT.Season = sAll;
% convert to table and compute some group stats by season
T = timetable2table(giulia_TT);
data = T(:,{'Var5','Season'});
statarray = grpstats(data, 'Season', {'mean' 'std'})
% plot var5 vs season with +/-1 sd in error bars
x = statarray.Season;
yMean = statarray.mean_Var5;
ySTD = statarray.std_Var5;
b = bar(x, yMean, 'facecolor', [.8 .8 .9]);
set(gca, 'YLim', [0 120]);
title('Var5 by Season');
xlabel('Season');
ylabel('Mean');
% add error bars showing +/- 1 SD
hold on;
eb = errorbar(x, yMean, ySTD, ySTD);
eb.Color = [.5 .5 .9];
eb.LineStyle = 'none';
eb.LineWidth = 1.5;
% display values in bars
s = sprintf('%.2f,', statarray.mean_Var5);
s(end) = []; % remove trailing semi-colon
s = split(s, ',');
text(b.XEndPoints, b.YData * 0.4, s, 'b', 'fontsize', 12, 'horizontalalignment', 'center');
Stats table output:
Bar plot output:

8 Comments

Pul
Pul on 12 Aug 2021
Edited: Pul on 12 Aug 2021
Thank you very much for the deepening. I’m not understanding very well the code and as a consequence,I’m not able to plot. (I had to install a matlab product for running the entire code!) My aim should be plotting on the x-axis time(seasons) and on the y-axis data related to season with their average.
Thank you.
I added code to my answer to also output a bar plot for Var1 by season with error bars showing +/- 1 SD of the means. Unfortunately, I have no idea what measures correspond to the generic variable names, so the labelling isn't quite right. Edit accordingly.
Thank you.
But how can I get which season is referring to,in the plot?
And, to be honest, it's not clear to me how you calculated the average and how the grey rectangles come from.
I would need to correlate the average of the seasons using Var.5 in "giulia_TT" (that I can't see in the code).
Thanks.
What are the names of the measures corresponding to Var1, Var2, ... Var6?
Do you have names for the seasons? I'm only asking because of the unusual divisions given in your question (e.g. season 4 is February).
The bar chart is is just an example. It is for Var1. The season number is directly under each bar in the chart. The first bar is for the season 1 which is Dec+Jan, according to your definition. The height of the bar is the mean/average for the corresponding season and variable. For the first bar (season 1), the mean is 208.51. You can see the exact value in the statarray output table and you can also read the approximate value off the y-axis.
It's also possible, with a few extra lines of code, to display the mean value inside the bar. Send me names for the variables and seasons and I'll add this to the code.
No, I don't have name for the seasons!
I would just need the division I told you before: 1)Dec-Jan; 2) Mar to Oct; 3)Nov; 4)Feb.
So 1 is Dec-Jan, 2 is for Mar to Oct, 3 is for Nov and 4 is for Feb, right?
Yes, I saw the values in the table, but I was wondering where they come from. Do they come from Var5? (Because I have to take into account only that variable.)
Indeed, I noticed that in the previous code, using this script, I got different value for the mean: 58.125090122566690, 74.331828370156900, 65.709169913095600, 61.315059959507860.
So, for this reason, I was wondering which variable you were considering.
s3_var5_mean = mean(s3.Var5, 'omitnan'),
s1_var5_mean = mean(s1.Var5, 'omitnan')
s2_var5_mean = mean(s2.Var5, 'omitnan')
s3_var5_mean = mean(s3.Var5, 'omitnan')
s4_var5_mean = mean(s4.Var5, 'omitnan')
As a consequence, the names of other variables don't matter to me!
Thank you.
OK, I adjusted the code -- again. The anaysis now is only for Var5, which is the only variable you are interested in, apparently. The means are now displayed in the bars, as well.
Thanks for you help.
You're welcome. Glad to help. Good luck.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!