Plotting data with min, max, mean value

Hello, I need help with changing my output plot. I have computed data of temperature during whole year in one minute intervals. Plot of data looks like this:
Is there a way how to plot it and get graph for example like this one?
If it helps, I'm adding my simulated data. Thank you for any help.

 Accepted Answer

dpb
dpb on 10 Dec 2016
Edited: dpb on 11 Dec 2016
See
doc boxplot % for starters...
ADDENDUM
I admit I've never used Matlab boxplot in anger; let's see...[ponder, piddle, ... ah, ok!]...
Takes a little klutzing around -- augment the shorter months with NaN to the length of the longest to create a rectangular array, then make a categorical grouping array 1:12 for each column.
ADDENDUM(2)
Decided to see if could do with your sample data...turns out not too bad...
g=ordinal(cellstr(datestr(datenum(2000,1:12,1),'mmm')));
l1=(eomday(2015,1:12))*24*60; % days in month for non-leapyear to minutes
l2=cumsum(l1); % cumulative for indexing into 1D vector
t=nan(31*24*60,12); % preallocate NaN
i1=1; % allocate to rectangular array...
for i=1:12
t(1:l1(i),i)=teplotaBojleru(i1:l2(i)).';
i1=l2(i)+1;
end
boxpot(t,g,'grouporder',cellstr(g))
NB: Remember I'm at R2012b; newer datetime class may be somewhat simpler. Also, if you could generate/retrieve the data initially in the orientation desired might make it a little simpler to not have to do the rearranging--altho it's not too bad...
ADDENDUM(3):
You don't have to rearrange; if you create a grouping variable of the same length as the vector that is the corresponding month for that column, then (other than perhaps memory issues) boxplot will work for that. I did a very trivial example of just 20 or so elements with three groupings to test the theory but it did work, apparently correctly.
The grouping for the categorical variable then needs must be
boxplot(x,g1,'grouporder',cellstr(getlevels(g1)))
to reflect the levels only since they're no longer unique as in the other method. I don't know how to manipulate the new categorical that seems missing these methods from the old Stat Toolbox implementation.

9 Comments

I'm having problem with boxplotting separated month data into one plot becouse they have different number of rows. E.g January has 44640rows, February 40320 rows... "boxplot(January,February)" and I get "Error using boxplot>straightenX (line 923) G must be the same length as X or the same length as the number of columns in X."
If you know how many rows there are in each month, and it sounds like you do, then just extract them into 12 variables, one for each month. Anything wrong with that?
LamaObecna
LamaObecna on 11 Dec 2016
Edited: LamaObecna on 11 Dec 2016
dpb: I'm trying to use your code as you did, but I get error "Undefined function or variable 'l'. Error in boxplot (line 8) i1=l(i)+1;" If I change l(i) for i(i) (I thought it might work) then I get "Subscripted assignment dimension mismatch. Error in boxplot (line 8) t(1:l1(i),i)=teplotaBojleru(i1:l2(i)).';"
Image Analyst: Problem is they have different row size and since I'm fighting with some matlab basics I dont know how to fill them correctly with NaN values to get same size. edit: I solved it
dpb
dpb on 11 Dec 2016
Edited: dpb on 11 Dec 2016
No way, no how, would there be an error referring to boxplot with the code line i1=l(i)+1 in my code. That's before you ever call boxplot creating the array to pass...
I just reran it here...first checked that the two l arrays were still in memory...
>> disp([l1;l2])
Columns 1 through 10
44640 40320 44640 43200 44640 43200 44640 44640 43200 44640
44640 84960 129600 172800 217440 260640 305280 349920 393120 437760
Columns 11 through 12
43200 44640
480960 525600
>>
OK, they're still around so...
>> t=nan(31*24*60,12);
>> i1=1;for i=1:12,t(1:l1(i),i)=teplotaBojleru(i1:l2(i)).';i1=l(i)+1;end
>> whos t
Name Size Bytes Class Attributes
t 44640x12 4285440 double
>> figure,boxplot(t,g,'grouporder',cellstr(g))
>>
regenerated the same figure as expected...
Show the actual code and error in context...
PS: You don't have to create the 2D array or save as separate variables, either. As noted in my follow-up above, you can create a grouping variable that corresponds to month of the same length as the initial temperature vector. l1, l2 would be a very useful bit of info from which to do that, as well...
I'm using R2015b. If I use exactly same code as you posted first time I get that error
g=ordinal(cellstr(datestr(datenum(2000,1:12,1),'mmm')));
l1=(eomday(2015,1:12))*24*60; % days in month for non-leapyear to minutes
l2=cumsum(l1); % cumulative for indexing into 1D vector
t=nan(31*24*60,12); % preallocate NaN
i1=1; % allocate to rectangular array...
for i=1:12
t(1:l1(i),i)=teplotaBojleru(i1:l2(i)).';
i1=l2(i)+1; % CORRECTION TYPO FIXED HERE MISSING '2'
end
boxplot(t,g,'grouporder',cellstr(g))
Undefined function or variable 'l'.
Why is there this "i1=1;"? Where do you define "l"? I see only l1 and l2.
Right now I created this:
months=ordinal(cellstr(datestr(datenum(2000,1:4,1),'mmm')));
year=nan(31*24*60,4);
year(1:44640,1)=teplotaBojleru(1:44640)';%january
year(1:40320,2)=teplotaBojleru(44641:84960)';%february
year(1:44640,3)=teplotaBojleru(84961:129600)';%march
year(1:43200,4)=teplotaBojleru(129601:172800)';%april
boxplot(year,months,'grouporder',cellstr(months))
It works, but it's little bit elementary. "As noted in my follow-up above, you can create a grouping variable that corresponds to month of the same length" I don't get it...I'm confused :D
dpb
dpb on 11 Dec 2016
Edited: dpb on 11 Dec 2016
" Undefined function or variable 'l'."
Oh, pooh! :( That's a typo on my part; I didn't realize I still had an earlier l hanging around, which computed before I introduced the two. It's actually l2 in that use. See later comment on why...
"Why is there this "i1=1;"?"
It's the initialization for the lower index for the first iteration in the indexing expression into the long temperature vector teplotaBojleru(i1:l2(i)) . That's "eye-1":"ell-2", then i1 is updated for the next iteration. l2 is the accumulative end point of each month's subsection; hence the first location in the next section is the last of the previous plus one. The first element in the vector is index 1; hence the initialization for the the first pass.
..."you can create a grouping variable that corresponds to month of the same length"
Say have a vector of 9 values over three months...then the grouping variable would be 3 levels for three groups. In general terms,
>> N=3; M=3; % levels, number of classes
>> g=reshape(repmat([1:N].',1,M).',1,[])
g =
1 1 1 2 2 2 3 3 3
>>
as numeric grouping variables. Then convert that to categorical to use the previous to get the categorical labels.
Of course, in the real case you have, the grouping variable would be built from the lengths computed earlier for the corresponding positions in the vector; the above just illustrates the way it would work without building the rectangular array but a secondary grouping vector. Both end up with two copies of that size although you could possibly do without the 1D vector after creating the 2D array in which case it would be more effective. That all depends on what else is to be done, if anything.
I juste wanted to write I figured out where was mistake so now it's working fine! Last question, what are those thick red lines? Anyway, thank you very much for your time and help!
Glad you did recognize the error...I realized my first response was incorrect when I checked my workspace and discovered I did have an 'l' array had forgotten about...
" what are those thick red lines?"
I dunno...I presume they're a fignewton of the result of the distribution of points from your artificial data but I don't really know what boxplot does internally or what it indicates, specifically, about that distribution/set of values.
I presume when you have real data it'll look more normal; certainly the time plot is kinda' funky-lookin' with the bounded/truncated distributions shown. Note that they show up on the two ends where you've truncated the bottom side so heavily.
" what are those thick red lines?"
Oh, I know! It's the gazillion "outliers" for those months that are so drastically truncated one side. Note for Jan how small the 25-75th percentile range is as compared to the actual data range. The default outlier symbol string is 'r+' so what looks solid is actually lots of red crosses on top of each other to make what looks like solid bar. You can just make out the vertical edge of the uppermost "+" at the top of each of those bars.

Sign in to comment.

More Answers (0)

Categories

Asked:

on 10 Dec 2016

Commented:

dpb
on 12 Dec 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!