How do I compute mean values of nine variables over Lat x lon x time?
Show older comments
I am attaching my netcdf matlab code to read the data from netcdf file. Input file netcdf is also attached. Please modify the code to compute mean values of nine meteorological variables from netcdf ERA5 hourly data. This file contains 00 and 12 GMT data. I would be highly obliged for your kind help.
I want daily average values from 00 and 12 GMT observations and also average values over lat and lon data. I have the total number of data points in input file 6961, so output file should have 3480 data points with nine variables(3481,9).
Sanchit
2 Comments
Cris LaPierre
on 19 Jul 2023
User has deleted their original question.
How do I compute mean values of nine variables over Lat x lon x time?
I am attaching my netcdf matlab code to read the data from netcdf file. Input file netcdf is also attached. Please modify the code to compute mean values of nine meteorological variables from netcdf ERA5 hourly data. This file contains 00 and 12 GMT data. I would be highly obliged for your kind help.
I want daily average values from 00 and 12 GMT observations and also average values over lat and lon data. I have the total number of data points in input file 6961, so output file should have 3480 data points with nine variables(3480,9).
Sanchit
Rena Berman
on 5 Jun 2024
(Answers Dev) Restored edit
Answers (1)
Cris LaPierre
on 19 Jul 2023
Edited: Cris LaPierre
on 19 Jul 2023
My understanding is that you want to take the mean of all the data for each day and time. This means you would lose the lat/lon and expver information.
I would take a different approach. I would turn the info into a timetable, and then use groupsummary to perform the mean by group.
unzip('matlab.zip','./');
ncfile = 'Lakhimpur_ERA5_daily_00_12_2014_2023.nc';
% Load the grouping data
lat = ncread(ncfile,'latitude');
lon = ncread(ncfile,'longitude');
expver = ncread(ncfile,'expver');
time = ncread(ncfile,'time');
time = datetime(double(time)*60*60,'ConvertFrom','epochtime','Epoch','1901-01-01');
% convert grouping data to 4x4x2x6961 arrays
[Lon,Lat,Expver,Time] = ndgrid(lon,lat,expver,time);
% Load the variables (4x4x2x6961)
var1 = ncread(ncfile,'d2m') ;
var2 = ncread(ncfile,'t2m') ;
var3 = ncread(ncfile,'e') ;
var4 = ncread(ncfile,'pev') ;
var5 = ncread(ncfile,'ssr') ;
var6 = ncread(ncfile,'ssrd') ;
var7 = ncread(ncfile,'tp') ;
d2m = var1 .* 0.00036854 + 288.2675;
d2m = d2m-273.15;
t2m = var2 .* 0.00038766 + 291.3707;
t2m = t2m - 273.15;
e = var3 .* 5.0661e-08 - 0.0030683;
pev = var4 .* 8.5836e-08 - 0.00278;
ssr = var5 .* 183.9069 + 12629668.0466;
ssrd =var6 .* 209.9315 + 14640023.0343;
tp =var7 .* 5.8624e-07 + 0.019209;
% DEWPOINTAND VAPOR PRESSURE DEFICIT EQUATIONS
% From Tetens Formula, the relation between temperature and the partial pressure of water vapor
es = 6.1078 .* exp((17.269 .* t2m) ./ (237.3+t2m));
% where,es is saturated vapor pressure in millibars and T is temperature in degrees C
% and the equation for relative humidity:
% Measure the air temperature T, in °C.
% Find out the dew point temperature Dp, in °C.
% Calculate relative humidity RH using the formula,
rh = 100 .* (exp(17.625 .* d2m ./ (243.04 + d2m)) ./ exp(17.625 .* t2m ./ (243.04 + t2m)));
% Rh=(ea/es)*100
% ea = Rh*es/100
% where, ea is the actual vapor pressure or vapor pressure at dewpoint temperature
% es is the saturation vapor pressure or vapor pressure at air temperature
% And,Vapor Pressure Deficit = es - ea at any instant in millibars
vpd = es - (rh .* es ./ 100);
% Create a timetable of all the data
Lon = Lon(:);
Lat = Lat(:);
Expver = Expver(:);
Time = Time(:);
d2m = d2m(:);
t2m = t2m(:);
e = e(:);
pev= pev(:);
ssr = ssr(:);
ssrd = ssrd(:);
tp = tp(:);
vpd = vpd(:);
rh = rh(:);
dataTbl = timetable(Time,Lon,Lat,Expver,d2m,t2m,e,pev,ssr,ssrd,tp,vpd,rh)
% Calculate the mean over latitude x longitude x time
data = groupsummary(dataTbl,["Time","Time"],["hourofday","day"],"mean",4:12)
You can then go on to process the data as you desire. See the Access Data in Tables page for help on working with table data. If you keep the results in a table, you will want to use writetable instead of writematrix.
17 Comments
Devendra
on 19 Jul 2023
Edited: Rena Berman
on 5 Jun 2024
My understanding is that Sanchit wants daily average of all the data each day. That means daily average from 00 and 12 GMT also. Please modify your code to take daily average of all the data including time also. So the output table will have dimensions of (3480,11). Therefore hourofday_time will also vanish along with lat and Lon and expver information.
Thanks a lot for your kind help.
Devendra
Cris LaPierre
on 19 Jul 2023
The table has the data for both times in it. Please use the links in my post to learn how to continue processing the data, including extracting the data for 00 and 12 GMT from the summary table created by groupsummary (3481 and 00, 3480 at 12).
Torsten
on 19 Jul 2023
No. We may like to restore your question.
Cris LaPierre
on 20 Jul 2023
Edited: Cris LaPierre
on 20 Jul 2023
Bringing the converstaion back here from your duplicate post.The summary table contains both data sets. Since I set the grouping as ["hourofday","day"], the first half of the table is 00 GMT, and the second half is 12 GMT. The only problem you have left to solve is splitting the table.
You can do that using any method of indexing you'd like. Again, from the linked page, in you want an array instead of a table as the result, your syntax would be T{rows,vars}
Make an attempt and share your code here if it doesn't work for more help.
Devendra
on 20 Jul 2023
Edited: Rena Berman
on 5 Jun 2024
Hi Cris,
Is it possible to take mean of 00 and 12 GMT observations from the above code itself rather than splitting the table for 00 and 12 GMT and then computing the mean of 00 and 12GMT?
I also wanted to use your code if it computes the average of all the variables from 00 and 12 GMT daily observations for each day.
Devendra
Sanchit
on 22 Jul 2023
Cris LaPierre
on 22 Jul 2023
Edited: Cris LaPierre
on 29 Jul 2023
What do you mean by truncated? What are you seeing that is unexpected?
Sanchit
on 22 Jul 2023
Edited: Rena Berman
on 5 Jun 2024
Sanchit
on 22 Jul 2023
Edited: Rena Berman
on 5 Jun 2024
Walter Roberson
on 22 Jul 2023
Edited: Cris LaPierre
on 28 Jul 2023
EDIT: Adding deleted posts here
I am using the following line to normalize the data between 0 to 1 directly from your code as follow;
data = groupsummary(dataTbl,["Time","Time"],["hourofday","day"],"mean",4:12)
% Normalize the data matrix
output = normalize(data,"range");
% Write normalized data in csv format
filename = 'c:/dummy.csv' writematrix(output,filename);
but it is giving error. May I request you to kindly suggest me how to get normalized data from above lines? Thank you very much.
Sanchit
Alternatevly how to write the data upto 10 decimal points in a csv file.
Sanchit
____________________________________________________________________________
I am unclear about the datatypes of the variables to be written. If they are all double precision, then see @doc:dlmwrite and use a 'precision' option if you need more precision than writematrix() gives you. writematrix() is defined to use the equivalent of "format long g" which is %.15g . If you need to be able to reproduce double precision values exactly then you need at least %.17g (not sure at the moment if you ever need %.18g)
Cris LaPierre
on 22 Jul 2023
@Sanchit, the error is because output is a table, and you are using writematrix to create the csv. You either need to use writetable (see here and here), or extract your numeric data to a numeric array (see here).
That still won't give you your data to 10 significant figures, but it will correct the error.
Also, please stop deleting your posts.
Sanchit
on 28 Jul 2023
Cris LaPierre
on 28 Jul 2023
The issue is exactly what the error message says: "invalid data variables" in Read_netCDF_file (line 67)
data = groupsummary(dataTbl,["Time","Time"],["hourofday","day"],"mean",4:12)
For reference, here is the syntax that line of code is using:
Sanchit
on 28 Jul 2023
Cris LaPierre
on 28 Jul 2023
Let's start simple. Look at your data. How many variables (columns) are there? Which columns of the table do you want to take the mean of?
Sanchit
on 28 Jul 2023
Cris LaPierre
on 29 Jul 2023
As a technicality, in a timetable, the Time column doesn't count, so technically there are 10.
I'm not sure if you remember, but in the original data file that spawned this question, there were 12 columns, and the code you are using was written for the original data set, not your new one. You need to update the datavars input accordingly.
For reference, here is the syntax my code is using:
Categories
Find more on NetCDF in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!