Mean of Multiple data sets across a time series

I have 5 sets of CO2 concentrations, all taken at different locations. They have different start dates, the earliest beginning in 1957 and the latest in 1981 all up until the present. I would like to find the average of all 5 locations to create one line and be able to plot it against time, I believe this would take a for loop which I am not very experienced in. I also considered keeping the datasets separate and just finding the averages of each and using hold on to plot them against each other, but I again run into the issue of figuring out how to code to loop to keep it against time. If one of these ways would be a better option for how I should analyze my data that would be helpful.

6 Comments

REad about mean. You can frame your data into a matrix and get avaerage on the matrix by specifying dimension.
To restate the question, according to my understanding: There are two options under consideration:
1) calculate the average CO2 concentration of the 5 locations, for each time. But each location's data spans a different time period, so it is necessary to "line up" the data properly in time before averaging.
2) calculate the average CO2 concentration for each location over all time, which would produce 5 numbers. (or perhaps calculate a running average of CO2 concentration for each location, which would produce 5 vectors of numbers?)
As far as I understand, the main difficulty is figuring out how to "line up" the data so that only data from the same time are used together. Is that basically right? If so, what format are the data in now?
My data is currently in an array. It orignally came in a table so I did the table2array. I am thinking the best option for me would be #1 like you listed above. I just have no idea how to go about doing so.
Is it possible to share the table or array of data, say as a .mat file uploaded here? Then I'll look and see what I can figure out.
My data came in a .csv file, would I be able to share it to you through that? Or should I share my entire livescript to you? I tried uploading a file here and it was not working for me so I am unsure if I am doing that correctly.
Yeah, if you can upload the .csv file, that would be perfect.
If you're unable to do so, maybe it's too large? Probably just the top however many rows would be sufficient for me to get an idea of what the data is. If all else fails, maybe a screen shot of the .csv open in Excel or whatever would work well enough.

Sign in to comment.

Answers (1)

Hi Fiona,
As per my understanding, you want to get the mean CO2 where the reading has been taken at different locations and the time stamp of the recording is also different.
Without the actual data, it is difficult to tell the solution upfront, but I have created a sample example code where I have created a table that contains readings in ppm across different locations and were taken at different months. Now, in order to calculate the mean, I'm getting the reading across the same month and then calculating the mean. You can refer to the below code for your reference.
location = ["A";"A";"B";"B";"B";"C";"C";"D"];
month = ["Jan";"Mar";"Jan";"Feb";"Mar";"Feb";"Mar";"Jan"];
ppm = [10;20;30;40;50;60;70;80];
dataTable = table(location,month,ppm);
dataArray = table2array(dataTable);
uniqueMonths = unique(dataArray(:,2));
for i=1:length(uniqueMonths)
requiredPPM = dataArray(dataArray(:,2)==uniqueMonths(i),3);
monthMean = mean(str2double(requiredPPM));
disp("Mean of " + uniqueMonths(i) + " is " + monthMean);
end
Mean of Feb is 50 Mean of Jan is 40 Mean of Mar is 46.6667
Hope it helps!

Products

Release

R2021a

Asked:

on 8 Dec 2021

Answered:

on 18 Apr 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!