How to match mean and standard deviation of 2 datasets for data that cannot be less than 0
    10 views (last 30 days)
  
       Show older comments
    
Hi all,
I have two datasets of monthly precipitation sums at a certain location. One of them is observed, the other is modelled by a climate model. I want to match the mean and standard deviation of both datasets, so that the time series from the model matches with those from the observations. However, when I apply the formula, some negative values occur in the time series of the modelled monthly precipitation sums in order to make the standard deviation fit. I wonder whether there is a solution for this or if anyone knows how I can solve this.
Thanks!
data_mod=[70.74191271	66.54238669	28.60091702	55.56554018	66.04186858	77.06576381	72.57394329	99.62497103	42.51156832	22.81012399	107.3993961	48.45702239	33.71119171	61.09975519	74.43277952	39.14433747	113.1039794	67.93592923	96.95537867	12.99913771	53.6158074	48.05637989	52.27533536	99.27060261	54.73806827	148.462539	17.94473213	93.65016815	32.89454535	52.36015655];
data_obs=[27	38.6	146.1	61.8	44.6	7.5	50.4	23.2	8.1	89.7	23.1	83.3	86.5	46	14.5	27.7	81	30	50.3	165.7	15.5	106.7	56.7	52.5	75.1	100.1	6.9	18.7	93.4	16.6];
data_transformed = mean(data_obs(:)) + (data_mod - mean(data_mod(:)))*(std(data_obs(:))/std(data_mod(:)));
mean(data_transformed(:))
std(data_transformed(:))
0 Comments
Answers (1)
  Les Beckham
      
 on 7 Feb 2024
        
      Edited: Les Beckham
      
 on 7 Feb 2024
  
      Your model data doesn't look at all like the observed data, so I'm not surprised that trying to force their statistics to match gives you unexpected results.
data_mod=[70.74191271	66.54238669	28.60091702	55.56554018	66.04186858	77.06576381	72.57394329	99.62497103	42.51156832	22.81012399	107.3993961	48.45702239	33.71119171	61.09975519	74.43277952	39.14433747	113.1039794	67.93592923	96.95537867	12.99913771	53.6158074	48.05637989	52.27533536	99.27060261	54.73806827	148.462539	17.94473213	93.65016815	32.89454535	52.36015655];
data_obs=[27	38.6	146.1	61.8	44.6	7.5	50.4	23.2	8.1	89.7	23.1	83.3	86.5	46	14.5	27.7	81	30	50.3	165.7	15.5	106.7	56.7	52.5	75.1	100.1	6.9	18.7	93.4	16.6];
plot(1:numel(data_obs), data_obs, '.-', 1:numel(data_mod), data_mod, 'o-')
legend('Observed', 'Modeled')
grid on
mean(data_mod) % <<< Model mean is much higher than observed mean!
mean(data_obs)
data_transformed = mean(data_obs(:)) + (data_mod - mean(data_mod(:)))*(std(data_obs(:))/std(data_mod(:)));
mean(data_transformed(:))
std(data_transformed(:))
2 Comments
  Les Beckham
      
 on 8 Feb 2024
				Sorry, but I would have to say no, unless I misunderstand what your "request" or goal really is.  Perhaps if you explain in more detail what your ultimate goal is, someone can provide more suggestions.  You say that you "want to match the mean and standard deviation of both datasets, so that the time series from the model matches with those from the observations.", however, since the model very clearly isn't anything like the observations, this is going to be very difficult.
See Also
Categories
				Find more on Climate Science and Analysis in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

