Clear Filters
Clear Filters

cross correlation between two time series data

107 views (last 30 days)
I have two different normalised atmospheric time series data. Their temporal variation shows a mixed pattern of correlation and anti correlation at different time periods. Aparently there are some lags in these correlations. But when I carry out cross correlation, the plot shows max correlation at 1500 time lag which is the number of data points (plot xcorr-data.png). The wcoherence shows strong correlation between 0.5-2 days buts arrows dont show any definite pattern. Can anyone suggest me, how I can interprete the result in more meaningful way. Any suggestion about any alternative ways will be great.
Thanking you in advance.

Answers (1)

William Rose
William Rose on 10 May 2024
I looked at the plots you provided. You say "Their temporal variation shows a mixed pattern of correlation and anti correlation at different time periods." Is that statement based on your visual analysis of the orange and blue traces in variation.png? I recommend that you be cautious when drawing such conclusions. We humans see "patterns" even where none exist. There are plenty of studies invoving uncorrelated random sequences that demonstrate this. It is good to state your hypothesis about the relationship between the two signals clearly, before doing the analysis, because if you do not, then there is a significant risk that you will find a spurious relationship between the signals, by "data dredging" - that is, by doing enough anslyses that something pops up as "significant".
The orange trace in the "variation" figure shows a large spike around March 25, 2024. Is this real, or is it some kind of edge effect? The large size of this spike means that it will have have a large weight in some kinds of analysis.
The large correlation at 1500 lag is occurs because you did not remopve the mean value form each sequence before doing the cross correlation, and because you did not specify the unbiased estimate of the cross correlation.
If you upload the data shown in variaiton.png, I expect you will receive more specific suggesitons for analysis.
  2 Comments
Sandipan
Sandipan on 11 May 2024
@William Rose Thank you for the insight. I understand visual conclusions can lead to false correlations. Actually, I have carried out Spearmann R correlations at different intervals. It is 0.5 -0.7 in different intervals. But at some interval its positive, in others its anti correlated. I was exploring whats the best way to represent it. Can wavelet coherence analysis be meaningful here? As suggested by you, I have uploaded the raw data for variation.png file
William Rose
William Rose on 13 May 2024
I apologize for saying things you already knew, in my initial post.
I would be uneasy about computing Spearmann (or Pearson) correlation sepearately for different intervals, unless you have a hypothesis for why the correlsation would be different. For example, maybe you hypothesize that x and y are anti-correlated when the moon is waxing, and correlated when the moon is waning,. Then it makes sense to analyze the data for the different phases of the moon separately.
The script below plots the data you posted. The raw data sems to be a bit differnet hatn the data in the plot variaiton.png, in your original posting. The script also computes and plots the cross correlation with two different normalization options. Neither plot looks like the cross correlation plot you provided in your original plot. I am not sure how you made that plot.
d=importdata('sampledata.csv');
x=d.data(:,1); y=d.data(:,2);
N=length(x);
fprintf('Mean(x)=%.3f, mean(y)=%.3f, length=%d samples.\n',mean(x),mean(y),N)
Mean(x)=0.000, mean(y)=-0.000, length=1596 samples.
t=0:N-1;
[r1,lags]=xcorr(x,y); % cross corr., default
r2=xcorr(x,y,'unbiased'); % cross corr., unbiased
figure
plot(t,x,'-r.',t,y,'-b.')
xlabel('Time (hours)'); legend('x','y'); grid on
figure;
subplot(211), plot(lags,r1,'-g')
title('Cross Correlation (default)'); grid on
subplot(212), plot(lags,r2,'-g')
title('Cross Correlation (unbiased)');
xlabel('Lags'); grid on
You can use wavelet coherence analysis, but I am not up to speed on the literature regarding its susceptibility to type I errors, i.e. finding significance in a dataset, where in fact there is none.
Good luck.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!