# cross correlation using 'xcorr' in the presence of NaN or missing values

68 views (last 30 days)
Sagar on 6 Aug 2015
Commented: Adam Danz on 15 Apr 2020
Hi I am trying to calculate cross correlation of two time-series at different lags but my data have a lot of NaN values. When I calculate cross correlation as below, it gives all NaNs in the corln.
[corln, lags] = xcorr (ave_precp_india (:), aod_all (:, 1), 15);
I want to specify something like 'rows', 'pairwise' in calculating correlation so that NaNs are ignored. How can I specify 'rows', 'pairwise' option in xcorr?

#### 1 Comment

Brian Sweis on 15 Apr 2020
ever find an answer to this? in the same exact boat

Adam Danz on 15 Apr 2020
Edited: Adam Danz on 15 Apr 2020
There isn't a simple solution to this problem. If you have a single NaN value within the window, the correlation for that window will be NaN.
Depending on how many missing values are in the data and how far they are spread apart, you may be able to work around the problem.
If there are relatively few missing values and the missing values are spread apart, you could fill in the NaN values by interpolation or using Matlab's fillmissing() function but you must do so in a responsible and meaningful way. Merely avoiding NaN values is not an indication that your solution was a good solution. After filling the missing values, plot the data and make sure the updated values make sense and are reasonable.
If the NaN values are clustered together, interpolation and fillmissing() won't be reasonable solutions. You may have to analyze the data in chunks but even that has problems since the number of data points within the window becomes smaller at the beginning and end of each chunk of data.

Brian Sweis on 15 Apr 2020
Thanks Adam, I'm realizing this is certainly not so simple, as I've been browsing the web all day learning about other people's situations with similar issues. I appreciate your (recent) commentary as a lot of people's comments online are from pretty old posts.
To make it a little more complicated, I'm trying to do this in 2 dimensions much like xcorr2, which also does't have an obvious way to spit out the normalized corrcoeff like xcorr can.
This function normxcorr2_general seems to work a bit more flexibly than matlab's built in normxcorr2:
And this function nanxcov seems to try to handle NaNs in a way xcorr doesn't and then normalizes with means removed.
I reached out to the author of the normxcorr2_general code above, Dirk Padfield who just today pointed me in the direction of this paper with corresponding code:
"Masked object registration in the Fourier domain" on how to quickly compute masked correlation that you can find along with the code at http://www.dirkpadfield.com/papers. This approach/code enables you to specify a mask with any arbitrary pixels turned on that you want, and all pixels that are turned off will be ignored in the computation. The code does not include a "maxLag" parameter but because the computation is fast you can crop the output to what you need after it processes all lags in multiple dimensions. "
Adam Danz on 15 Apr 2020
Thanks for sharing what you've found. It will likely be useful to future visitors here.