How to replace the missing value using the correlation between x and y?

7 views (last 30 days)
Hi,
Let's say I have this data:
T x y
------ ------ ------
1 150 7.2
2 150 7.1
3 210 NaN
4 235 6.1
5 280 5.5
I tried to use correlation on MATLAB but it gives me weired output where all the values are NaNs with size 5x5
Is there a way to replace this missing value using the correlation between x and y?

Answers (2)

Ayush Singh
Ayush Singh on 18 Jun 2022
Hi Omar,
I understand from your question that you want to have the correlation between x and y but at the same time emit the NaN values.
You could use 'rows' , 'complete' name -pair values to avoid the rows with NaN values.
CorrXY = corr (X , Y, 'rows', 'complete') % Here X and Y are the two variables for which you are finding the correlation
  1 Comment
Omar Mfarij
Omar Mfarij on 18 Jun 2022
Edited: Omar Mfarij on 18 Jun 2022
Hi Ayush,
Thanks for answering!
I tried this method before, and it returns everything with NaN. That is why it was weird to me!
What I need is to not ignore the missing value. I need to replace this NaN value after I correlate between x and y. I can simply replace it with mean value and then correlate between x and y.
However, the correlation comes first and then replacing the missing value.

Sign in to comment.


Jeff Miller
Jeff Miller on 20 Jun 2022
Omar, if I understand what you are trying to do, I would suggest:
  • Form a reduced dataset where you drop all rows with NANs. These rows cannot tell you anything about the relation between X and Y since you don't have Y.
  • Using regress or fitlm, fit a linear model to predict Y from X within this reduced dataset.
  • Going back to the full dataset, compute the predicted Y for each case with NAN based on its X and the linear model just fit.

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!