Rank importance of predictors using ReliefF or RReliefF algorithm
[
ranks predictors using either the ReliefF or RReliefF algorithm with
idx
,weights
] = relieff(X
,y
,k
)k
nearest neighbors. The input matrix
X
contains predictor variables, and the vector
y
contains a response vector. The function returns
idx
, which contains the indices of the most important
predictors, and weights
, which contains the weights of the
predictors.
If y
is numeric, relieff
performs
RReliefF analysis for regression by default. Otherwise, relieff
performs ReliefF analysis for classification using k
nearest
neighbors per class. For more information on ReliefF and RReliefF, see Algorithms.
Predictor ranks and weights usually depend on k
. If you
set k
to 1, then the estimates can be unreliable for noisy
data. If you set k
to a value comparable with the number of
observations (rows) in X
, relieff
can
fail to find important predictors. You can start with
k
= 10
and investigate the
stability and reliability of relieff
ranks and weights for
various values of k
.
relieff
removes observations with NaN
values.
[1] Kononenko, I., E. Simec, and M. Robnik-Sikonja. (1997). “Overcoming the
myopia of inductive learning algorithms with RELIEFF.” Retrieved from CiteSeerX:
https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.4740
[2] Robnik-Sikonja, M., and I.
Kononenko. (1997). “An adaptation of Relief for attribute estimation in
regression.” Retrieved from CiteSeerX: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.8381
[3] Robnik-Sikonja, M., and I. Kononenko. (2003). “Theoretical and empirical analysis of ReliefF and RReliefF.” Machine Learning, 53, 23–69.
fscmrmr
| fscnca
| fsrnca
| fsulaplacian
| knnsearch
| pdist2
| plotPartialDependence
| sequentialfs