Load the sample data.

The `robotarm`

(pumadyn32nm) dataset is created using a robot arm simulator with 7168 training and 1024 test observations with 32 features [1], [2]. This is a preprocessed version of the original data set. Data are preprocessed by subtracting off a linear regression fit followed by normalization of all features to unit variance.

Compute the generalization error without feature selection.

Now, refit the model and compute the prediction loss with feature selection, with $$\lambda $$ = 0 (no regularization term) and compare to the previous loss value, to determine feature selection seems necessary for this problem. For the settings that you do not change, `refit`

uses the settings of the initial model `nca`

. For example, it uses the feature weights found in `nca`

as the initial feature weights.

The decrease in the loss suggests that feature selection is necessary.

Plot the feature weights.

Tuning the regularization parameter usually improves the results. Suppose that, after tuning $$\lambda $$ using cross-validation as in Tune Regularization Parameter in NCA for Regression, the best $$\lambda $$ value found is 0.0035. Refit the `nca`

model using this $$\lambda $$ value and stochastic gradient descent as the solver. Compute the prediction loss.

Plot the feature weights.

After tuning the regularization parameter, the loss decreased even more and the software identified four of the features as relevant.

**References**

[1] Rasmussen, C. E., R. M. Neal, G. E. Hinton, D. van Camp, M. Revow, Z. Ghahramani, R. Kustra, and R. Tibshirani. The DELVE Manual, 1996, http://mlg.eng.cam.ac.uk/pub/pdf/RasNeaHinetal96.pdf

[2] https://www.cs.toronto.edu/~delve/data/datasets.html