To determine a good lasso-penalty strength for a linear classification model that uses a logistic regression learner, compare k-fold edges.

Load the NLP data set. Preprocess the data as in Estimate k-Fold Cross-Validation Edge.

Create a set of 11 logarithmically-spaced regularization strengths from $$1{0}^{-8}$$ through $$1{0}^{1}$$.

Cross-validate a binary, linear classification model using 5-fold cross-validation and that uses each of the regularization strengths. Optimize the objective function using SpaRSA. Lower the tolerance on the gradient of the objective function to `1e-8`

.

CVMdl =
ClassificationPartitionedLinear
CrossValidatedModel: 'Linear'
ResponseName: 'Y'
NumObservations: 31572
KFold: 5
Partition: [1x1 cvpartition]
ClassNames: [0 1]
ScoreTransform: 'none'
Properties, Methods

`CVMdl`

is a `ClassificationPartitionedLinear`

model. Because `fitclinear`

implements 5-fold cross-validation, `CVMdl`

contains 5 `ClassificationLinear`

models that the software trains on each fold.

Estimate the edges for each fold and regularization strength.

eFolds = *5×11*
0.9958 0.9958 0.9958 0.9958 0.9958 0.9924 0.9770 0.9178 0.8452 0.8127 0.8127
0.9991 0.9991 0.9991 0.9991 0.9991 0.9938 0.9780 0.9201 0.8262 0.8128 0.8128
0.9992 0.9992 0.9992 0.9992 0.9992 0.9942 0.9781 0.9135 0.8253 0.8128 0.8128
0.9974 0.9974 0.9974 0.9974 0.9974 0.9931 0.9773 0.9121 0.8410 0.8130 0.8130
0.9976 0.9976 0.9976 0.9976 0.9976 0.9942 0.9782 0.9157 0.8368 0.8127 0.8127

`eFolds`

is a 5-by-11 matrix of edges. Rows correspond to folds and columns correspond to regularization strengths in `Lambda`

. You can use `eFolds`

to identify ill-performing folds, that is, unusually low edges.

Estimate the average edge over all folds for each regularization strength.

e = *1×11*
0.9978 0.9978 0.9978 0.9978 0.9978 0.9936 0.9777 0.9158 0.8349 0.8128 0.8128

Determine how well the models generalize by plotting the averages of the 5-fold edge for each regularization strength. Identify the regularization strength that maximizes the 5-fold edge over the grid.

Several values of `Lambda`

yield similarly high edges. Higher values of lambda lead to predictor variable sparsity, which is a good quality of a classifier.

Choose the regularization strength that occurs just before the edge starts decreasing.

Train a linear classification model using the entire data set and specify the regularization strength `LambdaFinal`

.

To estimate labels for new observations, pass `MdlFinal`

and the new data to `predict`

.