To determine a good lasso-penalty strength for a linear classification model that uses a logistic regression learner, compare distributions of test-sample margins.

Load the NLP data set. Preprocess the data as in Estimate Test-Sample Margins.

Create a set of 11 logarithmically-spaced regularization strengths from $$1{0}^{-8}$$ through $$1{0}^{1}$$.

Train binary, linear classification models that use each of the regularization strengths. Optimize the objective function using SpaRSA. Lower the tolerance on the gradient of the objective function to `1e-8`

.

CVMdl =
ClassificationPartitionedLinear
CrossValidatedModel: 'Linear'
ResponseName: 'Y'
NumObservations: 31572
KFold: 1
Partition: [1x1 cvpartition]
ClassNames: [0 1]
ScoreTransform: 'none'
Properties, Methods

Extract the trained linear classification model.

Mdl =
ClassificationLinear
ResponseName: 'Y'
ClassNames: [0 1]
ScoreTransform: 'logit'
Beta: [34023x11 double]
Bias: [1x11 double]
Lambda: [1x11 double]
Learner: 'logistic'
Properties, Methods

`Mdl`

is a `ClassificationLinear`

model object. Because `Lambda`

is a sequence of regularization strengths, you can think of `Mdl`

as 11 models, one for each regularization strength in `Lambda`

.

Estimate the test-sample margins.

Because there are 11 regularization strengths, `m`

has 11 columns.

Plot the test-sample margins for each regularization strength. Because logistic regression scores are in [0,1], margins are in [-1,1]. Rescale the margins to help identify the regularization strength that maximizes the margins over the grid.

Several values of `Lambda`

yield margin distributions that are compacted near $$1000{0}^{1}$$. Higher values of lambda lead to predictor variable sparsity, which is a good quality of a classifier.

Choose the regularization strength that occurs just before the centers of the margin distributions start decreasing.

Train a linear classification model using the entire data set and specify the desired regularization strength.

To estimate labels for new observations, pass `MdlFinal`

and the new data to `predict`

.