class lightning.classification.CDClassifier(loss='squared_hinge', penalty='l2', multiclass=False, C=1.0, alpha=1.0, max_iter=50, tol=0.001, termination='violation_sum', shrinking=True, max_steps='auto', sigma=0.01, beta=0.5, warm_start=False, debiasing=False, Cd=1.0, warm_debiasing=False, selection='cyclic', permute=True, callback=None, n_calls=100, random_state=None, verbose=0, n_jobs=1)[source]

Estimator for learning linear classifiers by (block) coordinate descent.

The objective functions considered take the form

minimize F(W) = C * L(W) + alpha * R(W),

where L(W) is a loss term and R(W) is a penalty term.

  • loss (str, 'squared_hinge', 'log', 'modified_huber', 'squared') – The loss function to be used.

  • penalty (str, 'l2', 'l1', 'l1/l2') –

    The penalty to be used.

    • l2: ridge

    • l1: lasso

    • l1/l2: group lasso

  • multiclass (bool) – Whether to use a direct multiclass formulation (True) or one-vs-rest (False). Direct formulations are only available for loss=’squared_hinge’ and loss=’log’.

  • C (float) – Weight of the loss term.

  • alpha (float) – Weight of the penalty term.

  • max_iter (int) – Maximum number of iterations to perform.

  • tol (float) – Tolerance of the stopping criterion.

  • termination (str, 'violation_sum', 'violation_max') – Stopping criterion to use.

  • shrinking (bool) – Whether to activate shrinking or not.

  • max_steps (int or "auto") – Maximum number of steps to use during the line search. Use max_steps=0 to use a constant step size instead of the line search. Use max_steps=”auto” to let CDClassifier choose the best value.

  • sigma (float) – Constant used in the line search sufficient decrease condition.

  • beta (float) – Multiplicative constant used in the backtracking line search.

  • warm_start (bool) – Whether to activate warm-start or not.

  • debiasing (bool) – Whether to refit the model using l2 penalty (only useful if penalty=’l1’ or penalty=’l1/l2’).

  • Cd (float) – Value of C when doing debiasing.

  • warm_debiasing (bool) – Whether to warm-start the model or not when doing debiasing.

  • selection (str, 'cyclic', 'uniform') – Strategy to use for selecting coordinates.

  • permute (bool) – Whether to permute coordinates or not before cycling (only when selection=’cyclic’).

  • callback (callable) – Callback function.

  • n_calls (int) – Frequency with which callback must be called.

  • random_state (RandomState or int) – The seed of the pseudo random number generator to use.

  • verbose (int) – Verbosity level.

  • n_jobs (int) – Number of CPU’s to be used when multiclass=False and when penalty is a non group-lasso penalty. By default use one CPU. If set to -1, use all CPU’s


The following example demonstrates how to learn a classification model with a multiclass squared hinge loss and an l1/l2 penalty.

>>> from sklearn.datasets import fetch_20newsgroups_vectorized
>>> from lightning.classification import CDClassifier
>>> bunch = fetch_20newsgroups_vectorized(subset="all")
>>> X, y =,
>>> clf = CDClassifier(penalty="l1/l2",
                       C=1.0 / X.shape[0],
                       random_state=0).fit(X, y)
>>> accuracy = clf.score(X, y)


Block Coordinate Descent Algorithms for Large-scale Sparse Multiclass Classification. Mathieu Blondel, Kazuhiro Seki, and Kuniaki Uehara. Machine Learning, May 2013.

fit(X, y)[source]

Fit model according to X and y.

  • X (array-like, shape = [n_samples, n_features]) – Training vectors, where n_samples is the number of samples and n_features is the number of features.

  • y (array-like, shape = [n_samples]) – Target values.


self – Returns self.

Return type



Get parameters for this estimator.


deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.


params – Parameter names mapped to their values.

Return type


property predict_proba
score(X, y, sample_weight=None)

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.

  • sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.


score – Mean accuracy of self.predict(X) wrt. y.

Return type



Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.


**params (dict) – Estimator parameters.


self – Estimator instance.

Return type

estimator instance