Classification¶
Conformal prediction methods for classification tasks.
mapie.classification.SplitConformalClassifier
¶
SplitConformalClassifier(
estimator: ClassifierMixin = LogisticRegression(),
confidence_level: Union[float, Iterable[float]] = 0.9,
conformity_score: Union[
str, BaseClassificationScore
] = "lac",
prefit: bool = True,
n_jobs: Optional[int] = None,
verbose: int = 0,
random_state: Optional[Union[int, RandomState]] = None,
)
Computes prediction sets using the split conformal classification technique:
- The
fitmethod (optional) fits the base classifier to the training data. - The
conformalizemethod estimates the uncertainty of the base classifier by computing conformity scores on the conformalization set. - The
predict_setmethod predicts labels and sets of labels.
| PARAMETER | DESCRIPTION |
|---|---|
estimator
|
The base classifier used to predict labels.
TYPE:
|
confidence_level
|
The confidence level(s) for the prediction sets, indicating the desired coverage probability of the prediction sets. If a float is provided, it represents a single confidence level. If a list, multiple prediction sets for each specified confidence level are returned.
TYPE:
|
conformity_score
|
The method used to compute conformity scores. Valid options:
A custom score function inheriting from BaseClassificationScore may also be provided.
TYPE:
|
prefit
|
If True, the base classifier must be fitted, and the If False, the base classifier will be fitted during the
TYPE:
|
n_jobs
|
The number of jobs to run in parallel when applicable.
TYPE:
|
verbose
|
Controls the verbosity level. Higher values increase the output details.
TYPE:
|
Examples:
>>> from mapie.classification import SplitConformalClassifier
>>> from mapie.utils import train_conformalize_test_split
>>> from sklearn.datasets import make_classification
>>> from sklearn.neighbors import KNeighborsClassifier
>>> X, y = make_classification(n_samples=500)
>>> (
... X_train, X_conformalize, X_test,
... y_train, y_conformalize, y_test
... ) = train_conformalize_test_split(
... X, y, train_size=0.6, conformalize_size=0.2, test_size=0.2, random_state=1
... )
>>> mapie_classifier = SplitConformalClassifier(
... estimator=KNeighborsClassifier(),
... confidence_level=0.95,
... prefit=False,
... ).fit(X_train, y_train).conformalize(X_conformalize, y_conformalize)
Source code in mapie/classification.py
fit
¶
fit(
X_train: ArrayLike,
y_train: ArrayLike,
fit_params: Optional[dict] = None,
) -> SplitConformalClassifier
Fits the base classifier to the training data.
| PARAMETER | DESCRIPTION |
|---|---|
X_train
|
Training data features.
TYPE:
|
y_train
|
Training data targets.
TYPE:
|
fit_params
|
Parameters to pass to the
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Self
|
The fitted SplitConformalClassifier instance. |
Source code in mapie/classification.py
conformalize
¶
conformalize(
X_conformalize: ArrayLike,
y_conformalize: ArrayLike,
predict_params: Optional[dict] = None,
) -> SplitConformalClassifier
Estimates the uncertainty of the base classifier by computing conformity scores on the conformalization set.
| PARAMETER | DESCRIPTION |
|---|---|
X_conformalize
|
Features of the conformalization set.
TYPE:
|
y_conformalize
|
Targets of the conformalization set.
TYPE:
|
predict_params
|
Parameters to pass to the
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Self
|
The conformalized SplitConformalClassifier instance. |
Source code in mapie/classification.py
predict_set
¶
predict_set(
X: ArrayLike,
conformity_score_params: Optional[dict] = None,
) -> Tuple[NDArray, NDArray]
For each sample in X, predicts a label (using the base classifier), and a set of labels.
If several confidence levels were provided during initialisation, several sets will be predicted for each sample. See the return signature.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
conformity_score_params
|
Parameters specific to conformity scores, used at prediction time. The only example for now is
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tuple[NDArray, NDArray]
|
Two arrays:
|
Source code in mapie/classification.py
predict
¶
For each sample in X, returns the predicted label by the base classifier.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
NDArray
|
Array of predicted labels, with shape |
Source code in mapie/classification.py
mapie.classification.CrossConformalClassifier
¶
CrossConformalClassifier(
estimator: ClassifierMixin = LogisticRegression(),
confidence_level: Union[float, Iterable[float]] = 0.9,
conformity_score: Union[
str, BaseClassificationScore
] = "lac",
cv: Union[int, BaseCrossValidator] = 5,
n_jobs: Optional[int] = None,
verbose: int = 0,
random_state: Optional[Union[int, RandomState]] = None,
)
Computes prediction sets using the cross conformal classification technique:
- The
fit_conformalizemethod estimates the uncertainty of the base classifier in a cross-validation style. It fits the base classifier on folds of the dataset and computes conformity scores on the out-of-fold data. - The
predict_setmethod predicts labels and sets of labels.
| PARAMETER | DESCRIPTION |
|---|---|
estimator
|
The base classifier used to predict labels.
TYPE:
|
confidence_level
|
The confidence level(s) for the prediction sets, indicating the desired coverage probability of the prediction sets. If a float is provided, it represents a single confidence level. If a list, multiple prediction sets for each specified confidence level are returned.
TYPE:
|
conformity_score
|
The method used to compute conformity scores. Valid options:
A custom score function inheriting from BaseClassificationScore may also be provided.
TYPE:
|
cv
|
The cross-validator used to compute conformity scores. Valid options:
Main variants in the cross conformal setting are:
TYPE:
|
n_jobs
|
The number of jobs to run in parallel when applicable.
TYPE:
|
verbose
|
Controls the verbosity level. Higher values increase the output details.
TYPE:
|
random_state
|
A seed or random state instance to ensure reproducibility in any random operations within the classifier.
TYPE:
|
Examples:
>>> from mapie.classification import CrossConformalClassifier
>>> from sklearn.datasets import make_classification
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.neighbors import KNeighborsClassifier
>>> X_full, y_full = make_classification(n_samples=500)
>>> X, X_test, y, y_test = train_test_split(X_full, y_full)
>>> mapie_classifier = CrossConformalClassifier(
... estimator=KNeighborsClassifier(),
... confidence_level=0.95,
... cv=10
... ).fit_conformalize(X, y)
Source code in mapie/classification.py
fit_conformalize
¶
fit_conformalize(
X: ArrayLike,
y: ArrayLike,
groups: Optional[ArrayLike] = None,
fit_params: Optional[dict] = None,
predict_params: Optional[dict] = None,
) -> CrossConformalClassifier
Estimates the uncertainty of the base classifier in a cross-validation style: fits the base classifier on different folds of the dataset and computes conformity scores on the corresponding out-of-fold data.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
y
|
Targets
TYPE:
|
groups
|
Groups to pass to the cross-validator.
TYPE:
|
fit_params
|
Parameters to pass to the
TYPE:
|
predict_params
|
Parameters to pass to the
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Self
|
This CrossConformalClassifier instance, fitted and conformalized. |
Source code in mapie/classification.py
predict_set
¶
predict_set(
X: ArrayLike,
conformity_score_params: Optional[dict] = None,
agg_scores: str = "mean",
) -> Tuple[NDArray, NDArray]
For each sample in X, predicts a label (using the base classifier), and a set of labels.
If several confidence levels were provided during initialisation, several sets will be predicted for each sample. See the return signature.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
conformity_score_params
|
Parameters specific to conformity scores, used at prediction time. The only example for now is
TYPE:
|
agg_scores
|
How to aggregate conformity scores. Each classifier fitted on different folds of the dataset is used to produce conformity scores on the test data. The agg_score parameter allows to control how those scores are aggregated. Valid options:
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tuple[NDArray, NDArray]
|
Two arrays:
|
Source code in mapie/classification.py
predict
¶
For each sample in X, returns the predicted label by the base classifier.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
NDArray
|
Array of predicted labels, with shape |