Conformity Scores¶

Conformity score classes for regression and classification.

Regression¶

mapie.conformity_scores.BaseRegressionScore ¶

BaseRegressionScore(
    sym: bool,
    consistency_check: bool = True,
    eps: float = float(EPSILON),
)

Bases: BaseConformityScore

Base conformity score class for regression task.

This class should not be used directly. Use derived classes instead.

PARAMETER	DESCRIPTION
`sym`	Whether to consider the conformity score as symmetrical or not. TYPE: `bool`
`consistency_check`	Whether to check the consistency between the methods `get_estimation_distribution` and `get_conformity_scores`. If `True`, the following equality must be verified:: `y == self.get_estimation_distribution( y_pred, self.get_conformity_scores(y, y_pred, kwargs), kwargs)` By default `True`. TYPE: `bool` DEFAULT: `True`
`eps`	Threshold to consider when checking the consistency between `get_estimation_distribution` and `get_conformity_scores`. It should be specified if `consistency_check==True`. By default, it is defined by the default precision. TYPE: `float` DEFAULT: `float(EPSILON)`

Source code in mapie/conformity_scores/regression.py

def __init__(
    self,
    sym: bool,
    consistency_check: bool = True,
    eps: float = float(EPSILON),
):
    super().__init__()
    self.sym = sym
    self.consistency_check = consistency_check
    self.eps = eps

get_signed_conformity_scores `abstractmethod` ¶

get_signed_conformity_scores(
    y: NDArray, y_pred: NDArray, **kwargs
) -> NDArray

Placeholder for get_conformity_scores. Subclasses should implement this method!

Compute the sample conformity scores given the predicted and observed targets.

PARAMETER	DESCRIPTION
`y`	Observed target values. TYPE: `NDArray`
`y_pred`	Predicted target values. TYPE: `NDArray`

RETURNS	DESCRIPTION
`NDArray of shape (n_samples,)`	Signed conformity scores.

Source code in mapie/conformity_scores/regression.py

@abstractmethod
def get_signed_conformity_scores(
    self, y: NDArray, y_pred: NDArray, **kwargs
) -> NDArray:
    """
    Placeholder for `get_conformity_scores`.
    Subclasses should implement this method!

    Compute the sample conformity scores given the predicted and
    observed targets.

    Parameters
    ----------
    y: NDArray of shape (n_samples,)
        Observed target values.

    y_pred: NDArray of shape (n_samples,)
        Predicted target values.

    Returns
    -------
    NDArray of shape (n_samples,)
        Signed conformity scores.
    """

get_conformity_scores ¶

get_conformity_scores(
    y: NDArray, y_pred: NDArray, **kwargs
) -> NDArray

Get the conformity score considering the symmetrical property if so.

PARAMETER	DESCRIPTION
`y`	Observed target values. TYPE: `NDArray`
`y_pred`	Predicted target values. TYPE: `NDArray`

RETURNS	DESCRIPTION
`NDArray of shape (n_samples,)`	Conformity scores.

Source code in mapie/conformity_scores/regression.py

def get_conformity_scores(self, y: NDArray, y_pred: NDArray, **kwargs) -> NDArray:
    """
    Get the conformity score considering the symmetrical property if so.

    Parameters
    ----------
    y: NDArray of shape (n_samples,)
        Observed target values.

    y_pred: NDArray of shape (n_samples,)
        Predicted target values.

    Returns
    -------
    NDArray of shape (n_samples,)
        Conformity scores.
    """
    conformity_scores = self.get_signed_conformity_scores(y, y_pred, **kwargs)

    if self.consistency_check:
        self.check_consistency(y, y_pred, conformity_scores, **kwargs)
    if self.sym:
        conformity_scores = np.abs(conformity_scores)
    return conformity_scores

check_consistency ¶

check_consistency(
    y: NDArray,
    y_pred: NDArray,
    conformity_scores: NDArray,
    **kwargs,
) -> None

Check consistency between the following methods: get_estimation_distribution and get_signed_conformity_scores

The following equality should be verified::

y == self.get_estimation_distribution(
    y_pred,
    self.get_conformity_scores(y, y_pred, **kwargs),
    **kwargs)

PARAMETER	DESCRIPTION
`y`	Observed target values. TYPE: `NDArray`
`y_pred`	Predicted target values. TYPE: `NDArray`
`conformity_scores`	Conformity scores. TYPE: `NDArray`

RAISES	DESCRIPTION
`ValueError`	If the two methods are not consistent.

Source code in mapie/conformity_scores/regression.py

def check_consistency(
    self, y: NDArray, y_pred: NDArray, conformity_scores: NDArray, **kwargs
) -> None:
    """
    Check consistency between the following methods:
    `get_estimation_distribution` and `get_signed_conformity_scores`

    The following equality should be verified::

        y == self.get_estimation_distribution(
            y_pred,
            self.get_conformity_scores(y, y_pred, **kwargs),
            **kwargs)

    Parameters
    ----------
    y: NDArray of shape (n_samples,)
        Observed target values.

    y_pred: NDArray of shape (n_samples,)
        Predicted target values.

    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores.

    Raises
    ------
    ValueError
        If the two methods are not consistent.
    """
    score_distribution = self.get_estimation_distribution(
        y_pred, conformity_scores, **kwargs
    )
    abs_conformity_scores = np.abs(np.subtract(score_distribution, y))
    max_conf_score: float = np.max(abs_conformity_scores)
    if max_conf_score > self.eps:
        raise ValueError(
            "The two functions get_conformity_scores and "
            "get_estimation_distribution of the BaseRegressionScore class "
            "are not consistent. "
            "The following equation must be verified: "
            "self.get_estimation_distribution(y_pred, "
            "self.get_conformity_scores(y, y_pred)) == y. "
            f"The maximum conformity score is {max_conf_score}. "
            "The eps attribute may need to be increased if you are "
            "sure that the two methods are consistent."
        )

get_estimation_distribution `abstractmethod` ¶

get_estimation_distribution(
    y_pred: NDArray, conformity_scores: NDArray, **kwargs
) -> NDArray

Placeholder for get_estimation_distribution. Subclasses should implement this method!

Compute samples of the estimation distribution given the predicted targets and the conformity scores.

PARAMETER	DESCRIPTION
`y_pred`	Predicted target values. TYPE: `NDArray`
`conformity_scores`	Conformity scores. TYPE: `NDArray`

RETURNS	DESCRIPTION
`NDArray of shape (n_samples,)`	Observed values.

Source code in mapie/conformity_scores/regression.py

@abstractmethod
def get_estimation_distribution(
    self, y_pred: NDArray, conformity_scores: NDArray, **kwargs
) -> NDArray:
    """
    Placeholder for `get_estimation_distribution`.
    Subclasses should implement this method!

    Compute samples of the estimation distribution given the predicted
    targets and the conformity scores.

    Parameters
    ----------
    y_pred: NDArray of shape (n_samples,)
        Predicted target values.

    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores.

    Returns
    -------
    NDArray of shape (n_samples,)
        Observed values.
    """

get_bounds ¶

get_bounds(
    X: NDArray,
    alpha_np: NDArray,
    estimator: EnsembleRegressor,
    conformity_scores: NDArray,
    ensemble: bool = False,
    method: str = "base",
    optimize_beta: bool = False,
    allow_infinite_bounds: bool = False,
) -> Tuple[NDArray, NDArray, NDArray]

Compute bounds of the prediction intervals from the observed values, the estimator of type EnsembleRegressor and the conformity scores.

PARAMETER	DESCRIPTION
`X`	Observed feature values. TYPE: `NDArray`
`alpha_np`	NDArray of floats between `0` and `1`, represents the uncertainty of the confidence interval. TYPE: `NDArray`
`estimator`	Estimator that is fitted to predict y from X. TYPE: `EnsembleRegressor`
`conformity_scores`	Conformity scores. TYPE: `NDArray`
`ensemble`	Boolean determining whether the predictions are ensembled or not. By default `False`. TYPE: `bool` DEFAULT: `False`
`method`	Method to choose for prediction interval estimates. The `"plus"` method implies that the quantile is calculated after estimating the bounds, whereas the other methods (among the `"naive"`, `"base"` or `"minmax"` methods, for example) do the opposite. By default `base`. TYPE: `str` DEFAULT: `'base'`
`optimize_beta`	Whether to optimize the PIs' width or not. By default `False`. TYPE: `bool` DEFAULT: `False`
`allow_infinite_bounds`	Allow infinite prediction intervals to be produced. By default `False`. TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`Tuple[NDArray, NDArray, NDArray]`	The predictions itself. (y_pred) of shape (n_samples,). The lower bounds of the prediction intervals of shape (n_samples, n_alpha). The upper bounds of the prediction intervals of shape (n_samples, n_alpha).

RAISES	DESCRIPTION
`ValueError`	If beta optimisation with symmetrical conformity score function.

Source code in mapie/conformity_scores/regression.py

def get_bounds(
    self,
    X: NDArray,
    alpha_np: NDArray,
    estimator: EnsembleRegressor,
    conformity_scores: NDArray,
    ensemble: bool = False,
    method: str = "base",
    optimize_beta: bool = False,
    allow_infinite_bounds: bool = False,
) -> Tuple[NDArray, NDArray, NDArray]:
    """
    Compute bounds of the prediction intervals from the observed values,
    the estimator of type `EnsembleRegressor` and the conformity scores.

    Parameters
    ----------
    X: NDArray of shape (n_samples, n_features)
        Observed feature values.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between `0` and `1`, represents the
        uncertainty of the confidence interval.

    estimator: EnsembleRegressor
        Estimator that is fitted to predict y from X.

    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores.

    ensemble: bool
        Boolean determining whether the predictions are ensembled or not.

        By default `False`.

    method: str
        Method to choose for prediction interval estimates.
        The `"plus"` method implies that the quantile is calculated
        after estimating the bounds, whereas the other methods
        (among the `"naive"`, `"base"` or `"minmax"` methods,
        for example) do the opposite.

        By default `base`.

    optimize_beta: bool
        Whether to optimize the PIs' width or not.

        By default `False`.

    allow_infinite_bounds: bool
        Allow infinite prediction intervals to be produced.

        By default `False`.

    Returns
    -------
    Tuple[NDArray, NDArray, NDArray]
        - The predictions itself. (y_pred) of shape (n_samples,).
        - The lower bounds of the prediction intervals of shape
          (n_samples, n_alpha).
        - The upper bounds of the prediction intervals of shape
          (n_samples, n_alpha).

    Raises
    ------
    ValueError
        If beta optimisation with symmetrical conformity score function.
    """
    if self.sym and optimize_beta:
        raise ValueError(
            "Interval width minimization cannot be used with a "
            + "symmetrical conformity score function."
        )

    y_pred, y_pred_low, y_pred_up = estimator.predict(X, ensemble)
    signed = -1 if self.sym else 1

    if optimize_beta:
        beta_np = self._beta_optimize(
            alpha_np,
            conformity_scores.reshape(1, -1),
            conformity_scores.reshape(1, -1),
        )
    else:
        beta_np = alpha_np / 2

    if method == "plus":
        alpha_low = alpha_np if self.sym else beta_np
        alpha_up = 1 - alpha_np if self.sym else 1 - alpha_np + beta_np

        conformity_scores_low = self.get_estimation_distribution(
            y_pred_low, signed * conformity_scores, X=X
        )
        conformity_scores_up = self.get_estimation_distribution(
            y_pred_up, conformity_scores, X=X
        )
        bound_low = self.get_quantile(
            conformity_scores_low,
            alpha_low,
            axis=1,
            reversed=True,
            unbounded=allow_infinite_bounds,
        )
        bound_up = self.get_quantile(
            conformity_scores_up, alpha_up, axis=1, unbounded=allow_infinite_bounds
        )

    else:
        if self.sym:
            alpha_ref = 1 - alpha_np
            quantile_ref = self.get_quantile(
                conformity_scores[..., np.newaxis], alpha_ref, axis=0
            )
            quantile_low, quantile_up = -quantile_ref, quantile_ref

        else:
            alpha_low, alpha_up = beta_np, 1 - alpha_np + beta_np

            quantile_low = self.get_quantile(
                conformity_scores[..., np.newaxis],
                alpha_low,
                axis=0,
                reversed=True,
                unbounded=allow_infinite_bounds,
            )
            quantile_up = self.get_quantile(
                conformity_scores[..., np.newaxis],
                alpha_up,
                axis=0,
                unbounded=allow_infinite_bounds,
            )

        bound_low = self.get_estimation_distribution(y_pred_low, quantile_low, X=X)
        bound_up = self.get_estimation_distribution(y_pred_up, quantile_up, X=X)

    return y_pred, bound_low, bound_up

predict_set ¶

predict_set(X: NDArray, alpha_np: NDArray, **kwargs)

Compute the prediction sets on new samples based on the uncertainty of the target confidence set.

PARAMETER	DESCRIPTION
`X`	The input data or samples for prediction. TYPE: `NDArray`
`alpha_np`	Represents the uncertainty of the confidence set to produce. TYPE: `NDArray`
`**kwargs`	Additional keyword arguments. DEFAULT: `{}`

RETURNS	DESCRIPTION
`result`	The prediction sets for each sample and each alpha level. The output structure depends on the `get_bounds` method.

Source code in mapie/conformity_scores/regression.py

def predict_set(self, X: NDArray, alpha_np: NDArray, **kwargs):
    """
    Compute the prediction sets on new samples based on the uncertainty of
    the target confidence set.

    Parameters
    -----------
    X: NDArray of shape (n_samples,)
        The input data or samples for prediction.

    alpha_np: NDArray of shape (n_alpha, )
        Represents the uncertainty of the confidence set to produce.

    **kwargs: dict
        Additional keyword arguments.

    Returns
    --------
    result
        The prediction sets for each sample and each alpha level.
        The output structure depends on the `get_bounds` method.
    """
    return self.get_bounds(X=X, alpha_np=alpha_np, **kwargs)

get_effective_calibration_samples ¶

get_effective_calibration_samples(scores: NDArray)

Calculates the effective number of calibration samples.

PARAMETER	DESCRIPTION
`scores`	An array of scores. TYPE: `NDArray`

RETURNS	DESCRIPTION
`n`	The effective number of calibration samples. TYPE: `int`

Source code in mapie/conformity_scores/regression.py

def get_effective_calibration_samples(self, scores: NDArray):
    """
    Calculates the effective number of calibration samples.

    Parameters
    ----------
    scores: NDArray
        An array of scores.

    Returns
    -------
    n: int
        The effective number of calibration samples.
    """
    n: int = np.sum(~np.isnan(scores))
    if not self.sym:
        n //= 2
    return n

mapie.conformity_scores.AbsoluteConformityScore ¶

AbsoluteConformityScore(sym: bool = True)

Bases: BaseRegressionScore

Absolute conformity score.

The signed conformity score = y - y_pred. The conformity score is symmetrical.

This is appropriate when the confidence interval is symmetrical and its range is approximatively the same over the range of predicted values.

References

[1] Lei, J., G'Sell, M., Rinaldo, A., Tibshirani, R. J. & Wasserman, L.. "Distribution-Free Predictive Inference for Regression." Journal of the American Statistical Association 2018.

Source code in mapie/conformity_scores/bounds/absolute.py

def __init__(
    self,
    sym: bool = True,
) -> None:
    super().__init__(sym=sym, consistency_check=True)

get_signed_conformity_scores ¶

get_signed_conformity_scores(
    y: ArrayLike, y_pred: ArrayLike, **kwargs
) -> NDArray

Compute the signed conformity scores from the predicted values and the observed ones, from the following formula: signed conformity score = y - y_pred

Source code in mapie/conformity_scores/bounds/absolute.py

def get_signed_conformity_scores(
    self, y: ArrayLike, y_pred: ArrayLike, **kwargs
) -> NDArray:
    """
    Compute the signed conformity scores from the predicted values
    and the observed ones, from the following formula:
    signed conformity score = y - y_pred
    """
    return np.subtract(y, y_pred)

get_estimation_distribution ¶

get_estimation_distribution(
    y_pred: ArrayLike,
    conformity_scores: ArrayLike,
    **kwargs,
) -> NDArray

Compute samples of the estimation distribution from the predicted values and the conformity scores, from the following formula: signed conformity score = y - y_pred <=> y = y_pred + signed conformity score

conformity_scores can be either the conformity scores or the quantile of the conformity scores.

Source code in mapie/conformity_scores/bounds/absolute.py

def get_estimation_distribution(
    self, y_pred: ArrayLike, conformity_scores: ArrayLike, **kwargs
) -> NDArray:
    """
    Compute samples of the estimation distribution from the predicted
    values and the conformity scores, from the following formula:
    signed conformity score = y - y_pred
    <=> y = y_pred + signed conformity score

    `conformity_scores` can be either the conformity scores or
    the quantile of the conformity scores.
    """
    return np.add(y_pred, conformity_scores)

mapie.conformity_scores.GammaConformityScore ¶

GammaConformityScore(sym: bool = False)

Bases: BaseRegressionScore

Gamma conformity score.

The signed conformity score = (y - y_pred) / y_pred. The conformity score is not symmetrical.

This is appropriate when the confidence interval is not symmetrical and its range depends on the predicted values. Like the Gamma distribution, its support is limited to strictly positive reals.

References

[1] Cordier, T., Blot, V., Lacombe, L., Morzadec, T., Capitaine, A. & Brunel, N.. "Flexible and Systematic Uncertainty Estimation with Conformal Prediction via the MAPIE library." Proceedings of Machine Learning Research 2023.

Source code in mapie/conformity_scores/bounds/gamma.py

def __init__(
    self,
    sym: bool = False,
) -> None:
    super().__init__(sym=sym, consistency_check=False)

get_signed_conformity_scores ¶

get_signed_conformity_scores(
    y: ArrayLike, y_pred: ArrayLike, **kwargs
) -> NDArray

Compute the signed conformity scores from the observed values and the predicted ones, from the following formula: signed conformity score = (y - y_pred) / y_pred

Source code in mapie/conformity_scores/bounds/gamma.py

def get_signed_conformity_scores(
    self, y: ArrayLike, y_pred: ArrayLike, **kwargs
) -> NDArray:
    """
    Compute the signed conformity scores from the observed values
    and the predicted ones, from the following formula:
    signed conformity score = (y - y_pred) / y_pred
    """
    self._check_observed_data(y)
    self._check_predicted_data(y_pred)
    return np.divide(np.subtract(y, y_pred), y_pred)

get_estimation_distribution ¶

get_estimation_distribution(
    y_pred: ArrayLike,
    conformity_scores: ArrayLike,
    **kwargs,
) -> NDArray

Compute samples of the estimation distribution from the predicted values and the conformity scores, from the following formula: signed conformity score = (y - y_pred) / y_pred <=> y = y_pred * (1 + signed conformity score)

conformity_scores can be either the conformity scores or the quantile of the conformity scores.

Source code in mapie/conformity_scores/bounds/gamma.py

def get_estimation_distribution(
    self, y_pred: ArrayLike, conformity_scores: ArrayLike, **kwargs
) -> NDArray:
    """
    Compute samples of the estimation distribution from the predicted
    values and the conformity scores, from the following formula:
    signed conformity score = (y - y_pred) / y_pred
    <=> y = y_pred * (1 + signed conformity score)

    `conformity_scores` can be either the conformity scores or
    the quantile of the conformity scores.
    """
    self._check_predicted_data(y_pred)
    return np.multiply(y_pred, np.add(1, conformity_scores))

mapie.conformity_scores.ResidualNormalisedScore ¶

ResidualNormalisedScore(
    residual_estimator: Optional[RegressorMixin] = None,
    prefit: bool = False,
    split_size: Optional[Union[int, float]] = None,
    random_state: Optional[Union[int, RandomState]] = None,
    sym: bool = True,
    consistency_check: bool = False,
)

Bases: BaseRegressionScore

Residual Normalised score.

The signed conformity score = abs(y - y_pred) / r_pred. r_pred being the predicted residual abs(y - y_pred) of the base estimator. It is calculated by a model that learns to predict these residuals. The learning is done with the log of the residual and we use the exponential of the prediction to avoid negative values.

The conformity score is symmetrical and allows the calculation of adaptive prediction intervals (taking X into account). It is possible to use it only with split and prefit methods (not with cross methods).

Warning : if the estimator provided is not fitted a subset of the calibration data will be used to fit the model (20% by default).

References

[1] Lei, J., G'Sell, M., Rinaldo, A., Tibshirani, R. J. & Wasserman, L.. "Distribution-Free Predictive Inference for Regression." Journal of the American Statistical Association 2018.

PARAMETER	DESCRIPTION
`residual_estimator`	The model that learns to predict the residuals of the base estimator. It can be any regressor with scikit-learn API (i.e. with `fit` and `predict` methods). If `None`, estimator defaults to a `LinearRegression` instance. TYPE: `Optional[RegressorMixin]` DEFAULT: `None`
`prefit`	Specify if the `residual_estimator` is already fitted or not. By default `False`. TYPE: `bool` DEFAULT: `False`
`split_size`	The proportion of data that is used to fit the `residual_estimator`. By default it is the default value of `sklearn.model_selection.train_test_split` ie 0.25. TYPE: `Optional[Union[int, float]]` DEFAULT: `None`
`random_state`	Pseudo random number used for random sampling. Pass an int for reproducible output across multiple function calls. By default `None`. TYPE: `Optional[Union[int, RandomState]]` DEFAULT: `None`

Source code in mapie/conformity_scores/bounds/residuals.py

def __init__(
    self,
    residual_estimator: Optional[RegressorMixin] = None,
    prefit: bool = False,
    split_size: Optional[Union[int, float]] = None,
    random_state: Optional[Union[int, np.random.RandomState]] = None,
    sym: bool = True,
    consistency_check: bool = False,
) -> None:
    super().__init__(sym=sym, consistency_check=consistency_check)
    self.prefit = prefit
    self.residual_estimator = residual_estimator
    self.split_size = split_size
    self.random_state = random_state

get_signed_conformity_scores ¶

get_signed_conformity_scores(
    y: ArrayLike,
    y_pred: ArrayLike,
    X: Optional[ArrayLike] = None,
    **kwargs,
) -> NDArray

Computes the signed conformity score = (y - y_pred) / r_pred. r_pred being the predicted residual abs(y - y_pred) of the estimator. It is calculated by a model (residual_estimator_) that learns to predict this residual.

The learning is done with the log of the residual and later we use the exponential of the prediction to avoid negative values.

Source code in mapie/conformity_scores/bounds/residuals.py

def get_signed_conformity_scores(
    self, y: ArrayLike, y_pred: ArrayLike, X: Optional[ArrayLike] = None, **kwargs
) -> NDArray:
    """
    Computes the signed conformity score = (y - y_pred) / r_pred.
    r_pred being the predicted residual abs(y - y_pred) of the estimator.
    It is calculated by a model (`residual_estimator_`) that learns
    to predict this residual.

    The learning is done with the log of the residual and later we
    use the exponential of the prediction to avoid negative values.
    """
    if X is None:
        raise ValueError(
            "Additional parameters must be provided for the method to "
            + "work (here `X` is missing)."
        )
    X = cast(ArrayLike, X)

    (X, y, y_pred, self.residual_estimator_, random_state) = self._check_parameters(
        X, y, y_pred
    )

    full_indexes = np.argwhere(np.logical_not(np.isnan(y_pred))).reshape((-1,))

    if not self.prefit:
        cal_indexes, res_indexes = train_test_split(
            full_indexes,
            test_size=self.split_size,
            random_state=random_state,
        )
        self.residual_estimator_ = self._fit_residual_estimator(
            clone(self.residual_estimator_),
            _safe_indexing(X, res_indexes),
            _safe_indexing(y, res_indexes),
            _safe_indexing(y_pred, res_indexes),
        )
        residuals_pred = np.maximum(
            np.exp(
                self._predict_residual_estimator(_safe_indexing(X, cal_indexes))
            ),
            self.eps,
        )
    else:
        cal_indexes = full_indexes
        residuals_pred = np.maximum(
            self._predict_residual_estimator(_safe_indexing(X, cal_indexes)),
            self.eps,
        )

    signed_conformity_scores = np.divide(
        np.subtract(
            _safe_indexing(y, cal_indexes), _safe_indexing(y_pred, cal_indexes)
        ),
        residuals_pred,
    )

    # reconstruct array with nan and conformity scores
    complete_signed_cs = np.full(y_pred.shape, fill_value=np.nan, dtype=float)
    complete_signed_cs[cal_indexes] = signed_conformity_scores

    return complete_signed_cs

get_estimation_distribution ¶

get_estimation_distribution(
    y_pred: ArrayLike,
    conformity_scores: ArrayLike,
    X: Optional[ArrayLike] = None,
    **kwargs,
) -> NDArray

Compute samples of the estimation distribution from the predicted values and the conformity scores, from the following formula: y_pred + conformity_scores * r_pred.

The learning has been done with the log of the residual so we use the exponential of the prediction to avoid negative values.

conformity_scores can be either the conformity scores or the quantile of the conformity scores.

Source code in mapie/conformity_scores/bounds/residuals.py

def get_estimation_distribution(
    self,
    y_pred: ArrayLike,
    conformity_scores: ArrayLike,
    X: Optional[ArrayLike] = None,
    **kwargs,
) -> NDArray:
    """
    Compute samples of the estimation distribution from the predicted
    values and the conformity scores, from the following formula:
    `y_pred + conformity_scores * r_pred`.

    The learning has been done with the log of the residual so we use the
    exponential of the prediction to avoid negative values.

    `conformity_scores` can be either the conformity scores or
    the quantile of the conformity scores.
    """
    if X is None:
        raise ValueError(
            "Additional parameters must be provided for the method to "
            + "work (here `X` is missing)."
        )
    X = cast(ArrayLike, X)

    r_pred = self._predict_residual_estimator(X).reshape((-1, 1))
    if not self.prefit:
        return np.add(y_pred, np.multiply(conformity_scores, np.exp(r_pred)))
    else:
        return np.add(y_pred, np.multiply(conformity_scores, r_pred))

Classification¶

mapie.conformity_scores.BaseClassificationScore ¶

BaseClassificationScore()

Bases: BaseConformityScore

Base conformity score class for classification task.

This class should not be used directly. Use derived classes instead.

ATTRIBUTE	DESCRIPTION
`classes`	Names of the classes. TYPE: `Optional[ArrayLike]`
`random_state`	Pseudo random number generator state. TYPE: `Optional[Union[int, RandomState]]`
`quantiles_`	The quantiles estimated from `get_sets` method. TYPE: `ArrayLike of shape (n_alpha)`

Source code in mapie/conformity_scores/classification.py

def __init__(self) -> None:
    super().__init__()

set_external_attributes ¶

set_external_attributes(
    *,
    classes: Optional[ArrayLike] = None,
    random_state: Optional[Union[int, RandomState]] = None,
    **kwargs,
) -> None

Set attributes that are not provided by the user.

PARAMETER	DESCRIPTION
`classes`	Names of the classes. By default `None`. TYPE: `Optional[ArrayLike]` DEFAULT: `None`
`random_state`	Pseudo random number generator state. TYPE: `Optional[Union[int, RandomState]]` DEFAULT: `None`

Source code in mapie/conformity_scores/classification.py

def set_external_attributes(
    self,
    *,
    classes: Optional[ArrayLike] = None,
    random_state: Optional[Union[int, np.random.RandomState]] = None,
    **kwargs,
) -> None:
    """
    Set attributes that are not provided by the user.

    Parameters
    ----------
    classes: Optional[ArrayLike]
        Names of the classes.

        By default `None`.

    random_state: Optional[Union[int, np.random.RandomState]]
        Pseudo random number generator state.
    """
    super().set_external_attributes(**kwargs)
    self.classes = classes
    self.random_state = random_state

get_predictions `abstractmethod` ¶

get_predictions(
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray

Abstract method to get predictions from an EnsembleClassifier.

This method should be implemented by any subclass of the current class.

PARAMETER	DESCRIPTION
`X`	Observed feature values. TYPE: `NDArray`
`alpha_np`	NDArray of floats between `0` and `1`, represents the uncertainty of the confidence set. TYPE: `NDArray`
`y_pred_proba`	Predicted probabilities from the estimator. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator. TYPE: `Optional[Union[int, str, BaseCrossValidator]]`

RETURNS	DESCRIPTION
`NDArray`	Array of predictions.

Source code in mapie/conformity_scores/classification.py

@abstractmethod
def get_predictions(
    self,
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray:
    """
    Abstract method to get predictions from an EnsembleClassifier.

    This method should be implemented by any subclass of the current class.

    Parameters
    -----------
    X: NDArray of shape (n_samples, n_features)
        Observed feature values.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between `0` and `1`, represents the
        uncertainty of the confidence set.

    y_pred_proba: NDArray
        Predicted probabilities from the estimator.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator.

    Returns
    --------
    NDArray
        Array of predictions.
    """

get_conformity_score_quantiles `abstractmethod` ¶

get_conformity_score_quantiles(
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray

Abstract method to get quantiles of the conformity scores.

This method should be implemented by any subclass of the current class.

PARAMETER	DESCRIPTION
`conformity_scores`	Conformity scores for each sample. TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence set. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator. TYPE: `Optional[Union[int, str, BaseCrossValidator]]`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

Source code in mapie/conformity_scores/classification.py

@abstractmethod
def get_conformity_score_quantiles(
    self,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray:
    """
    Abstract method to get quantiles of the conformity scores.

    This method should be implemented by any subclass of the current class.

    Parameters
    -----------
    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence set.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator.

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.
    """

get_prediction_sets `abstractmethod` ¶

get_prediction_sets(
    y_pred_proba: NDArray,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray

Abstract method to generate prediction sets based on the probability predictions, the conformity scores and the uncertainty level.

This method should be implemented by any subclass of the current class.

PARAMETER	DESCRIPTION
`y_pred_proba`	Target prediction. TYPE: `NDArray`
`conformity_scores`	Conformity scores for each sample. TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence set. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator. TYPE: `Optional[Union[int, str, BaseCrossValidator]]`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

Source code in mapie/conformity_scores/classification.py

@abstractmethod
def get_prediction_sets(
    self,
    y_pred_proba: NDArray,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray:
    """
    Abstract method to generate prediction sets based on the probability
    predictions, the conformity scores and the uncertainty level.

    This method should be implemented by any subclass of the current class.

    Parameters
    -----------
    y_pred_proba: NDArray of shape (n_samples, n_classes)
        Target prediction.

    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence set.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator.

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.
    """

get_sets ¶

get_sets(
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    conformity_scores: NDArray,
    **kwargs,
) -> NDArray

Compute classes of the prediction sets from the observed values, the predicted probabilities and the conformity scores.

PARAMETER	DESCRIPTION
`X`	Observed feature values. TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence set. TYPE: `NDArray`
`y_pred_proba`	Predicted probabilities from the estimator. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator. TYPE: `Optional[Union[int, str, BaseCrossValidator]]`
`conformity_scores`	Conformity scores. TYPE: `NDArray`

RETURNS	DESCRIPTION
`NDArray of shape (n_samples, n_classes, n_alpha)`	Prediction sets (Booleans indicate whether classes are included).

Source code in mapie/conformity_scores/classification.py

def get_sets(
    self,
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    conformity_scores: NDArray,
    **kwargs,
) -> NDArray:
    """
    Compute classes of the prediction sets from the observed values,
    the predicted probabilities and the conformity scores.

    Parameters
    ----------
    X: NDArray of shape (n_samples, n_features)
        Observed feature values.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence set.

    y_pred_proba: NDArray
        Predicted probabilities from the estimator.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator.

    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores.

    Returns
    -------
    NDArray of shape (n_samples, n_classes, n_alpha)
        Prediction sets (Booleans indicate whether classes are included).
    """
    # Choice of the quantile
    # Predict probabilities
    y_pred_proba = self.get_predictions(X, alpha_np, y_pred_proba, cv, **kwargs)

    self.quantiles_ = self.get_conformity_score_quantiles(
        conformity_scores, alpha_np, cv, **kwargs
    )

    # Build prediction sets
    prediction_sets = self.get_prediction_sets(
        y_pred_proba, conformity_scores, alpha_np, cv, **kwargs
    )

    return prediction_sets

predict_set ¶

predict_set(X: NDArray, alpha_np: NDArray, **kwargs)

Compute the prediction sets on new samples based on the uncertainty of the target confidence set.

PARAMETER	DESCRIPTION
`X`	The input data or samples for prediction. TYPE: `NDArray`
`alpha_np`	Represents the uncertainty of the confidence set to produce. TYPE: `NDArray`
`**kwargs`	Additional keyword arguments. DEFAULT: `{}`

RETURNS	DESCRIPTION
`result`	The prediction sets for each sample and each alpha level. The output structure depends on the `get_sets` method.

Source code in mapie/conformity_scores/classification.py

def predict_set(self, X: NDArray, alpha_np: NDArray, **kwargs):
    """
    Compute the prediction sets on new samples based on the uncertainty of
    the target confidence set.

    Parameters
    -----------
    X: NDArray of shape (n_samples,)
        The input data or samples for prediction.

    alpha_np: NDArray of shape (n_alpha, )
        Represents the uncertainty of the confidence set to produce.

    **kwargs: dict
        Additional keyword arguments.

    Returns
    --------
    result
        The prediction sets for each sample and each alpha level.
        The output structure depends on the `get_sets` method.
    """
    return self.get_sets(X=X, alpha_np=alpha_np, **kwargs)

mapie.conformity_scores.NaiveConformityScore ¶

NaiveConformityScore()

Bases: BaseClassificationScore

Naive classification non-conformity score method that is based on the cumulative sum of probabilities until the 1-alpha threshold.

ATTRIBUTE	DESCRIPTION
`classes`	Names of the classes. TYPE: `Optional[ArrayLike]`
`random_state`	Pseudo random number generator state. TYPE: `Optional[Union[int, RandomState]]`
`quantiles_`	The quantiles estimated from `get_sets` method. TYPE: `ArrayLike of shape (n_alpha,)`

Source code in mapie/conformity_scores/sets/naive.py

def __init__(self) -> None:
    super().__init__()

get_conformity_scores ¶

get_conformity_scores(
    y: NDArray, y_pred: NDArray, **kwargs
) -> NDArray

Get the conformity score.

PARAMETER	DESCRIPTION
`y`	Observed target values (not used here). TYPE: `NDArray`
`y_pred`	Predicted target values. TYPE: `NDArray`

RETURNS	DESCRIPTION
`NDArray of shape (n_samples,)`	Conformity scores.

Source code in mapie/conformity_scores/sets/naive.py

def get_conformity_scores(self, y: NDArray, y_pred: NDArray, **kwargs) -> NDArray:
    """
    Get the conformity score.

    Parameters
    ----------
    y: NDArray of shape (n_samples,)
        Observed target values (not used here).

    y_pred: NDArray of shape (n_samples,)
        Predicted target values.

    Returns
    -------
    NDArray of shape (n_samples,)
        Conformity scores.
    """
    conformity_scores = np.empty(y_pred.shape, dtype="float")
    return conformity_scores

get_predictions ¶

get_predictions(
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray

Just processes the passed y_pred_proba.

PARAMETER	DESCRIPTION
`X`	Observed feature values (not used since predictions are passed). TYPE: `NDArray`
`alpha_np`	NDArray of floats between `0` and `1`, represents the uncertainty of the confidence interval. TYPE: `NDArray`
`y_pred_proba`	Predicted probabilities from the estimator. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator (not used here). TYPE: `Optional[Union[int, str, BaseCrossValidator]]`

RETURNS	DESCRIPTION
`NDArray`	Array of predictions.

Source code in mapie/conformity_scores/sets/naive.py

def get_predictions(
    self,
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray:
    """
    Just processes the passed y_pred_proba.

    Parameters
    -----------
    X: NDArray of shape (n_samples, n_features)
        Observed feature values (not used since predictions are passed).

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between `0` and `1`, represents the
        uncertainty of the confidence interval.

    y_pred_proba: NDArray
        Predicted probabilities from the estimator.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator (not used here).

    Returns
    --------
    NDArray
        Array of predictions.
    """
    y_pred_proba = np.repeat(y_pred_proba[:, :, np.newaxis], len(alpha_np), axis=2)
    return y_pred_proba

get_conformity_score_quantiles ¶

get_conformity_score_quantiles(
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray

Get the quantiles of the conformity scores for each uncertainty level.

PARAMETER	DESCRIPTION
`conformity_scores`	Conformity scores for each sample (not used here). TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval (not used here). TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator (not used here). TYPE: `Optional[Union[int, str, BaseCrossValidator]]`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

Source code in mapie/conformity_scores/sets/naive.py

def get_conformity_score_quantiles(
    self,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray:
    """
    Get the quantiles of the conformity scores for each uncertainty level.

    Parameters
    -----------
    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample (not used here).

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence interval (not used here).

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator (not used here).

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.
    """
    quantiles_ = 1 - alpha_np
    return quantiles_

get_prediction_sets ¶

get_prediction_sets(
    y_pred_proba: NDArray,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray

Generate prediction sets based on the probability predictions, the conformity scores and the uncertainty level.

PARAMETER	DESCRIPTION
`y_pred_proba`	Target prediction. TYPE: `NDArray`
`conformity_scores`	Conformity scores for each sample (not used here). TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval (not used here). TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator (not used here). TYPE: `Optional[Union[int, str, BaseCrossValidator]]`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

Source code in mapie/conformity_scores/sets/naive.py

def get_prediction_sets(
    self,
    y_pred_proba: NDArray,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray:
    """
    Generate prediction sets based on the probability predictions,
    the conformity scores and the uncertainty level.

    Parameters
    -----------
    y_pred_proba: NDArray of shape (n_samples, n_classes)
        Target prediction.

    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample (not used here).

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence interval (not used here).

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator (not used here).

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.
    """
    # sort labels by decreasing probability
    _, _, y_pred_proba_last = self._get_last_included_proba(
        y_pred_proba, thresholds=self.quantiles_, include_last_label=True
    )
    # get the prediction set by taking all probabilities above the last one
    prediction_sets = np.greater_equal(y_pred_proba - y_pred_proba_last, -EPSILON)

    return cast(NDArray, prediction_sets)

mapie.conformity_scores.LACConformityScore ¶

LACConformityScore()

Bases: BaseClassificationScore

Least Ambiguous set-valued Classifier (LAC) method-based non conformity score (also formerly called "score").

It is based on the scores (i.e. 1 minus the softmax score of the true label) on the conformalization set.

References

[1] Mauricio Sadinle, Jing Lei, and Larry Wasserman. "Least Ambiguous Set-Valued Classifiers with Bounded Error Levels.", Journal of the American Statistical Association, 114, 2019.

ATTRIBUTE	DESCRIPTION
`classes`	Names of the classes. TYPE: `Optional[ArrayLike]`
`random_state`	Pseudo random number generator state. TYPE: `Optional[Union[int, RandomState]]`
`quantiles_`	The quantiles estimated from `get_sets` method. TYPE: `ArrayLike of shape (n_alpha)`

Source code in mapie/conformity_scores/sets/lac.py

def __init__(self) -> None:
    super().__init__()

get_conformity_scores ¶

get_conformity_scores(
    y: NDArray,
    y_pred: NDArray,
    y_enc: Optional[NDArray] = None,
    **kwargs,
) -> NDArray

Get the conformity score.

PARAMETER	DESCRIPTION
`y`	Observed target values (not used here). TYPE: `NDArray`
`y_pred`	Predicted target values. TYPE: `NDArray`
`y_enc`	Target values as normalized encodings. TYPE: `Optional[NDArray]` DEFAULT: `None`

RETURNS	DESCRIPTION
`NDArray of shape (n_samples,)`	Conformity scores.

Source code in mapie/conformity_scores/sets/lac.py

def get_conformity_scores(
    self, y: NDArray, y_pred: NDArray, y_enc: Optional[NDArray] = None, **kwargs
) -> NDArray:
    """
    Get the conformity score.

    Parameters
    ----------
    y: NDArray of shape (n_samples,)
        Observed target values (not used here).

    y_pred: NDArray of shape (n_samples,)
        Predicted target values.

    y_enc: NDArray of shape (n_samples,)
        Target values as normalized encodings.

    Returns
    -------
    NDArray of shape (n_samples,)
        Conformity scores.
    """
    # Casting
    y_enc = cast(NDArray, y_enc)

    # Conformity scores
    conformity_scores = np.take_along_axis(1 - y_pred, y_enc.reshape(-1, 1), axis=1)

    return conformity_scores

get_predictions ¶

get_predictions(
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    **kwargs,
) -> NDArray

Just processes the passed y_pred_proba.

PARAMETER	DESCRIPTION
`X`	Observed feature values (not used since predictions are passed). TYPE: `NDArray`
`alpha_np`	NDArray of floats between `0` and `1`, represents the uncertainty of the confidence interval. TYPE: `NDArray`
`y_pred_proba`	Predicted probabilities from the estimator. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator (not used here). TYPE: `Optional[Union[int, str, BaseCrossValidator]]`
`agg_scores`	Method to aggregate the scores from the base estimators. If "mean", the scores are averaged. If "crossval", the scores are obtained from cross-validation. By default `"mean"`. TYPE: `Optional[str]` DEFAULT: `'mean'`

RETURNS	DESCRIPTION
`NDArray`	Array of predictions.

Source code in mapie/conformity_scores/sets/lac.py

def get_predictions(
    self,
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    **kwargs,
) -> NDArray:
    """
    Just processes the passed y_pred_proba.

    Parameters
    -----------
    X: NDArray of shape (n_samples, n_features)
        Observed feature values (not used since predictions are passed).

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between `0` and `1`, represents the
        uncertainty of the confidence interval.

    y_pred_proba: NDArray
        Predicted probabilities from the estimator.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator (not used here).

    agg_scores: Optional[str]
        Method to aggregate the scores from the base estimators.
        If "mean", the scores are averaged. If "crossval", the scores are
        obtained from cross-validation.

        By default `"mean"`.

    Returns
    --------
    NDArray
        Array of predictions.
    """
    if agg_scores != "crossval":
        y_pred_proba = np.repeat(
            y_pred_proba[:, :, np.newaxis], len(alpha_np), axis=2
        )

    return y_pred_proba

get_conformity_score_quantiles ¶

get_conformity_score_quantiles(
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    **kwargs,
) -> NDArray

Get the quantiles of the conformity scores for each uncertainty level.

PARAMETER	DESCRIPTION
`conformity_scores`	Conformity scores for each sample. TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator. TYPE: `Optional[Union[int, str, BaseCrossValidator]]`
`agg_scores`	Method to aggregate the scores from the base estimators. If "mean", the scores are averaged. If "crossval", the scores are obtained from cross-validation. By default `"mean"`. TYPE: `Optional[str]` DEFAULT: `'mean'`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

Source code in mapie/conformity_scores/sets/lac.py

def get_conformity_score_quantiles(
    self,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    **kwargs,
) -> NDArray:
    """
    Get the quantiles of the conformity scores for each uncertainty level.

    Parameters
    -----------
    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence interval.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator.

    agg_scores: Optional[str]
        Method to aggregate the scores from the base estimators.
        If "mean", the scores are averaged. If "crossval", the scores are
        obtained from cross-validation.

        By default `"mean"`.

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.
    """
    n = len(conformity_scores)

    if cv == "prefit" or agg_scores in ["mean"]:
        quantiles_ = _compute_quantiles(conformity_scores, alpha_np)
    else:
        quantiles_ = (n + 1) * (1 - alpha_np)

    return quantiles_

get_prediction_sets ¶

get_prediction_sets(
    y_pred_proba: NDArray,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    **kwargs,
) -> NDArray

Generate prediction sets based on the probability predictions, the conformity scores and the uncertainty level.

PARAMETER	DESCRIPTION
`y_pred_proba`	Target prediction. TYPE: `NDArray`
`conformity_scores`	Conformity scores for each sample. TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator. TYPE: `Optional[Union[int, str, BaseCrossValidator]]`
`agg_scores`	Method to aggregate the scores from the base estimators. If "mean", the scores are averaged. If "crossval", the scores are obtained from cross-validation. By default `"mean"`. TYPE: `Optional[str]` DEFAULT: `'mean'`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

Source code in mapie/conformity_scores/sets/lac.py

def get_prediction_sets(
    self,
    y_pred_proba: NDArray,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    **kwargs,
) -> NDArray:
    """
    Generate prediction sets based on the probability predictions,
    the conformity scores and the uncertainty level.

    Parameters
    -----------
    y_pred_proba: NDArray of shape (n_samples, n_classes)
        Target prediction.

    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence interval.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator.

    agg_scores: Optional[str]
        Method to aggregate the scores from the base estimators.
        If "mean", the scores are averaged. If "crossval", the scores are
        obtained from cross-validation.

        By default `"mean"`.

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.
    """
    n = len(conformity_scores)

    if (cv == "prefit") or (agg_scores == "mean"):
        prediction_sets = np.less_equal(
            (1 - y_pred_proba) - self.quantiles_, EPSILON
        )
    else:
        y_pred_included = np.less_equal(
            (1 - y_pred_proba) - conformity_scores.ravel(), EPSILON
        ).sum(axis=2)
        prediction_sets = np.stack(
            [
                np.greater_equal(y_pred_included - _alpha * (n - 1), -EPSILON)
                for _alpha in alpha_np
            ],
            axis=2,
        )

    return cast(NDArray, prediction_sets)

mapie.conformity_scores.APSConformityScore ¶

APSConformityScore()

Bases: NaiveConformityScore

Adaptive Prediction Sets (APS) method-based non-conformity score. It is based on the sum of the softmax outputs of the labels until the true label is reached, on the conformalization set. See [1] for more details.

References

[1] Yaniv Romano, Matteo Sesia and Emmanuel J. Candès. "Classification with Valid and Adaptive Coverage." NeurIPS 202 (spotlight) 2020.

ATTRIBUTE	DESCRIPTION
`classes`	Names of the classes. TYPE: `Optional[ArrayLike]`
`random_state`	Pseudo random number generator state. TYPE: `Optional[Union[int, RandomState]]`
`quantiles_`	The quantiles estimated from `get_sets` method. TYPE: `ArrayLike of shape (n_alpha)`

Source code in mapie/conformity_scores/sets/aps.py

def __init__(self) -> None:
    super().__init__()

get_predictions ¶

get_predictions(
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    **kwargs,
) -> NDArray

Just processes the passed y_pred_proba.

PARAMETER	DESCRIPTION
`X`	Observed feature values (not used since predictions are passed). TYPE: `NDArray`
`alpha_np`	NDArray of floats between `0` and `1`, represents the uncertainty of the confidence interval. TYPE: `NDArray`
`y_pred_proba`	Predicted probabilities from the estimator. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator (not used here). TYPE: `Optional[Union[int, str, BaseCrossValidator]]`
`agg_scores`	Method to aggregate the scores from the base estimators. If "mean", the scores are averaged. If "crossval", the scores are obtained from cross-validation. By default `"mean"`. TYPE: `Optional[str]` DEFAULT: `'mean'`

RETURNS	DESCRIPTION
`NDArray`	Array of predictions.

Source code in mapie/conformity_scores/sets/aps.py

def get_predictions(
    self,
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    **kwargs,
) -> NDArray:
    """
    Just processes the passed y_pred_proba.

    Parameters
    -----------
    X: NDArray of shape (n_samples, n_features)
        Observed feature values (not used since predictions are passed).

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between `0` and `1`, represents the
        uncertainty of the confidence interval.

    y_pred_proba: NDArray
        Predicted probabilities from the estimator.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator (not used here).

    agg_scores: Optional[str]
        Method to aggregate the scores from the base estimators.
        If "mean", the scores are averaged. If "crossval", the scores are
        obtained from cross-validation.

        By default `"mean"`.

    Returns
    --------
    NDArray
        Array of predictions.
    """
    if agg_scores != "crossval":
        y_pred_proba = np.repeat(
            y_pred_proba[:, :, np.newaxis], len(alpha_np), axis=2
        )
    return y_pred_proba

get_true_label_cumsum_proba `staticmethod` ¶

get_true_label_cumsum_proba(
    y: ArrayLike, y_pred_proba: NDArray, classes: ArrayLike
) -> Tuple[NDArray, NDArray]

Compute the cumsumed probability of the true label.

PARAMETER	DESCRIPTION
`y`	Array with the labels. TYPE: `ArrayLike`
`y_pred_proba`	Predictions of the model. TYPE: `NDArray`
`classes`	Array with the classes. TYPE: `ArrayLike`

RETURNS	DESCRIPTION
`Tuple[NDArray, NDArray] of shapes (n_samples, 1) and (n_samples, ).`	The first element is the cumsum probability of the true label. The second is the 1-based rank of the true label in the sorted probabilities.

Source code in mapie/conformity_scores/sets/aps.py

@staticmethod
def get_true_label_cumsum_proba(
    y: ArrayLike, y_pred_proba: NDArray, classes: ArrayLike
) -> Tuple[NDArray, NDArray]:
    """
    Compute the cumsumed probability of the true label.

    Parameters
    ----------
    y: ArrayLike of shape (n_samples, )
        Array with the labels.

    y_pred_proba: NDArray of shape (n_samples, n_classes)
        Predictions of the model.

    classes: ArrayLike of shape (n_classes, )
        Array with the classes.

    Returns
    -------
    Tuple[NDArray, NDArray] of shapes (n_samples, 1) and (n_samples, ).
        The first element is the cumsum probability of the true label.
        The second is the 1-based rank of the true label in the sorted probabilities.
    """
    y_true = label_binarize(y=y, classes=classes)
    index_sorted = np.fliplr(np.argsort(y_pred_proba, axis=1))
    y_pred_sorted = np.take_along_axis(y_pred_proba, index_sorted, axis=1)
    y_true_sorted = np.take_along_axis(y_true, index_sorted, axis=1)
    y_pred_sorted_cumsum = np.cumsum(y_pred_sorted, axis=1)
    cutoff = np.argmax(y_true_sorted, axis=1)
    true_label_cumsum_proba = np.take_along_axis(
        y_pred_sorted_cumsum, cutoff.reshape(-1, 1), axis=1
    )
    cutoff += 1

    return true_label_cumsum_proba, cutoff

get_conformity_scores ¶

get_conformity_scores(
    y: NDArray,
    y_pred: NDArray,
    y_enc: Optional[NDArray] = None,
    **kwargs,
) -> NDArray

Get the conformity score.

PARAMETER	DESCRIPTION
`y`	Observed target values. TYPE: `NDArray`
`y_pred`	Predicted target values. TYPE: `NDArray`
`y_enc`	Target values as normalized encodings. TYPE: `Optional[NDArray]` DEFAULT: `None`

RETURNS	DESCRIPTION
`NDArray of shape (n_samples,)`	Conformity scores.

Source code in mapie/conformity_scores/sets/aps.py

def get_conformity_scores(
    self, y: NDArray, y_pred: NDArray, y_enc: Optional[NDArray] = None, **kwargs
) -> NDArray:
    """
    Get the conformity score.

    Parameters
    ----------
    y: NDArray of shape (n_samples,)
        Observed target values.

    y_pred: NDArray of shape (n_samples,)
        Predicted target values.

    y_enc: Optional[NDArray] of shape (n_samples,)
        Target values as normalized encodings.

    Returns
    -------
    NDArray of shape (n_samples,)
        Conformity scores.
    """
    # Casting
    y_enc = cast(NDArray, y_enc)
    classes = cast(NDArray, self.classes)

    # Conformity scores
    conformity_scores, self.cutoff = self.get_true_label_cumsum_proba(
        y, y_pred, classes
    )
    y_proba_true = np.take_along_axis(y_pred, y_enc.reshape(-1, 1), axis=1)
    random_state = check_random_state(self.random_state)
    u = random_state.uniform(size=len(y_pred)).reshape(-1, 1)
    conformity_scores -= u * y_proba_true

    return conformity_scores

get_conformity_score_quantiles ¶

get_conformity_score_quantiles(
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    **kwargs,
) -> NDArray

Get the quantiles of the conformity scores for each uncertainty level.

PARAMETER	DESCRIPTION
`conformity_scores`	Conformity scores for each sample. TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator. TYPE: `Optional[Union[int, str, BaseCrossValidator]]`
`agg_scores`	Method to aggregate the scores from the base estimators. If "mean", the scores are averaged. If "crossval", the scores are obtained from cross-validation. By default `"mean"`. TYPE: `Optional[str]` DEFAULT: `'mean'`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

Source code in mapie/conformity_scores/sets/aps.py

def get_conformity_score_quantiles(
    self,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    **kwargs,
) -> NDArray:
    """
    Get the quantiles of the conformity scores for each uncertainty level.

    Parameters
    -----------
    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence interval.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator.

    agg_scores: Optional[str]
        Method to aggregate the scores from the base estimators.
        If "mean", the scores are averaged. If "crossval", the scores are
        obtained from cross-validation.

        By default `"mean"`.

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.
    """
    n = len(conformity_scores)

    if cv == "prefit" or agg_scores in ["mean"]:
        quantiles_ = _compute_quantiles(conformity_scores, alpha_np)
    else:
        quantiles_ = (n + 1) * (1 - alpha_np)

    return quantiles_

get_prediction_sets ¶

get_prediction_sets(
    y_pred_proba: NDArray,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    include_last_label: Optional[Union[bool, str]] = True,
    **kwargs,
) -> NDArray

Generate prediction sets based on the probability predictions, the conformity scores and the uncertainty level.

PARAMETER	DESCRIPTION
`y_pred_proba`	Target prediction. TYPE: `NDArray`
`conformity_scores`	Conformity scores for each sample. TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval (not used here). TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator. TYPE: `Optional[Union[int, str, BaseCrossValidator]]`
`agg_scores`	Method to aggregate the scores from the base estimators. If "mean", the scores are averaged. If "crossval", the scores are obtained from cross-validation. By default `"mean"`. TYPE: `Optional[str]` DEFAULT: `'mean'`
`include_last_label`	Whether or not to include last label in prediction sets for the "aps" method. Choose among: False, does not include label whose cumulated score is just over the quantile. True, includes label whose cumulated score is just over the quantile, unless there is only one label in the prediction set. "randomized", randomly includes label whose cumulated score is just over the quantile based on the comparison of a uniform number and the difference between the cumulated score of the last label and the quantile. When set to `True` or `False`, it may result in a coverage higher than `1 - alpha` (because contrary to the "randomized" setting, none of these methods create empty prediction sets). See [1] and [2] for more details. By default `True`. TYPE: `Optional[Union[bool, str]]` DEFAULT: `True`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

References

[1] Yaniv Romano, Matteo Sesia and Emmanuel J. Candès. "Classification with Valid and Adaptive Coverage." NeurIPS 202 (spotlight) 2020.

[2] Anastasios Nikolas Angelopoulos, Stephen Bates, Michael Jordan and Jitendra Malik. "Uncertainty Sets for Image Classifiers using Conformal Prediction." International Conference on Learning Representations 2021.

Source code in mapie/conformity_scores/sets/aps.py

def get_prediction_sets(
    self,
    y_pred_proba: NDArray,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    include_last_label: Optional[Union[bool, str]] = True,
    **kwargs,
) -> NDArray:
    """
    Generate prediction sets based on the probability predictions,
    the conformity scores and the uncertainty level.

    Parameters
    -----------
    y_pred_proba: NDArray of shape (n_samples, n_classes)
        Target prediction.

    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence interval (not used here).

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator.

    agg_scores: Optional[str]
        Method to aggregate the scores from the base estimators.
        If "mean", the scores are averaged. If "crossval", the scores are
        obtained from cross-validation.

        By default `"mean"`.

    include_last_label: Optional[Union[bool, str]]
        Whether or not to include last label in
        prediction sets for the "aps" method. Choose among:

        - False, does not include label whose cumulated score is just over
          the quantile.
        - True, includes label whose cumulated score is just over the
          quantile, unless there is only one label in the prediction set.
        - "randomized", randomly includes label whose cumulated score is
          just over the quantile based on the comparison of a uniform
          number and the difference between the cumulated score of
          the last label and the quantile.

        When set to `True` or `False`, it may result in a coverage
        higher than `1 - alpha` (because contrary to the "randomized"
        setting, none of these methods create empty prediction sets). See
        [1] and [2] for more details.

        By default `True`.

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.

    References
    ----------
    [1] Yaniv Romano, Matteo Sesia and Emmanuel J. Candès.
    "Classification with Valid and Adaptive Coverage."
    NeurIPS 202 (spotlight) 2020.

    [2] Anastasios Nikolas Angelopoulos, Stephen Bates, Michael Jordan
    and Jitendra Malik.
    "Uncertainty Sets for Image Classifiers using Conformal Prediction."
    International Conference on Learning Representations 2021.
    """
    include_last_label = check_include_last_label(include_last_label)

    # specify which thresholds will be used
    if cv == "prefit" or agg_scores in ["mean"]:
        thresholds = self.quantiles_
    else:
        thresholds = conformity_scores.ravel()

    # sort labels by decreasing probability
    y_pred_proba_cumsum, y_pred_index_last, y_pred_proba_last = (
        self._get_last_included_proba(
            y_pred_proba,
            thresholds,
            include_last_label,
            prediction_phase=True,
            **kwargs,
        )
    )
    # get the prediction set by taking all probabilities above the last one
    if cv == "prefit" or agg_scores in ["mean"]:
        y_pred_included = np.greater_equal(
            y_pred_proba - y_pred_proba_last, -EPSILON
        )
    else:
        y_pred_included = np.less_equal(y_pred_proba - y_pred_proba_last, EPSILON)
    # remove last label randomly
    if include_last_label == "randomized":
        y_pred_included = self._add_random_tie_breaking(
            y_pred_included,
            y_pred_index_last,
            y_pred_proba_cumsum,
            y_pred_proba_last,
            thresholds,
            **kwargs,
        )
    if cv == "prefit" or agg_scores in ["mean"]:
        prediction_sets = y_pred_included
    else:
        # compute the number of times the inequality is verified
        prediction_sets_summed = y_pred_included.sum(axis=2)
        prediction_sets = np.less_equal(
            prediction_sets_summed[:, :, np.newaxis]
            - self.quantiles_[np.newaxis, np.newaxis, :],
            EPSILON,
        )

    return cast(NDArray, prediction_sets)

mapie.conformity_scores.RAPSConformityScore ¶

RAPSConformityScore(size_raps: Optional[float] = 0.2)

Bases: APSConformityScore

Regularized Adaptive Prediction Sets (RAPS) method-based non-conformity score. It uses the same technique as APSConformityScore class but with a penalty term to reduce the size of prediction sets. See [1] for more details. For now, this method only works with "prefit" and "split" strategies.

References

[1] Anastasios Nikolas Angelopoulos, Stephen Bates, Michael Jordan and Jitendra Malik. "Uncertainty Sets for Image Classifiers using Conformal Prediction." International Conference on Learning Representations 2021.

PARAMETER	DESCRIPTION
`size_raps`	Percentage of the data to be used for choosing lambda_star and k_star for the RAPS method. TYPE: `Optional[float]` DEFAULT: `0.2`

ATTRIBUTE	DESCRIPTION
`classes`	Names of the classes. TYPE: `ArrayLike`
`random_state`	Pseudo random number generator state. TYPE: `Union[int, RandomState]`
`quantiles_`	The quantiles estimated from `get_sets` method. TYPE: `ArrayLike of shape (n_alpha)`
`label_encoder`	The label encoder used to encode the labels. TYPE: `LabelEncoder`
`size_raps`	Percentage of the data to be used for choosing lambda_star and k_star for the RAPS method. TYPE: `float`

Source code in mapie/conformity_scores/sets/raps.py

def __init__(self, size_raps: Optional[float] = 0.2) -> None:
    super().__init__()
    self.size_raps = size_raps

set_external_attributes ¶

set_external_attributes(
    *,
    label_encoder: Optional[LabelEncoder] = None,
    size_raps: Optional[float] = None,
    **kwargs,
) -> None

Set attributes that are not provided by the user.

PARAMETER	DESCRIPTION
`label_encoder`	The label encoder used to encode the labels. By default `None`. TYPE: `Optional[LabelEncoder]` DEFAULT: `None`
`size_raps`	Percentage of the data to be used for choosing lambda_star and k_star for the RAPS method. By default `None`. TYPE: `Optional[float]` DEFAULT: `None`

Source code in mapie/conformity_scores/sets/raps.py

def set_external_attributes(
    self,
    *,
    label_encoder: Optional[LabelEncoder] = None,
    size_raps: Optional[float] = None,
    **kwargs,
) -> None:
    """
    Set attributes that are not provided by the user.

    Parameters
    ----------
    label_encoder: Optional[LabelEncoder]
        The label encoder used to encode the labels.

        By default `None`.

    size_raps: Optional[float]
        Percentage of the data to be used for choosing lambda_star and
        k_star for the RAPS method.

        By default `None`.
    """
    super().set_external_attributes(**kwargs)
    self.label_encoder_ = cast(LabelEncoder, label_encoder)
    self.size_raps = size_raps

split_data ¶

split_data(
    X: NDArray,
    y: NDArray,
    y_enc: NDArray,
    sample_weight: Optional[NDArray] = None,
    groups: Optional[NDArray] = None,
)

Split data. Keeps part of the data for the calibration estimator (separate from the calibration data).

PARAMETER	DESCRIPTION
`X`	Observed values. TYPE: `NDArray`
`y`	Target values. TYPE: `NDArray`
`y_enc`	Target values as normalized encodings. TYPE: `NDArray`
`sample_weight`	Non-null sample weights. TYPE: `Optional[NDArray]` DEFAULT: `None`
`groups`	Group labels for the samples used while splitting the dataset into train/test set. By default `None`. TYPE: `Optional[NDArray]` DEFAULT: `None`

RETURNS	DESCRIPTION
`Tuple[NDArray, NDArray, NDArray, NDArray, Optional[NDArray],`
`Optional[NDArray]]`	X: NDArray of shape (n_samples, n_features) y: NDArray of shape (n_samples,) y_enc: NDArray of shape (n_samples,) sample_weight: Optional[NDArray] of shape (n_samples,) groups: Optional[NDArray] of shape (n_samples,)

Source code in mapie/conformity_scores/sets/raps.py

def split_data(
    self,
    X: NDArray,
    y: NDArray,
    y_enc: NDArray,
    sample_weight: Optional[NDArray] = None,
    groups: Optional[NDArray] = None,
):
    """
    Split data. Keeps part of the data for the calibration estimator
    (separate from the calibration data).

    Parameters
    ----------
    X: NDArray
        Observed values.

    y: NDArray
        Target values.

    y_enc: NDArray
        Target values as normalized encodings.

    sample_weight: Optional[NDArray] of shape (n_samples,)
        Non-null sample weights.

    groups: Optional[NDArray] of shape (n_samples,)
        Group labels for the samples used while splitting the dataset into
        train/test set.
        By default `None`.

    Returns
    -------
    Tuple[NDArray, NDArray, NDArray, NDArray, Optional[NDArray],
    Optional[NDArray]]
        - X: NDArray of shape (n_samples, n_features)
        - y: NDArray of shape (n_samples,)
        - y_enc: NDArray of shape (n_samples,)
        - sample_weight: Optional[NDArray] of shape (n_samples,)
        - groups: Optional[NDArray] of shape (n_samples,)
    """
    # Split data for raps method
    raps_split = StratifiedShuffleSplit(
        n_splits=1, test_size=self.size_raps, random_state=self.random_state
    )
    train_raps_index, val_raps_index = next(raps_split.split(X, y_enc))
    X, self.X_raps, y_enc, self.y_raps = (
        _safe_indexing(X, train_raps_index),
        _safe_indexing(X, val_raps_index),
        _safe_indexing(y_enc, train_raps_index),
        _safe_indexing(y_enc, val_raps_index),
    )

    # Decode y_raps for use in the RAPS method
    self.y_raps_no_enc = self.label_encoder_.inverse_transform(self.y_raps)
    y = self.label_encoder_.inverse_transform(y_enc)

    # Cast to NDArray for type checking
    y_enc = cast(NDArray, y_enc)
    if sample_weight is not None:
        sample_weight = cast(NDArray, sample_weight)
        sample_weight = sample_weight[train_raps_index]
    if groups is not None:
        groups = cast(NDArray, groups)
        groups = groups[train_raps_index]

    # Keep sample data size for training and calibration
    self.n_samples_ = _num_samples(y_enc)

    return X, y, y_enc, sample_weight, groups

get_conformity_scores ¶

get_conformity_scores(
    y: NDArray,
    y_pred: NDArray,
    y_enc: Optional[NDArray] = None,
    **kwargs,
) -> NDArray

Get the conformity score.

PARAMETER	DESCRIPTION
`y`	Observed target values. TYPE: `NDArray`
`y_pred`	Predicted target values. TYPE: `NDArray`
`y_enc`	Target values as normalized encodings. TYPE: `Optional[NDArray]` DEFAULT: `None`

RETURNS	DESCRIPTION
`NDArray of shape (n_samples,)`	Conformity scores.

Source code in mapie/conformity_scores/sets/raps.py

def get_conformity_scores(
    self, y: NDArray, y_pred: NDArray, y_enc: Optional[NDArray] = None, **kwargs
) -> NDArray:
    """
    Get the conformity score.

    Parameters
    ----------
    y: NDArray of shape (n_samples,)
        Observed target values.

    y_pred: NDArray of shape (n_samples,)
        Predicted target values.

    y_enc: Optional[NDArray] of shape (n_samples,)
        Target values as normalized encodings.

    Returns
    -------
    NDArray of shape (n_samples,)
        Conformity scores.
    """
    # Compute y_pred and position on the RAPS validation dataset
    predict_params = kwargs.pop("predict_params", {})
    self.y_pred_proba_raps = self.predictor.single_estimator_.predict_proba(
        self.X_raps, **predict_params
    )
    self.position_raps = get_true_label_position(
        self.y_pred_proba_raps, self.y_raps
    )

    return super().get_conformity_scores(y, y_pred, y_enc=y_enc, **kwargs)

get_conformity_score_quantiles ¶

get_conformity_score_quantiles(
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    include_last_label: Optional[Union[bool, str]] = True,
    **kwargs,
) -> NDArray

Get the quantiles of the conformity scores for each uncertainty level.

PARAMETER	DESCRIPTION
`conformity_scores`	Conformity scores for each sample. TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator (not used here). TYPE: `Optional[Union[int, str, BaseCrossValidator]]`
`agg_scores`	Method to aggregate the scores from the base estimators. If "mean", the scores are averaged. If "crossval", the scores are obtained from cross-validation (not used here). By default, `"mean"`. TYPE: `Optional[str]` DEFAULT: `'mean'`
`include_last_label`	Whether or not to include last label in prediction sets. Choose among `False`, `True` or `"randomized"`. By default, `True`. See the docstring of `APSConformityScore.get_prediction_sets` for more details. TYPE: `Optional[Union[bool, str]]` DEFAULT: `True`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

Source code in mapie/conformity_scores/sets/raps.py

def get_conformity_score_quantiles(
    self,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    agg_scores: Optional[str] = "mean",
    include_last_label: Optional[Union[bool, str]] = True,
    **kwargs,
) -> NDArray:
    """
    Get the quantiles of the conformity scores for each uncertainty level.

    Parameters
    -----------
    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence interval.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator (not used here).

    agg_scores: Optional[str]
        Method to aggregate the scores from the base estimators.
        If "mean", the scores are averaged. If "crossval", the scores are
        obtained from cross-validation (not used here).

        By default, `"mean"`.

    include_last_label: Optional[Union[bool, str]]
        Whether or not to include last label in prediction sets.
        Choose among `False`, `True`  or `"randomized"`.

        By default, `True`.

        See the docstring of
        `APSConformityScore.get_prediction_sets`
        for more details.

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.
    """
    # Casting to NDArray to avoid mypy errors
    # X_raps = cast(NDArray, X_raps)
    # y_raps_no_enc = cast(NDArray, y_raps_no_enc)
    # y_pred_proba_raps = cast(NDArray, y_pred_proba_raps)
    # position_raps = cast(NDArray, position_raps)

    _check_alpha_and_n_samples(alpha_np, self.X_raps.shape[0])
    self.k_star = _compute_quantiles(self.position_raps, alpha_np) + 1
    y_pred_proba_raps = np.repeat(
        self.y_pred_proba_raps[:, :, np.newaxis], len(alpha_np), axis=2
    )
    self.lambda_star = self._find_lambda_star(
        self.y_raps_no_enc,
        y_pred_proba_raps,
        alpha_np,
        include_last_label,
        self.k_star,
    )
    conformity_scores_regularized = self._regularize_conformity_score(
        self.k_star, self.lambda_star, conformity_scores, self.cutoff
    )
    quantiles_ = _compute_quantiles(conformity_scores_regularized, alpha_np)

    return quantiles_

mapie.conformity_scores.TopKConformityScore ¶

TopKConformityScore()

Bases: BaseClassificationScore

Top-K method-based non-conformity score.

It is based on the sorted index of the probability of the true label in the softmax outputs, on the conformalization set. In case two probabilities are equal, both are taken, thus, the size of some prediction sets may be different from the others.

References

[1] Anastasios Nikolas Angelopoulos, Stephen Bates, Michael Jordan and Jitendra Malik. "Uncertainty Sets for Image Classifiers using Conformal Prediction." International Conference on Learning Representations 2021.

ATTRIBUTE	DESCRIPTION
`classes`	Names of the classes. TYPE: `Optional[ArrayLike]`
`random_state`	Pseudo random number generator state. TYPE: `Optional[Union[int, RandomState]]`
`quantiles_`	The quantiles estimated from `get_sets` method. TYPE: `ArrayLike of shape (n_alpha)`

Source code in mapie/conformity_scores/sets/topk.py

def __init__(self) -> None:
    super().__init__()

get_conformity_scores ¶

get_conformity_scores(
    y: NDArray,
    y_pred: NDArray,
    y_enc: Optional[NDArray] = None,
    **kwargs,
) -> NDArray

Get the conformity score.

PARAMETER	DESCRIPTION
`y`	Observed target values. TYPE: `NDArray`
`y_pred`	Predicted target values. TYPE: `NDArray`
`y_enc`	Target values as normalized encodings. TYPE: `Optional[NDArray]` DEFAULT: `None`

RETURNS	DESCRIPTION
`NDArray of shape (n_samples,)`	Conformity scores.

Source code in mapie/conformity_scores/sets/topk.py

def get_conformity_scores(
    self, y: NDArray, y_pred: NDArray, y_enc: Optional[NDArray] = None, **kwargs
) -> NDArray:
    """
    Get the conformity score.

    Parameters
    ----------
    y: NDArray of shape (n_samples,)
        Observed target values.

    y_pred: NDArray of shape (n_samples,)
        Predicted target values.

    y_enc: NDArray of shape (n_samples,)
        Target values as normalized encodings.

    Returns
    -------
    NDArray of shape (n_samples,)
        Conformity scores.
    """
    # Casting
    y_enc = cast(NDArray, y_enc)

    # Conformity scores
    # Here we reorder the labels by decreasing probability and get the
    # position of each label from decreasing probability
    conformity_scores = get_true_label_position(y_pred, y_enc)

    return conformity_scores

get_predictions ¶

get_predictions(
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray

Just processes the passed y_pred_proba.

PARAMETER	DESCRIPTION
`X`	Observed feature values (not used since predictions are passed). TYPE: `NDArray`
`alpha_np`	NDArray of floats between `0` and `1`, represents the uncertainty of the confidence interval. TYPE: `NDArray`
`y_pred_proba`	Predicted probabilities from the estimator. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator (not used here). TYPE: `Optional[Union[int, str, BaseCrossValidator]]`

RETURNS	DESCRIPTION
`NDArray`	Array of predictions.

Source code in mapie/conformity_scores/sets/topk.py

def get_predictions(
    self,
    X: NDArray,
    alpha_np: NDArray,
    y_pred_proba: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray:
    """
    Just processes the passed y_pred_proba.

    Parameters
    -----------
    X: NDArray of shape (n_samples, n_features)
        Observed feature values (not used since predictions are passed).

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between `0` and `1`, represents the
        uncertainty of the confidence interval.

    y_pred_proba: NDArray
        Predicted probabilities from the estimator.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator (not used here).

    Returns
    --------
    NDArray
        Array of predictions.
    """
    y_pred_proba = np.repeat(y_pred_proba[:, :, np.newaxis], len(alpha_np), axis=2)
    return y_pred_proba

get_conformity_score_quantiles ¶

get_conformity_score_quantiles(
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray

Get the quantiles of the conformity scores for each uncertainty level.

PARAMETER	DESCRIPTION
`conformity_scores`	Conformity scores for each sample. TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval. TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator (not used here). TYPE: `Optional[Union[int, str, BaseCrossValidator]]`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

Source code in mapie/conformity_scores/sets/topk.py

def get_conformity_score_quantiles(
    self,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray:
    """
    Get the quantiles of the conformity scores for each uncertainty level.

    Parameters
    -----------
    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample.

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence interval.

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator (not used here).

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.
    """
    return _compute_quantiles(conformity_scores, alpha_np)

get_prediction_sets ¶

get_prediction_sets(
    y_pred_proba: NDArray,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray

Generate prediction sets based on the probability predictions, the conformity scores and the uncertainty level.

PARAMETER	DESCRIPTION
`y_pred_proba`	Target prediction. TYPE: `NDArray`
`conformity_scores`	Conformity scores for each sample (not used here). TYPE: `NDArray`
`alpha_np`	NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval (not used here). TYPE: `NDArray`
`cv`	Cross-validation strategy used by the estimator (not used here). TYPE: `Optional[Union[int, str, BaseCrossValidator]]`

RETURNS	DESCRIPTION
`NDArray`	Array of quantiles with respect to alpha_np.

Source code in mapie/conformity_scores/sets/topk.py

def get_prediction_sets(
    self,
    y_pred_proba: NDArray,
    conformity_scores: NDArray,
    alpha_np: NDArray,
    cv: Optional[Union[int, str, BaseCrossValidator]],
    **kwargs,
) -> NDArray:
    """
    Generate prediction sets based on the probability predictions,
    the conformity scores and the uncertainty level.

    Parameters
    -----------
    y_pred_proba: NDArray of shape (n_samples, n_classes)
        Target prediction.

    conformity_scores: NDArray of shape (n_samples,)
        Conformity scores for each sample (not used here).

    alpha_np: NDArray of shape (n_alpha,)
        NDArray of floats between 0 and 1, representing the uncertainty
        of the confidence interval (not used here).

    cv: Optional[Union[int, str, BaseCrossValidator]]
        Cross-validation strategy used by the estimator (not used here).

    Returns
    --------
    NDArray
        Array of quantiles with respect to alpha_np.
    """
    y_pred_proba = y_pred_proba[:, :, 0]
    index_sorted = np.fliplr(np.argsort(y_pred_proba, axis=1))
    y_pred_index_last = np.stack(
        [index_sorted[:, quantile] for quantile in self.quantiles_], axis=1
    )
    y_pred_proba_last = np.stack(
        [
            np.take_along_axis(
                y_pred_proba, y_pred_index_last[:, iq].reshape(-1, 1), axis=1
            )
            for iq, _ in enumerate(self.quantiles_)
        ],
        axis=2,
    )
    prediction_sets = np.greater_equal(
        y_pred_proba[:, :, np.newaxis] - y_pred_proba_last, -EPSILON
    )

    return cast(NDArray, prediction_sets)

Conformity Scores¶

Regression¶

mapie.conformity_scores.BaseRegressionScore ¶

get_signed_conformity_scores abstractmethod ¶

get_conformity_scores ¶

check_consistency ¶

get_estimation_distribution abstractmethod ¶

get_bounds ¶

predict_set ¶

get_effective_calibration_samples ¶

mapie.conformity_scores.AbsoluteConformityScore ¶

get_signed_conformity_scores ¶

get_estimation_distribution ¶

mapie.conformity_scores.GammaConformityScore ¶

get_signed_conformity_scores ¶

get_estimation_distribution ¶

mapie.conformity_scores.ResidualNormalisedScore ¶

get_signed_conformity_scores ¶

get_estimation_distribution ¶

Classification¶

mapie.conformity_scores.BaseClassificationScore ¶

set_external_attributes ¶

get_predictions abstractmethod ¶

get_conformity_score_quantiles abstractmethod ¶

get_prediction_sets abstractmethod ¶

get_sets ¶

predict_set ¶

mapie.conformity_scores.NaiveConformityScore ¶

get_conformity_scores ¶

get_predictions ¶

get_conformity_score_quantiles ¶

get_prediction_sets ¶

mapie.conformity_scores.LACConformityScore ¶

get_conformity_scores ¶

get_predictions ¶

get_conformity_score_quantiles ¶

get_prediction_sets ¶

mapie.conformity_scores.APSConformityScore ¶

get_predictions ¶

get_true_label_cumsum_proba staticmethod ¶

get_conformity_scores ¶

get_conformity_score_quantiles ¶

get_prediction_sets ¶

mapie.conformity_scores.RAPSConformityScore ¶

set_external_attributes ¶

split_data ¶

get_conformity_scores ¶

get_conformity_score_quantiles ¶

mapie.conformity_scores.TopKConformityScore ¶

get_conformity_scores ¶

get_predictions ¶

get_conformity_score_quantiles ¶

get_prediction_sets ¶

get_signed_conformity_scores `abstractmethod` ¶

get_estimation_distribution `abstractmethod` ¶

get_predictions `abstractmethod` ¶

get_conformity_score_quantiles `abstractmethod` ¶

get_prediction_sets `abstractmethod` ¶

get_true_label_cumsum_proba `staticmethod` ¶