Regression¶
Conformal prediction methods for regression tasks.
mapie.regression.SplitConformalRegressor
¶
SplitConformalRegressor(
estimator: RegressorMixin = LinearRegression(),
confidence_level: Union[float, Iterable[float]] = 0.9,
conformity_score: Union[
str, BaseRegressionScore
] = "absolute",
prefit: bool = True,
n_jobs: Optional[int] = None,
verbose: int = 0,
)
Computes prediction intervals using the split conformal regression technique:
- The
fitmethod (optional) fits the base regressor to the training data. - The
conformalizemethod estimates the uncertainty of the base regressor by computing conformity scores on the conformalization set. - The
predict_intervalmethod predicts points and intervals.
| PARAMETER | DESCRIPTION |
|---|---|
estimator
|
The base regressor used to predict points.
TYPE:
|
confidence_level
|
The confidence level(s) for the prediction intervals, indicating the desired coverage probability of the prediction intervals. If a float is provided, it represents a single confidence level. If a list, multiple prediction intervals for each specified confidence level are returned.
TYPE:
|
conformity_score
|
The method used to compute conformity scores Valid options:
A custom score function inheriting from BaseRegressionScore may also be provided.
TYPE:
|
prefit
|
If True, the base regressor must be fitted, and the If False, the base regressor will be fitted during the
TYPE:
|
n_jobs
|
The number of jobs to run in parallel when applicable.
TYPE:
|
verbose
|
Controls the verbosity level. Higher values increase the output details.
TYPE:
|
Examples:
>>> from mapie.regression import SplitConformalRegressor
>>> from mapie.utils import train_conformalize_test_split
>>> from sklearn.datasets import make_regression
>>> from sklearn.linear_model import Ridge
>>> X, y = make_regression(n_samples=500, n_features=2, noise=1.0)
>>> (
... X_train, X_conformalize, X_test,
... y_train, y_conformalize, y_test
... ) = train_conformalize_test_split(
... X, y, train_size=0.6, conformalize_size=0.2, test_size=0.2, random_state=1
... )
>>> mapie_regressor = SplitConformalRegressor(
... estimator=Ridge(),
... confidence_level=0.95,
... prefit=False,
... ).fit(X_train, y_train).conformalize(X_conformalize, y_conformalize)
Source code in mapie/regression/regression.py
fit
¶
fit(
X_train: ArrayLike,
y_train: ArrayLike,
fit_params: Optional[dict] = None,
) -> SplitConformalRegressor
Fits the base regressor to the training data.
| PARAMETER | DESCRIPTION |
|---|---|
X_train
|
Training data features.
TYPE:
|
y_train
|
Training data targets.
TYPE:
|
fit_params
|
Parameters to pass to the
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Self
|
The fitted SplitConformalRegressor instance. |
Source code in mapie/regression/regression.py
conformalize
¶
conformalize(
X_conformalize: ArrayLike,
y_conformalize: ArrayLike,
predict_params: Optional[dict] = None,
) -> SplitConformalRegressor
Estimates the uncertainty of the base regressor by computing conformity scores on the conformalization set.
| PARAMETER | DESCRIPTION |
|---|---|
X_conformalize
|
Features of the conformalization set.
TYPE:
|
y_conformalize
|
Targets of the conformalization set.
TYPE:
|
predict_params
|
Parameters to pass to the
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Self
|
The conformalized SplitConformalRegressor instance. |
Source code in mapie/regression/regression.py
predict_interval
¶
predict_interval(
X: ArrayLike,
minimize_interval_width: bool = False,
allow_infinite_bounds: bool = False,
) -> Tuple[NDArray, NDArray]
Predicts points (using the base regressor) and intervals.
If several confidence levels were provided during initialisation, several intervals will be predicted for each sample. See the return signature.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
minimize_interval_width
|
If True, attempts to minimize the intervals width.
TYPE:
|
allow_infinite_bounds
|
If True, allows prediction intervals with infinite bounds.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tuple[NDArray, NDArray]
|
Two arrays:
|
Source code in mapie/regression/regression.py
predict
¶
Predicts points.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
NDArray
|
Array of point predictions, with shape (n_samples,). |
Source code in mapie/regression/regression.py
mapie.regression.CrossConformalRegressor
¶
CrossConformalRegressor(
estimator: RegressorMixin = LinearRegression(),
confidence_level: Union[float, Iterable[float]] = 0.9,
conformity_score: Union[
str, BaseRegressionScore
] = "absolute",
method: str = "plus",
cv: Union[int, BaseCrossValidator] = 5,
n_jobs: Optional[int] = None,
verbose: int = 0,
random_state: Optional[Union[int, RandomState]] = None,
)
Computes prediction intervals using the cross conformal regression technique:
- The
fit_conformalizemethod estimates the uncertainty of the base regressor in a cross-validation style. It fits the base regressor on folds of the dataset and computes conformity scores on the out-of-fold data. - The
predict_intervalcomputes prediction points and intervals.
| PARAMETER | DESCRIPTION |
|---|---|
estimator
|
The base regressor used to predict points.
TYPE:
|
confidence_level
|
The confidence level(s) for the prediction intervals, indicating the desired coverage probability of the prediction intervals. If a float is provided, it represents a single confidence level. If a list, multiple prediction intervals for each specified confidence level are returned.
TYPE:
|
conformity_score
|
The method used to compute conformity scores Valid options:
A custom score function inheriting from BaseRegressionScore may also be provided.
TYPE:
|
method
|
The method used to compute prediction intervals. Options are:
TYPE:
|
cv
|
The cross-validator used to compute conformity scores. Valid options:
Main variants in the cross conformal setting are:
TYPE:
|
n_jobs
|
The number of jobs to run in parallel when applicable.
TYPE:
|
verbose
|
Controls the verbosity level. Higher values increase the output details.
TYPE:
|
random_state
|
A seed or random state instance to ensure reproducibility in any random operations within the regressor.
TYPE:
|
Examples:
>>> from mapie.regression import CrossConformalRegressor
>>> from sklearn.datasets import make_regression
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.linear_model import Ridge
>>> X_full, y_full = make_regression(n_samples=500,n_features=2,noise=1.0)
>>> X, X_test, y, y_test = train_test_split(X_full, y_full)
>>> mapie_regressor = CrossConformalRegressor(
... estimator=Ridge(),
... confidence_level=0.95,
... cv=10
... ).fit_conformalize(X, y)
Source code in mapie/regression/regression.py
fit_conformalize
¶
fit_conformalize(
X: ArrayLike,
y: ArrayLike,
groups: Optional[ArrayLike] = None,
fit_params: Optional[dict] = None,
predict_params: Optional[dict] = None,
) -> CrossConformalRegressor
Estimates the uncertainty of the base regressor in a cross-validation style: fits the base regressor on different folds of the dataset and computes conformity scores on the corresponding out-of-fold data.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
y
|
Targets
TYPE:
|
groups
|
Groups to pass to the cross-validator.
TYPE:
|
fit_params
|
Parameters to pass to the
TYPE:
|
predict_params
|
Parameters to pass to the
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Self
|
This CrossConformalRegressor instance, fitted and conformalized. |
Source code in mapie/regression/regression.py
predict_interval
¶
predict_interval(
X: ArrayLike,
aggregate_predictions: Optional[str] = "mean",
minimize_interval_width: bool = False,
allow_infinite_bounds: bool = False,
) -> Tuple[NDArray, NDArray]
Predicts points and intervals.
If several confidence levels were provided during initialisation, several intervals will be predicted for each sample. See the return signature.
By default, points are predicted using an aggregation.
See the ensemble parameter.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
aggregate_predictions
|
The method to predict a point. Options:
TYPE:
|
minimize_interval_width
|
If True, attempts to minimize the interval width.
TYPE:
|
allow_infinite_bounds
|
If True, allows prediction intervals with infinite bounds.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tuple[NDArray, NDArray]
|
Two arrays:
|
Source code in mapie/regression/regression.py
predict
¶
Predicts points.
By default, points are predicted using an aggregation.
See the ensemble parameter.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
aggregate_predictions
|
The method to predict a point. Options:
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
NDArray
|
Array of point predictions, with shape |
Source code in mapie/regression/regression.py
mapie.regression.JackknifeAfterBootstrapRegressor
¶
JackknifeAfterBootstrapRegressor(
estimator: RegressorMixin = LinearRegression(),
confidence_level: Union[float, Iterable[float]] = 0.9,
conformity_score: Union[
str, BaseRegressionScore
] = "absolute",
method: str = "plus",
resampling: Union[int, Subsample] = 30,
aggregation_method: str = "mean",
n_jobs: Optional[int] = None,
verbose: int = 0,
random_state: Optional[Union[int, RandomState]] = None,
)
Computes prediction intervals using the jackknife-after-bootstrap technique:
- The
fit_conformalizemethod estimates the uncertainty of the base regressor using bootstrap sampling. It fits the base regressor on samples of the dataset and computes conformity scores on the out-of-sample data. - The
predict_intervalcomputes prediction points and intervals.
| PARAMETER | DESCRIPTION |
|---|---|
estimator
|
The base regressor used to predict points.
TYPE:
|
confidence_level
|
The confidence level(s) for the prediction intervals, indicating the desired coverage probability of the prediction intervals. If a float is provided, it represents a single confidence level. If a list, multiple prediction intervals for each specified confidence level are returned.
TYPE:
|
conformity_score
|
The method used to compute conformity scores Valid options:
A custom score function inheriting from BaseRegressionScore may also be provided.
TYPE:
|
method
|
The method used to compute prediction intervals. Options are:
Note: The "base" method is not mentioned in the conformal inference literature for Jackknife after bootstrap strategies, hence not provided here.
TYPE:
|
resampling
|
Number of bootstrap resamples or an instance of
TYPE:
|
aggregation_method
|
Aggregation method for predictions across bootstrap samples. Options:
TYPE:
|
n_jobs
|
The number of jobs to run in parallel when applicable.
TYPE:
|
verbose
|
Controls the verbosity level. Higher values increase the output details.
TYPE:
|
random_state
|
A seed or random state instance to ensure reproducibility in any random operations within the regressor.
TYPE:
|
Examples:
>>> from mapie.regression import JackknifeAfterBootstrapRegressor
>>> from sklearn.datasets import make_regression
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.linear_model import Ridge
>>> X_full, y_full = make_regression(n_samples=500,n_features=2,noise=1.0)
>>> X, X_test, y, y_test = train_test_split(X_full, y_full)
>>> mapie_regressor = JackknifeAfterBootstrapRegressor(
... estimator=Ridge(),
... confidence_level=0.95,
... resampling=25,
... ).fit_conformalize(X, y)
Source code in mapie/regression/regression.py
fit_conformalize
¶
fit_conformalize(
X: ArrayLike,
y: ArrayLike,
fit_params: Optional[dict] = None,
predict_params: Optional[dict] = None,
) -> JackknifeAfterBootstrapRegressor
Estimates the uncertainty of the base regressor using bootstrap sampling: fits the base regressor on (potentially overlapping) samples of the dataset, and computes conformity scores on the corresponding out of samples data.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features. Must be the same X used in .fit
TYPE:
|
y
|
Targets. Must be the same y used in .fit
TYPE:
|
fit_params
|
Parameters to pass to the
TYPE:
|
predict_params
|
Parameters to pass to the
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Self
|
This JackknifeAfterBootstrapRegressor instance, fitted and conformalized. |
Source code in mapie/regression/regression.py
predict_interval
¶
predict_interval(
X: ArrayLike,
ensemble: bool = True,
minimize_interval_width: bool = False,
allow_infinite_bounds: bool = False,
) -> Tuple[NDArray, NDArray]
Predicts points and intervals.
If several confidence levels were provided during initialisation, several intervals will be predicted for each sample. See the return signature.
By default, points are predicted using an aggregation.
See the ensemble parameter.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Test data for prediction intervals.
TYPE:
|
ensemble
|
If True, a predicted point is an aggregation of the predictions of the
regressors trained on each bootstrap samples. This aggregation depends on
the If False, a point is predicted using the regressor trained on the entire data
TYPE:
|
minimize_interval_width
|
If True, attempts to minimize the interval width.
TYPE:
|
allow_infinite_bounds
|
If True, allows prediction intervals with infinite bounds.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tuple[NDArray, NDArray]
|
Two arrays:
|
Source code in mapie/regression/regression.py
predict
¶
Predicts points.
By default, points are predicted using an aggregation.
See the ensemble parameter.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Data features for generating point predictions.
TYPE:
|
ensemble
|
If True, a predicted point is an aggregation of the predictions of the
regressors trained on each bootstrap samples. This aggregation depends on
the
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
NDArray
|
Array of point predictions, with shape |
Source code in mapie/regression/regression.py
mapie.regression.ConformalizedQuantileRegressor
¶
ConformalizedQuantileRegressor(
estimator: Optional[
Union[
RegressorMixin,
Pipeline,
List[Union[RegressorMixin, Pipeline]],
]
] = None,
confidence_level: float = 0.9,
prefit: bool = False,
)
Computes prediction intervals using the conformalized quantile regression technique:
- The
fitmethod fits three models to the training data using the provided regressor: a model to predict the target, and models to predict upper and lower quantiles around the target. - The
conformalizemethod estimates the uncertainty of the quantile models using the conformalization set. - The
predict_intervalcomputes prediction points and intervals.
| PARAMETER | DESCRIPTION |
|---|---|
estimator
|
The regressor used to predict points and quantiles. When
When
TYPE:
|
confidence_level
|
The confidence level for the prediction intervals, indicating the desired coverage probability of the prediction intervals.
TYPE:
|
prefit
|
If True, three fitted quantile regressors must be provided, and the If False, the three regressors will be fitted during the
TYPE:
|
Examples:
>>> from mapie.regression import ConformalizedQuantileRegressor
>>> from mapie.utils import train_conformalize_test_split
>>> from sklearn.datasets import make_regression
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.linear_model import QuantileRegressor
>>> X, y = make_regression(n_samples=500, n_features=2, noise=1.0)
>>> (
... X_train, X_conformalize, X_test,
... y_train, y_conformalize, y_test
... ) = train_conformalize_test_split(
... X, y, train_size=0.6, conformalize_size=0.2, test_size=0.2, random_state=1
... )
>>> mapie_regressor = ConformalizedQuantileRegressor(
... estimator=QuantileRegressor(),
... confidence_level=0.95,
... ).fit(X_train, y_train).conformalize(X_conformalize, y_conformalize)
Source code in mapie/regression/quantile_regression.py
fit
¶
fit(
X_train: ArrayLike,
y_train: ArrayLike,
fit_params: Optional[dict] = None,
) -> ConformalizedQuantileRegressor
Fits three models using the regressor provided at initialisation:
- a model to predict the target
- a model to predict the upper quantile of the target
- a model to predict the lower quantile of the target
| PARAMETER | DESCRIPTION |
|---|---|
X_train
|
Training data features.
TYPE:
|
y_train
|
Training data targets.
TYPE:
|
fit_params
|
Parameters to pass to the
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Self
|
The fitted ConformalizedQuantileRegressor instance. |
Source code in mapie/regression/quantile_regression.py
conformalize
¶
conformalize(
X_conformalize: ArrayLike,
y_conformalize: ArrayLike,
predict_params: Optional[dict] = None,
) -> ConformalizedQuantileRegressor
Estimates the uncertainty of the quantile regressors by computing conformity scores on the conformalization set.
| PARAMETER | DESCRIPTION |
|---|---|
X_conformalize
|
Features of the conformalization set.
TYPE:
|
y_conformalize
|
Targets of the conformalization set.
TYPE:
|
predict_params
|
Parameters to pass to the
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Self
|
The ConformalizedQuantileRegressor instance. |
Source code in mapie/regression/quantile_regression.py
predict_interval
¶
predict_interval(
X: ArrayLike,
minimize_interval_width: bool = False,
allow_infinite_bounds: bool = False,
symmetric_correction: bool = False,
) -> Tuple[NDArray, NDArray]
Predicts points (using the base regressor) and intervals.
The returned NDArray containing the prediction intervals is of shape (n_samples, 2, 1). The third dimension is unnecessary, but kept for consistency with the other conformal regression methods available in MAPIE.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
minimize_interval_width
|
If True, attempts to minimize the intervals width.
TYPE:
|
allow_infinite_bounds
|
If True, allows prediction intervals with infinite bounds.
TYPE:
|
symmetric_correction
|
To produce prediction intervals, the conformalized quantile regression technique corrects the predictions of the upper and lower quantile regressors by adding a constant. If
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tuple[NDArray, NDArray]
|
Two arrays:
|
Source code in mapie/regression/quantile_regression.py
predict
¶
Predicts points.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Features
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
NDArray
|
Array of point predictions with shape |
Source code in mapie/regression/quantile_regression.py
mapie.regression.TimeSeriesRegressor
¶
TimeSeriesRegressor(
estimator: Optional[RegressorMixin] = None,
method: str = "enbpi",
cv: Optional[
Union[int, str, BaseCrossValidator]
] = None,
n_jobs: Optional[int] = None,
agg_function: Optional[str] = "mean",
verbose: int = 0,
conformity_score: Optional[BaseRegressionScore] = None,
random_state: Optional[Union[int, RandomState]] = None,
)
Bases: _MapieRegressor
Prediction intervals with out-of-fold residuals for time series.
This class only has two valid method : "enbpi" or "aci"
The prediction intervals are calibrated on a split of the trained data. Both strategies are estimating prediction intervals on single-output time series.
EnbPI allows you to update conformal scores using the update
function. It will replace the oldest one with the newest scores.
It will keep the same amount of total scores
Actually, EnbPI only corresponds to TimeSeriesRegressor if the
cv argument is of type BlockBootstrap.
The ACI strategy allows you to adapt the conformal inference (i.e the quantile). If the real values are not in the coverage, the size of the intervals will grow. Conversely, if the real values are in the coverage, the size of the intervals will decrease. You can use a gamma coefficient to adjust the strength of the correction. If the quantile is equal to zero, the method will produce an infinite set size.
References
Chen Xu, and Yao Xie. "Conformal prediction for dynamic time-series." https://arxiv.org/abs/2010.09107
Isaac Gibbs, Emmanuel Candes "Adaptive conformal inference under distribution shift" https://proceedings.neurips.cc/paper/2021/file/0d441de75945e5acbc865406fc9a2559-Paper.pdf
Margaux Zaffran et al. "Adaptive Conformal Predictions for Time Series" https://arxiv.org/pdf/2202.07282.pdf
Source code in mapie/regression/time_series_regression.py
adapt_conformal_inference
¶
adapt_conformal_inference(
X: ArrayLike,
y: ArrayLike,
gamma: float,
confidence_level: Optional[
Union[float, Iterable[float]]
] = None,
ensemble: bool = False,
optimize_beta: bool = False,
) -> TimeSeriesRegressor
Adapt the alpha_t attribute when new data with known
labels are available.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Input data.
TYPE:
|
y
|
Input labels.
TYPE:
|
ensemble
|
Boolean determining whether the predictions are ensembled or not.
If By default
TYPE:
|
gamma
|
Coefficient that decides the correction of the conformal inference. If it equals 0, there are no corrections.
TYPE:
|
confidence_level
|
Between By default
TYPE:
|
optimize_beta
|
Whether to optimize the PIs' width or not. By default
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
TimeSeriesRegressor
|
The model itself. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If the length of |
Source code in mapie/regression/time_series_regression.py
239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 | |
update
¶
update(
X: ArrayLike,
y: ArrayLike,
ensemble: bool = False,
confidence_level: Optional[
Union[float, Iterable[float]]
] = None,
gamma: float = 0.0,
optimize_beta: bool = False,
) -> TimeSeriesRegressor
Update conformity scores
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Input data.
TYPE:
|
y
|
Input labels.
TYPE:
|
ensemble
|
Boolean determining whether the predictions are ensembled or not.
If By default
TYPE:
|
confidence_level
|
(deprecated)
Between By default
TYPE:
|
gamma
|
(deprecated) Coefficient that decides the correction of the conformal inference. If it equals 0, there are no corrections. By default
TYPE:
|
optimize_beta
|
(deprecated) Whether to optimize the PIs' width or not. By default
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
TimeSeriesRegressor
|
The model itself. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If the length of |
Source code in mapie/regression/time_series_regression.py
predict
¶
predict(
X: ArrayLike,
ensemble: bool = False,
confidence_level: Optional[
Union[float, Iterable[float]]
] = None,
optimize_beta: bool = False,
allow_infinite_bounds: bool = False,
**predict_params,
) -> Union[NDArray, Tuple[NDArray, NDArray]]
Predict target on new samples with confidence intervals.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Test data.
TYPE:
|
ensemble
|
Boolean determining whether the predictions are ensembled or not.
If By default
TYPE:
|
confidence_level
|
Between By default
TYPE:
|
optimize_beta
|
Whether to optimize the PIs' width or not. By default
TYPE:
|
allow_infinite_bounds
|
Allow infinite prediction intervals to be produced.
TYPE:
|
predict_params
|
Additional predict parameters.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Union[NDArray, Tuple[NDArray, NDArray]]
|
|