imbalanced-learn API

This is the full API documentation of the imbalanced-learn toolbox.

imblearn.under_sampling: Under-sampling methods

The imblearn.under_sampling provides methods to under-sample a dataset.

Prototype generation

The imblearn.under_sampling.prototype_generation submodule contains methods that generate new samples in order to balance the dataset.

under_sampling.ClusterCentroids([ratio, …]) Perform under-sampling by generating centroids based on clustering methods.

Prototype selection

The imblearn.under_sampling.prototype_selection submodule contains methods that select samples in order to balance the dataset.

under_sampling.CondensedNearestNeighbour([…]) Class to perform under-sampling based on the condensed nearest neighbour method.
under_sampling.EditedNearestNeighbours([…]) Class to perform under-sampling based on the edited nearest neighbour method.
under_sampling.RepeatedEditedNearestNeighbours([…]) Class to perform under-sampling based on the repeated edited nearest neighbour method.
under_sampling.AllKNN([ratio, …]) Class to perform under-sampling based on the AllKNN method.
under_sampling.InstanceHardnessThreshold([…]) Class to perform under-sampling based on the instance hardness threshold.
under_sampling.NearMiss([ratio, …]) Class to perform under-sampling based on NearMiss methods.
under_sampling.NeighbourhoodCleaningRule([…]) Class performing under-sampling based on the neighbourhood cleaning rule.
under_sampling.OneSidedSelection([ratio, …]) Class to perform under-sampling based on one-sided selection method.
under_sampling.RandomUnderSampler([ratio, …]) Class to perform random under-sampling.
under_sampling.TomekLinks([ratio, …]) Class to perform under-sampling by removing Tomek’s links.

imblearn.over_sampling: Over-sampling methods

The imblearn.over_sampling provides a set of method to perform over-sampling.

over_sampling.ADASYN([ratio, random_state, …]) Perform over-sampling using ADASYN.
over_sampling.RandomOverSampler([ratio, …]) Class to perform random over-sampling.
over_sampling.SMOTE([ratio, random_state, …]) Class to perform over-sampling using SMOTE.

imblearn.combine: Combination of over- and under-sampling methods

The imblearn.combine provides methods which combine over-sampling and under-sampling.

combine.SMOTEENN([ratio, random_state, …]) Class to perform over-sampling using SMOTE and cleaning using ENN.
combine.SMOTETomek([ratio, random_state, …]) Class to perform over-sampling using SMOTE and cleaning using Tomek links.

imblearn.ensemble: Ensemble methods

The imblearn.ensemble module include methods generating under-sampled subsets combined inside an ensemble.

ensemble.BalanceCascade([ratio, …]) Create an ensemble of balanced sets by iteratively under-sampling the imbalanced dataset using an estimator.
ensemble.BalancedBaggingClassifier([…]) A Bagging classifier with additional balancing.
ensemble.EasyEnsemble([ratio, …]) Create an ensemble sets by iteratively applying random under-sampling.

imblearn.pipeline: Pipeline

The imblearn.pipeline module implements utilities to build a composite estimator, as a chain of transforms, samples and estimators.

pipeline.Pipeline(steps[, memory]) Pipeline of transforms and resamples with a final estimator.
pipeline.make_pipeline(*steps) Construct a Pipeline from the given estimators.

imblearn.metrics: Metrics

The imblearn.metrics module includes score functions, performance metrics and pairwise metrics and distance computations.

metrics.classification_report_imbalanced(…) Build a classification report based on metrics used with imbalanced
metrics.sensitivity_specificity_support(…) Compute sensitivity, specificity, and support for each class
metrics.sensitivity_score(y_true, y_pred[, …]) Compute the sensitivity
metrics.specificity_score(y_true, y_pred[, …]) Compute the specificity
metrics.geometric_mean_score(y_true, y_pred) Compute the geometric mean
metrics.make_index_balanced_accuracy([…]) Balance any scoring function using the index balanced accuracy

imblearn.datasets: Datasets

The imblearn.datasets provides methods to generate imbalanced data.

datasets.make_imbalance(X, y, ratio[, …]) Turns a dataset into an imbalanced dataset at specific ratio.
datasets.fetch_datasets([data_home, …]) Load the benchmark datasets from Zenodo, downloading it if necessary.

imblearn.utils: Utilities

The imblearn.utils module includes various utilities.

utils.check_neighbors_object(nn_name, nn_object) Check the objects is consistent to be a NN.
utils.check_ratio(ratio, y, sampling_type, …) Ratio validation for samplers.
utils.hash_X_y(X, y[, n_samples]) Compute hash of the input arrays.