forestci.random_forest_error

random_forest_error(forest, X_train, X_test, inbag=None, calibrate=True, memory_constrained=False, memory_limit=None)[source]

Calculate error bars from scikit-learn RandomForest estimators.

RandomForest is a regressor or classifier object this variance can be used to plot error bars for RandomForest objects

Parameters
forestRandomForest

Regressor or Classifier object.

X_trainndarray

An array with shape (n_train_sample, n_features). The design matrix for training data.

X_testndarray

An array with shape (n_test_sample, n_features). The design matrix for testing data

inbagndarray, optional

The inbag matrix that fit the data. If set to None (default) it will be inferred from the forest. However, this only works for trees for which bootstrapping was set to True. That is, if sampling was done with replacement. Otherwise, users need to provide their own inbag matrix.

calibrate: boolean, optional

Whether to apply calibration to mitigate Monte Carlo noise. Some variance estimates may be negative due to Monte Carlo effects if the number of trees in the forest is too small. To use calibration, Default: True

memory_constrained: boolean, optional

Whether or not there is a restriction on memory. If False, it is assumed that a ndarry of shape (n_train_sample,n_test_sample) fits in main memory. Setting to True can actually provide a speed up if memory_limit is tuned to the optimal range.

memory_limit: int, optional.

An upper bound for how much memory the itermediate matrices will take up in Megabytes. This must be provided if memory_constrained=True.

Returns
An array with the unbiased sampling variance (V_IJ_unbiased)
for a RandomForest object.

See also

calc_inbag()

Notes

The calculation of error is based on the infinitesimal jackknife variance, as described in [Wager2014] and is a Python implementation of the R code provided at: https://github.com/swager/randomForestCI

Wager2014

S. Wager, T. Hastie, B. Efron. “Confidence Intervals for Random Forests: The Jackknife and the Infinitesimal Jackknife”, Journal of Machine Learning Research vol. 15, pp. 1625-1651, 2014.

Examples using forestci.random_forest_error