metric_learn
.RCA¶

class
metric_learn.
RCA
(n_components=None, preprocessor=None)[source]¶ Relevant Components Analysis (RCA)
RCA learns a full rank Mahalanobis distance metric based on a weighted sum of inchunklets covariance matrices. It applies a global linear transformation to assign large weights to relevant dimensions and low weights to irrelevant dimensions. Those relevant dimensions are estimated using “chunklets”, subsets of points that are known to belong to the same class.
Read more in the User Guide.
Parameters:  n_componentsint or None, optional (default=None)
Dimensionality of reduced space (if None, defaults to dimension of X).
 preprocessorarraylike, shape=(n_samples, n_features) or callable
The preprocessor to call to get tuples from indices. If arraylike, tuples will be formed like this: X[indices].
References
[1] Noam Shental, et al. Adjustment learning and relevant component analysis . ECCV 2002. Examples
>>> from metric_learn import RCA >>> X = [[0.05, 3.0],[0.05, 3.0], >>> [0.1, 3.55],[0.1, 3.55], >>> [0.95, 0.05],[0.95, 0.05], >>> [0.4, 0.05],[0.4, 0.05]] >>> chunks = [0, 0, 1, 1, 2, 2, 3, 3] >>> rca = RCA() >>> rca.fit(X, chunks)
Attributes:  components_
numpy.ndarray
, shape=(n_components, n_features) The learned linear transformation
L
.
Methods
fit
(X, chunks)Learn the RCA model. fit_transform
(X[, y])Fit to data, then transform it. get_mahalanobis_matrix
()Returns a copy of the Mahalanobis matrix learned by the metric learner. get_metric
()Returns a function that takes as input two 1D arrays and outputs the learned metric score on these two points. get_params
([deep])Get parameters for this estimator. score_pairs
(pairs)Returns the learned Mahalanobis distance between pairs. set_params
(**params)Set the parameters of this estimator. transform
(X)Embeds data points in the learned linear embedding space. 
__init__
(n_components=None, preprocessor=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

fit
(X, chunks)[source]¶ Learn the RCA model.
Parameters:  data(n x d) data matrix
Each row corresponds to a single instance
 chunks(n,) array of ints
When
chunks[i] == 1
, point i doesn’t belong to any chunklet. Whenchunks[i] == j
, point i belongs to chunklet j.

fit_transform
(X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters:  X{arraylike, sparse matrix, dataframe} of shape (n_samples, n_features)
 yndarray of shape (n_samples,), default=None
Target values.
 **fit_paramsdict
Additional fit parameters.
Returns:  X_newndarray array of shape (n_samples, n_features_new)
Transformed array.

get_mahalanobis_matrix
()¶ Returns a copy of the Mahalanobis matrix learned by the metric learner.
Returns:  M
numpy.ndarray
, shape=(n_features, n_features) The copy of the learned Mahalanobis matrix.
 M

get_metric
()¶ Returns a function that takes as input two 1D arrays and outputs the learned metric score on these two points.
This function will be independent from the metric learner that learned it (it will not be modified if the initial metric learner is modified), and it can be directly plugged into the
metric
argument of scikitlearn’s estimators.Returns:  metric_funfunction
The function described above.
See also
score_pairs
 a method that returns the metric score between several pairs of points. Unlike
get_metric
, this is a method of the metric learner and therefore can change if the metric learner changes. Besides, it can use the metric learner’s preprocessor, and works on concatenated arrays.
Examples
>>> from metric_learn import NCA >>> from sklearn.datasets import make_classification >>> from sklearn.neighbors import KNeighborsClassifier >>> nca = NCA() >>> X, y = make_classification() >>> nca.fit(X, y) >>> knn = KNeighborsClassifier(metric=nca.get_metric()) >>> knn.fit(X, y) KNeighborsClassifier(algorithm='auto', leaf_size=30, metric=<function MahalanobisMixin.get_metric.<locals>.metric_fun at 0x...>, metric_params=None, n_jobs=None, n_neighbors=5, p=2, weights='uniform')

get_params
(deep=True)¶ Get parameters for this estimator.
Parameters:  deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:  paramsmapping of string to any
Parameter names mapped to their values.

score_pairs
(pairs)¶ Returns the learned Mahalanobis distance between pairs.
This distance is defined as: \(d_M(x, x') = \sqrt{(xx')^T M (xx')}\) where
M
is the learned Mahalanobis matrix, for every pair of pointsx
andx'
. This corresponds to the euclidean distance between embeddings of the points in a new space, obtained through a linear transformation. Indeed, we have also: \(d_M(x, x') = \sqrt{(x_e  x_e')^T (x_e x_e')}\), with \(x_e = L x\) (SeeMahalanobisMixin
).Parameters:  pairsarraylike, shape=(n_pairs, 2, n_features) or (n_pairs, 2)
3D Array of pairs to score, with each row corresponding to two points, for 2D array of indices of pairs if the metric learner uses a preprocessor.
Returns:  scores
numpy.ndarray
of shape=(n_pairs,) The learned Mahalanobis distance for every pair.
See also
get_metric
 a method that returns a function to compute the metric between two points. The difference with
score_pairs
is that it works on two 1D arrays and cannot use a preprocessor. Besides, the returned function is independent of the metric learner and hence is not modified if the metric learner is.  Mahalanobis Distances
 The section of the project documentation that describes Mahalanobis Distances.

set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Parameters:  **paramsdict
Estimator parameters.
Returns:  selfobject
Estimator instance.

transform
(X)¶ Embeds data points in the learned linear embedding space.
Transforms samples in
X
intoX_embedded
, samples inside a new embedding space such that:X_embedded = X.dot(L.T)
, whereL
is the learned linear transformation (SeeMahalanobisMixin
).Parameters:  X
numpy.ndarray
, shape=(n_samples, n_features) The data points to embed.
Returns:  X_embedded
numpy.ndarray
, shape=(n_samples, n_components) The embedded data points.
 X