polylearn.FactorizationMachineClassifier¶

class polylearn.FactorizationMachineClassifier(degree=2, loss=’squared_hinge’, n_components=2, alpha=1, beta=1, tol=1e-06, fit_lower=’explicit’, fit_linear=True, warm_start=False, init_lambdas=’ones’, max_iter=10000, verbose=False, random_state=None)[source]¶

Factorization machine for classification.

Parameters:

Parameters:	degree : int >= 2, default: 2 Degree of the polynomial. Corresponds to the order of feature interactions captured by the model. Currently only supports degrees up to 3. loss : {‘logistic’\|’squared_hinge’\|’squared’}, default: ‘squared_hinge’ Which loss function to use. logistic: L(y, p) = log(1 + exp(-yp)) squared hinge: L(y, p) = max(1 - yp, 0)² squared: L(y, p) = 0.5 * (y - p)² n_components : int, default: 2 Number of basis vectors to learn, a.k.a. the dimension of the low-rank parametrization. alpha : float, default: 1 Regularization amount for linear term (if `fit_linear=True`). beta : float, default: 1 Regularization amount for higher-order weights. tol : float, default: 1e-6 Tolerance for the stopping condition. fit_lower : {‘explicit’\|’augment’\|None}, default: ‘explicit’ Whether and how to fit lower-order, non-homogeneous terms. ‘explicit’: fits a separate P directly for each lower order. ‘augment’: adds the required number of dummy columns (columns that are 1 everywhere) in order to capture lower-order terms. Adds `degree - 2` columns if `fit_linear` is true, or `degree - 1` columns otherwise, to account for the linear term. None: only learns weights for the degree given. If `degree == 3`, for example, the model will only have weights for third-order feature interactions. fit_linear : {True\|False}, default: True Whether to fit an explicit linear term <w, x> to the model, using coordinate descent. If False, the model can still capture linear effects if `fit_lower == 'augment'`. warm_start : boolean, optional, default: False Whether to use the existing solution, if available. Useful for computing regularization paths or pre-initializing the model. init_lambdas : {‘ones’\|’random_signs’}, default: ‘ones’ How to initialize the predictive weights of each learned basis. The lambdas are not trained; using alternate signs can theoretically improve performance if the kernel degree is even. The default value of ‘ones’ matches the original formulation of factorization machines (Rendle, 2010). To use custom values for the lambdas, `warm_start` may be used. max_iter : int, optional, default: 10000 Maximum number of passes over the dataset to perform. verbose : boolean, optional, default: False Whether to print debugging information. random_state : int seed, RandomState instance, or None (default) The seed of the pseudo random number generator to use for initializing the parameters.
Attributes:	self.P_ : array, shape [n_orders, n_components, n_features] The learned basis functions. `self.P_[0, :, :]` is always available, and corresponds to interactions of order `self.degree`. `self.P_[i, :, :]` for i > 0 corresponds to interactions of order `self.degree - i`, available only if `self.fit_lower='explicit'`. self.w_ : array, shape [n_features] The learned linear model, completing the FM. Only present if `self.fit_linear` is true. self.lams_ : array, shape [n_components] The predictive weights.

degree : int >= 2, default: 2

Degree of the polynomial. Corresponds to the order of feature interactions captured by the model. Currently only supports degrees up to 3.

loss : {‘logistic’|’squared_hinge’|’squared’}, default: ‘squared_hinge’

Which loss function to use.

logistic: L(y, p) = log(1 + exp(-yp))

squared hinge: L(y, p) = max(1 - yp, 0)²

squared: L(y, p) = 0.5 * (y - p)²

n_components : int, default: 2

Number of basis vectors to learn, a.k.a. the dimension of the low-rank parametrization.

alpha : float, default: 1

Regularization amount for linear term (if fit_linear=True).

beta : float, default: 1

Regularization amount for higher-order weights.

tol : float, default: 1e-6

Tolerance for the stopping condition.

fit_lower : {‘explicit’|’augment’|None}, default: ‘explicit’

Whether and how to fit lower-order, non-homogeneous terms.

‘explicit’: fits a separate P directly for each lower order.

‘augment’: adds the required number of dummy columns (columns

that are 1 everywhere) in order to capture lower-order terms. Adds degree - 2 columns if fit_linear is true, or degree - 1 columns otherwise, to account for the linear term.

None: only learns weights for the degree given. If degree == 3, for example, the model will only have weights for third-order feature interactions.

fit_linear : {True|False}, default: True

Whether to fit an explicit linear term <w, x> to the model, using coordinate descent. If False, the model can still capture linear effects if fit_lower == 'augment'.

warm_start : boolean, optional, default: False

Whether to use the existing solution, if available. Useful for computing regularization paths or pre-initializing the model.

init_lambdas : {‘ones’|’random_signs’}, default: ‘ones’

How to initialize the predictive weights of each learned basis. The lambdas are not trained; using alternate signs can theoretically improve performance if the kernel degree is even. The default value of ‘ones’ matches the original formulation of factorization machines (Rendle, 2010).

To use custom values for the lambdas, warm_start may be used.

max_iter : int, optional, default: 10000

Maximum number of passes over the dataset to perform.

verbose : boolean, optional, default: False

Whether to print debugging information.

random_state : int seed, RandomState instance, or None (default)

The seed of the pseudo random number generator to use for initializing the parameters.

Attributes:

self.P_ : array, shape [n_orders, n_components, n_features]

The learned basis functions.

self.P_[0, :, :] is always available, and corresponds to interactions of order self.degree.

self.P_[i, :, :] for i > 0 corresponds to interactions of order self.degree - i, available only if self.fit_lower='explicit'.

self.w_ : array, shape [n_features]

The learned linear model, completing the FM.

Only present if self.fit_linear is true.

self.lams_ : array, shape [n_components]

The predictive weights.

References

Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms. Mathieu Blondel, Masakazu Ishihata, Akinori Fujino, Naonori Ueda. In: Proceedings of ICML 2016. http://mblondel.org/publications/mblondel-icml2016.pdf

Factorization machines. Steffen Rendle In: Proceedings of IEEE 2010.

Methods

`decision_function`(X)	Compute the output of the factorization machine before thresholding.
`fit`(X, y)	Fit factorization machine to training data.
`get_params`([deep])	Get parameters for this estimator.
`predict`(X)	Predict using the factorization machine
`predict_proba`(X)	Compute probability estimates for the test samples.
`score`(X, y[, sample_weight])	Returns the mean accuracy on the given test data and labels.
`set_params`(**params)	Set the parameters of this estimator.

__init__(degree=2, loss=’squared_hinge’, n_components=2, alpha=1, beta=1, tol=1e-06, fit_lower=’explicit’, fit_linear=True, warm_start=False, init_lambdas=’ones’, max_iter=10000, verbose=False, random_state=None)[source]¶

decision_function(X)¶

Compute the output of the factorization machine before thresholding.

Parameters:

Parameters:	X : {array-like, sparse matrix}, shape = [n_samples, n_features] Samples.
Returns:	y_scores : array, shape = [n_samples] Returns predicted values.

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Samples.

Returns:

y_scores : array, shape = [n_samples]

Returns predicted values.

fit(X, y)¶

Fit factorization machine to training data.

Parameters:

Parameters:	X : array-like or sparse, shape = [n_samples, n_features] Training vectors, where n_samples is the number of samples and n_features is the number of features. y : array-like, shape = [n_samples] Target values.
Returns:	self : Estimator Returns self.

X : array-like or sparse, shape = [n_samples, n_features]

Training vectors, where n_samples is the number of samples and n_features is the number of features.

y : array-like, shape = [n_samples]

Target values.

Returns:

self : Estimator

Returns self.

get_params(deep=True)¶

Get parameters for this estimator.

Parameters:

Parameters:	deep : boolean, optional If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:	params : mapping of string to any Parameter names mapped to their values.

deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params : mapping of string to any

Parameter names mapped to their values.

predict(X)¶

Predict using the factorization machine

Parameters:

Parameters:	X : {array-like, sparse matrix}, shape = [n_samples, n_features] Samples.
Returns:	y_pred : array, shape = [n_samples] Returns predicted values.

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Samples.

Returns:

y_pred : array, shape = [n_samples]

Returns predicted values.

predict_proba(X)¶

Compute probability estimates for the test samples.

Only available if loss=’logistic’.

Parameters:

Parameters:	X : {array-like, sparse matrix}, shape = [n_samples, n_features] Samples.
Returns:	y_scores : array, shape = [n_samples] Probability estimates that the samples are from the positive class.

X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Samples.

Returns:

y_scores : array, shape = [n_samples]

Probability estimates that the samples are from the positive class.

score(X, y, sample_weight=None)¶

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:

Parameters:	X : array-like, shape = (n_samples, n_features) Test samples. y : array-like, shape = (n_samples) or (n_samples, n_outputs) True labels for X. sample_weight : array-like, shape = [n_samples], optional Sample weights.
Returns:	score : float Mean accuracy of self.predict(X) wrt. y.

X : array-like, shape = (n_samples, n_features)

Test samples.

y : array-like, shape = (n_samples) or (n_samples, n_outputs)

True labels for X.

sample_weight : array-like, shape = [n_samples], optional

Sample weights.

Returns:

score : float

Mean accuracy of self.predict(X) wrt. y.

set_params(**params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:	self :