AMICA

Note

The AMICA class is Scikit-Learn compatible, so most methods available to sklearn.decomposition.FastICA should also be available to amica.AMICA.

class amica.AMICA(n_components=None, *, n_mixtures=3, n_models=1, mean_center=True, whiten='zca', max_iter=500, tol=1e-07, lrate=0.05, pdftype=0, do_newton=True, newt_start=50, w_init=None, sbeta_init=None, mu_init=None, random_state=None)

Bases: TransformerMixin, BaseEstimator

AMICA: adaptive Mixture algorithm for Independent Component Analysis.

Parameters:
n_componentsint, default=None

Number of components to use. If None, then n_components == n_features.

Note

If the data are rank deficient, then the effective number of components will be lower than n_components, as the number of components will be set to the data rank.

n_mixturesint, default=3

Number of mixtures components to use in the Gaussian Mixture Model (GMM) for each component’s source density. default is 3.

n_modelsint, default=1

Number of ICA decompositions to run. Only 1 is supported currently.

batch_sizeint, optional

Batch size for processing data in chunks along the samples axis. If None, batching is chosen automatically to keep peak memory under ~1.5 GB, and warns if the batch size is below ~8k samples. If the input data is small enough to process in one shot, no batching is used. If you want to enforce no batching even when the data is very large, set batch_size to X.shape[0] to process all samples at once, but note that this may lead to high memory usage for large datasets.

mean_centerbool, default=True

If True, the data is mean-centered before whitening and fitting. This is equivalent to do_mean=1 in the Fortran AMICA program.

whitenstr {“zca”, “pca”, “variance”}, default=”zca”

whitening strategy.

  • "zca": Data is whitened with the inverse of the symmetric square

    root of the covariance matrix. if n_components < n_features, then approximate sphering is done by multiplying by the eigenvectors of a reduced dimensional subset of the principle component (eigenvector) subspace. This is equivalent to do_sphere=1 + do_approx_sphere=1 in the Fortran AMICA program. In EEBLAB’s AMICA GUI, this is called “Symmetric sphering”.

  • "pca": Data is whitened using only eigenvector projection and scaling to

    do sphering (not symmetric or approximately symmetric). This is equivalent to do_sphere=1 + do_approx_sphere=0 in the Fortran AMICA program. In EEBLAB’s AMICA GUI, this is called “Principle Components (Eigenvectors)”.

  • "variance": Diagonal Normalization. Each feature is scaled by the variance

    across features. This is equivalent to do_sphere=0 in the Fortran AMICA program. In EEBLAB’s AMICA GUI, this is called “No sphering transformation”.

max_iterint, default=500

Maximum number of iterations during fit.

tolfloat, default=1e-7

Tolerance for stopping criteria. A positive scalar giving the tolerance at which the un-mixing matrix is considered to have converged. The default is 1e-7. Fortran AMICA program contained tunable tolerance parameters for two different stopping criteria min_dll and min_grad_norm. We only expose one parameter, which is applied to both criteria.

lratefloat, default=0.05

Initial learning rate for the optimization algorithm. The Fortran AMICA program exposed 2 tunable learning rate parameters, lrate and rholrate, but we expose only one for simplicity, which is applied to both.

pdftypeint, default=0

Type of source density model to use. Currently only 0 is supported, which corresponds to the Gaussian Mixture Model (GMM) density.

do_newtonbool, default=True

If True, the optimization method will switch from Stochastic Gradient Descent (SGD) to newton updates after newt_start iterations. If False, only SGD updates are used.

newt_startint, default=50

Number of iterations before switching to Newton updates if do_newton is True.

w_initndarray of shape (n_components, n_components), default=``None``

Initial un-mixing array. If None, then an array of values drawn from a normal distribution is used.

sbeta_initndarray of shape (n_components, n_mixtures), default=None

Initial scale parameters for the mixture components. If None, then an array of values drawn from a uniform distribution is used.

mu_initndarray of shape (n_components, n_mixtures), default=None

Initial location parameters for the mixture components. If None, then an array of values drawn from a normal distribution is used.

random_stateint or None, default=None

Used to initialize w_init when not specified, with a normal distribution. Pass an int for reproducible results across multiple function calls. Note that unlike scikit-learn’s FastICA, you cannot pass a BitGenerator instance via default_rng().

Attributes:
components_ndarray of shape (n_components, n_features)

The linear operator to apply to the data to get the independent sources. This is equal to np.matmul(unmixing_matrix, self.whitening_) when whiten is "zca" or "pca".

mixing_ndarray of shape (n_features, n_components)

The pseudo-inverse of components_. It is the linear operator that maps independent sources to the data (in feature space).

mean_ndarray of shape(n_features,)

The mean over features. Only set if self.whiten is True.

whitening_ndarray of shape (n_components, n_features)

Only set if whiten is True. This is the pre-whitening matrix that projects data onto the first n_components principal components.

n_features_in_int

Number of features seen during fit().

n_iter_int

Number of iterations taken to converge during fit.

Methods

fit(X[, y, verbose])

Fit the AMICA model to the data X.

fit_transform(X[, y])

Fit the model to the data and transform it.

inverse_transform(X)

Reconstruct data from its independent components.

transform(X[, copy])

Recover the sources from X (apply the unmixing matrix).

Examples

>>> from sklearn.datasets import load_digits
>>> from amica import AMICA
>>> X, _ = load_digits(return_X_y=True)
>>> transformer = AMICA(n_components=7, random_state=0)
>>> X_transformed = transformer.fit_transform(X)
>>> X_transformed.shape
(1797, 7)
fit(X, y=None, verbose=None)

Fit the AMICA model to the data X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Training data, where n_samples is the number of samples and n_features is the number of features.

yIgnored

Not used, present here for API consistency by convention.

verbosebool or str or int or None, default=None

Control verbosity of the logging output. If a str, it can be either "DEBUG", "INFO", "WARNING", "ERROR", or "CRITICAL". Note that these are for convenience and are equivalent to passing in logging.DEBUG, etc. For bool, True is the same as "INFO", False is the same as "WARNING". If None, defaults to "INFO".

Returns:
selfobject

Fitted estimator.

fit_transform(X, y=None)

Fit the model to the data and transform it.

Parameters:
Xarray-like of shape (n_samples, n_features)

Training data, where n_samples is the number of samples and n_features is the number of features.

yIgnored

Not used, present here for API consistency by convention.

Returns:
X_newndarray of shape (n_samples, n_components)

Estimated sources obtained by transforming the data with the estimated unmixing matrix.

inverse_transform(X)

Reconstruct data from its independent components.

Parameters:
Xarray-like of shape (n_samples, n_components)

Independent components to invert.

Returns:
X_reconstructedndarray of shape (n_samples, n_features)

Reconstructed data.

Examples

>>> from sklearn.datasets import load_digits
>>> from amica import AMICA
>>> X, _ = load_digits(return_X_y=True)
>>> transformer = AMICA(n_components=7, random_state=0)
>>> X_transformed = transformer.fit_transform(X)
>>> X_reconstructed = transformer.inverse_transform(X_transformed)
>>> X_reconstructed.shape
(1797, 64)
transform(X, copy=True)

Recover the sources from X (apply the unmixing matrix).

Parameters:
Xarray-like of shape (n_samples, n_features)

Data to transform, where n_samples is the number of samples and n_features is the number of features.

copybool, default=True

If False, data passed to fit can be overwritten. Defaults to True.

Returns:
X_newndarray of shape (n_samples, n_components)

Estimated sources obtained by transforming the data with the estimated unmixing matrix.