Skip to content

manifold_compression

Bases: transformation

The manifold based data compression function.

This class reduces the dimensionality of input features by applying a manifold learning technique such as Isomap, Locally Linear Embedding (LLE), or other manifold methods.

Notes

Formally, given an input data instance \(\mathbf{x} \in {R}^m\), we can represent the feature selection-based data compression function as follows:

\[
    \begin{equation}
    \kappa(\mathbf{x}) = \text{manifold}(\mathbf{x}) \in {R}^d.
    \end{equation}
\]

The output dimension \(d\) may require manual setup, e.g., as a hyper-parameter \(D\).

Attributes:

Name Type Description
D int

Number of dimensions to retain after compression.

n_neighbors int

Number of neighbors used in the manifold learning algorithm.

name str

Name of the transformation.

manifold_function manifold

The manifold learning function used for compression.

Parameters:

Name Type Description Default
D int

Number of dimensions to retain after compression.

required
n_neighbors int

Number of neighbors to use for manifold learning. Defaults to 1.

1
name str

Name of the transformation. Defaults to 'dimension_reduction_compression'.

'dimension_reduction_compression'
manifold_function manifold

A pre-configured manifold function. Defaults to None.

None
manifold_function_configs dict

Configuration for initializing the manifold function. Should include the class name and optional parameters. Defaults to None.

None
*args tuple

Additional positional arguments for the parent transformation class.

()
**kwargs dict

Additional keyword arguments for the parent transformation class.

{}

Raises:

Type Description
ValueError

If neither manifold_function nor manifold_function_configs are specified.

Methods:

Name Description
__init__

Initializes the manifold-based dimensionality reduction instance.

calculate_D

Returns the number of dimensions to retain (D).

forward

Applies the manifold function to the input tensor and reduces its dimensionality.

Source code in tinybig/compression/manifold_compression.py
class manifold_compression(transformation):
    r"""
        The manifold based data compression function.

        This class reduces the dimensionality of input features by applying a manifold learning technique
        such as Isomap, Locally Linear Embedding (LLE), or other manifold methods.

        Notes
        ----------
        Formally, given an input data instance $\mathbf{x} \in {R}^m$, we can represent the feature selection-based data compression function as follows:

        $$
            \begin{equation}
            \kappa(\mathbf{x}) = \text{manifold}(\mathbf{x}) \in {R}^d.
            \end{equation}
        $$

        The output dimension $d$ may require manual setup, e.g., as a hyper-parameter $D$.

        Attributes
        ----------
        D : int
            Number of dimensions to retain after compression.
        n_neighbors : int
            Number of neighbors used in the manifold learning algorithm.
        name : str
            Name of the transformation.
        manifold_function : manifold
            The manifold learning function used for compression.

        Parameters
        ----------
        D : int
            Number of dimensions to retain after compression.
        n_neighbors : int, optional
            Number of neighbors to use for manifold learning. Defaults to 1.
        name : str, optional
            Name of the transformation. Defaults to 'dimension_reduction_compression'.
        manifold_function : manifold, optional
            A pre-configured manifold function. Defaults to None.
        manifold_function_configs : dict, optional
            Configuration for initializing the manifold function. Should include the class name
            and optional parameters. Defaults to None.
        *args : tuple
            Additional positional arguments for the parent `transformation` class.
        **kwargs : dict
            Additional keyword arguments for the parent `transformation` class.

        Raises
        ------
        ValueError
            If neither `manifold_function` nor `manifold_function_configs` are specified.

        Methods
        -------
        __init__(D, n_neighbors=1, name='dimension_reduction_compression', manifold_function=None, manifold_function_configs=None, *args, **kwargs)
            Initializes the manifold-based dimensionality reduction instance.
        calculate_D(m: int)
            Returns the number of dimensions to retain (`D`).
        forward(x: torch.Tensor, device='cpu', *args, **kwargs)
            Applies the manifold function to the input tensor and reduces its dimensionality.
    """
    def __init__(self, D: int, n_neighbors: int = 1, name='dimension_reduction_compression', manifold_function: manifold = None, manifold_function_configs: dict = None, *args, **kwargs):
        """
            The initialization method of the manifold based compression function.

            It initializes the compression function based on the provided manifold function or its configs.

            Parameters
            ----------
            D : int
                Number of dimensions to retain after compression.
            n_neighbors : int, optional
                Number of neighbors to use for manifold learning. Defaults to 1.
            name : str, optional
                Name of the transformation. Defaults to 'dimension_reduction_compression'.
            manifold_function : manifold, optional
                A pre-configured manifold function. Defaults to None.
            manifold_function_configs : dict, optional
                Configuration for initializing the manifold function. Should include the class name
                and optional parameters. Defaults to None.
            *args : tuple
                Additional positional arguments for the parent `transformation` class.
            **kwargs : dict
                Additional keyword arguments for the parent `transformation` class.

            Raises
            ------
            ValueError
                If neither `manifold_function` nor `manifold_function_configs` are specified.
        """
        super().__init__(name=name, *args, **kwargs)
        self.D = D
        self.n_neighbors = n_neighbors

        if manifold_function is not None:
            self.manifold_function = manifold_function
        elif manifold_function_configs is not None:
            function_class = manifold_function_configs['function_class']
            function_parameters = manifold_function_configs['function_parameters'] if 'function_parameters' in manifold_function_configs else {}
            if 'n_components' in function_parameters:
                assert function_parameters['n_components'] == D
            else:
                function_parameters['n_components'] = D
            self.manifold_function = config.get_obj_from_str(function_class)(**function_parameters)
        else:
            raise ValueError('You must specify either manifold_function or manifold_function_configs...')

    def calculate_D(self, m: int):
        """
            The compression dimension calculation method.

            It calculates the intermediate compression space dimension based on the input dimension parameter m.
            This method also validates the specified number of features (`D`) and ensures it is less than or equal to `m`.

            Parameters
            ----------
            m : int
                Total number of features in the input.

            Returns
            -------
            int
                The number of dimensions to retain (`D`).

            Raises
            ------
            AssertionError
                If `D` is not set or is greater than `m`.
        """
        assert self.D is not None and self.D <= m, 'You must specify a D that is smaller than m!'
        return self.D

    def forward(self, x: torch.Tensor, device: str = 'cpu', *args, **kwargs):
        r"""
            The forward method of the manifold based compression function.

            It applies the manifold function to the input tensor and reduces its dimensionality.

            Formally, given an input data instance $\mathbf{x} \in {R}^m$, we can represent the feature selection-based data compression function as follows:

            $$
                \begin{equation}
                \kappa(\mathbf{x}) = \text{manifold}(\mathbf{x}) \in {R}^d.
                \end{equation}
            $$

            Parameters
            ----------
            x : torch.Tensor
                Input tensor of shape `(batch_size, num_features)`.
            device : str, optional
                Device for computation (e.g., 'cpu' or 'cuda'). Defaults to 'cpu'.
            *args : tuple
                Additional positional arguments for pre- and post-processing.
            **kwargs : dict
                Additional keyword arguments for pre- and post-processing.

            Returns
            -------
            torch.Tensor
                Compressed tensor of shape `(batch_size, D)`.

            Raises
            ------
            AssertionError
                If the output tensor shape does not match the expected `(batch_size, D)`.
        """
        b, m = x.shape
        x = self.pre_process(x=x, device=device)

        compression = self.manifold_function(torch.from_numpy(x.numpy())).to(device)

        assert compression.shape == (b, self.calculate_D(m=m))
        return self.post_process(x=compression, device=device)

__init__(D, n_neighbors=1, name='dimension_reduction_compression', manifold_function=None, manifold_function_configs=None, *args, **kwargs)

The initialization method of the manifold based compression function.

It initializes the compression function based on the provided manifold function or its configs.

Parameters:

Name Type Description Default
D int

Number of dimensions to retain after compression.

required
n_neighbors int

Number of neighbors to use for manifold learning. Defaults to 1.

1
name str

Name of the transformation. Defaults to 'dimension_reduction_compression'.

'dimension_reduction_compression'
manifold_function manifold

A pre-configured manifold function. Defaults to None.

None
manifold_function_configs dict

Configuration for initializing the manifold function. Should include the class name and optional parameters. Defaults to None.

None
*args tuple

Additional positional arguments for the parent transformation class.

()
**kwargs dict

Additional keyword arguments for the parent transformation class.

{}

Raises:

Type Description
ValueError

If neither manifold_function nor manifold_function_configs are specified.

Source code in tinybig/compression/manifold_compression.py
def __init__(self, D: int, n_neighbors: int = 1, name='dimension_reduction_compression', manifold_function: manifold = None, manifold_function_configs: dict = None, *args, **kwargs):
    """
        The initialization method of the manifold based compression function.

        It initializes the compression function based on the provided manifold function or its configs.

        Parameters
        ----------
        D : int
            Number of dimensions to retain after compression.
        n_neighbors : int, optional
            Number of neighbors to use for manifold learning. Defaults to 1.
        name : str, optional
            Name of the transformation. Defaults to 'dimension_reduction_compression'.
        manifold_function : manifold, optional
            A pre-configured manifold function. Defaults to None.
        manifold_function_configs : dict, optional
            Configuration for initializing the manifold function. Should include the class name
            and optional parameters. Defaults to None.
        *args : tuple
            Additional positional arguments for the parent `transformation` class.
        **kwargs : dict
            Additional keyword arguments for the parent `transformation` class.

        Raises
        ------
        ValueError
            If neither `manifold_function` nor `manifold_function_configs` are specified.
    """
    super().__init__(name=name, *args, **kwargs)
    self.D = D
    self.n_neighbors = n_neighbors

    if manifold_function is not None:
        self.manifold_function = manifold_function
    elif manifold_function_configs is not None:
        function_class = manifold_function_configs['function_class']
        function_parameters = manifold_function_configs['function_parameters'] if 'function_parameters' in manifold_function_configs else {}
        if 'n_components' in function_parameters:
            assert function_parameters['n_components'] == D
        else:
            function_parameters['n_components'] = D
        self.manifold_function = config.get_obj_from_str(function_class)(**function_parameters)
    else:
        raise ValueError('You must specify either manifold_function or manifold_function_configs...')

calculate_D(m)

The compression dimension calculation method.

It calculates the intermediate compression space dimension based on the input dimension parameter m. This method also validates the specified number of features (D) and ensures it is less than or equal to m.

Parameters:

Name Type Description Default
m int

Total number of features in the input.

required

Returns:

Type Description
int

The number of dimensions to retain (D).

Raises:

Type Description
AssertionError

If D is not set or is greater than m.

Source code in tinybig/compression/manifold_compression.py
def calculate_D(self, m: int):
    """
        The compression dimension calculation method.

        It calculates the intermediate compression space dimension based on the input dimension parameter m.
        This method also validates the specified number of features (`D`) and ensures it is less than or equal to `m`.

        Parameters
        ----------
        m : int
            Total number of features in the input.

        Returns
        -------
        int
            The number of dimensions to retain (`D`).

        Raises
        ------
        AssertionError
            If `D` is not set or is greater than `m`.
    """
    assert self.D is not None and self.D <= m, 'You must specify a D that is smaller than m!'
    return self.D

forward(x, device='cpu', *args, **kwargs)

The forward method of the manifold based compression function.

It applies the manifold function to the input tensor and reduces its dimensionality.

Formally, given an input data instance \(\mathbf{x} \in {R}^m\), we can represent the feature selection-based data compression function as follows:

\[
    \begin{equation}
    \kappa(\mathbf{x}) = \text{manifold}(\mathbf{x}) \in {R}^d.
    \end{equation}
\]

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (batch_size, num_features).

required
device str

Device for computation (e.g., 'cpu' or 'cuda'). Defaults to 'cpu'.

'cpu'
*args tuple

Additional positional arguments for pre- and post-processing.

()
**kwargs dict

Additional keyword arguments for pre- and post-processing.

{}

Returns:

Type Description
Tensor

Compressed tensor of shape (batch_size, D).

Raises:

Type Description
AssertionError

If the output tensor shape does not match the expected (batch_size, D).

Source code in tinybig/compression/manifold_compression.py
def forward(self, x: torch.Tensor, device: str = 'cpu', *args, **kwargs):
    r"""
        The forward method of the manifold based compression function.

        It applies the manifold function to the input tensor and reduces its dimensionality.

        Formally, given an input data instance $\mathbf{x} \in {R}^m$, we can represent the feature selection-based data compression function as follows:

        $$
            \begin{equation}
            \kappa(\mathbf{x}) = \text{manifold}(\mathbf{x}) \in {R}^d.
            \end{equation}
        $$

        Parameters
        ----------
        x : torch.Tensor
            Input tensor of shape `(batch_size, num_features)`.
        device : str, optional
            Device for computation (e.g., 'cpu' or 'cuda'). Defaults to 'cpu'.
        *args : tuple
            Additional positional arguments for pre- and post-processing.
        **kwargs : dict
            Additional keyword arguments for pre- and post-processing.

        Returns
        -------
        torch.Tensor
            Compressed tensor of shape `(batch_size, D)`.

        Raises
        ------
        AssertionError
            If the output tensor shape does not match the expected `(batch_size, D)`.
    """
    b, m = x.shape
    x = self.pre_process(x=x, device=device)

    compression = self.manifold_function(torch.from_numpy(x.numpy())).to(device)

    assert compression.shape == (b, self.calculate_D(m=m))
    return self.post_process(x=compression, device=device)