Skip to content

dimension_reduction_compression

Bases: transformation

The dimension reduction based data compression function.

This class reduces the dimensionality of input features by applying a specified dimension reduction function or initializing it based on provided configurations.

Notes

Formally, given an input data instance \(\mathbf{x} \in {R}^m\), we can represent the feature selection-based data compression function as follows:

\[
    \begin{equation}
    \kappa(\mathbf{x}) = \text{dimension-reduction}(\mathbf{x}) \in {R}^d.
    \end{equation}
\]

The output dimension \(d\) may require manual setup, e.g., as a hyper-parameter \(D\).

Attributes:

Name Type Description
D int

Number of dimensions to retain after compression.

name str

Name of the transformation.

dr_function incremental_dimension_reduction

The dimension reduction function used for compression.

Parameters:

Name Type Description Default
D int

Number of dimensions to retain after compression.

required
name str

Name of the transformation. Defaults to 'dimension_reduction_compression'.

'dimension_reduction_compression'
dr_function incremental_dimension_reduction

A pre-configured dimension reduction function. Defaults to None.

None
dr_function_configs dict

Configuration for initializing the dimension reduction function. Should include the class name and optional parameters. Defaults to None.

None
*args tuple

Additional positional arguments for the parent transformation class.

()
**kwargs dict

Additional keyword arguments for the parent transformation class.

{}

Raises:

Type Description
ValueError

If neither dr_function nor dr_function_configs are specified.

Methods:

Name Description
__init__

Initializes the dimension reduction and compression instance.

calculate_D

Validates and returns the specified number of dimensions (D).

forward

Applies the dimension reduction and compression function to the input tensor.

Source code in tinybig/compression/dimension_reduction_compression.py
class dimension_reduction_compression(transformation):
    r"""
        The dimension reduction based data compression function.

        This class reduces the dimensionality of input features by applying a specified dimension reduction
        function or initializing it based on provided configurations.

        Notes
        ----------
        Formally, given an input data instance $\mathbf{x} \in {R}^m$, we can represent the feature selection-based data compression function as follows:

        $$
            \begin{equation}
            \kappa(\mathbf{x}) = \text{dimension-reduction}(\mathbf{x}) \in {R}^d.
            \end{equation}
        $$

        The output dimension $d$ may require manual setup, e.g., as a hyper-parameter $D$.

        Attributes
        ----------
        D : int
            Number of dimensions to retain after compression.
        name : str
            Name of the transformation.
        dr_function : incremental_dimension_reduction
            The dimension reduction function used for compression.

        Parameters
        ----------
        D : int
            Number of dimensions to retain after compression.
        name : str, optional
            Name of the transformation. Defaults to 'dimension_reduction_compression'.
        dr_function : incremental_dimension_reduction, optional
            A pre-configured dimension reduction function. Defaults to None.
        dr_function_configs : dict, optional
            Configuration for initializing the dimension reduction function. Should include the class name
            and optional parameters. Defaults to None.
        *args : tuple
            Additional positional arguments for the parent `transformation` class.
        **kwargs : dict
            Additional keyword arguments for the parent `transformation` class.

        Raises
        ------
        ValueError
            If neither `dr_function` nor `dr_function_configs` are specified.

        Methods
        -------
        __init__(D, name='dimension_reduction_compression', dr_function=None, dr_function_configs=None, *args, **kwargs)
            Initializes the dimension reduction and compression instance.
        calculate_D(m: int)
            Validates and returns the specified number of dimensions (`D`).
        forward(x: torch.Tensor, device='cpu', *args, **kwargs)
            Applies the dimension reduction and compression function to the input tensor.
    """
    def __init__(self, D: int, name='dimension_reduction_compression', dr_function: incremental_dimension_reduction = None, dr_function_configs: dict = None, *args, **kwargs):
        """
            Initializes the dimension reduction and compression instance.

            This method sets the number of dimensions (`D`) to retain and initializes the dimension reduction
            function using either a direct function or a configuration.

            Parameters
            ----------
            D : int
                Number of dimensions to retain after compression.
            name : str, optional
                Name of the transformation. Defaults to 'dimension_reduction_compression'.
            dr_function : incremental_dimension_reduction, optional
                A pre-configured dimension reduction function. Defaults to None.
            dr_function_configs : dict, optional
                Configuration for initializing the dimension reduction function. Should include the class name
                and optional parameters. Defaults to None.
            *args : tuple
                Additional positional arguments for the parent `transformation` class.
            **kwargs : dict
                Additional keyword arguments for the parent `transformation` class.

            Raises
            ------
            ValueError
                If neither `dr_function` nor `dr_function_configs` are specified.

            Returns
            ----------
            transformation
                The feature selection based compression function.
        """
        super().__init__(name=name, *args, **kwargs)
        self.D = D

        if dr_function is not None:
            self.dr_function = dr_function
        elif dr_function_configs is not None:
            function_class = dr_function_configs['function_class']
            function_parameters = dr_function_configs['function_parameters'] if 'function_parameters' in dr_function_configs else {}
            if 'n_feature' in function_parameters:
                assert function_parameters['n_feature'] == D
            else:
                function_parameters['n_feature'] = D
            self.dr_function = config.get_obj_from_str(function_class)(**function_parameters)
        else:
            raise ValueError('You must specify either dr_function or dr_function_configs...')

    def calculate_D(self, m: int):
        """
            The compression dimension calculation method.

            It calculates the intermediate compression space dimension based on the input dimension parameter m.
            This method also validates the specified number of dimensions (`D`) and ensures it is less than or equal to `m`.

            Parameters
            ----------
            m : int
                Total number of features in the input.

            Returns
            -------
            int
                The number of dimensions to retain (`D`).

            Raises
            ------
            AssertionError
                If `D` is not set or is greater than `m`.
        """
        assert self.D is not None and self.D <= m, 'You must specify a D that is smaller than m!'
        return self.D

    def forward(self, x: torch.Tensor, device: str = 'cpu', *args, **kwargs):
        r"""
            The forward method of the dimension reduction based compression function.

            It applies the dimension reduction and compression function to the input tensor.

            Formally, given an input data instance $\mathbf{x} \in {R}^m$, we can represent the feature selection-based data compression function as follows:

            $$
                \begin{equation}
                \kappa(\mathbf{x}) = \text{dimension-reduction}(\mathbf{x}) \in {R}^d.
                \end{equation}
            $$

            Parameters
            ----------
            x : torch.Tensor
                Input tensor of shape `(batch_size, num_features)`.
            device : str, optional
                Device for computation (e.g., 'cpu', 'cuda' or 'mps'). Defaults to 'cpu'.
            *args : tuple
                Additional positional arguments for pre- and post-processing.
            **kwargs : dict
                Additional keyword arguments for pre- and post-processing.

            Returns
            -------
            torch.Tensor
                Compressed tensor of shape `(batch_size, D)`.

            Raises
            ------
            AssertionError
                If the output tensor shape does not match the expected `(batch_size, D)`.
        """
        b, m = x.shape
        x = self.pre_process(x=x, device=device)

        compression = self.dr_function(torch.from_numpy(x.numpy())).to(device)

        assert compression.shape == (b, self.calculate_D(m=m))
        return self.post_process(x=compression, device=device)

__init__(D, name='dimension_reduction_compression', dr_function=None, dr_function_configs=None, *args, **kwargs)

Initializes the dimension reduction and compression instance.

This method sets the number of dimensions (D) to retain and initializes the dimension reduction function using either a direct function or a configuration.

Parameters:

Name Type Description Default
D int

Number of dimensions to retain after compression.

required
name str

Name of the transformation. Defaults to 'dimension_reduction_compression'.

'dimension_reduction_compression'
dr_function incremental_dimension_reduction

A pre-configured dimension reduction function. Defaults to None.

None
dr_function_configs dict

Configuration for initializing the dimension reduction function. Should include the class name and optional parameters. Defaults to None.

None
*args tuple

Additional positional arguments for the parent transformation class.

()
**kwargs dict

Additional keyword arguments for the parent transformation class.

{}

Raises:

Type Description
ValueError

If neither dr_function nor dr_function_configs are specified.

Returns:

Type Description
transformation

The feature selection based compression function.

Source code in tinybig/compression/dimension_reduction_compression.py
def __init__(self, D: int, name='dimension_reduction_compression', dr_function: incremental_dimension_reduction = None, dr_function_configs: dict = None, *args, **kwargs):
    """
        Initializes the dimension reduction and compression instance.

        This method sets the number of dimensions (`D`) to retain and initializes the dimension reduction
        function using either a direct function or a configuration.

        Parameters
        ----------
        D : int
            Number of dimensions to retain after compression.
        name : str, optional
            Name of the transformation. Defaults to 'dimension_reduction_compression'.
        dr_function : incremental_dimension_reduction, optional
            A pre-configured dimension reduction function. Defaults to None.
        dr_function_configs : dict, optional
            Configuration for initializing the dimension reduction function. Should include the class name
            and optional parameters. Defaults to None.
        *args : tuple
            Additional positional arguments for the parent `transformation` class.
        **kwargs : dict
            Additional keyword arguments for the parent `transformation` class.

        Raises
        ------
        ValueError
            If neither `dr_function` nor `dr_function_configs` are specified.

        Returns
        ----------
        transformation
            The feature selection based compression function.
    """
    super().__init__(name=name, *args, **kwargs)
    self.D = D

    if dr_function is not None:
        self.dr_function = dr_function
    elif dr_function_configs is not None:
        function_class = dr_function_configs['function_class']
        function_parameters = dr_function_configs['function_parameters'] if 'function_parameters' in dr_function_configs else {}
        if 'n_feature' in function_parameters:
            assert function_parameters['n_feature'] == D
        else:
            function_parameters['n_feature'] = D
        self.dr_function = config.get_obj_from_str(function_class)(**function_parameters)
    else:
        raise ValueError('You must specify either dr_function or dr_function_configs...')

calculate_D(m)

The compression dimension calculation method.

It calculates the intermediate compression space dimension based on the input dimension parameter m. This method also validates the specified number of dimensions (D) and ensures it is less than or equal to m.

Parameters:

Name Type Description Default
m int

Total number of features in the input.

required

Returns:

Type Description
int

The number of dimensions to retain (D).

Raises:

Type Description
AssertionError

If D is not set or is greater than m.

Source code in tinybig/compression/dimension_reduction_compression.py
def calculate_D(self, m: int):
    """
        The compression dimension calculation method.

        It calculates the intermediate compression space dimension based on the input dimension parameter m.
        This method also validates the specified number of dimensions (`D`) and ensures it is less than or equal to `m`.

        Parameters
        ----------
        m : int
            Total number of features in the input.

        Returns
        -------
        int
            The number of dimensions to retain (`D`).

        Raises
        ------
        AssertionError
            If `D` is not set or is greater than `m`.
    """
    assert self.D is not None and self.D <= m, 'You must specify a D that is smaller than m!'
    return self.D

forward(x, device='cpu', *args, **kwargs)

The forward method of the dimension reduction based compression function.

It applies the dimension reduction and compression function to the input tensor.

Formally, given an input data instance \(\mathbf{x} \in {R}^m\), we can represent the feature selection-based data compression function as follows:

\[
    \begin{equation}
    \kappa(\mathbf{x}) = \text{dimension-reduction}(\mathbf{x}) \in {R}^d.
    \end{equation}
\]

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (batch_size, num_features).

required
device str

Device for computation (e.g., 'cpu', 'cuda' or 'mps'). Defaults to 'cpu'.

'cpu'
*args tuple

Additional positional arguments for pre- and post-processing.

()
**kwargs dict

Additional keyword arguments for pre- and post-processing.

{}

Returns:

Type Description
Tensor

Compressed tensor of shape (batch_size, D).

Raises:

Type Description
AssertionError

If the output tensor shape does not match the expected (batch_size, D).

Source code in tinybig/compression/dimension_reduction_compression.py
def forward(self, x: torch.Tensor, device: str = 'cpu', *args, **kwargs):
    r"""
        The forward method of the dimension reduction based compression function.

        It applies the dimension reduction and compression function to the input tensor.

        Formally, given an input data instance $\mathbf{x} \in {R}^m$, we can represent the feature selection-based data compression function as follows:

        $$
            \begin{equation}
            \kappa(\mathbf{x}) = \text{dimension-reduction}(\mathbf{x}) \in {R}^d.
            \end{equation}
        $$

        Parameters
        ----------
        x : torch.Tensor
            Input tensor of shape `(batch_size, num_features)`.
        device : str, optional
            Device for computation (e.g., 'cpu', 'cuda' or 'mps'). Defaults to 'cpu'.
        *args : tuple
            Additional positional arguments for pre- and post-processing.
        **kwargs : dict
            Additional keyword arguments for pre- and post-processing.

        Returns
        -------
        torch.Tensor
            Compressed tensor of shape `(batch_size, D)`.

        Raises
        ------
        AssertionError
            If the output tensor shape does not match the expected `(batch_size, D)`.
    """
    b, m = x.shape
    x = self.pre_process(x=x, device=device)

    compression = self.dr_function(torch.from_numpy(x.numpy())).to(device)

    assert compression.shape == (b, self.calculate_D(m=m))
    return self.post_process(x=compression, device=device)