dimension_reduction_compression

Bases: transformation

The dimension reduction based data compression function.

This class reduces the dimensionality of input features by applying a specified dimension reduction function or initializing it based on provided configurations.

Notes

Formally, given an input data instance \(\mathbf{x} \in {R}^m\), we can represent the feature selection-based data compression function as follows:

\[
    \begin{equation}
    \kappa(\mathbf{x}) = \text{dimension-reduction}(\mathbf{x}) \in {R}^d.
    \end{equation}
\]

The output dimension \(d\) may require manual setup, e.g., as a hyper-parameter \(D\).

Attributes:

Name	Type	Description
`D`	`int`	Number of dimensions to retain after compression.
`name`	`str`	Name of the transformation.
`dr_function`	`incremental_dimension_reduction`	The dimension reduction function used for compression.

Parameters:

Name	Type	Description	Default
`D`	`int`	Number of dimensions to retain after compression.	required
`name`	`str`	Name of the transformation. Defaults to 'dimension_reduction_compression'.	`'dimension_reduction_compression'`
`dr_function`	`incremental_dimension_reduction`	A pre-configured dimension reduction function. Defaults to None.	`None`
`dr_function_configs`	`dict`	Configuration for initializing the dimension reduction function. Should include the class name and optional parameters. Defaults to None.	`None`
`*args`	`tuple`	Additional positional arguments for the parent `transformation` class.	`()`
`**kwargs`	`dict`	Additional keyword arguments for the parent `transformation` class.	`{}`

Raises:

Type	Description
`ValueError`	If neither `dr_function` nor `dr_function_configs` are specified.

Methods:

Name	Description
`__init__`	Initializes the dimension reduction and compression instance.
`calculate_D`	Validates and returns the specified number of dimensions (`D`).
`forward`	Applies the dimension reduction and compression function to the input tensor.

Source code in tinybig/compression/dimension_reduction_compression.py

class dimension_reduction_compression(transformation):
    r"""
        The dimension reduction based data compression function.

        This class reduces the dimensionality of input features by applying a specified dimension reduction
        function or initializing it based on provided configurations.

        Notes
        ----------
        Formally, given an input data instance $\mathbf{x} \in {R}^m$, we can represent the feature selection-based data compression function as follows:

        $$
            \begin{equation}
            \kappa(\mathbf{x}) = \text{dimension-reduction}(\mathbf{x}) \in {R}^d.
            \end{equation}
        $$

        The output dimension $d$ may require manual setup, e.g., as a hyper-parameter $D$.

        Attributes
        ----------
        D : int
            Number of dimensions to retain after compression.
        name : str
            Name of the transformation.
        dr_function : incremental_dimension_reduction
            The dimension reduction function used for compression.

        Parameters
        ----------
        D : int
            Number of dimensions to retain after compression.
        name : str, optional
            Name of the transformation. Defaults to 'dimension_reduction_compression'.
        dr_function : incremental_dimension_reduction, optional
            A pre-configured dimension reduction function. Defaults to None.
        dr_function_configs : dict, optional
            Configuration for initializing the dimension reduction function. Should include the class name
            and optional parameters. Defaults to None.
        *args : tuple
            Additional positional arguments for the parent `transformation` class.
        **kwargs : dict
            Additional keyword arguments for the parent `transformation` class.

        Raises
        ------
        ValueError
            If neither `dr_function` nor `dr_function_configs` are specified.

        Methods
        -------
        __init__(D, name='dimension_reduction_compression', dr_function=None, dr_function_configs=None, *args, **kwargs)
            Initializes the dimension reduction and compression instance.
        calculate_D(m: int)
            Validates and returns the specified number of dimensions (`D`).
        forward(x: torch.Tensor, device='cpu', *args, **kwargs)
            Applies the dimension reduction and compression function to the input tensor.
    """
    def __init__(self, D: int, name='dimension_reduction_compression', dr_function: incremental_dimension_reduction = None, dr_function_configs: dict = None, *args, **kwargs):
        """
            Initializes the dimension reduction and compression instance.

            This method sets the number of dimensions (`D`) to retain and initializes the dimension reduction
            function using either a direct function or a configuration.

            Parameters
            ----------
            D : int
                Number of dimensions to retain after compression.
            name : str, optional
                Name of the transformation. Defaults to 'dimension_reduction_compression'.
            dr_function : incremental_dimension_reduction, optional
                A pre-configured dimension reduction function. Defaults to None.
            dr_function_configs : dict, optional
                Configuration for initializing the dimension reduction function. Should include the class name
                and optional parameters. Defaults to None.
            *args : tuple
                Additional positional arguments for the parent `transformation` class.
            **kwargs : dict
                Additional keyword arguments for the parent `transformation` class.

            Raises
            ------
            ValueError
                If neither `dr_function` nor `dr_function_configs` are specified.

            Returns
            ----------
            transformation
                The feature selection based compression function.
        """
        super().__init__(name=name, *args, **kwargs)
        self.D = D

        if dr_function is not None:
            self.dr_function = dr_function
        elif dr_function_configs is not None:
            function_class = dr_function_configs['function_class']
            function_parameters = dr_function_configs['function_parameters'] if 'function_parameters' in dr_function_configs else {}
            if 'n_feature' in function_parameters:
                assert function_parameters['n_feature'] == D
            else:
                function_parameters['n_feature'] = D
            self.dr_function = config.get_obj_from_str(function_class)(**function_parameters)
        else:
            raise ValueError('You must specify either dr_function or dr_function_configs...')

    def calculate_D(self, m: int):
        """
            The compression dimension calculation method.

            It calculates the intermediate compression space dimension based on the input dimension parameter m.
            This method also validates the specified number of dimensions (`D`) and ensures it is less than or equal to `m`.

            Parameters
            ----------
            m : int
                Total number of features in the input.

            Returns
            -------
            int
                The number of dimensions to retain (`D`).

            Raises
            ------
            AssertionError
                If `D` is not set or is greater than `m`.
        """
        assert self.D is not None and self.D <= m, 'You must specify a D that is smaller than m!'
        return self.D

    def forward(self, x: torch.Tensor, device: str = 'cpu', *args, **kwargs):
        r"""
            The forward method of the dimension reduction based compression function.

            It applies the dimension reduction and compression function to the input tensor.

            Formally, given an input data instance $\mathbf{x} \in {R}^m$, we can represent the feature selection-based data compression function as follows:

            $$
                \begin{equation}
                \kappa(\mathbf{x}) = \text{dimension-reduction}(\mathbf{x}) \in {R}^d.
                \end{equation}
            $$

            Parameters
            ----------
            x : torch.Tensor
                Input tensor of shape `(batch_size, num_features)`.
            device : str, optional
                Device for computation (e.g., 'cpu', 'cuda' or 'mps'). Defaults to 'cpu'.
            *args : tuple
                Additional positional arguments for pre- and post-processing.
            **kwargs : dict
                Additional keyword arguments for pre- and post-processing.

            Returns
            -------
            torch.Tensor
                Compressed tensor of shape `(batch_size, D)`.

            Raises
            ------
            AssertionError
                If the output tensor shape does not match the expected `(batch_size, D)`.
        """
        b, m = x.shape
        x = self.pre_process(x=x, device=device)

        compression = self.dr_function(torch.from_numpy(x.numpy())).to(device)

        assert compression.shape == (b, self.calculate_D(m=m))
        return self.post_process(x=compression, device=device)

`init(D, name='dimension_reduction_compression', dr_function=None, dr_function_configs=None, *args, **kwargs)`

Initializes the dimension reduction and compression instance.

This method sets the number of dimensions (D) to retain and initializes the dimension reduction function using either a direct function or a configuration.

Parameters:

Name	Type	Description	Default
`D`	`int`	Number of dimensions to retain after compression.	required
`name`	`str`	Name of the transformation. Defaults to 'dimension_reduction_compression'.	`'dimension_reduction_compression'`
`dr_function`	`incremental_dimension_reduction`	A pre-configured dimension reduction function. Defaults to None.	`None`
`dr_function_configs`	`dict`	Configuration for initializing the dimension reduction function. Should include the class name and optional parameters. Defaults to None.	`None`
`*args`	`tuple`	Additional positional arguments for the parent `transformation` class.	`()`
`**kwargs`	`dict`	Additional keyword arguments for the parent `transformation` class.	`{}`

Raises:

Type	Description
`ValueError`	If neither `dr_function` nor `dr_function_configs` are specified.

Returns:

Type	Description
`transformation`	The feature selection based compression function.

Source code in tinybig/compression/dimension_reduction_compression.py

def __init__(self, D: int, name='dimension_reduction_compression', dr_function: incremental_dimension_reduction = None, dr_function_configs: dict = None, *args, **kwargs):
    """
        Initializes the dimension reduction and compression instance.

        This method sets the number of dimensions (`D`) to retain and initializes the dimension reduction
        function using either a direct function or a configuration.

        Parameters
        ----------
        D : int
            Number of dimensions to retain after compression.
        name : str, optional
            Name of the transformation. Defaults to 'dimension_reduction_compression'.
        dr_function : incremental_dimension_reduction, optional
            A pre-configured dimension reduction function. Defaults to None.
        dr_function_configs : dict, optional
            Configuration for initializing the dimension reduction function. Should include the class name
            and optional parameters. Defaults to None.
        *args : tuple
            Additional positional arguments for the parent `transformation` class.
        **kwargs : dict
            Additional keyword arguments for the parent `transformation` class.

        Raises
        ------
        ValueError
            If neither `dr_function` nor `dr_function_configs` are specified.

        Returns
        ----------
        transformation
            The feature selection based compression function.
    """
    super().__init__(name=name, *args, **kwargs)
    self.D = D

    if dr_function is not None:
        self.dr_function = dr_function
    elif dr_function_configs is not None:
        function_class = dr_function_configs['function_class']
        function_parameters = dr_function_configs['function_parameters'] if 'function_parameters' in dr_function_configs else {}
        if 'n_feature' in function_parameters:
            assert function_parameters['n_feature'] == D
        else:
            function_parameters['n_feature'] = D
        self.dr_function = config.get_obj_from_str(function_class)(**function_parameters)
    else:
        raise ValueError('You must specify either dr_function or dr_function_configs...')

`calculate_D(m)`

The compression dimension calculation method.

It calculates the intermediate compression space dimension based on the input dimension parameter m. This method also validates the specified number of dimensions (D) and ensures it is less than or equal to m.

Parameters:

Name	Type	Description	Default
`m`	`int`	Total number of features in the input.	required

Returns:

Type	Description
`int`	The number of dimensions to retain (`D`).

Raises:

Type	Description
`AssertionError`	If `D` is not set or is greater than `m`.

Source code in tinybig/compression/dimension_reduction_compression.py

def calculate_D(self, m: int):
    """
        The compression dimension calculation method.

        It calculates the intermediate compression space dimension based on the input dimension parameter m.
        This method also validates the specified number of dimensions (`D`) and ensures it is less than or equal to `m`.

        Parameters
        ----------
        m : int
            Total number of features in the input.

        Returns
        -------
        int
            The number of dimensions to retain (`D`).

        Raises
        ------
        AssertionError
            If `D` is not set or is greater than `m`.
    """
    assert self.D is not None and self.D <= m, 'You must specify a D that is smaller than m!'
    return self.D

`forward(x, device='cpu', *args, **kwargs)`

The forward method of the dimension reduction based compression function.

It applies the dimension reduction and compression function to the input tensor.

Formally, given an input data instance \(\mathbf{x} \in {R}^m\), we can represent the feature selection-based data compression function as follows:

\[
    \begin{equation}
    \kappa(\mathbf{x}) = \text{dimension-reduction}(\mathbf{x}) \in {R}^d.
    \end{equation}
\]

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Input tensor of shape `(batch_size, num_features)`.	required
`device`	`str`	Device for computation (e.g., 'cpu', 'cuda' or 'mps'). Defaults to 'cpu'.	`'cpu'`
`*args`	`tuple`	Additional positional arguments for pre- and post-processing.	`()`
`**kwargs`	`dict`	Additional keyword arguments for pre- and post-processing.	`{}`

Returns:

Type	Description
`Tensor`	Compressed tensor of shape `(batch_size, D)`.

Raises:

Type	Description
`AssertionError`	If the output tensor shape does not match the expected `(batch_size, D)`.

Source code in tinybig/compression/dimension_reduction_compression.py

def forward(self, x: torch.Tensor, device: str = 'cpu', *args, **kwargs):
    r"""
        The forward method of the dimension reduction based compression function.

        It applies the dimension reduction and compression function to the input tensor.

        Formally, given an input data instance $\mathbf{x} \in {R}^m$, we can represent the feature selection-based data compression function as follows:

        $$
            \begin{equation}
            \kappa(\mathbf{x}) = \text{dimension-reduction}(\mathbf{x}) \in {R}^d.
            \end{equation}
        $$

        Parameters
        ----------
        x : torch.Tensor
            Input tensor of shape `(batch_size, num_features)`.
        device : str, optional
            Device for computation (e.g., 'cpu', 'cuda' or 'mps'). Defaults to 'cpu'.
        *args : tuple
            Additional positional arguments for pre- and post-processing.
        **kwargs : dict
            Additional keyword arguments for pre- and post-processing.

        Returns
        -------
        torch.Tensor
            Compressed tensor of shape `(batch_size, D)`.

        Raises
        ------
        AssertionError
            If the output tensor shape does not match the expected `(batch_size, D)`.
    """
    b, m = x.shape
    x = self.pre_process(x=x, device=device)

    compression = self.dr_function(torch.from_numpy(x.numpy())).to(device)

    assert compression.shape == (b, self.calculate_D(m=m))
    return self.post_process(x=compression, device=device)