Skip to content

model

Bases: Module, function

The base model class of the RPN model in the tinyBIG toolkit.

It inherits from the torch.nn.Module class, which also inherits the "state_dict" and "load_state_dict" methods from the base class.

...

Notes

RPN Model Architecture

Formally, given the underlying data distribution mapping \(f: {R}^m \to {R}^n\), the RPN model proposes to approximate function \(f\) as follows: $$ \begin{equation} g(\mathbf{x} | \mathbf{w}) = \left \langle \kappa_{\xi} (\mathbf{x}), \psi(\mathbf{w}) \right \rangle + \pi(\mathbf{x}), \end{equation} $$

The RPN model disentangles input data from model parameters through the expansion functions \(\kappa\) and reconciliation function \(\psi\), subsequently summed with the remainder function \(\pi\), where

  • \(\kappa_{\xi}: {R}^m \to {R}^{D}\) is named as the data interdependent transformation function. It is a composite function of the data transformation function \(\kappa\) and the data interdependence function \(\xi\). Notation \(D\) is the target expansion space dimension.

  • \(\psi: {R}^l \to {R}^{n \times D}\) is named as the parameter reconciliation function (or parameter fabrication function to be general), which is defined only on the parameters without any input data.

  • \(\pi: {R}^m \to {R}^n\) is named as the remainder function.

  • \(\xi_a: {R}^{b \times m} \to {R}^{m \times m'}\) and \(\xi_i: {R}^{b \times m} \to {R}^{b \times b'}\) defined on the input data batch \(\mathbf{X} \in R^{b \times m}\) are named as the attribute and instance data interdependence functions, respectively.

Deep RPN model with Multi-Layer

The multi-head multi-channel RPN layer provides RPN with greater capabilities for approximating functions with diverse expansions concurrently. However, such shallow architectures can be insufficient for modeling complex functions. The RPN model can also be designed with a deep architecture by stacking multiple RPN layers on top of each other.

Formally, we can represent the deep RPN model with multi-layers as follows:

\[
    \begin{equation}
        \begin{cases}
            \text{Input: } & \mathbf{H}_0  = \mathbf{X},\\\\
            \text{Layer 1: } & \mathbf{H}_1 = \left\langle \kappa_{\xi, 1}(\mathbf{H}_0), \psi_1(\mathbf{w}_1) \right\rangle + \pi_1(\mathbf{H}_0),\\\\
            \text{Layer 2: } & \mathbf{H}_2 = \left\langle \kappa_{\xi, 2}(\mathbf{H}_1), \psi_2(\mathbf{w}_2) \right\rangle + \pi_2(\mathbf{H}_1),\\\\
            \cdots & \cdots \ \cdots\\\\
            \text{Layer K: } & \mathbf{H}_K = \left\langle \kappa_{\xi, K}(\mathbf{H}_{K-1}), \psi_K(\mathbf{w}_K) \right\rangle + \pi_K(\mathbf{H}_{K-1}),\\\\
            \text{Output: } & \mathbf{Z}  = \mathbf{H}_K.
        \end{cases}
    \end{equation}
\]

In the above equation, the subscripts used above denote the layer index. The dimensions of the outputs at each layer can be represented as a list \([d_0, d_1, \cdots, d_{K-1}, d_K]\), where \(d_0 = m\) and \(d_K = n\) denote the input and the desired output dimensions, respectively. Therefore, if the component functions at each layer of our model have been predetermined, we can just use the dimension list \([d_0, d_1, \cdots, d_{K-1}, d_K]\) to represent the architecture of the RPN model.

Attributes:

Name Type Description
name str, default = 'base_metric'

Name of the model.

Methods:

Name Description
__init__

It performs the initialization of the model

save_ckpt

It saves the model state as checkpoint to file.

load_ckpt

It loads the model state from a file.

__call__

It reimplementation the build-in callable method.

forward

The forward method of the model.

Source code in tinybig/module/base_model.py
class model(Module, function):
    r"""
    The base model class of the RPN model in the tinyBIG toolkit.

    It inherits from the torch.nn.Module class, which also inherits the
    "state_dict" and "load_state_dict" methods from the base class.

    ...

    Notes
    ---------

    ## RPN Model Architecture

    Formally, given the underlying data distribution mapping $f: {R}^m \to {R}^n$,
    the RPN model proposes to approximate function $f$ as follows:
    $$
        \begin{equation}
            g(\mathbf{x} | \mathbf{w}) = \left \langle \kappa_{\xi} (\mathbf{x}), \psi(\mathbf{w}) \right \rangle + \pi(\mathbf{x}),
        \end{equation}
    $$

    The RPN model disentangles input data from model parameters through the expansion functions $\kappa$ and
    reconciliation function $\psi$, subsequently summed with the remainder function $\pi$, where

    * $\kappa_{\xi}: {R}^m \to {R}^{D}$ is named as the **data interdependent transformation function**. It is a composite function of the **data transformation function** $\kappa$ and the **data interdependence function** $\xi$. Notation $D$ is the target expansion space dimension.

    * $\psi: {R}^l \to {R}^{n \times D}$ is named as the **parameter reconciliation function** (or **parameter fabrication function** to be general), which is defined only on the parameters without any input data.

    * $\pi: {R}^m \to {R}^n$ is named as the **remainder function**.

    * $\xi_a: {R}^{b \times m} \to {R}^{m \times m'}$ and $\xi_i: {R}^{b \times m} \to {R}^{b \times b'}$ defined on the input data batch $\mathbf{X} \in R^{b \times m}$ are named as the **attribute** and **instance data interdependence functions**, respectively.

    ## Deep RPN model with Multi-Layer

    The multi-head multi-channel RPN layer provides RPN with greater capabilities
    for approximating functions with diverse expansions concurrently.
    However, such shallow architectures can be insufficient for modeling complex functions.
    The RPN model can also be designed with a deep architecture by stacking multiple RPN layers on top of each other.

    Formally, we can represent the deep RPN model with multi-layers as follows:

    $$
        \begin{equation}
            \begin{cases}
                \text{Input: } & \mathbf{H}_0  = \mathbf{X},\\\\
                \text{Layer 1: } & \mathbf{H}_1 = \left\langle \kappa_{\xi, 1}(\mathbf{H}_0), \psi_1(\mathbf{w}_1) \right\rangle + \pi_1(\mathbf{H}_0),\\\\
                \text{Layer 2: } & \mathbf{H}_2 = \left\langle \kappa_{\xi, 2}(\mathbf{H}_1), \psi_2(\mathbf{w}_2) \right\rangle + \pi_2(\mathbf{H}_1),\\\\
                \cdots & \cdots \ \cdots\\\\
                \text{Layer K: } & \mathbf{H}_K = \left\langle \kappa_{\xi, K}(\mathbf{H}_{K-1}), \psi_K(\mathbf{w}_K) \right\rangle + \pi_K(\mathbf{H}_{K-1}),\\\\
                \text{Output: } & \mathbf{Z}  = \mathbf{H}_K.
            \end{cases}
        \end{equation}
    $$

    In the above equation, the subscripts used above denote the layer index. The dimensions of the outputs at each layer
    can be represented as a list $[d_0, d_1, \cdots, d_{K-1}, d_K]$, where $d_0 = m$ and $d_K = n$
    denote the input and the desired output dimensions, respectively.
    Therefore, if the component functions at each layer of our model have been predetermined, we can just use the dimension
    list $[d_0, d_1, \cdots, d_{K-1}, d_K]$ to represent the architecture of the RPN model.

    Attributes
    ----------
    name: str, default = 'base_metric'
        Name of the model.

    Methods
    ----------
    __init__
        It performs the initialization of the model

    save_ckpt
        It saves the model state as checkpoint to file.

    load_ckpt
        It loads the model state from a file.

    __call__
        It reimplementation the build-in callable method.

    forward
        The forward method of the model.
    """
    def __init__(self, name: str = 'model_name', device: str = 'cpu', *args, **kwargs):
        """
        The initialization method of the base model class.

        It initializes a model object based on the provided model parameters.

        Parameters
        ----------
        name: str, default = 'model_name'
            The name of the model, with default value "model_name".

        Returns
        ----------
        object
            The initialized model object.
        """
        Module.__init__(self)
        function.__init__(self, name=name, device=device)

    def save_ckpt(self, cache_dir='./ckpt', checkpoint_file='checkpoint'):
        """
        The model state checkpoint saving method.

        It saves the current model state to a checkpoint file.

        Parameters
        ----------
        cache_dir: str, default = './ckpt'
            The cache directory of the model checkpoint file.
        checkpoint_file: str, default = 'checkpoint'
            The checkpoint file name.

        Returns
        -------
        None
            This method doesn't have return values.
        """
        create_directory_if_not_exists(f'{cache_dir}/{checkpoint_file}')
        torch.save(self.state_dict(), f'{cache_dir}/{checkpoint_file}')
        print("model checkpoint saving to {}/{}...".format(cache_dir, checkpoint_file))

    def load_ckpt(self, cache_dir: str = './ckpt', checkpoint_file: str = 'checkpoint', strict: bool = True):
        """
        The model state checkpoint loading method.

        It loads the model state from the provided checkpoint file.

        Parameters
        ----------
        cache_dir: str, default = './ckpt'
            The cache directory of the model checkpoint file.
        checkpoint_file: str, default = 'checkpoint'
            The checkpoint file name.
        strict: bool, default = True
            The boolean tag of whether the model state loading follows the strict configuration checking.

        Returns
        -------
        None
            This method doesn't have return values.
        """
        self.load_state_dict(torch.load(f'{cache_dir}/{checkpoint_file}'), strict=strict)
        print("model checkpoint loading from {}/{}...".format(cache_dir, checkpoint_file))

    @abstractmethod
    def to_config(self, *args, **kwargs):
        """
        Abstract method to convert the `model` instance into a configuration dictionary.

        This method is intended to be implemented by subclasses. It should generate a dictionary
        that encapsulates the essential configuration of the model, allowing for reconstruction
        or serialization of the instance. The specific structure and content of the configuration
        dictionary are determined by the implementing model.

        Parameters
        ----------
        *args : tuple
            Additional positional arguments that might be required by the implementation.
        **kwargs : dict
            Additional keyword arguments that might be required by the implementation.

        Returns
        -------
        dict
            A dictionary representing the configuration of the instance. The exact structure and keys
            depend on the subclass implementation.

        Raises
        ------
        NotImplementedError
            If the method is not implemented in a subclass and is called directly.

        See Also
        --------
        BaseClass : The base class where this method is defined.
        """
        pass

    @abstractmethod
    def forward(self, *args, **kwargs):
        """
        The forward method of the model.

        It is declared to be an abstractmethod and needs to be implemented in the inherited RPN model classes.
        This callable method accepts the data instances as the input and generate the desired outputs.

        Returns
        ----------
        torch.Tensor
            The model generated outputs.
        """
        pass

__init__(name='model_name', device='cpu', *args, **kwargs)

The initialization method of the base model class.

It initializes a model object based on the provided model parameters.

Parameters:

Name Type Description Default
name str

The name of the model, with default value "model_name".

'model_name'

Returns:

Type Description
object

The initialized model object.

Source code in tinybig/module/base_model.py
def __init__(self, name: str = 'model_name', device: str = 'cpu', *args, **kwargs):
    """
    The initialization method of the base model class.

    It initializes a model object based on the provided model parameters.

    Parameters
    ----------
    name: str, default = 'model_name'
        The name of the model, with default value "model_name".

    Returns
    ----------
    object
        The initialized model object.
    """
    Module.__init__(self)
    function.__init__(self, name=name, device=device)

forward(*args, **kwargs) abstractmethod

The forward method of the model.

It is declared to be an abstractmethod and needs to be implemented in the inherited RPN model classes. This callable method accepts the data instances as the input and generate the desired outputs.

Returns:

Type Description
Tensor

The model generated outputs.

Source code in tinybig/module/base_model.py
@abstractmethod
def forward(self, *args, **kwargs):
    """
    The forward method of the model.

    It is declared to be an abstractmethod and needs to be implemented in the inherited RPN model classes.
    This callable method accepts the data instances as the input and generate the desired outputs.

    Returns
    ----------
    torch.Tensor
        The model generated outputs.
    """
    pass

load_ckpt(cache_dir='./ckpt', checkpoint_file='checkpoint', strict=True)

The model state checkpoint loading method.

It loads the model state from the provided checkpoint file.

Parameters:

Name Type Description Default
cache_dir str

The cache directory of the model checkpoint file.

'./ckpt'
checkpoint_file str

The checkpoint file name.

'checkpoint'
strict bool

The boolean tag of whether the model state loading follows the strict configuration checking.

True

Returns:

Type Description
None

This method doesn't have return values.

Source code in tinybig/module/base_model.py
def load_ckpt(self, cache_dir: str = './ckpt', checkpoint_file: str = 'checkpoint', strict: bool = True):
    """
    The model state checkpoint loading method.

    It loads the model state from the provided checkpoint file.

    Parameters
    ----------
    cache_dir: str, default = './ckpt'
        The cache directory of the model checkpoint file.
    checkpoint_file: str, default = 'checkpoint'
        The checkpoint file name.
    strict: bool, default = True
        The boolean tag of whether the model state loading follows the strict configuration checking.

    Returns
    -------
    None
        This method doesn't have return values.
    """
    self.load_state_dict(torch.load(f'{cache_dir}/{checkpoint_file}'), strict=strict)
    print("model checkpoint loading from {}/{}...".format(cache_dir, checkpoint_file))

save_ckpt(cache_dir='./ckpt', checkpoint_file='checkpoint')

The model state checkpoint saving method.

It saves the current model state to a checkpoint file.

Parameters:

Name Type Description Default
cache_dir

The cache directory of the model checkpoint file.

'./ckpt'
checkpoint_file

The checkpoint file name.

'checkpoint'

Returns:

Type Description
None

This method doesn't have return values.

Source code in tinybig/module/base_model.py
def save_ckpt(self, cache_dir='./ckpt', checkpoint_file='checkpoint'):
    """
    The model state checkpoint saving method.

    It saves the current model state to a checkpoint file.

    Parameters
    ----------
    cache_dir: str, default = './ckpt'
        The cache directory of the model checkpoint file.
    checkpoint_file: str, default = 'checkpoint'
        The checkpoint file name.

    Returns
    -------
    None
        This method doesn't have return values.
    """
    create_directory_if_not_exists(f'{cache_dir}/{checkpoint_file}')
    torch.save(self.state_dict(), f'{cache_dir}/{checkpoint_file}')
    print("model checkpoint saving to {}/{}...".format(cache_dir, checkpoint_file))

to_config(*args, **kwargs) abstractmethod

Abstract method to convert the model instance into a configuration dictionary.

This method is intended to be implemented by subclasses. It should generate a dictionary that encapsulates the essential configuration of the model, allowing for reconstruction or serialization of the instance. The specific structure and content of the configuration dictionary are determined by the implementing model.

Parameters:

Name Type Description Default
*args tuple

Additional positional arguments that might be required by the implementation.

()
**kwargs dict

Additional keyword arguments that might be required by the implementation.

{}

Returns:

Type Description
dict

A dictionary representing the configuration of the instance. The exact structure and keys depend on the subclass implementation.

Raises:

Type Description
NotImplementedError

If the method is not implemented in a subclass and is called directly.

See Also

BaseClass : The base class where this method is defined.

Source code in tinybig/module/base_model.py
@abstractmethod
def to_config(self, *args, **kwargs):
    """
    Abstract method to convert the `model` instance into a configuration dictionary.

    This method is intended to be implemented by subclasses. It should generate a dictionary
    that encapsulates the essential configuration of the model, allowing for reconstruction
    or serialization of the instance. The specific structure and content of the configuration
    dictionary are determined by the implementing model.

    Parameters
    ----------
    *args : tuple
        Additional positional arguments that might be required by the implementation.
    **kwargs : dict
        Additional keyword arguments that might be required by the implementation.

    Returns
    -------
    dict
        A dictionary representing the configuration of the instance. The exact structure and keys
        depend on the subclass implementation.

    Raises
    ------
    NotImplementedError
        If the method is not implemented in a subclass and is called directly.

    See Also
    --------
    BaseClass : The base class where this method is defined.
    """
    pass