Skip to content


Bases: Module, function

The base model class of the RPN model in the tinyBIG toolkit.

It inherits from the torch.nn.Module class, which also inherits the "state_dict" and "load_state_dict" methods from the base class.



RPN Model Architecture

Formally, given the underlying data distribution mapping \(f: {R}^m \to {R}^n\), the RPN model proposes to approximate function \(f\) as follows: $$ \begin{equation} g(\mathbf{x} | \mathbf{w}) = \left \langle \kappa_{\xi} (\mathbf{x}), \psi(\mathbf{w}) \right \rangle + \pi(\mathbf{x}), \end{equation} $$

The RPN model disentangles input data from model parameters through the expansion functions \(\kappa\) and reconciliation function \(\psi\), subsequently summed with the remainder function \(\pi\), where

  • \(\kappa_{\xi}: {R}^m \to {R}^{D}\) is named as the data interdependent transformation function. It is a composite function of the data transformation function \(\kappa\) and the data interdependence function \(\xi\). Notation \(D\) is the target expansion space dimension.

  • \(\psi: {R}^l \to {R}^{n \times D}\) is named as the parameter reconciliation function (or parameter fabrication function to be general), which is defined only on the parameters without any input data.

  • \(\pi: {R}^m \to {R}^n\) is named as the remainder function.

  • \(\xi_a: {R}^{b \times m} \to {R}^{m \times m'}\) and \(\xi_i: {R}^{b \times m} \to {R}^{b \times b'}\) defined on the input data batch \(\mathbf{X} \in R^{b \times m}\) are named as the attribute and instance data interdependence functions, respectively.

Deep RPN model with Multi-Layer

The multi-head multi-channel RPN layer provides RPN with greater capabilities for approximating functions with diverse expansions concurrently. However, such shallow architectures can be insufficient for modeling complex functions. The RPN model can also be designed with a deep architecture by stacking multiple RPN layers on top of each other.

Formally, we can represent the deep RPN model with multi-layers as follows:

            \text{Input: } & \mathbf{H}_0  = \mathbf{X},\\\\
            \text{Layer 1: } & \mathbf{H}_1 = \left\langle \kappa_{\xi, 1}(\mathbf{H}_0), \psi_1(\mathbf{w}_1) \right\rangle + \pi_1(\mathbf{H}_0),\\\\
            \text{Layer 2: } & \mathbf{H}_2 = \left\langle \kappa_{\xi, 2}(\mathbf{H}_1), \psi_2(\mathbf{w}_2) \right\rangle + \pi_2(\mathbf{H}_1),\\\\
            \cdots & \cdots \ \cdots\\\\
            \text{Layer K: } & \mathbf{H}_K = \left\langle \kappa_{\xi, K}(\mathbf{H}_{K-1}), \psi_K(\mathbf{w}_K) \right\rangle + \pi_K(\mathbf{H}_{K-1}),\\\\
            \text{Output: } & \mathbf{Z}  = \mathbf{H}_K.

In the above equation, the subscripts used above denote the layer index. The dimensions of the outputs at each layer can be represented as a list \([d_0, d_1, \cdots, d_{K-1}, d_K]\), where \(d_0 = m\) and \(d_K = n\) denote the input and the desired output dimensions, respectively. Therefore, if the component functions at each layer of our model have been predetermined, we can just use the dimension list \([d_0, d_1, \cdots, d_{K-1}, d_K]\) to represent the architecture of the RPN model.


Name Type Description
name str, default = 'base_metric'

Name of the model.


Name Description

It performs the initialization of the model


It saves the model state as checkpoint to file.


It loads the model state from a file.


It reimplementation the build-in callable method.


The forward method of the model.

Source code in tinybig/module/
class model(Module, function):
    The base model class of the RPN model in the tinyBIG toolkit.

    It inherits from the torch.nn.Module class, which also inherits the
    "state_dict" and "load_state_dict" methods from the base class.



    ## RPN Model Architecture

    Formally, given the underlying data distribution mapping $f: {R}^m \to {R}^n$,
    the RPN model proposes to approximate function $f$ as follows:
            g(\mathbf{x} | \mathbf{w}) = \left \langle \kappa_{\xi} (\mathbf{x}), \psi(\mathbf{w}) \right \rangle + \pi(\mathbf{x}),

    The RPN model disentangles input data from model parameters through the expansion functions $\kappa$ and
    reconciliation function $\psi$, subsequently summed with the remainder function $\pi$, where

    * $\kappa_{\xi}: {R}^m \to {R}^{D}$ is named as the **data interdependent transformation function**. It is a composite function of the **data transformation function** $\kappa$ and the **data interdependence function** $\xi$. Notation $D$ is the target expansion space dimension.

    * $\psi: {R}^l \to {R}^{n \times D}$ is named as the **parameter reconciliation function** (or **parameter fabrication function** to be general), which is defined only on the parameters without any input data.

    * $\pi: {R}^m \to {R}^n$ is named as the **remainder function**.

    * $\xi_a: {R}^{b \times m} \to {R}^{m \times m'}$ and $\xi_i: {R}^{b \times m} \to {R}^{b \times b'}$ defined on the input data batch $\mathbf{X} \in R^{b \times m}$ are named as the **attribute** and **instance data interdependence functions**, respectively.

    ## Deep RPN model with Multi-Layer

    The multi-head multi-channel RPN layer provides RPN with greater capabilities
    for approximating functions with diverse expansions concurrently.
    However, such shallow architectures can be insufficient for modeling complex functions.
    The RPN model can also be designed with a deep architecture by stacking multiple RPN layers on top of each other.

    Formally, we can represent the deep RPN model with multi-layers as follows:

                \text{Input: } & \mathbf{H}_0  = \mathbf{X},\\\\
                \text{Layer 1: } & \mathbf{H}_1 = \left\langle \kappa_{\xi, 1}(\mathbf{H}_0), \psi_1(\mathbf{w}_1) \right\rangle + \pi_1(\mathbf{H}_0),\\\\
                \text{Layer 2: } & \mathbf{H}_2 = \left\langle \kappa_{\xi, 2}(\mathbf{H}_1), \psi_2(\mathbf{w}_2) \right\rangle + \pi_2(\mathbf{H}_1),\\\\
                \cdots & \cdots \ \cdots\\\\
                \text{Layer K: } & \mathbf{H}_K = \left\langle \kappa_{\xi, K}(\mathbf{H}_{K-1}), \psi_K(\mathbf{w}_K) \right\rangle + \pi_K(\mathbf{H}_{K-1}),\\\\
                \text{Output: } & \mathbf{Z}  = \mathbf{H}_K.

    In the above equation, the subscripts used above denote the layer index. The dimensions of the outputs at each layer
    can be represented as a list $[d_0, d_1, \cdots, d_{K-1}, d_K]$, where $d_0 = m$ and $d_K = n$
    denote the input and the desired output dimensions, respectively.
    Therefore, if the component functions at each layer of our model have been predetermined, we can just use the dimension
    list $[d_0, d_1, \cdots, d_{K-1}, d_K]$ to represent the architecture of the RPN model.

    name: str, default = 'base_metric'
        Name of the model.

        It performs the initialization of the model

        It saves the model state as checkpoint to file.

        It loads the model state from a file.

        It reimplementation the build-in callable method.

        The forward method of the model.
    def __init__(self, name: str = 'model_name', device: str = 'cpu', *args, **kwargs):
        The initialization method of the base model class.

        It initializes a model object based on the provided model parameters.

        name: str, default = 'model_name'
            The name of the model, with default value "model_name".

            The initialized model object.
        function.__init__(self, name=name, device=device)

    def save_ckpt(self, cache_dir='./ckpt', checkpoint_file='checkpoint'):
        The model state checkpoint saving method.

        It saves the current model state to a checkpoint file.

        cache_dir: str, default = './ckpt'
            The cache directory of the model checkpoint file.
        checkpoint_file: str, default = 'checkpoint'
            The checkpoint file name.

            This method doesn't have return values.
        create_directory_if_not_exists(f'{cache_dir}/{checkpoint_file}'), f'{cache_dir}/{checkpoint_file}')
        print("model checkpoint saving to {}/{}...".format(cache_dir, checkpoint_file))

    def load_ckpt(self, cache_dir: str = './ckpt', checkpoint_file: str = 'checkpoint', strict: bool = True):
        The model state checkpoint loading method.

        It loads the model state from the provided checkpoint file.

        cache_dir: str, default = './ckpt'
            The cache directory of the model checkpoint file.
        checkpoint_file: str, default = 'checkpoint'
            The checkpoint file name.
        strict: bool, default = True
            The boolean tag of whether the model state loading follows the strict configuration checking.

            This method doesn't have return values.
        self.load_state_dict(torch.load(f'{cache_dir}/{checkpoint_file}'), strict=strict)
        print("model checkpoint loading from {}/{}...".format(cache_dir, checkpoint_file))

    def to_config(self, *args, **kwargs):
        Abstract method to convert the `model` instance into a configuration dictionary.

        This method is intended to be implemented by subclasses. It should generate a dictionary
        that encapsulates the essential configuration of the model, allowing for reconstruction
        or serialization of the instance. The specific structure and content of the configuration
        dictionary are determined by the implementing model.

        *args : tuple
            Additional positional arguments that might be required by the implementation.
        **kwargs : dict
            Additional keyword arguments that might be required by the implementation.

            A dictionary representing the configuration of the instance. The exact structure and keys
            depend on the subclass implementation.

            If the method is not implemented in a subclass and is called directly.

        See Also
        BaseClass : The base class where this method is defined.

    def forward(self, *args, **kwargs):
        The forward method of the model.

        It is declared to be an abstractmethod and needs to be implemented in the inherited RPN model classes.
        This callable method accepts the data instances as the input and generate the desired outputs.

            The model generated outputs.

__init__(name='model_name', device='cpu', *args, **kwargs)

The initialization method of the base model class.

It initializes a model object based on the provided model parameters.


Name Type Description Default
name str

The name of the model, with default value "model_name".



Type Description

The initialized model object.

Source code in tinybig/module/
def __init__(self, name: str = 'model_name', device: str = 'cpu', *args, **kwargs):
    The initialization method of the base model class.

    It initializes a model object based on the provided model parameters.

    name: str, default = 'model_name'
        The name of the model, with default value "model_name".

        The initialized model object.
    function.__init__(self, name=name, device=device)

forward(*args, **kwargs) abstractmethod

The forward method of the model.

It is declared to be an abstractmethod and needs to be implemented in the inherited RPN model classes. This callable method accepts the data instances as the input and generate the desired outputs.


Type Description

The model generated outputs.

Source code in tinybig/module/
def forward(self, *args, **kwargs):
    The forward method of the model.

    It is declared to be an abstractmethod and needs to be implemented in the inherited RPN model classes.
    This callable method accepts the data instances as the input and generate the desired outputs.

        The model generated outputs.

load_ckpt(cache_dir='./ckpt', checkpoint_file='checkpoint', strict=True)

The model state checkpoint loading method.

It loads the model state from the provided checkpoint file.


Name Type Description Default
cache_dir str

The cache directory of the model checkpoint file.

checkpoint_file str

The checkpoint file name.

strict bool

The boolean tag of whether the model state loading follows the strict configuration checking.



Type Description

This method doesn't have return values.

Source code in tinybig/module/
def load_ckpt(self, cache_dir: str = './ckpt', checkpoint_file: str = 'checkpoint', strict: bool = True):
    The model state checkpoint loading method.

    It loads the model state from the provided checkpoint file.

    cache_dir: str, default = './ckpt'
        The cache directory of the model checkpoint file.
    checkpoint_file: str, default = 'checkpoint'
        The checkpoint file name.
    strict: bool, default = True
        The boolean tag of whether the model state loading follows the strict configuration checking.

        This method doesn't have return values.
    self.load_state_dict(torch.load(f'{cache_dir}/{checkpoint_file}'), strict=strict)
    print("model checkpoint loading from {}/{}...".format(cache_dir, checkpoint_file))

save_ckpt(cache_dir='./ckpt', checkpoint_file='checkpoint')

The model state checkpoint saving method.

It saves the current model state to a checkpoint file.


Name Type Description Default

The cache directory of the model checkpoint file.


The checkpoint file name.



Type Description

This method doesn't have return values.

Source code in tinybig/module/
def save_ckpt(self, cache_dir='./ckpt', checkpoint_file='checkpoint'):
    The model state checkpoint saving method.

    It saves the current model state to a checkpoint file.

    cache_dir: str, default = './ckpt'
        The cache directory of the model checkpoint file.
    checkpoint_file: str, default = 'checkpoint'
        The checkpoint file name.

        This method doesn't have return values.
    create_directory_if_not_exists(f'{cache_dir}/{checkpoint_file}'), f'{cache_dir}/{checkpoint_file}')
    print("model checkpoint saving to {}/{}...".format(cache_dir, checkpoint_file))

to_config(*args, **kwargs) abstractmethod

Abstract method to convert the model instance into a configuration dictionary.

This method is intended to be implemented by subclasses. It should generate a dictionary that encapsulates the essential configuration of the model, allowing for reconstruction or serialization of the instance. The specific structure and content of the configuration dictionary are determined by the implementing model.


Name Type Description Default
*args tuple

Additional positional arguments that might be required by the implementation.

**kwargs dict

Additional keyword arguments that might be required by the implementation.



Type Description

A dictionary representing the configuration of the instance. The exact structure and keys depend on the subclass implementation.


Type Description

If the method is not implemented in a subclass and is called directly.

See Also

BaseClass : The base class where this method is defined.

Source code in tinybig/module/
def to_config(self, *args, **kwargs):
    Abstract method to convert the `model` instance into a configuration dictionary.

    This method is intended to be implemented by subclasses. It should generate a dictionary
    that encapsulates the essential configuration of the model, allowing for reconstruction
    or serialization of the instance. The specific structure and content of the configuration
    dictionary are determined by the implementing model.

    *args : tuple
        Additional positional arguments that might be required by the implementation.
    **kwargs : dict
        Additional keyword arguments that might be required by the implementation.

        A dictionary representing the configuration of the instance. The exact structure and keys
        depend on the subclass implementation.

        If the method is not implemented in a subclass and is called directly.

    See Also
    BaseClass : The base class where this method is defined.