model
Bases: Module
, function
The base model class of the RPN model in the tinyBIG toolkit.
It inherits from the torch.nn.Module class, which also inherits the "state_dict" and "load_state_dict" methods from the base class.
...
Notes
RPN Model Architecture
Formally, given the underlying data distribution mapping \(f: {R}^m \to {R}^n\), the RPN model proposes to approximate function \(f\) as follows: $$ \begin{equation} g(\mathbf{x} | \mathbf{w}) = \left \langle \kappa_{\xi} (\mathbf{x}), \psi(\mathbf{w}) \right \rangle + \pi(\mathbf{x}), \end{equation} $$
The RPN model disentangles input data from model parameters through the expansion functions \(\kappa\) and reconciliation function \(\psi\), subsequently summed with the remainder function \(\pi\), where
-
\(\kappa_{\xi}: {R}^m \to {R}^{D}\) is named as the data interdependent transformation function. It is a composite function of the data transformation function \(\kappa\) and the data interdependence function \(\xi\). Notation \(D\) is the target expansion space dimension.
-
\(\psi: {R}^l \to {R}^{n \times D}\) is named as the parameter reconciliation function (or parameter fabrication function to be general), which is defined only on the parameters without any input data.
-
\(\pi: {R}^m \to {R}^n\) is named as the remainder function.
-
\(\xi_a: {R}^{b \times m} \to {R}^{m \times m'}\) and \(\xi_i: {R}^{b \times m} \to {R}^{b \times b'}\) defined on the input data batch \(\mathbf{X} \in R^{b \times m}\) are named as the attribute and instance data interdependence functions, respectively.
Deep RPN model with Multi-Layer
The multi-head multi-channel RPN layer provides RPN with greater capabilities for approximating functions with diverse expansions concurrently. However, such shallow architectures can be insufficient for modeling complex functions. The RPN model can also be designed with a deep architecture by stacking multiple RPN layers on top of each other.
Formally, we can represent the deep RPN model with multi-layers as follows:
\[ \begin{equation} \begin{cases} \text{Input: } & \mathbf{H}_0 = \mathbf{X},\\\\ \text{Layer 1: } & \mathbf{H}_1 = \left\langle \kappa_{\xi, 1}(\mathbf{H}_0), \psi_1(\mathbf{w}_1) \right\rangle + \pi_1(\mathbf{H}_0),\\\\ \text{Layer 2: } & \mathbf{H}_2 = \left\langle \kappa_{\xi, 2}(\mathbf{H}_1), \psi_2(\mathbf{w}_2) \right\rangle + \pi_2(\mathbf{H}_1),\\\\ \cdots & \cdots \ \cdots\\\\ \text{Layer K: } & \mathbf{H}_K = \left\langle \kappa_{\xi, K}(\mathbf{H}_{K-1}), \psi_K(\mathbf{w}_K) \right\rangle + \pi_K(\mathbf{H}_{K-1}),\\\\ \text{Output: } & \mathbf{Z} = \mathbf{H}_K. \end{cases} \end{equation} \]
In the above equation, the subscripts used above denote the layer index. The dimensions of the outputs at each layer can be represented as a list \([d_0, d_1, \cdots, d_{K-1}, d_K]\), where \(d_0 = m\) and \(d_K = n\) denote the input and the desired output dimensions, respectively. Therefore, if the component functions at each layer of our model have been predetermined, we can just use the dimension list \([d_0, d_1, \cdots, d_{K-1}, d_K]\) to represent the architecture of the RPN model.
Attributes:
Name | Type | Description |
---|---|---|
name |
str, default = 'base_metric'
|
Name of the model. |
Methods:
Name | Description |
---|---|
__init__ |
It performs the initialization of the model |
save_ckpt |
It saves the model state as checkpoint to file. |
load_ckpt |
It loads the model state from a file. |
__call__ |
It reimplementation the build-in callable method. |
forward |
The forward method of the model. |
Source code in tinybig/module/base_model.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 |
|
__init__(name='model_name', device='cpu', *args, **kwargs)
The initialization method of the base model class.
It initializes a model object based on the provided model parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The name of the model, with default value "model_name". |
'model_name'
|
Returns:
Type | Description |
---|---|
object
|
The initialized model object. |
Source code in tinybig/module/base_model.py
forward(*args, **kwargs)
abstractmethod
The forward method of the model.
It is declared to be an abstractmethod and needs to be implemented in the inherited RPN model classes. This callable method accepts the data instances as the input and generate the desired outputs.
Returns:
Type | Description |
---|---|
Tensor
|
The model generated outputs. |
Source code in tinybig/module/base_model.py
load_ckpt(cache_dir='./ckpt', checkpoint_file='checkpoint', strict=True)
The model state checkpoint loading method.
It loads the model state from the provided checkpoint file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cache_dir
|
str
|
The cache directory of the model checkpoint file. |
'./ckpt'
|
checkpoint_file
|
str
|
The checkpoint file name. |
'checkpoint'
|
strict
|
bool
|
The boolean tag of whether the model state loading follows the strict configuration checking. |
True
|
Returns:
Type | Description |
---|---|
None
|
This method doesn't have return values. |
Source code in tinybig/module/base_model.py
save_ckpt(cache_dir='./ckpt', checkpoint_file='checkpoint')
The model state checkpoint saving method.
It saves the current model state to a checkpoint file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cache_dir
|
The cache directory of the model checkpoint file. |
'./ckpt'
|
|
checkpoint_file
|
The checkpoint file name. |
'checkpoint'
|
Returns:
Type | Description |
---|---|
None
|
This method doesn't have return values. |
Source code in tinybig/module/base_model.py
to_config(*args, **kwargs)
abstractmethod
Abstract method to convert the model
instance into a configuration dictionary.
This method is intended to be implemented by subclasses. It should generate a dictionary that encapsulates the essential configuration of the model, allowing for reconstruction or serialization of the instance. The specific structure and content of the configuration dictionary are determined by the implementing model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*args
|
tuple
|
Additional positional arguments that might be required by the implementation. |
()
|
**kwargs
|
dict
|
Additional keyword arguments that might be required by the implementation. |
{}
|
Returns:
Type | Description |
---|---|
dict
|
A dictionary representing the configuration of the instance. The exact structure and keys depend on the subclass implementation. |
Raises:
Type | Description |
---|---|
NotImplementedError
|
If the method is not implemented in a subclass and is called directly. |
See Also
BaseClass : The base class where this method is defined.