hypernet_reconciliation
Bases: fabrication
The hypernet based parameter reconciliation function.
It performs the hypernet based parameter reconciliation, and returns the reconciled parameter matrix of shape (n, D). This class inherits from the reconciliation class (i.e., the fabrication class in the module directory).
...
Notes
Formally, given the input parameter vector \(\mathbf{w} \in {R}^l\) from length \(l\), the hypernet based parameter reconciliation function projects it to a high-dimensional parameter matrix of shape (n, D) via a hypernet model, e.g., MLP, as follows $$ \begin{equation} \psi(\mathbf{w}) = \text{HyperNet}(\mathbf{w}) = \mathbf{W} \in {R}^{n \times D}, \end{equation} $$ where \(\text{HyperNet}(\cdot)\) denotes a randomly initialized MLP model with frozen parameters.
For the hybernet based parameter reconciliation function, the parameter length \(l\) should be assigned manually in the initialization method, and it cannot be calculated based on the dimension parameters \(n\) and \(D\) anymore.
Also in the current project, we use a frozen MLP with 1 hidden layer as the hypernet for parameter reconciliation. Meanwhile, the current implementation of this reconciliation function also allows the dynamic MLP with learnable parameters, which can be turned on or turned off by chanting the "static" parameter as True or False, respectively.
Attributes:
Name | Type | Description |
---|---|---|
name |
str, default = 'hypernet_reconciliation'
|
Name of the hypernet parameter reconciliation function |
r |
int, default = 2
|
Submatrix rank parameter. |
Methods:
Name | Description |
---|---|
__init__ |
It initializes the hypernet parameter reconciliation function. |
calculate_l |
It calculates the length of required parameters for the reconciliation function. |
forward |
It implements the abstract forward method declared in the base reconciliation class. |
Source code in tinybig/reconciliation/hypernet_reconciliation.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 |
|
__init__(name='hypernet_reconciliation', l=64, hidden_dim=128, static=True, net=None, *args, **kwargs)
The initialization method of the hypernet parameter reconciliation function.
It initializes a hypernet parameter reconciliation function object. This method will also call the initialization method of the base class as well.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
Name of the hypernet based parameter reconciliation function. |
'hypernet_reconciliation'
|
|
l
|
int
|
The learnable parameter length, which needs to be assigned manually. |
64
|
hidden_dim
|
int
|
The hidden layer dimension of the hypernet MLP. |
128
|
static
|
bool
|
The static hypernet indicator. If state=True, the hypernet MLP is frozen; if state=False, the hypernet MLP is dynamic and contains learnable parameters as well. |
True
|
net
|
The hypernet MLP model. |
None
|
Returns:
Type | Description |
---|---|
fabrication
|
The hypernet parameter reconciliation function object. |
Source code in tinybig/reconciliation/hypernet_reconciliation.py
calculate_l(n=None, D=None)
The required parameter number calculation method.
It calculates the number of required learnable parameters, i.e., \(l\), of the parameter reconciliation function.
Notes
For the hybernet based parameter reconciliation function, the parameter length \(l\) should be assigned manually in the initialization method, and it cannot be calculated based on the dimension parameters \(n\) and \(D\) anymore.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n
|
int
|
The dimension of the output space. |
None
|
D
|
int
|
The dimension of the intermediate expansion space. |
None
|
Returns:
Type | Description |
---|---|
int
|
The number of required learnable parameters. |
Source code in tinybig/reconciliation/hypernet_reconciliation.py
forward(n, D, w, device='cpu', *args, **kwargs)
The forward method of the parameter reconciliation function.
It applies the hypernet based parameter reconciliation operation to the input parameter vector \(\mathbf{w}\), and returns the reconciled parameter matrix of shape (n, D) subject to rank parameters \(r\) as follows: $$ \begin{equation} \psi(\mathbf{w}) = \text{HyperNet}(\mathbf{w}) = \mathbf{W} \in {R}^{n \times D}, \end{equation} $$ where \(\text{HyperNet}(\cdot)\) denotes a randomly initialized MLP model with frozen parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n
|
int
|
The dimension of the output space. |
required |
D
|
int
|
The dimension of the intermediate expansion space. |
required |
w
|
Parameter
|
The learnable parameters of the model. |
required |
device
|
Device to perform the parameter reconciliation. |
'cpu'
|
Returns:
Type | Description |
---|---|
Tensor
|
The reconciled parameter matrix of shape (n, D). |
Source code in tinybig/reconciliation/hypernet_reconciliation.py
initialize_hypernet(l, n, D, hidden_dim, static=True, device='cpu')
The hypernet MLP initialization method.
It initializes the hypernet MLP model based on the provided parameters, whose architecture dimensions can be denoted as follows: $$ \begin{equation} [l] \to [hidden\_dim] \to [n \times D], \end{equation} $$ which can projects any inputs of length \(l\) to the desired output of length \(n \times D\).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
l
|
int
|
The input dimension of the hypernet MLP, which equals to the parameter length \(l\). |
required |
n
|
int
|
The output space dimension, which together with the expansion dimension \(D\) defines the output dimension of the bypernet MLP as \(n \times D\). |
required |
D
|
int
|
The expansion space dimension, which together with the output space dimension \(n\) defines the output dimension of the bypernet MLP as \(n \times D\). |
required |
hidden_dim
|
int
|
The hidden layer dimension of the hypernet MLP. |
required |
static
|
bool
|
The static hypernet indicator. If state=True, the hypernet MLP is frozen; if state=False, the hypernet MLP is dynamic and contains learnable parameters as well. |
True
|
device
|
str
|
The device to host the hypernet and perform the parameter reconciliation. |
'cpu'
|
Returns:
Type | Description |
---|---|
fabrication
|
This function initialize the self.net parameter and doesn't have any return values. |