Skip to content

dual_lphm_reconciliation

Bases: fabrication

The dual low-rank parameterized hypercomplex multiplication (Dual-LPHM) based parameter reconciliation function.

It performs the Dual-LPHM parameter reconciliation, and returns the Dual-LPHM reconciled parameter matrix of shape (n, D). This class inherits from the reconciliation class (i.e., the fabrication class in the module directory).

The dual low-rank parameterized hypercomplex multiplication based parameter reconciliation can be viewed as a more agreesive version of the LPHM based parameter reconciliation function. It replaces both \(\mathbf{A}\) and \(\mathbf{B}\) in the hypercomplex multiplication based parameter reconciliation with the products of two low-rank sub-matrices, respectively.

...

Notes

Formally, given the parameter vector \(\mathbf{w} \in {R}^{l}\) and a rank hyper-parameter \(r\), together with the parameter sub-matrix dimension parameters \(p\) and \(q\), the Dual-LPHM reconciliation function partitions \(\mathbf{w}\) into four sub-vectors and subsequently reshapes them into three matrices \(\mathbf{P} \in {R}^{p \times r}\), \(\mathbf{Q} \in {R}^{q \times r}\), \(\mathbf{S} \in {R}^{\frac{n}{p} \times r}\) and \(\mathbf{T} \in {R}^{\frac{D}{q} \times r}\). These sub-matrices \(\mathbf{P}\), \(\mathbf{Q}\), \(\mathbf{S}\) and \(\mathbf{T}\) help define the Dual-LPHM reconciliation function as follows: $$ \begin{equation} \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = ( \mathbf{P} \mathbf{Q}^\top) \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}. \end{equation} $$ This necessitates imposing certain limitations on these dimension and rank parameters, and the parameter vector length \(l\) can be calculated as follows: $$ \begin{equation} l = r( p + q + \frac{n}{p} + \frac{D}{q} ). \end{equation} $$

For the Dual-LPHM parameter reconciliation function, it adds strict constraints on the parameters \(p\) and \(q\), which should be the divisors of the target dimensions \(n\) and \(D\), respectively, i.e., $$ \begin{equation} n \% p = 0 \text{, and } D \% q = 0. \end{equation} $$

Attributes:

Name Type Description
name str, default = 'dual_lphm_reconciliation'

Name of the Dual-LPHM parameter reconciliation function

p int, default = 2

Parameter sub-matrix row dimension.

q int, default = None

Parameter sub-matrix column dimension. If q is not provided with initial values, it will be assigned with value p by default.

r int, default = 2

Submatrix rank parameter.

Methods:

Name Description
__init__

It initializes the Dual-LPHM parameter reconciliation function.

calculate_l

It calculates the length of required parameters for the reconciliation function.

forward

It implements the abstract forward method declared in the base reconciliation class.

Source code in tinybig/reconciliation/lowrank_reconciliation.py
class dual_lphm_reconciliation(fabrication):
    r"""
    The dual low-rank parameterized hypercomplex multiplication (Dual-LPHM) based parameter reconciliation function.

    It performs the Dual-LPHM parameter reconciliation, and returns the Dual-LPHM reconciled parameter matrix of shape (n, D).
    This class inherits from the reconciliation class (i.e., the fabrication class in the module directory).

    The dual low-rank parameterized hypercomplex multiplication based parameter reconciliation can be viewed as a more
    agreesive version of the LPHM based parameter reconciliation function.
    It replaces both $\mathbf{A}$ and $\mathbf{B}$ in the hypercomplex multiplication based parameter reconciliation
    with the products of two low-rank sub-matrices, respectively.

    ...

    Notes
    ----------
    Formally, given the parameter vector $\mathbf{w} \in {R}^{l}$ and a rank hyper-parameter $r$, together with the
    parameter sub-matrix dimension parameters $p$ and $q$, the Dual-LPHM reconciliation function partitions $\mathbf{w}$
    into four sub-vectors and subsequently reshapes them into three matrices $\mathbf{P} \in {R}^{p \times r}$,
    $\mathbf{Q} \in {R}^{q \times r}$, $\mathbf{S} \in {R}^{\frac{n}{p} \times r}$ and $\mathbf{T} \in {R}^{\frac{D}{q} \times r}$.
    These sub-matrices $\mathbf{P}$, $\mathbf{Q}$, $\mathbf{S}$ and $\mathbf{T}$ help define the Dual-LPHM reconciliation function as follows:
    $$
        \begin{equation}
            \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = ( \mathbf{P} \mathbf{Q}^\top) \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}.
        \end{equation}
    $$
    This necessitates imposing certain limitations on these dimension and rank parameters, and the parameter vector
    length $l$ can be calculated as follows:
    $$
        \begin{equation}
            l = r( p + q + \frac{n}{p} + \frac{D}{q} ).
        \end{equation}
    $$

    For the Dual-LPHM parameter reconciliation function, it adds strict constraints on the parameters $p$ and $q$, which
    should be the divisors of the target dimensions $n$ and $D$, respectively, i.e.,
    $$
        \begin{equation}
            n \\% p = 0 \text{, and } D \\% q = 0.
        \end{equation}
    $$

    Attributes
    ----------
    name: str, default = 'dual_lphm_reconciliation'
        Name of the Dual-LPHM parameter reconciliation function
    p: int, default = 2
        Parameter sub-matrix row dimension.
    q: int, default = None
        Parameter sub-matrix column dimension.
        If q is not provided with initial values, it will be assigned with value p by default.
    r: int, default = 2
        Submatrix rank parameter.

    Methods
    ----------
    __init__
        It initializes the Dual-LPHM parameter reconciliation function.

    calculate_l
        It calculates the length of required parameters for the reconciliation function.

    forward
        It implements the abstract forward method declared in the base reconciliation class.
    """
    def __init__(self, name='dual_lphm_reconciliation', p=2, q=None, r=2, *args, **kwargs):
        """
        The initialization method of the Dual-LPHM parameter reconciliation function.

        It initializes a Dual-LPHM parameter reconciliation function object.
        This method will also call the initialization method of the base class as well.

        Parameters
        ----------
        name: str, default = 'dual_lphm_reconciliation'
            Name of the Dual-LPHM parameter reconciliation function.
        p: int, default = 2
            Parameter sub-matrix row dimension.
        q: int, default = None
            Parameter sub-matrix column dimension.
            If q is not provided with initial values, it will be assigned with value p by default.
        r: int, default = 2
            Submatrix rank parameter.

        Returns
        ----------
        object
            The Dual-LPHM parameter reconciliation function object.
        """
        super().__init__(name=name, *args, **kwargs)
        self.p = p
        self.q = q if q is not None else p
        self.r = r

    def calculate_l(self, n: int, D: int):
        r"""
        The required parameter number calculation method.

        It calculates the number of required learnable parameters, i.e., $l$, of the parameter reconciliation function
        based on the intermediate and output space dimensions, $n$ and $D$, and the dimension and rank parameters
        $p$, $q$ and $r$, which can be represented as follows:
        $$
            \begin{equation}
                l = r( p + q + \frac{n}{p} + \frac{D}{q} ).
            \end{equation}
        $$

        Notes
        ----------
        For the Dual-LPHM parameter reconciliation function, it adds strict constraints on the parameters $p$ and $q$, which
        should be the divisors of the target dimensions $n$ and $D$, respectively, i.e.,
        $$
            \begin{equation}
                n \\% p = 0 \text{, and } D \\% q = 0.
            \end{equation}
        $$

        Parameters
        ----------
        n: int
            The dimension of the output space.
        D: int
            The dimension of the intermediate expansion space.

        Returns
        -------
        int
            The number of required learnable parameters.
        """
        if n % self.p != 0 or D % self.q != 0:
            raise ValueError('The input dimensions {} and {} cannot be divided by parameter p {} and q {}'.format(n, D, self.p, self.q))
        s, t = int(n / self.p), int(D / self.q)
        assert (self.p * self.q * s * t == n * D)
        return self.p * self.r + self.q * self.r + s * self.r + t * self.r

    def forward(self, n: int, D: int, w: torch.nn.Parameter, device='cpu', *args, **kwargs):
        r"""
        The forward method of the parameter reconciliation function.

        It applies the Dual-LPHM parameter reconciliation operation to the input parameter vector $\mathbf{w}$,
        and returns the reconciled parameter matrix of shape (n, D) subject to the dimension and rank parameters
        $p$, $q$ and $r$ as follows:
        $$
            \begin{equation}
                \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = ( \mathbf{P} \mathbf{Q}^\top) \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}.
            \end{equation}
        $$
        where $\mathbf{P} \in {R}^{p \times r}$, $\mathbf{Q} \in {R}^{q \times r}$, $\mathbf{S} \in {R}^{\frac{n}{p} \times r}$ and
        $\mathbf{T} \in {R}^{\frac{D}{q} \times r}$ are all obtained by partitioning $\mathbf{w}$ into sub-vectors
        and subsequently reshaping them into matrices.

        Parameters
        ----------
        n: int
            The dimension of the output space.
        D: int
            The dimension of the intermediate expansion space.
        w: torch.nn.Parameter, default = None
            The learnable parameters of the model.
        device: str, default = 'cpu'
            Device to perform the parameter reconciliation.

        Returns
        ----------
        torch.Tensor
            The reconciled parameter matrix of shape (n, D).
        """
        assert w.dim() == 2 and w.size(1) == self.calculate_l(n=n, D=D)
        s, t = int(n/self.p), int(D/self.q)
        P, Q, S, T = torch.split(w, [self.p*self.r, self.q*self.r, s*self.r, t*self.r], dim=1)
        A = F.linear(P.view(self.p, -1), Q.view(self.q, -1)).view(1, -1)
        B = F.linear(S.view(s, -1), T.view(t, -1)).view(1, -1)
        return torch.einsum('pq,st->psqt', A, B).view(self.p*s, self.q*t)

__init__(name='dual_lphm_reconciliation', p=2, q=None, r=2, *args, **kwargs)

The initialization method of the Dual-LPHM parameter reconciliation function.

It initializes a Dual-LPHM parameter reconciliation function object. This method will also call the initialization method of the base class as well.

Parameters:

Name Type Description Default
name

Name of the Dual-LPHM parameter reconciliation function.

'dual_lphm_reconciliation'
p

Parameter sub-matrix row dimension.

2
q

Parameter sub-matrix column dimension. If q is not provided with initial values, it will be assigned with value p by default.

None
r

Submatrix rank parameter.

2

Returns:

Type Description
object

The Dual-LPHM parameter reconciliation function object.

Source code in tinybig/reconciliation/lowrank_reconciliation.py
def __init__(self, name='dual_lphm_reconciliation', p=2, q=None, r=2, *args, **kwargs):
    """
    The initialization method of the Dual-LPHM parameter reconciliation function.

    It initializes a Dual-LPHM parameter reconciliation function object.
    This method will also call the initialization method of the base class as well.

    Parameters
    ----------
    name: str, default = 'dual_lphm_reconciliation'
        Name of the Dual-LPHM parameter reconciliation function.
    p: int, default = 2
        Parameter sub-matrix row dimension.
    q: int, default = None
        Parameter sub-matrix column dimension.
        If q is not provided with initial values, it will be assigned with value p by default.
    r: int, default = 2
        Submatrix rank parameter.

    Returns
    ----------
    object
        The Dual-LPHM parameter reconciliation function object.
    """
    super().__init__(name=name, *args, **kwargs)
    self.p = p
    self.q = q if q is not None else p
    self.r = r

calculate_l(n, D)

The required parameter number calculation method.

It calculates the number of required learnable parameters, i.e., \(l\), of the parameter reconciliation function based on the intermediate and output space dimensions, \(n\) and \(D\), and the dimension and rank parameters \(p\), \(q\) and \(r\), which can be represented as follows: $$ \begin{equation} l = r( p + q + \frac{n}{p} + \frac{D}{q} ). \end{equation} $$

Notes

For the Dual-LPHM parameter reconciliation function, it adds strict constraints on the parameters \(p\) and \(q\), which should be the divisors of the target dimensions \(n\) and \(D\), respectively, i.e., $$ \begin{equation} n \% p = 0 \text{, and } D \% q = 0. \end{equation} $$

Parameters:

Name Type Description Default
n int

The dimension of the output space.

required
D int

The dimension of the intermediate expansion space.

required

Returns:

Type Description
int

The number of required learnable parameters.

Source code in tinybig/reconciliation/lowrank_reconciliation.py
def calculate_l(self, n: int, D: int):
    r"""
    The required parameter number calculation method.

    It calculates the number of required learnable parameters, i.e., $l$, of the parameter reconciliation function
    based on the intermediate and output space dimensions, $n$ and $D$, and the dimension and rank parameters
    $p$, $q$ and $r$, which can be represented as follows:
    $$
        \begin{equation}
            l = r( p + q + \frac{n}{p} + \frac{D}{q} ).
        \end{equation}
    $$

    Notes
    ----------
    For the Dual-LPHM parameter reconciliation function, it adds strict constraints on the parameters $p$ and $q$, which
    should be the divisors of the target dimensions $n$ and $D$, respectively, i.e.,
    $$
        \begin{equation}
            n \\% p = 0 \text{, and } D \\% q = 0.
        \end{equation}
    $$

    Parameters
    ----------
    n: int
        The dimension of the output space.
    D: int
        The dimension of the intermediate expansion space.

    Returns
    -------
    int
        The number of required learnable parameters.
    """
    if n % self.p != 0 or D % self.q != 0:
        raise ValueError('The input dimensions {} and {} cannot be divided by parameter p {} and q {}'.format(n, D, self.p, self.q))
    s, t = int(n / self.p), int(D / self.q)
    assert (self.p * self.q * s * t == n * D)
    return self.p * self.r + self.q * self.r + s * self.r + t * self.r

forward(n, D, w, device='cpu', *args, **kwargs)

The forward method of the parameter reconciliation function.

It applies the Dual-LPHM parameter reconciliation operation to the input parameter vector \(\mathbf{w}\), and returns the reconciled parameter matrix of shape (n, D) subject to the dimension and rank parameters \(p\), \(q\) and \(r\) as follows: $$ \begin{equation} \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = ( \mathbf{P} \mathbf{Q}^\top) \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}. \end{equation} $$ where \(\mathbf{P} \in {R}^{p \times r}\), \(\mathbf{Q} \in {R}^{q \times r}\), \(\mathbf{S} \in {R}^{\frac{n}{p} \times r}\) and \(\mathbf{T} \in {R}^{\frac{D}{q} \times r}\) are all obtained by partitioning \(\mathbf{w}\) into sub-vectors and subsequently reshaping them into matrices.

Parameters:

Name Type Description Default
n int

The dimension of the output space.

required
D int

The dimension of the intermediate expansion space.

required
w Parameter

The learnable parameters of the model.

required
device

Device to perform the parameter reconciliation.

'cpu'

Returns:

Type Description
Tensor

The reconciled parameter matrix of shape (n, D).

Source code in tinybig/reconciliation/lowrank_reconciliation.py
def forward(self, n: int, D: int, w: torch.nn.Parameter, device='cpu', *args, **kwargs):
    r"""
    The forward method of the parameter reconciliation function.

    It applies the Dual-LPHM parameter reconciliation operation to the input parameter vector $\mathbf{w}$,
    and returns the reconciled parameter matrix of shape (n, D) subject to the dimension and rank parameters
    $p$, $q$ and $r$ as follows:
    $$
        \begin{equation}
            \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = ( \mathbf{P} \mathbf{Q}^\top) \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}.
        \end{equation}
    $$
    where $\mathbf{P} \in {R}^{p \times r}$, $\mathbf{Q} \in {R}^{q \times r}$, $\mathbf{S} \in {R}^{\frac{n}{p} \times r}$ and
    $\mathbf{T} \in {R}^{\frac{D}{q} \times r}$ are all obtained by partitioning $\mathbf{w}$ into sub-vectors
    and subsequently reshaping them into matrices.

    Parameters
    ----------
    n: int
        The dimension of the output space.
    D: int
        The dimension of the intermediate expansion space.
    w: torch.nn.Parameter, default = None
        The learnable parameters of the model.
    device: str, default = 'cpu'
        Device to perform the parameter reconciliation.

    Returns
    ----------
    torch.Tensor
        The reconciled parameter matrix of shape (n, D).
    """
    assert w.dim() == 2 and w.size(1) == self.calculate_l(n=n, D=D)
    s, t = int(n/self.p), int(D/self.q)
    P, Q, S, T = torch.split(w, [self.p*self.r, self.q*self.r, s*self.r, t*self.r], dim=1)
    A = F.linear(P.view(self.p, -1), Q.view(self.q, -1)).view(1, -1)
    B = F.linear(S.view(s, -1), T.view(t, -1)).view(1, -1)
    return torch.einsum('pq,st->psqt', A, B).view(self.p*s, self.q*t)