Skip to content

lphm_reconciliation

Bases: fabrication

The low-rank parameterized hypercomplex multiplication (LPHM) based parameter reconciliation function.

It performs the LPHM parameter reconciliation, and returns the LPHM reconciled parameter matrix of shape (n, D). This class inherits from the reconciliation class (i.e., the fabrication class in the module directory).

The low-rank parameterized hypercomplex multiplication based parameter reconciliation can be viewed as a combination of the low-rank parameter reconciliation with the hypercomplex multiplication based parameter reconciliation, where the matrix \(\mathbf{B}\) in the hypercomplex multiplication based parameter reconciliation is replaced with the product of two low-rank sub-matrices instead.

...

Notes

Formally, given the parameter vector \(\mathbf{w} \in {R}^{l}\) and a rank hyper-parameter \(r\), together with the parameter sub-matrix dimension parameters \(p\) and \(q\), the LPHM reconciliation function partitions \(\mathbf{w}\) into three sub-vectors and subsequently reshapes them into three matrices \(\mathbf{A} \in {R}^{p \times q}\), \(\mathbf{S} \in {R}^{\frac{n}{p} \times r}\) and \(\mathbf{T} \in {R}^{\frac{D}{q} \times r}\). These sub-matrices \(\mathbf{A}\), \(\mathbf{S}\) and \(\mathbf{T}\) help define the LPHM reconciliation function as follows: $$ \begin{equation} \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = \mathbf{A} \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}. \end{equation} $$ This necessitates imposing certain limitations on these dimension and rank parameters, and the parameter vector length \(l\) can be calculated as follows: $$ \begin{equation} l = p \times q + r( \frac{n}{p} + \frac{D}{q} ). \end{equation} $$

For the LPHM parameter reconciliation function, it adds strict constraints on the parameters \(p\) and \(q\), which should be the divisors of the target dimensions \(n\) and \(D\), respectively, i.e., $$ \begin{equation} n \% p = 0 \text{, and } D \% q = 0. \end{equation} $$

Attributes:

Name Type Description
name str, default = 'lphm_reconciliation'

Name of the LPHM parameter reconciliation function

p int, default = 2

Parameter sub-matrix row dimension.

q int, default = None

Parameter sub-matrix column dimension. If q is not provided with initial values, it will be assigned with value p by default.

r int, default = 2

Submatrix rank parameter.

Methods:

Name Description
__init__

It initializes the LPHM parameter reconciliation function.

calculate_l

It calculates the length of required parameters for the reconciliation function.

forward

It implements the abstract forward method declared in the base reconciliation class.

Source code in tinybig/reconciliation/lowrank_reconciliation.py
class lphm_reconciliation(fabrication):
    r"""
    The low-rank parameterized hypercomplex multiplication (LPHM) based parameter reconciliation function.

    It performs the LPHM parameter reconciliation, and returns the LPHM reconciled parameter matrix of shape (n, D).
    This class inherits from the reconciliation class (i.e., the fabrication class in the module directory).

    The low-rank parameterized hypercomplex multiplication based parameter reconciliation can be viewed as a combination
    of the low-rank parameter reconciliation with the hypercomplex multiplication based parameter reconciliation, where
    the matrix $\mathbf{B}$ in the hypercomplex multiplication based parameter reconciliation is replaced with the
    product of two low-rank sub-matrices instead.

    ...

    Notes
    ----------
    Formally, given the parameter vector $\mathbf{w} \in {R}^{l}$ and a rank hyper-parameter $r$, together with the
    parameter sub-matrix dimension parameters $p$ and $q$, the LPHM reconciliation function partitions $\mathbf{w}$
    into three sub-vectors and subsequently reshapes them into three matrices $\mathbf{A} \in {R}^{p \times q}$,
    $\mathbf{S} \in {R}^{\frac{n}{p} \times r}$ and $\mathbf{T} \in {R}^{\frac{D}{q} \times r}$.
    These sub-matrices $\mathbf{A}$, $\mathbf{S}$ and $\mathbf{T}$ help define the LPHM reconciliation function as follows:
    $$
        \begin{equation}
            \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = \mathbf{A} \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}.
        \end{equation}
    $$
    This necessitates imposing certain limitations on these dimension and rank parameters, and the parameter vector
    length $l$ can be calculated as follows:
    $$
        \begin{equation}
            l = p \times q + r( \frac{n}{p} + \frac{D}{q} ).
        \end{equation}
    $$

    For the LPHM parameter reconciliation function, it adds strict constraints on the parameters $p$ and $q$, which
    should be the divisors of the target dimensions $n$ and $D$, respectively, i.e.,
    $$
        \begin{equation}
            n \\% p = 0 \text{, and } D \\% q = 0.
        \end{equation}
    $$

    Attributes
    ----------
    name: str, default = 'lphm_reconciliation'
        Name of the LPHM parameter reconciliation function
    p: int, default = 2
        Parameter sub-matrix row dimension.
    q: int, default = None
        Parameter sub-matrix column dimension.
        If q is not provided with initial values, it will be assigned with value p by default.
    r: int, default = 2
        Submatrix rank parameter.

    Methods
    ----------
    __init__
        It initializes the LPHM parameter reconciliation function.

    calculate_l
        It calculates the length of required parameters for the reconciliation function.

    forward
        It implements the abstract forward method declared in the base reconciliation class.
    """
    def __init__(self, name='lphm_reconciliation', p: int = None, q: int = None, r: int = 2, *args, **kwargs):
        """
        The initialization method of the LPHM parameter reconciliation function.

        It initializes a LPHM parameter reconciliation function object.
        This method will also call the initialization method of the base class as well.

        Parameters
        ----------
        name: str, default = 'lphm_reconciliation'
            Name of the LPHM parameter reconciliation function.
        p: int, default = 2
            Parameter sub-matrix row dimension.
        q: int, default = None
            Parameter sub-matrix column dimension.
            If q is not provided with initial values, it will be assigned with value p by default.
        r: int, default = 2
            Submatrix rank parameter.

        Returns
        ----------
        fabrication
            The LPHM parameter reconciliation function object.
        """
        super().__init__(name=name, *args, **kwargs)
        self.p = p
        self.q = q
        self.r = r

    def calculate_l(self, n: int, D: int):
        r"""
        The required parameter number calculation method.

        It calculates the number of required learnable parameters, i.e., $l$, of the parameter reconciliation function
        based on the intermediate and output space dimensions, $n$ and $D$, and the dimension and rank parameters
        $p$, $q$ and $r$, which can be represented as follows:
        $$
            \begin{equation}
                l = p \times q + r( \frac{n}{p} + \frac{D}{q} ).
            \end{equation}
        $$

        Notes
        ----------
        For the LPHM parameter reconciliation function, it adds strict constraints on the parameters $p$ and $q$, which
        should be the divisors of the target dimensions $n$ and $D$, respectively, i.e.,
        $$
            \begin{equation}
                n \\% p = 0 \text{, and } D \\% q = 0.
            \end{equation}
        $$

        Parameters
        ----------
        n: int
            The dimension of the output space.
        D: int
            The dimension of the intermediate expansion space.

        Returns
        -------
        int
            The number of required learnable parameters.
        """

        if self.p is None:
            self.p = find_close_factors(n)
        if self.q is None:
            self.q = find_close_factors(D)

        if n % self.p != 0 or D % self.q != 0:
            raise ValueError('The input dimensions {} and {} cannot be divided by parameter p {} and q {}'.format(n, D, self.p, self.q))
        s, t = int(n / self.p), int(D / self.q)
        assert (self.p * self.q * s * t == n * D)
        return self.p * self.q + s * self.r + t * self.r

    def forward(self, n: int, D: int, w: torch.nn.Parameter, device='cpu', *args, **kwargs):
        r"""
        The forward method of the parameter reconciliation function.

        It applies the LPHM parameter reconciliation operation to the input parameter vector $\mathbf{w}$,
        and returns the reconciled parameter matrix of shape (n, D) subject to the dimension and rank parameters
        $p$, $q$ and $r$ as follows:
        $$
            \begin{equation}
                \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = \mathbf{A} \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}.
            \end{equation}
        $$
        where $\mathbf{A} \in {R}^{p \times q}$, $\mathbf{S} \in {R}^{\frac{n}{p} \times r}$ and
        $\mathbf{T} \in {R}^{\frac{D}{q} \times r}$ are all obtained by partitioning $\mathbf{w}$ into sub-vectors
        and subsequently reshaping them into matrices.

        Parameters
        ----------
        n: int
            The dimension of the output space.
        D: int
            The dimension of the intermediate expansion space.
        w: torch.nn.Parameter, default = None
            The learnable parameters of the model.
        device: str, default = 'cpu'
            Device to perform the parameter reconciliation.

        Returns
        ----------
        torch.Tensor
            The reconciled parameter matrix of shape (n, D).
        """
        if self.p is None:
            self.p = find_close_factors(n)
        if self.q is None:
            self.q = find_close_factors(D)

        assert w.ndim == 2 and w.size(1) == self.calculate_l(n=n, D=D)
        s, t = int(n/self.p), int(D/self.q)
        A, S, T = torch.split(w, [self.p*self.q, s*self.r, t*self.r], dim=1)
        B = torch.matmul(S.view(s, -1), T.view(-1, t)).view(1, -1)
        return torch.einsum('pq,st->psqt', A, B).view(self.p*s, self.q*t)

__init__(name='lphm_reconciliation', p=None, q=None, r=2, *args, **kwargs)

The initialization method of the LPHM parameter reconciliation function.

It initializes a LPHM parameter reconciliation function object. This method will also call the initialization method of the base class as well.

Parameters:

Name Type Description Default
name

Name of the LPHM parameter reconciliation function.

'lphm_reconciliation'
p int

Parameter sub-matrix row dimension.

None
q int

Parameter sub-matrix column dimension. If q is not provided with initial values, it will be assigned with value p by default.

None
r int

Submatrix rank parameter.

2

Returns:

Type Description
fabrication

The LPHM parameter reconciliation function object.

Source code in tinybig/reconciliation/lowrank_reconciliation.py
def __init__(self, name='lphm_reconciliation', p: int = None, q: int = None, r: int = 2, *args, **kwargs):
    """
    The initialization method of the LPHM parameter reconciliation function.

    It initializes a LPHM parameter reconciliation function object.
    This method will also call the initialization method of the base class as well.

    Parameters
    ----------
    name: str, default = 'lphm_reconciliation'
        Name of the LPHM parameter reconciliation function.
    p: int, default = 2
        Parameter sub-matrix row dimension.
    q: int, default = None
        Parameter sub-matrix column dimension.
        If q is not provided with initial values, it will be assigned with value p by default.
    r: int, default = 2
        Submatrix rank parameter.

    Returns
    ----------
    fabrication
        The LPHM parameter reconciliation function object.
    """
    super().__init__(name=name, *args, **kwargs)
    self.p = p
    self.q = q
    self.r = r

calculate_l(n, D)

The required parameter number calculation method.

It calculates the number of required learnable parameters, i.e., \(l\), of the parameter reconciliation function based on the intermediate and output space dimensions, \(n\) and \(D\), and the dimension and rank parameters \(p\), \(q\) and \(r\), which can be represented as follows: $$ \begin{equation} l = p \times q + r( \frac{n}{p} + \frac{D}{q} ). \end{equation} $$

Notes

For the LPHM parameter reconciliation function, it adds strict constraints on the parameters \(p\) and \(q\), which should be the divisors of the target dimensions \(n\) and \(D\), respectively, i.e., $$ \begin{equation} n \% p = 0 \text{, and } D \% q = 0. \end{equation} $$

Parameters:

Name Type Description Default
n int

The dimension of the output space.

required
D int

The dimension of the intermediate expansion space.

required

Returns:

Type Description
int

The number of required learnable parameters.

Source code in tinybig/reconciliation/lowrank_reconciliation.py
def calculate_l(self, n: int, D: int):
    r"""
    The required parameter number calculation method.

    It calculates the number of required learnable parameters, i.e., $l$, of the parameter reconciliation function
    based on the intermediate and output space dimensions, $n$ and $D$, and the dimension and rank parameters
    $p$, $q$ and $r$, which can be represented as follows:
    $$
        \begin{equation}
            l = p \times q + r( \frac{n}{p} + \frac{D}{q} ).
        \end{equation}
    $$

    Notes
    ----------
    For the LPHM parameter reconciliation function, it adds strict constraints on the parameters $p$ and $q$, which
    should be the divisors of the target dimensions $n$ and $D$, respectively, i.e.,
    $$
        \begin{equation}
            n \\% p = 0 \text{, and } D \\% q = 0.
        \end{equation}
    $$

    Parameters
    ----------
    n: int
        The dimension of the output space.
    D: int
        The dimension of the intermediate expansion space.

    Returns
    -------
    int
        The number of required learnable parameters.
    """

    if self.p is None:
        self.p = find_close_factors(n)
    if self.q is None:
        self.q = find_close_factors(D)

    if n % self.p != 0 or D % self.q != 0:
        raise ValueError('The input dimensions {} and {} cannot be divided by parameter p {} and q {}'.format(n, D, self.p, self.q))
    s, t = int(n / self.p), int(D / self.q)
    assert (self.p * self.q * s * t == n * D)
    return self.p * self.q + s * self.r + t * self.r

forward(n, D, w, device='cpu', *args, **kwargs)

The forward method of the parameter reconciliation function.

It applies the LPHM parameter reconciliation operation to the input parameter vector \(\mathbf{w}\), and returns the reconciled parameter matrix of shape (n, D) subject to the dimension and rank parameters \(p\), \(q\) and \(r\) as follows: $$ \begin{equation} \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = \mathbf{A} \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}. \end{equation} $$ where \(\mathbf{A} \in {R}^{p \times q}\), \(\mathbf{S} \in {R}^{\frac{n}{p} \times r}\) and \(\mathbf{T} \in {R}^{\frac{D}{q} \times r}\) are all obtained by partitioning \(\mathbf{w}\) into sub-vectors and subsequently reshaping them into matrices.

Parameters:

Name Type Description Default
n int

The dimension of the output space.

required
D int

The dimension of the intermediate expansion space.

required
w Parameter

The learnable parameters of the model.

required
device

Device to perform the parameter reconciliation.

'cpu'

Returns:

Type Description
Tensor

The reconciled parameter matrix of shape (n, D).

Source code in tinybig/reconciliation/lowrank_reconciliation.py
def forward(self, n: int, D: int, w: torch.nn.Parameter, device='cpu', *args, **kwargs):
    r"""
    The forward method of the parameter reconciliation function.

    It applies the LPHM parameter reconciliation operation to the input parameter vector $\mathbf{w}$,
    and returns the reconciled parameter matrix of shape (n, D) subject to the dimension and rank parameters
    $p$, $q$ and $r$ as follows:
    $$
        \begin{equation}
            \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = \mathbf{A} \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}.
        \end{equation}
    $$
    where $\mathbf{A} \in {R}^{p \times q}$, $\mathbf{S} \in {R}^{\frac{n}{p} \times r}$ and
    $\mathbf{T} \in {R}^{\frac{D}{q} \times r}$ are all obtained by partitioning $\mathbf{w}$ into sub-vectors
    and subsequently reshaping them into matrices.

    Parameters
    ----------
    n: int
        The dimension of the output space.
    D: int
        The dimension of the intermediate expansion space.
    w: torch.nn.Parameter, default = None
        The learnable parameters of the model.
    device: str, default = 'cpu'
        Device to perform the parameter reconciliation.

    Returns
    ----------
    torch.Tensor
        The reconciled parameter matrix of shape (n, D).
    """
    if self.p is None:
        self.p = find_close_factors(n)
    if self.q is None:
        self.q = find_close_factors(D)

    assert w.ndim == 2 and w.size(1) == self.calculate_l(n=n, D=D)
    s, t = int(n/self.p), int(D/self.q)
    A, S, T = torch.split(w, [self.p*self.q, s*self.r, t*self.r], dim=1)
    B = torch.matmul(S.view(s, -1), T.view(-1, t)).view(1, -1)
    return torch.einsum('pq,st->psqt', A, B).view(self.p*s, self.q*t)