Skip to content

dual_lphm_reconciliation

Bases: fabrication

The dual low-rank parameterized hypercomplex multiplication (Dual-LPHM) based parameter reconciliation function.

It performs the Dual-LPHM parameter reconciliation, and returns the Dual-LPHM reconciled parameter matrix of shape (n, D). This class inherits from the reconciliation class (i.e., the fabrication class in the module directory).

The dual low-rank parameterized hypercomplex multiplication based parameter reconciliation can be viewed as a more agreesive version of the LPHM based parameter reconciliation function. It replaces both A and B in the hypercomplex multiplication based parameter reconciliation with the products of two low-rank sub-matrices, respectively.

...

Notes

Formally, given the parameter vector wRl and a rank hyper-parameter r, together with the parameter sub-matrix dimension parameters p and q, the Dual-LPHM reconciliation function partitions w into four sub-vectors and subsequently reshapes them into three matrices PRp×r, QRq×r, SRnp×r and TRDq×r. These sub-matrices P, Q, S and T help define the Dual-LPHM reconciliation function as follows: ψ(w)=AB=(PQ)(ST)Rn×D. \begin{equation} \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = ( \mathbf{P} \mathbf{Q}^\top) \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}. \end{equation} This necessitates imposing certain limitations on these dimension and rank parameters, and the parameter vector length l can be calculated as follows: l=r(p+q+np+Dq). \begin{equation} l = r( p + q + \frac{n}{p} + \frac{D}{q} ). \end{equation}

For the Dual-LPHM parameter reconciliation function, it adds strict constraints on the parameters p and q, which should be the divisors of the target dimensions n and D, respectively, i.e., n%p=0, and D%q=0. \begin{equation} n \% p = 0 \text{, and } D \% q = 0. \end{equation}

Attributes:

Name Type Description
name str, default = 'dual_lphm_reconciliation'

Name of the Dual-LPHM parameter reconciliation function

p int, default = 2

Parameter sub-matrix row dimension.

q int, default = None

Parameter sub-matrix column dimension. If q is not provided with initial values, it will be assigned with value p by default.

r int, default = 2

Submatrix rank parameter.

Methods:

Name Description
__init__

It initializes the Dual-LPHM parameter reconciliation function.

calculate_l

It calculates the length of required parameters for the reconciliation function.

forward

It implements the abstract forward method declared in the base reconciliation class.

Source code in tinybig/reconciliation/lowrank_reconciliation.py
class dual_lphm_reconciliation(fabrication):
    r"""
    The dual low-rank parameterized hypercomplex multiplication (Dual-LPHM) based parameter reconciliation function.

    It performs the Dual-LPHM parameter reconciliation, and returns the Dual-LPHM reconciled parameter matrix of shape (n, D).
    This class inherits from the reconciliation class (i.e., the fabrication class in the module directory).

    The dual low-rank parameterized hypercomplex multiplication based parameter reconciliation can be viewed as a more
    agreesive version of the LPHM based parameter reconciliation function.
    It replaces both $\mathbf{A}$ and $\mathbf{B}$ in the hypercomplex multiplication based parameter reconciliation
    with the products of two low-rank sub-matrices, respectively.

    ...

    Notes
    ----------
    Formally, given the parameter vector $\mathbf{w} \in {R}^{l}$ and a rank hyper-parameter $r$, together with the
    parameter sub-matrix dimension parameters $p$ and $q$, the Dual-LPHM reconciliation function partitions $\mathbf{w}$
    into four sub-vectors and subsequently reshapes them into three matrices $\mathbf{P} \in {R}^{p \times r}$,
    $\mathbf{Q} \in {R}^{q \times r}$, $\mathbf{S} \in {R}^{\frac{n}{p} \times r}$ and $\mathbf{T} \in {R}^{\frac{D}{q} \times r}$.
    These sub-matrices $\mathbf{P}$, $\mathbf{Q}$, $\mathbf{S}$ and $\mathbf{T}$ help define the Dual-LPHM reconciliation function as follows:
    $$
        \begin{equation}
            \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = ( \mathbf{P} \mathbf{Q}^\top) \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}.
        \end{equation}
    $$
    This necessitates imposing certain limitations on these dimension and rank parameters, and the parameter vector
    length $l$ can be calculated as follows:
    $$
        \begin{equation}
            l = r( p + q + \frac{n}{p} + \frac{D}{q} ).
        \end{equation}
    $$

    For the Dual-LPHM parameter reconciliation function, it adds strict constraints on the parameters $p$ and $q$, which
    should be the divisors of the target dimensions $n$ and $D$, respectively, i.e.,
    $$
        \begin{equation}
            n \\% p = 0 \text{, and } D \\% q = 0.
        \end{equation}
    $$

    Attributes
    ----------
    name: str, default = 'dual_lphm_reconciliation'
        Name of the Dual-LPHM parameter reconciliation function
    p: int, default = 2
        Parameter sub-matrix row dimension.
    q: int, default = None
        Parameter sub-matrix column dimension.
        If q is not provided with initial values, it will be assigned with value p by default.
    r: int, default = 2
        Submatrix rank parameter.

    Methods
    ----------
    __init__
        It initializes the Dual-LPHM parameter reconciliation function.

    calculate_l
        It calculates the length of required parameters for the reconciliation function.

    forward
        It implements the abstract forward method declared in the base reconciliation class.
    """
    def __init__(self, name='dual_lphm_reconciliation', p: int = None, q: int = None, r=2, *args, **kwargs):
        """
        The initialization method of the Dual-LPHM parameter reconciliation function.

        It initializes a Dual-LPHM parameter reconciliation function object.
        This method will also call the initialization method of the base class as well.

        Parameters
        ----------
        name: str, default = 'dual_lphm_reconciliation'
            Name of the Dual-LPHM parameter reconciliation function.
        p: int, default = 2
            Parameter sub-matrix row dimension.
        q: int, default = None
            Parameter sub-matrix column dimension.
            If q is not provided with initial values, it will be assigned with value p by default.
        r: int, default = 2
            Submatrix rank parameter.

        Returns
        ----------
        fabrication
            The Dual-LPHM parameter reconciliation function object.
        """
        super().__init__(name=name, *args, **kwargs)
        self.p = p
        self.q = q
        self.r = r

    def calculate_l(self, n: int, D: int):
        r"""
        The required parameter number calculation method.

        It calculates the number of required learnable parameters, i.e., $l$, of the parameter reconciliation function
        based on the intermediate and output space dimensions, $n$ and $D$, and the dimension and rank parameters
        $p$, $q$ and $r$, which can be represented as follows:
        $$
            \begin{equation}
                l = r( p + q + \frac{n}{p} + \frac{D}{q} ).
            \end{equation}
        $$

        Notes
        ----------
        For the Dual-LPHM parameter reconciliation function, it adds strict constraints on the parameters $p$ and $q$, which
        should be the divisors of the target dimensions $n$ and $D$, respectively, i.e.,
        $$
            \begin{equation}
                n \\% p = 0 \text{, and } D \\% q = 0.
            \end{equation}
        $$

        Parameters
        ----------
        n: int
            The dimension of the output space.
        D: int
            The dimension of the intermediate expansion space.

        Returns
        -------
        int
            The number of required learnable parameters.
        """
        if self.p is None:
            self.p = find_close_factors(n)
        if self.q is None:
            self.q = find_close_factors(D)

        if n % self.p != 0 or D % self.q != 0:
            raise ValueError('The input dimensions {} and {} cannot be divided by parameter p {} and q {}'.format(n, D, self.p, self.q))
        s, t = int(n / self.p), int(D / self.q)
        assert (self.p * self.q * s * t == n * D)
        return self.p * self.r + self.q * self.r + s * self.r + t * self.r

    def forward(self, n: int, D: int, w: torch.nn.Parameter, device='cpu', *args, **kwargs):
        r"""
        The forward method of the parameter reconciliation function.

        It applies the Dual-LPHM parameter reconciliation operation to the input parameter vector $\mathbf{w}$,
        and returns the reconciled parameter matrix of shape (n, D) subject to the dimension and rank parameters
        $p$, $q$ and $r$ as follows:
        $$
            \begin{equation}
                \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = ( \mathbf{P} \mathbf{Q}^\top) \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}.
            \end{equation}
        $$
        where $\mathbf{P} \in {R}^{p \times r}$, $\mathbf{Q} \in {R}^{q \times r}$, $\mathbf{S} \in {R}^{\frac{n}{p} \times r}$ and
        $\mathbf{T} \in {R}^{\frac{D}{q} \times r}$ are all obtained by partitioning $\mathbf{w}$ into sub-vectors
        and subsequently reshaping them into matrices.

        Parameters
        ----------
        n: int
            The dimension of the output space.
        D: int
            The dimension of the intermediate expansion space.
        w: torch.nn.Parameter, default = None
            The learnable parameters of the model.
        device: str, default = 'cpu'
            Device to perform the parameter reconciliation.

        Returns
        ----------
        torch.Tensor
            The reconciled parameter matrix of shape (n, D).
        """
        if self.p is None:
            self.p = find_close_factors(n)
        if self.q is None:
            self.q = find_close_factors(D)

        assert w.ndim == 2 and w.size(1) == self.calculate_l(n=n, D=D)
        s, t = int(n/self.p), int(D/self.q)
        P, Q, S, T = torch.split(w, [self.p*self.r, self.q*self.r, s*self.r, t*self.r], dim=1)
        A = torch.matmul(P.view(self.p, -1), Q.view(-1, self.q)).view(1, -1)
        B = torch.matmul(S.view(s, -1), T.view(-1, t)).view(1, -1)
        return torch.einsum('pq,st->psqt', A, B).view(self.p*s, self.q*t)

__init__(name='dual_lphm_reconciliation', p=None, q=None, r=2, *args, **kwargs)

The initialization method of the Dual-LPHM parameter reconciliation function.

It initializes a Dual-LPHM parameter reconciliation function object. This method will also call the initialization method of the base class as well.

Parameters:

Name Type Description Default
name

Name of the Dual-LPHM parameter reconciliation function.

'dual_lphm_reconciliation'
p int

Parameter sub-matrix row dimension.

None
q int

Parameter sub-matrix column dimension. If q is not provided with initial values, it will be assigned with value p by default.

None
r

Submatrix rank parameter.

2

Returns:

Type Description
fabrication

The Dual-LPHM parameter reconciliation function object.

Source code in tinybig/reconciliation/lowrank_reconciliation.py
def __init__(self, name='dual_lphm_reconciliation', p: int = None, q: int = None, r=2, *args, **kwargs):
    """
    The initialization method of the Dual-LPHM parameter reconciliation function.

    It initializes a Dual-LPHM parameter reconciliation function object.
    This method will also call the initialization method of the base class as well.

    Parameters
    ----------
    name: str, default = 'dual_lphm_reconciliation'
        Name of the Dual-LPHM parameter reconciliation function.
    p: int, default = 2
        Parameter sub-matrix row dimension.
    q: int, default = None
        Parameter sub-matrix column dimension.
        If q is not provided with initial values, it will be assigned with value p by default.
    r: int, default = 2
        Submatrix rank parameter.

    Returns
    ----------
    fabrication
        The Dual-LPHM parameter reconciliation function object.
    """
    super().__init__(name=name, *args, **kwargs)
    self.p = p
    self.q = q
    self.r = r

calculate_l(n, D)

The required parameter number calculation method.

It calculates the number of required learnable parameters, i.e., l, of the parameter reconciliation function based on the intermediate and output space dimensions, n and D, and the dimension and rank parameters p, q and r, which can be represented as follows: l=r(p+q+np+Dq). \begin{equation} l = r( p + q + \frac{n}{p} + \frac{D}{q} ). \end{equation}

Notes

For the Dual-LPHM parameter reconciliation function, it adds strict constraints on the parameters p and q, which should be the divisors of the target dimensions n and D, respectively, i.e., n%p=0, and D%q=0. \begin{equation} n \% p = 0 \text{, and } D \% q = 0. \end{equation}

Parameters:

Name Type Description Default
n int

The dimension of the output space.

required
D int

The dimension of the intermediate expansion space.

required

Returns:

Type Description
int

The number of required learnable parameters.

Source code in tinybig/reconciliation/lowrank_reconciliation.py
def calculate_l(self, n: int, D: int):
    r"""
    The required parameter number calculation method.

    It calculates the number of required learnable parameters, i.e., $l$, of the parameter reconciliation function
    based on the intermediate and output space dimensions, $n$ and $D$, and the dimension and rank parameters
    $p$, $q$ and $r$, which can be represented as follows:
    $$
        \begin{equation}
            l = r( p + q + \frac{n}{p} + \frac{D}{q} ).
        \end{equation}
    $$

    Notes
    ----------
    For the Dual-LPHM parameter reconciliation function, it adds strict constraints on the parameters $p$ and $q$, which
    should be the divisors of the target dimensions $n$ and $D$, respectively, i.e.,
    $$
        \begin{equation}
            n \\% p = 0 \text{, and } D \\% q = 0.
        \end{equation}
    $$

    Parameters
    ----------
    n: int
        The dimension of the output space.
    D: int
        The dimension of the intermediate expansion space.

    Returns
    -------
    int
        The number of required learnable parameters.
    """
    if self.p is None:
        self.p = find_close_factors(n)
    if self.q is None:
        self.q = find_close_factors(D)

    if n % self.p != 0 or D % self.q != 0:
        raise ValueError('The input dimensions {} and {} cannot be divided by parameter p {} and q {}'.format(n, D, self.p, self.q))
    s, t = int(n / self.p), int(D / self.q)
    assert (self.p * self.q * s * t == n * D)
    return self.p * self.r + self.q * self.r + s * self.r + t * self.r

forward(n, D, w, device='cpu', *args, **kwargs)

The forward method of the parameter reconciliation function.

It applies the Dual-LPHM parameter reconciliation operation to the input parameter vector w, and returns the reconciled parameter matrix of shape (n, D) subject to the dimension and rank parameters p, q and r as follows: ψ(w)=AB=(PQ)(ST)Rn×D. \begin{equation} \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = ( \mathbf{P} \mathbf{Q}^\top) \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}. \end{equation} where PRp×r, QRq×r, SRnp×r and TRDq×r are all obtained by partitioning w into sub-vectors and subsequently reshaping them into matrices.

Parameters:

Name Type Description Default
n int

The dimension of the output space.

required
D int

The dimension of the intermediate expansion space.

required
w Parameter

The learnable parameters of the model.

required
device

Device to perform the parameter reconciliation.

'cpu'

Returns:

Type Description
Tensor

The reconciled parameter matrix of shape (n, D).

Source code in tinybig/reconciliation/lowrank_reconciliation.py
def forward(self, n: int, D: int, w: torch.nn.Parameter, device='cpu', *args, **kwargs):
    r"""
    The forward method of the parameter reconciliation function.

    It applies the Dual-LPHM parameter reconciliation operation to the input parameter vector $\mathbf{w}$,
    and returns the reconciled parameter matrix of shape (n, D) subject to the dimension and rank parameters
    $p$, $q$ and $r$ as follows:
    $$
        \begin{equation}
            \psi(\mathbf{w}) = \mathbf{A} \otimes \mathbf{B} = ( \mathbf{P} \mathbf{Q}^\top) \otimes ( \mathbf{S} \mathbf{T}^\top) \in {R}^{n \times D}.
        \end{equation}
    $$
    where $\mathbf{P} \in {R}^{p \times r}$, $\mathbf{Q} \in {R}^{q \times r}$, $\mathbf{S} \in {R}^{\frac{n}{p} \times r}$ and
    $\mathbf{T} \in {R}^{\frac{D}{q} \times r}$ are all obtained by partitioning $\mathbf{w}$ into sub-vectors
    and subsequently reshaping them into matrices.

    Parameters
    ----------
    n: int
        The dimension of the output space.
    D: int
        The dimension of the intermediate expansion space.
    w: torch.nn.Parameter, default = None
        The learnable parameters of the model.
    device: str, default = 'cpu'
        Device to perform the parameter reconciliation.

    Returns
    ----------
    torch.Tensor
        The reconciled parameter matrix of shape (n, D).
    """
    if self.p is None:
        self.p = find_close_factors(n)
    if self.q is None:
        self.q = find_close_factors(D)

    assert w.ndim == 2 and w.size(1) == self.calculate_l(n=n, D=D)
    s, t = int(n/self.p), int(D/self.q)
    P, Q, S, T = torch.split(w, [self.p*self.r, self.q*self.r, s*self.r, t*self.r], dim=1)
    A = torch.matmul(P.view(self.p, -1), Q.view(-1, self.q)).view(1, -1)
    B = torch.matmul(S.view(s, -1), T.view(-1, t)).view(1, -1)
    return torch.einsum('pq,st->psqt', A, B).view(self.p*s, self.q*t)