Skip to content

bilinear_interdependence_layer

Bases: layer

A bilinear interdependence layer for processing data with interdependencies.

This layer incorporates bilinear interdependence heads with optional features such as Taylor expansions, parameter reconciliation, and various output processing functions. It supports channel fusion using parameterized concatenation.

Attributes:

Name Type Description
m int

The input dimension of the layer.

n int

The output dimension of the layer.

name str

The name of the layer.

batch_num int

The number of batches for instance interdependence.

channel_num int

The number of channels in the layer.

width int

The number of bilinear interdependence heads in the layer.

with_dual_lphm_interdependence bool

Whether to use dual LPHM interdependence.

with_lorr_interdependence bool

Whether to use LORR interdependence.

r_interdependence int

The rank for bilinear interdependence.

with_taylor bool

Whether to use Taylor expansion for data transformation.

d int

The degree of the Taylor expansion.

with_dual_lphm bool

Whether to use dual LPHM reconciliation for parameters.

with_lorr bool

Whether to use LORR reconciliation for parameters.

r int

The rank for parameter reconciliation.

enable_bias bool

Whether to enable bias in parameter reconciliation.

with_residual bool

Whether to include a residual connection.

with_batch_norm bool

Whether to apply batch normalization to the output.

with_relu bool

Whether to apply ReLU activation to the output.

with_softmax bool

Whether to apply softmax activation to the output.

with_dropout bool

Whether to apply dropout to the output.

p float

Dropout probability.

parameters_init_method str

The initialization method for parameters.

device str

The device to run the layer on ('cpu' or 'cuda').

head_fusion parameterized_concatenation_fusion

The fusion method for combining outputs from multiple heads.

Source code in tinybig/layer/bilinear_layers.py
class bilinear_interdependence_layer(layer):
    """
    A bilinear interdependence layer for processing data with interdependencies.

    This layer incorporates bilinear interdependence heads with optional features such as Taylor expansions,
    parameter reconciliation, and various output processing functions. It supports channel fusion using
    parameterized concatenation.

    Attributes
    ----------
    m : int
        The input dimension of the layer.
    n : int
        The output dimension of the layer.
    name : str
        The name of the layer.
    batch_num : int
        The number of batches for instance interdependence.
    channel_num : int
        The number of channels in the layer.
    width : int
        The number of bilinear interdependence heads in the layer.
    with_dual_lphm_interdependence : bool
        Whether to use dual LPHM interdependence.
    with_lorr_interdependence : bool
        Whether to use LORR interdependence.
    r_interdependence : int
        The rank for bilinear interdependence.
    with_taylor : bool
        Whether to use Taylor expansion for data transformation.
    d : int
        The degree of the Taylor expansion.
    with_dual_lphm : bool
        Whether to use dual LPHM reconciliation for parameters.
    with_lorr : bool
        Whether to use LORR reconciliation for parameters.
    r : int
        The rank for parameter reconciliation.
    enable_bias : bool
        Whether to enable bias in parameter reconciliation.
    with_residual : bool
        Whether to include a residual connection.
    with_batch_norm : bool
        Whether to apply batch normalization to the output.
    with_relu : bool
        Whether to apply ReLU activation to the output.
    with_softmax : bool
        Whether to apply softmax activation to the output.
    with_dropout : bool
        Whether to apply dropout to the output.
    p : float
        Dropout probability.
    parameters_init_method : str
        The initialization method for parameters.
    device : str
        The device to run the layer on ('cpu' or 'cuda').
    head_fusion : parameterized_concatenation_fusion
        The fusion method for combining outputs from multiple heads.
    """
    def __init__(
        self,
        m: int, n: int,
        name: str = 'attention_layer',
        batch_num: int = None,
        channel_num: int = 1, width: int = 1,
        # interdependence function parameters
        with_dual_lphm_interdependence: bool = False,
        with_lorr_interdependence: bool = False, r_interdependence: int = 3,
        # data transformation function parameters
        with_taylor: bool = False, d: int = 2,
        # parameter reconciliation function parameters
        with_dual_lphm: bool = False,
        with_lorr: bool = False, r: int = 3,
        enable_bias: bool = False,
        # remainder function parameters
        with_residual: bool = False,
        # output processing parameters
        with_batch_norm: bool = False,
        with_relu: bool = True,
        with_softmax: bool = True,
        with_dropout: bool = False, p: float = 0.25,
        # other parameters
        parameters_init_method: str = 'xavier_normal',
        device: str = 'cpu', *args, **kwargs
    ):
        """
        Initialize a bilinear interdependence layer.

        Parameters
        ----------
        m : int
            The input dimension of the layer.
        n : int
            The output dimension of the layer.
        name : str, default='attention_layer'
            The name of the layer.
        batch_num : int, optional
            The number of batches for instance interdependence.
        channel_num : int, default=1
            The number of channels in the layer.
        width : int, default=1
            The number of bilinear interdependence heads in the layer.
        with_dual_lphm_interdependence : bool, default=False
            Whether to use dual LPHM interdependence.
        with_lorr_interdependence : bool, default=False
            Whether to use LORR interdependence.
        r_interdependence : int, default=3
            The rank for bilinear interdependence.
        with_taylor : bool, default=False
            Whether to use Taylor expansion for data transformation.
        d : int, default=2
            The degree of the Taylor expansion.
        with_dual_lphm : bool, default=False
            Whether to use dual LPHM reconciliation for parameters.
        with_lorr : bool, default=False
            Whether to use LORR reconciliation for parameters.
        r : int, default=3
            The rank for parameter reconciliation.
        enable_bias : bool, default=False
            Whether to enable bias in parameter reconciliation.
        with_residual : bool, default=False
            Whether to include a residual connection.
        with_batch_norm : bool, default=False
            Whether to apply batch normalization to the output.
        with_relu : bool, default=True
            Whether to apply ReLU activation to the output.
        with_softmax : bool, default=True
            Whether to apply softmax activation to the output.
        with_dropout : bool, default=False
            Whether to apply dropout to the output.
        p : float, default=0.25
            Dropout probability.
        parameters_init_method : str, default='xavier_normal'
            The initialization method for parameters.
        device : str, default='cpu'
            The device to run the layer on ('cpu' or 'cuda').

        Returns
        -------
        None
        """
        print('* bilinear_interdependence_layer, width:', width)
        heads = [
            bilinear_interdependence_head(
                m=m, n=n,
                batch_num=batch_num,
                channel_num=channel_num,
                # --------------------------
                with_dual_lphm_interdependence=with_dual_lphm_interdependence,
                with_lorr_interdependence=with_lorr_interdependence, r_interdependence=r_interdependence,
                # --------------------------
                with_taylor=with_taylor, d=d,
                # --------------------------
                with_dual_lphm=with_dual_lphm,
                with_lorr=with_lorr, r=r,
                enable_bias=enable_bias,
                # --------------------------
                with_residual=with_residual,
                # --------------------------
                with_batch_norm=with_batch_norm,
                with_relu=with_relu,
                with_softmax=with_softmax,
                with_dropout=with_dropout, p=p,
                # --------------------------
                parameters_init_method=parameters_init_method,
                device=device, *args, **kwargs
            )
        ] * width
        head_fusion = parameterized_concatenation_fusion(
            dims=[n]*width
        )
        print('--------------------------')
        super().__init__(name=name, m=m, n=n, heads=heads, head_fusion=head_fusion, device=device, *args, **kwargs)

__init__(m, n, name='attention_layer', batch_num=None, channel_num=1, width=1, with_dual_lphm_interdependence=False, with_lorr_interdependence=False, r_interdependence=3, with_taylor=False, d=2, with_dual_lphm=False, with_lorr=False, r=3, enable_bias=False, with_residual=False, with_batch_norm=False, with_relu=True, with_softmax=True, with_dropout=False, p=0.25, parameters_init_method='xavier_normal', device='cpu', *args, **kwargs)

Initialize a bilinear interdependence layer.

Parameters:

Name Type Description Default
m int

The input dimension of the layer.

required
n int

The output dimension of the layer.

required
name str

The name of the layer.

'attention_layer'
batch_num int

The number of batches for instance interdependence.

None
channel_num int

The number of channels in the layer.

1
width int

The number of bilinear interdependence heads in the layer.

1
with_dual_lphm_interdependence bool

Whether to use dual LPHM interdependence.

False
with_lorr_interdependence bool

Whether to use LORR interdependence.

False
r_interdependence int

The rank for bilinear interdependence.

3
with_taylor bool

Whether to use Taylor expansion for data transformation.

False
d int

The degree of the Taylor expansion.

2
with_dual_lphm bool

Whether to use dual LPHM reconciliation for parameters.

False
with_lorr bool

Whether to use LORR reconciliation for parameters.

False
r int

The rank for parameter reconciliation.

3
enable_bias bool

Whether to enable bias in parameter reconciliation.

False
with_residual bool

Whether to include a residual connection.

False
with_batch_norm bool

Whether to apply batch normalization to the output.

False
with_relu bool

Whether to apply ReLU activation to the output.

True
with_softmax bool

Whether to apply softmax activation to the output.

True
with_dropout bool

Whether to apply dropout to the output.

False
p float

Dropout probability.

0.25
parameters_init_method str

The initialization method for parameters.

'xavier_normal'
device str

The device to run the layer on ('cpu' or 'cuda').

'cpu'

Returns:

Type Description
None
Source code in tinybig/layer/bilinear_layers.py
def __init__(
    self,
    m: int, n: int,
    name: str = 'attention_layer',
    batch_num: int = None,
    channel_num: int = 1, width: int = 1,
    # interdependence function parameters
    with_dual_lphm_interdependence: bool = False,
    with_lorr_interdependence: bool = False, r_interdependence: int = 3,
    # data transformation function parameters
    with_taylor: bool = False, d: int = 2,
    # parameter reconciliation function parameters
    with_dual_lphm: bool = False,
    with_lorr: bool = False, r: int = 3,
    enable_bias: bool = False,
    # remainder function parameters
    with_residual: bool = False,
    # output processing parameters
    with_batch_norm: bool = False,
    with_relu: bool = True,
    with_softmax: bool = True,
    with_dropout: bool = False, p: float = 0.25,
    # other parameters
    parameters_init_method: str = 'xavier_normal',
    device: str = 'cpu', *args, **kwargs
):
    """
    Initialize a bilinear interdependence layer.

    Parameters
    ----------
    m : int
        The input dimension of the layer.
    n : int
        The output dimension of the layer.
    name : str, default='attention_layer'
        The name of the layer.
    batch_num : int, optional
        The number of batches for instance interdependence.
    channel_num : int, default=1
        The number of channels in the layer.
    width : int, default=1
        The number of bilinear interdependence heads in the layer.
    with_dual_lphm_interdependence : bool, default=False
        Whether to use dual LPHM interdependence.
    with_lorr_interdependence : bool, default=False
        Whether to use LORR interdependence.
    r_interdependence : int, default=3
        The rank for bilinear interdependence.
    with_taylor : bool, default=False
        Whether to use Taylor expansion for data transformation.
    d : int, default=2
        The degree of the Taylor expansion.
    with_dual_lphm : bool, default=False
        Whether to use dual LPHM reconciliation for parameters.
    with_lorr : bool, default=False
        Whether to use LORR reconciliation for parameters.
    r : int, default=3
        The rank for parameter reconciliation.
    enable_bias : bool, default=False
        Whether to enable bias in parameter reconciliation.
    with_residual : bool, default=False
        Whether to include a residual connection.
    with_batch_norm : bool, default=False
        Whether to apply batch normalization to the output.
    with_relu : bool, default=True
        Whether to apply ReLU activation to the output.
    with_softmax : bool, default=True
        Whether to apply softmax activation to the output.
    with_dropout : bool, default=False
        Whether to apply dropout to the output.
    p : float, default=0.25
        Dropout probability.
    parameters_init_method : str, default='xavier_normal'
        The initialization method for parameters.
    device : str, default='cpu'
        The device to run the layer on ('cpu' or 'cuda').

    Returns
    -------
    None
    """
    print('* bilinear_interdependence_layer, width:', width)
    heads = [
        bilinear_interdependence_head(
            m=m, n=n,
            batch_num=batch_num,
            channel_num=channel_num,
            # --------------------------
            with_dual_lphm_interdependence=with_dual_lphm_interdependence,
            with_lorr_interdependence=with_lorr_interdependence, r_interdependence=r_interdependence,
            # --------------------------
            with_taylor=with_taylor, d=d,
            # --------------------------
            with_dual_lphm=with_dual_lphm,
            with_lorr=with_lorr, r=r,
            enable_bias=enable_bias,
            # --------------------------
            with_residual=with_residual,
            # --------------------------
            with_batch_norm=with_batch_norm,
            with_relu=with_relu,
            with_softmax=with_softmax,
            with_dropout=with_dropout, p=p,
            # --------------------------
            parameters_init_method=parameters_init_method,
            device=device, *args, **kwargs
        )
    ] * width
    head_fusion = parameterized_concatenation_fusion(
        dims=[n]*width
    )
    print('--------------------------')
    super().__init__(name=name, m=m, n=n, heads=heads, head_fusion=head_fusion, device=device, *args, **kwargs)