Skip to content

citeseer

Bases: graph_dataloader

A dataloader class for the Citeseer dataset.

This class extends graph_dataloader to handle the Citeseer graph dataset, which is another benchmark dataset for graph-based machine learning tasks.

Attributes:

Name Type Description
data_profile dict

Metadata and download links specific to the Citeseer dataset.

graph graph

The loaded graph structure for the Citeseer dataset.

Methods:

Name Description
__init__

Initializes the dataloader for the Citeseer dataset.

get_train_test_idx

Generates train and test indices for the Citeseer dataset.

Source code in tinybig/data/graph_dataloader.py
class citeseer(graph_dataloader):
    """
    A dataloader class for the Citeseer dataset.

    This class extends `graph_dataloader` to handle the Citeseer graph dataset,
    which is another benchmark dataset for graph-based machine learning tasks.

    Attributes
    ----------
    data_profile: dict
        Metadata and download links specific to the Citeseer dataset.
    graph: graph_class
        The loaded graph structure for the Citeseer dataset.

    Methods
    -------
    __init__(name: str = 'citeseer', train_batch_size: int = 64, test_batch_size: int = 64, ...)
        Initializes the dataloader for the Citeseer dataset.
    get_train_test_idx(X: torch.Tensor = None, y: torch.Tensor = None, ...)
        Generates train and test indices for the Citeseer dataset.
    """
    def __init__(self, name: str = 'citeseer', train_batch_size: int = 64, test_batch_size: int = 64, *args, **kwargs):
        """
        Initializes the dataloader for the Citeseer dataset.

        Parameters
        ----------
        name: str, default = 'citeseer'
            The name of the dataset.
        train_batch_size: int, default = 64
            Batch size for the training dataset.
        test_batch_size: int, default = 64
            Batch size for the testing dataset.

        Returns
        -------
        None
        """
        super().__init__(data_profile=CITESEER_DATA_PROFILE, name=name, train_batch_size=train_batch_size, test_batch_size=test_batch_size)

    def get_train_test_idx(self, X: torch.Tensor = None, y: torch.Tensor = None, *args, **kwargs):
        """
        Generates train and test indices for the Citeseer dataset.

        Parameters
        ----------
        X: torch.Tensor, optional
            Node features (not used in this method).
        y: torch.Tensor, optional
            Labels (not used in this method).

        Returns
        -------
        tuple
            Train indices (`torch.LongTensor`) and test indices (`torch.LongTensor`).

        Notes
        -----
        The train indices are predefined as the first 120 nodes.
        The test indices are predefined as nodes 200 to 1199.
        """
        train_idx = torch.LongTensor(range(120))
        test_idx = torch.LongTensor(range(200, 1200))
        return train_idx, test_idx

__init__(name='citeseer', train_batch_size=64, test_batch_size=64, *args, **kwargs)

Initializes the dataloader for the Citeseer dataset.

Parameters:

Name Type Description Default
name str

The name of the dataset.

'citeseer'
train_batch_size int

Batch size for the training dataset.

64
test_batch_size int

Batch size for the testing dataset.

64

Returns:

Type Description
None
Source code in tinybig/data/graph_dataloader.py
def __init__(self, name: str = 'citeseer', train_batch_size: int = 64, test_batch_size: int = 64, *args, **kwargs):
    """
    Initializes the dataloader for the Citeseer dataset.

    Parameters
    ----------
    name: str, default = 'citeseer'
        The name of the dataset.
    train_batch_size: int, default = 64
        Batch size for the training dataset.
    test_batch_size: int, default = 64
        Batch size for the testing dataset.

    Returns
    -------
    None
    """
    super().__init__(data_profile=CITESEER_DATA_PROFILE, name=name, train_batch_size=train_batch_size, test_batch_size=test_batch_size)

get_train_test_idx(X=None, y=None, *args, **kwargs)

Generates train and test indices for the Citeseer dataset.

Parameters:

Name Type Description Default
X Tensor

Node features (not used in this method).

None
y Tensor

Labels (not used in this method).

None

Returns:

Type Description
tuple

Train indices (torch.LongTensor) and test indices (torch.LongTensor).

Notes

The train indices are predefined as the first 120 nodes. The test indices are predefined as nodes 200 to 1199.

Source code in tinybig/data/graph_dataloader.py
def get_train_test_idx(self, X: torch.Tensor = None, y: torch.Tensor = None, *args, **kwargs):
    """
    Generates train and test indices for the Citeseer dataset.

    Parameters
    ----------
    X: torch.Tensor, optional
        Node features (not used in this method).
    y: torch.Tensor, optional
        Labels (not used in this method).

    Returns
    -------
    tuple
        Train indices (`torch.LongTensor`) and test indices (`torch.LongTensor`).

    Notes
    -----
    The train indices are predefined as the first 120 nodes.
    The test indices are predefined as nodes 200 to 1199.
    """
    train_idx = torch.LongTensor(range(120))
    test_idx = torch.LongTensor(range(200, 1200))
    return train_idx, test_idx