Tutorial on Data Expansion Functions
In this tutorial, you will learn
- what is data expansion function,
- how to do data expansion in
tinybig
, - the optional data processing functions in the expansion,
- and visualize the expansion outputs.
Many materials used in this tutorial are prepared based on the Section 5.1 of [1]
, and you are also recommended to
refer to that section of the paper for more detailed technical descriptions when you are working on this tutorial.
References:
[1] Jiawei Zhang. RPN: Reconciled Polynomial Network. Towards Unifying PGMs, Kernel SVMs, MLP and KAN.
1. What is Data Expansion Function?
Formally, data expansion function is one component function used in the RPN model for expanding the input data vectors from the input space to a high-dimensional intermediate space:
\[\begin{equation} \kappa: R^m \to R^D, \end{equation}\]
where \(m\) and \(D\) denote the input space and expansion space dimensions, respectively.
Data expansion provides high-order signals and features about the input that cannot be directly learned from the original input space. For the data instances that cannot be separated in the input space, with such high-dimensional features, the RPN model can also learn better parameters to approximate their underlying distributions for easier separations.
The currently released tinybig
package only implements data expansion functions. Meanwhile, the upcoming new versions of
tinybig
in development will support both "data expansion functions" and "data compression function", which can both be
referred to as the data transformation functions.
In this tutorial, for many places, we will use the function names "data transformation function" and "data expansion function" interchangeably without distinguishing their differences.
2. Examples of Data Expansion Functions?
In tinyBIG, several different families of data expansion functions have been implemented, whose detailed information is also available at the expansion function documentation pages.
In the following figure, we illustrate some example of them, including their names, formulas, and the corresponding expansion dimension calculations. In the following parts of this tutorial, we will walk you through some of them to help you get familiar with their usages.
3. Taylor's Expansion
Formally, given a vector \(\mathbf{x} = [x_1, x_2, \cdots, x_m] \in R^m\) of dimension \(m\), its Taylor's expansion polynomials with orders no greater than \(d\) can be represented as follows:
\[\begin{equation} \kappa (\mathbf{x} | d) = [P_1(\mathbf{x}), P_2(\mathbf{x}), \cdots, P_d(\mathbf{x}) ] \in R^D, \end{equation}\]
where the output dimension \(D = \sum_{i=1}^d m^i\).
In the above Taylor's expansion, notation \(P_d(\mathbf{x})\) represents the list of potential polynomials composed by the product of the vector elements \(x_1\), \(x_2\), \(\cdots\), \(x_m\) with sum of the degrees equals \(d\), i.e.,
\[\begin{equation} P_d(\mathbf{x}) = [x_1^{d_1} x_2^{d_2} \cdots x_m^{d_m}]_{d_1, d_2,\cdots, d_m \in \{0, 1, \cdots m\} \land \sum_{i=1}^m d_i = d}. \end{equation}\]
Some examples of the multivariate polynomials are provided as follows:
\[\begin{equation} \begin{aligned} P_0(\mathbf{x}) &= [1] \in R^{1},\\ P_1(\mathbf{x}) &= [x_1, x_2, \cdots, x_m] \in R^{m},\\ P_2(\mathbf{x}) &= [x_1^2, x_1 x_2, x_1 x_3, \cdots, x_1 x_m, x_2 x_1, x_2^2, x_2 x_3, \cdots, x_{m} x_m] \in R^{m^2}. \end{aligned} \end{equation}\]
3.1 Taylor's Expansion Function
Taylor's expansion has been implemented in tinybig
, which can be called and applied to inputs as follows:
Taylor's expansion printing output
As reminded before in the Quickstart tutorial, the current expansion functions in tinybig
will only accept 2D tensors
with shape \((B, m)\) as inputs, where \(B\) denotes the batch size and \(m\) denotes the input dimension length.
3.2 Taylor's Expansion Function Instantiation from Configs
Besides the manual definition of the expansion functions, tinybig
also allows the function instantiation from the
configurations.
For instance, the data expansion function defined in the previous subsection can also be instantiated from its configs
data_transformation_configs
as follows:
Taylor's expansion instantiation from configs printing output
In the following tutorials, to make the code more descent, we will just use the configuration files to instantiate the other functions, modules and models implemented in the tinyBIG toolkit.
4. Optional processing functions for expansions
Besides doing the expansions, tinybig
also allows the data expansion functions to apply optional pre- and post-processing
functions to the inputs and outputs of the expansion functions, respectively.
These optional pre- and post-processing functions will provide RPN and tinyBIG with great flexibility in model design and implementation. In this part, we will illustrate how to add these processing functions into the data expansion functions.
4.1 Pre-processing functions
The pre-processing function used in data expansion can be very diverse, including different activation functions and normalization functions. You can also define your customized pre-processing function and use them for data expansions.
Below, we provide an example to add the layer-norm
as a pre-processing function to the Taylor's expansion function:
Taylor's expansion with pre-processing layer-norm
What's more, tinybig
allows you to add multiple pre-processing functions into the data expansion function definition.
Below, we show the Taylor's expansion with both sigmoid and layer-norm as the pre-processing functions:
Taylor's expansion with pre-processing sigmoid and layer-norm
4.2 Post-processing functions
Below, we will define the Taylor's expansion functions with post-processing functions.
Meanwhile, slightly different from the above manual function definition, we propose to define the function
configuration in a separate file, and load it with tinybig
for the expansion function instantiation.
Please save the following expansion_function_postprocessing.yaml
to the directory ./configs/
that your code can access:
Taylor's expansion with post-processing sigmoid and layer-norm
{'data_transformation_configs': {'data_transformation_class': 'tinybig.expansion.taylor_expansion', 'data_transformation_parameters': {'name': 'taylor_expansion_with_preprocessing', 'd': 2, 'postprocess_function_configs': [{'function_class': 'torch.nn.Sigmoid'}, {'function_class': 'torch.nn.LayerNorm', 'function_parameters': {'normalized_shape': 12}}]}}}
tensor([[1., 2., 3.]]) tensor([[-1.9707, -0.3362, 0.4473, -1.9707, -0.3362, 0.4473, -0.3362, 0.7686,
0.9380, 0.4473, 0.9380, 0.9636]],
grad_fn=<NativeLayerNormBackward0>)
Careful readers may have already noticed that the normalized_shape
parameters of the layer-norm
function in the
pre-processing and post-processing function lists are different, since they are applied to the input and output vectors
of the expansion functions, respectively. Also, as the parameter d
of Taylor's expansion function changes, the
normalized_shape
of layer-norm
as the post-processing function may also need to be adjusted accordingly as well.
torch.nn.LayerNorm
vs torch.nn.functional.layer_norm
vs function string name 'layer_norm
'
Current tinybig
allows you to define these function in different forms, like "torch.nn.LayerNorm
, torch.nn.functional.layer_norm
,
and even just as function string name "layer_norm"
(which is used in the previous Quickstart tutorial).
All the pre- and post-processing functions (as well as the output-processing and activation functions to be introduced later)
are all handled by tinybig.util.process_function_list
, tinybig.util.func_x
and tinybig.util.str_func_x
.
We recommend you to define these functions as objects, like torch.nn.LayerNorm
together with the parameters as shown above,
which will be first instantiated by tinybig.util.process_function_list
into callable objects, and then executed by
tinybig.util.func_x
, without entering tinybig.util.str_func_x
.
The current tinybig.util.str_func_x
can only handle the string names of a few frequently used functions, which may
fail to work for the functions whose names or classes have not been recorded yet.
5. Expansion Visualization
We have defined the Taylor's expansion function above and also introduced how to add different pre- and post-processing functions to process the input and output of the expansions.
Below, we will illustrate the obtained expansion results on real-world MNIST image data and visualize them. So, we can compare the data vectors before and after the expansion.
We first define an image display function with matplotlib
(please install matplotlib
before running the following code):
As introduced in the previous Quickstart tutorial, tinybig
has a built-in class to load
the mnist dataset (after flattening and normalization):
By feeding the image data x
to the taylor_expansion
function, we can obtain the Taylor's expansion results
By feeding the image data raw_image
and expansion_image
to the show_image
function, we can display the image before
and after the expansion as follows:
MNIST raw image display
MNIST expansion image display
Compared with the raw image, the above expansion image visualization is less readable, which makes it hard to interpret the expansion results.
Below, we will use the reshape_expansion
function to re-organize the expansion image of size \(784 \times 784\) into
\(28 \times 28\) small-sized images, where each image has a size of \(28 \times 28\).
With the above reshape function, we can process and display the expansion image as follows:
MNIST reshaped expansion image display
The above image visualization illustrates the expansion effects of each pixel in the image to the whole raw image, and there are \(28 \times 28\) sub-images in the expansion results.
These high-order expansions actually provide some important features about the input data. We will discuss more about this in the following tutorials on the RPN model and the parameter reconciliation.
6. Conclusion
In this tutorial, we discussed the data expansion functions in the tinyBIG toolkit. We introduced different ways to define the expansion functions, including both manual function definition and configuration file based function instantiation. What's more, we also introduced the optional pre- and post-processing functions that can be used in the data expansion functions for input and output processing. Finally, we visualize the expansion results on the MNIST image data, which also helps interpret the expansion functions and their performance.