Model Foundations

Core aspects to building a `SAINT` model

The history saving thread hit an unexpected error (DatabaseError('database disk image is malformed')).History will not be written to the database.

/mnt/d/lib/python3.7/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at  /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  return torch._C._cuda_getDeviceCount() > 0

Helper Functions

`exists`[source]

exists(val)

Tests if val is not None

Function Arguments:

val: Any value

`default`[source]

default(val, d)

Get a default if val doesn't exist

Function Arguments:

val: Any value
d: Some default for val

`ff_encodings`[source]

ff_encodings(x:tensor, B:tensor)

Return sin and cosine projections of x @ B

Function Arguments:

x (torch.tensor): Input
B (torch.tensor): Projection matrix

Basic Layers

`class` `Residual`[source]

Residual(fn:callable) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

fn (callable): A function to generate a residual

`Residual.forward`[source]

Residual.forward(x:tensor, **kwargs)

Applies self.fn to x

Function Arguments:

x (torch.tensor): An input
**kwargs: kwargs for self.fn

`class` `PreNorm`[source]

PreNorm(dim:int, fn:callable) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

dim (int): LayerNorm dimension
fn (callable): Residual function

`PreNorm.forward`[source]

PreNorm.forward(x:tensor, **kwargs)

Applies self.fn to the output of self.norm(x)

Function Arguments:

x (torch.tensor): An input
**kwargs: kwargs for self.fn

Attention Layers

`class` `GEGLU`[source]

GEGLU() :: Module

GLU variation introduced in GLU Variants Improve Transformer

`GEGLU.forward`[source]

GEGLU.forward(x:tensor)

Chunks x into 2, applies F.gelu to chunks, and returns product with x

Function Arguments:

x (torch.tensor): An input

`class` `FeedForward`[source]

FeedForward(dim:int, mult:int=4, dropout:(<class 'int'>, <class 'float'>)=0.0) :: Sequential

A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.

To make it easier to understand, here is a small example::

# Example of using Sequential
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Example of using Sequential with OrderedDict
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))

Function Arguments:

dim (int): Linear layer dimension
mult (int): Multiplier for dim
dropout (int,float)): Dropout probability

`FeedForward.forward`[source]

FeedForward.forward(x:tensor)

Applies x to a Linear -> GEGLU -> Dropout -> Linear group

Function Arguments:

x (torch.tensor): An input

`class` `Attention`[source]

Attention(dim:int, heads:int=8, dim_head:int=16, dropout:float=0.0) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

dim (int): Dimension for the Linear groups
heads (int): Number of attention heads
dim_head (int): Dimension of the attention heads
dropout (float): Dropout probability

`Attention.forward`[source]

Attention.forward(x:tensor)

Applies attention to x

Function Arguments:

x (torch.tensor): An input

`class` `AttentionType`[source]

AttentionType(*args, **kwargs)

All possible attention types, with typo-proofing and auto-complete

`class` `RowColAttention`[source]

RowColAttention(num_tokens:int, dim:int, nfeats:int, depth:int, heads:int, dim_head:int=16, attn_dropout:float=0.0, ff_dropout:float=0.0, style:AttentionType='col') :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

num_tokens (int): Size of the categorical embeddings
dim (int): Dimension of the two Embedding layers
nfeats (int): Number of continuous features
depth (int): The number of attention modules to generate
heads (int): Number of attention heads
dim_head (int): Dimension of the attention heads
attn_dropout (float): Dropout probability in the attention module
ff_dropout (float): Dropout probability for the feed forward layers
style (AttentionType): Attention style

`RowColAttention.forward`[source]

RowColAttention.forward(x, x_cont=None, mask=None)

Applies an attention mechanism on inputs

Function Arguments:

x: Categorical inputs
x_cont: Continuous inputs

`class` `Transformer`[source]

Transformer(num_tokens:int, dim:int, depth:int, heads:int, dim_head:int=16, attn_dropout:float=0.0, ff_dropout:float=0.0) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

num_tokens (int): Size of the categorical embeddings in the Attention layer
dim (int): Dimension of the two Embedding layers in the Attention layer
depth (int): The number of attention modules to generate in the Attention layer
heads (int): Number of attention heads in the Attention layer
dim_head (int): Dimension of the attention heads in the Attention layer
attn_dropout (float): Dropout probability in the Attention layer
ff_dropout (float): Dropout probability for the feed forward layers

`Transformer.forward`[source]

Transformer.forward(x, x_cont=None)

Applies attention to inputs

Function Arguments:

x: Categorical inputs
x_cont: Continuous inputs

`class` `MLP`[source]

MLP(dims:list, act=None) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

dims (list): A list of dimensions for the module

`MLP.forward`[source]

MLP.forward(x:tensor)

Applies mlp on x

Function Arguments:

x (torch.tensor): An input

`class` `SimpleMLP`[source]

SimpleMLP(dims:list) :: Sequential

A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.

To make it easier to understand, here is a small example::

# Example of using Sequential
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Example of using Sequential with OrderedDict
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))

Function Arguments:

dims (list): A list of three dimensions for our MLP module

`SimpleMLP.forward`[source]

SimpleMLP.forward(x:tensor)

Applies simplified MLP on x

Function Arguments:

x (torch.tensor): A tensor input

`class` `TabAttention`[source]

TabAttention(categories:list, num_continuous:int, dim:int, depth:int, heads:int, dim_head:int=16, dim_out:int=1, mlp_hidden_mults:Tuple[int]=(4, 2), mlp_act=None, num_special_tokens:int=1, attn_dropout:float=0.0, ff_dropout:float=0.0, lastmlp_dropout:float=0.0, cont_embeddings:str='MLP', attention_style:AttentionType='col') :: Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

Function Arguments:

categories (list): List of categorical cardinalities
num_continuous (int): Number of continuous variables
dim (int): MLP dimension
depth (int): Transformer depth
heads (int): Number of attention heads
dim_head (int): Size of the attention head
dim_out (int): Size of the last linear layer
mlp_hidden_mults (Tuple[int]): Multipliers for the MLP hidden layers
num_special_tokens (int): Number of special tokens for the categories
attn_dropout (float): Dropout probability for the attention module
ff_dropout (float): Dropout probability for the feed forward layers
lastmlp_dropout (float): Dropout probability for the final MLP group
cont_embeddings (str): Type of embeddings for the continouous variables, only MLP is available. If None no attention is applied
attention_style (AttentionType): Attention style

`TabAttention.forward`[source]

TabAttention.forward(x_categ:tensor, x_cont:tensor, x_categ_enc:tensor, x_cont_enc:tensor)

Feed input through Tabular Attention

Function Arguments:

x_categ (torch.tensor): Categorical inputs
x_cont (torch.tensor): Continous inputs
x_categ_enc (torch.tensor): Encoded categorical inputs via embed_data_mask
x_cont_enc (torch.tensor): Encoded continuous inputs via embed_data_mask

Helper Functions

exists[source]

default[source]

ff_encodings[source]

Basic Layers

class Residual[source]

Residual.forward[source]

class PreNorm[source]

PreNorm.forward[source]

Attention Layers

class GEGLU[source]

GEGLU.forward[source]

class FeedForward[source]

FeedForward.forward[source]

class Attention[source]

Attention.forward[source]

class AttentionType[source]

class RowColAttention[source]

RowColAttention.forward[source]

class Transformer[source]

Transformer.forward[source]

class MLP[source]

MLP.forward[source]

class SimpleMLP[source]

SimpleMLP.forward[source]

class TabAttention[source]

TabAttention.forward[source]

`exists`[source]

`default`[source]

`ff_encodings`[source]

`class` `Residual`[source]

`Residual.forward`[source]

`class` `PreNorm`[source]

`PreNorm.forward`[source]

`class` `GEGLU`[source]

`GEGLU.forward`[source]

`class` `FeedForward`[source]

`FeedForward.forward`[source]

`class` `Attention`[source]

`Attention.forward`[source]

`class` `AttentionType`[source]

`class` `RowColAttention`[source]

`RowColAttention.forward`[source]

`class` `Transformer`[source]

`Transformer.forward`[source]

`class` `MLP`[source]

`MLP.forward`[source]

`class` `SimpleMLP`[source]

`SimpleMLP.forward`[source]

`class` `TabAttention`[source]

`TabAttention.forward`[source]