Core aspects to building a `SAINT` model
The history saving thread hit an unexpected error (DatabaseError('database disk image is malformed')).History will not be written to the database.
/mnt/d/lib/python3.7/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at  /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  return torch._C._cuda_getDeviceCount() > 0

Helper Functions

exists[source]

exists(val)

Tests if val is not None

Function Arguments:

  • val: Any value

default[source]

default(val, d)

Get a default if val doesn't exist

Function Arguments:

  • val: Any value
  • d: Some default for val

ff_encodings[source]

ff_encodings(x:tensor, B:tensor)

Return sin and cosine projections of x @ B

Function Arguments:

  • x (torch.tensor): Input
  • B (torch.tensor): Projection matrix

Basic Layers

class Residual[source]

Residual(fn:callable) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

  • fn (callable): A function to generate a residual

Residual.forward[source]

Residual.forward(x:tensor, **kwargs)

Applies self.fn to x

Function Arguments:

  • x (torch.tensor): An input
  • **kwargs: kwargs for self.fn

class PreNorm[source]

PreNorm(dim:int, fn:callable) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

  • dim (int): LayerNorm dimension
  • fn (callable): Residual function

PreNorm.forward[source]

PreNorm.forward(x:tensor, **kwargs)

Applies self.fn to the output of self.norm(x)

Function Arguments:

  • x (torch.tensor): An input
  • **kwargs: kwargs for self.fn

Attention Layers

class GEGLU[source]

GEGLU() :: Module

GLU variation introduced in GLU Variants Improve Transformer

GEGLU.forward[source]

GEGLU.forward(x:tensor)

Chunks x into 2, applies F.gelu to chunks, and returns product with x

Function Arguments:

  • x (torch.tensor): An input

class FeedForward[source]

FeedForward(dim:int, mult:int=4, dropout:(<class 'int'>, <class 'float'>)=0.0) :: Sequential

A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.

To make it easier to understand, here is a small example::

# Example of using Sequential
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Example of using Sequential with OrderedDict
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))

Function Arguments:

  • dim (int): Linear layer dimension
  • mult (int): Multiplier for dim
  • dropout (int,float)): Dropout probability

FeedForward.forward[source]

FeedForward.forward(x:tensor)

Applies x to a Linear -> GEGLU -> Dropout -> Linear group

Function Arguments:

  • x (torch.tensor): An input

class Attention[source]

Attention(dim:int, heads:int=8, dim_head:int=16, dropout:float=0.0) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

  • dim (int): Dimension for the Linear groups
  • heads (int): Number of attention heads
  • dim_head (int): Dimension of the attention heads
  • dropout (float): Dropout probability

Attention.forward[source]

Attention.forward(x:tensor)

Applies attention to x

Function Arguments:

  • x (torch.tensor): An input

class AttentionType[source]

AttentionType(*args, **kwargs)

All possible attention types, with typo-proofing and auto-complete

class RowColAttention[source]

RowColAttention(num_tokens:int, dim:int, nfeats:int, depth:int, heads:int, dim_head:int=16, attn_dropout:float=0.0, ff_dropout:float=0.0, style:AttentionType='col') :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

  • num_tokens (int): Size of the categorical embeddings
  • dim (int): Dimension of the two Embedding layers
  • nfeats (int): Number of continuous features
  • depth (int): The number of attention modules to generate
  • heads (int): Number of attention heads
  • dim_head (int): Dimension of the attention heads
  • attn_dropout (float): Dropout probability in the attention module
  • ff_dropout (float): Dropout probability for the feed forward layers
  • style (AttentionType): Attention style

RowColAttention.forward[source]

RowColAttention.forward(x, x_cont=None, mask=None)

Applies an attention mechanism on inputs

Function Arguments:

  • x: Categorical inputs
  • x_cont: Continuous inputs

class Transformer[source]

Transformer(num_tokens:int, dim:int, depth:int, heads:int, dim_head:int=16, attn_dropout:float=0.0, ff_dropout:float=0.0) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

  • num_tokens (int): Size of the categorical embeddings in the Attention layer
  • dim (int): Dimension of the two Embedding layers in the Attention layer
  • depth (int): The number of attention modules to generate in the Attention layer
  • heads (int): Number of attention heads in the Attention layer
  • dim_head (int): Dimension of the attention heads in the Attention layer
  • attn_dropout (float): Dropout probability in the Attention layer
  • ff_dropout (float): Dropout probability for the feed forward layers

Transformer.forward[source]

Transformer.forward(x, x_cont=None)

Applies attention to inputs

Function Arguments:

  • x: Categorical inputs
  • x_cont: Continuous inputs

class MLP[source]

MLP(dims:list, act=None) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

Function Arguments:

  • dims (list): A list of dimensions for the module

MLP.forward[source]

MLP.forward(x:tensor)

Applies mlp on x

Function Arguments:

  • x (torch.tensor): An input

class SimpleMLP[source]

SimpleMLP(dims:list) :: Sequential

A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.

To make it easier to understand, here is a small example::

# Example of using Sequential
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Example of using Sequential with OrderedDict
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))

Function Arguments:

  • dims (list): A list of three dimensions for our MLP module

SimpleMLP.forward[source]

SimpleMLP.forward(x:tensor)

Applies simplified MLP on x

Function Arguments:

  • x (torch.tensor): A tensor input

class TabAttention[source]

TabAttention(categories:list, num_continuous:int, dim:int, depth:int, heads:int, dim_head:int=16, dim_out:int=1, mlp_hidden_mults:Tuple[int]=(4, 2), mlp_act=None, num_special_tokens:int=1, attn_dropout:float=0.0, ff_dropout:float=0.0, lastmlp_dropout:float=0.0, cont_embeddings:str='MLP', attention_style:AttentionType='col') :: Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

Function Arguments:

  • categories (list): List of categorical cardinalities
  • num_continuous (int): Number of continuous variables
  • dim (int): MLP dimension
  • depth (int): Transformer depth
  • heads (int): Number of attention heads
  • dim_head (int): Size of the attention head
  • dim_out (int): Size of the last linear layer
  • mlp_hidden_mults (Tuple[int]): Multipliers for the MLP hidden layers
  • num_special_tokens (int): Number of special tokens for the categories
  • attn_dropout (float): Dropout probability for the attention module
  • ff_dropout (float): Dropout probability for the feed forward layers
  • lastmlp_dropout (float): Dropout probability for the final MLP group
  • cont_embeddings (str): Type of embeddings for the continouous variables, only MLP is available. If None no attention is applied
  • attention_style (AttentionType): Attention style

TabAttention.forward[source]

TabAttention.forward(x_categ:tensor, x_cont:tensor, x_categ_enc:tensor, x_cont_enc:tensor)

Feed input through Tabular Attention

Function Arguments:

  • x_categ (torch.tensor): Categorical inputs
  • x_cont (torch.tensor): Continous inputs
  • x_categ_enc (torch.tensor): Encoded categorical inputs via embed_data_mask
  • x_cont_enc (torch.tensor): Encoded continuous inputs via embed_data_mask