botiverse.models.BERT package#

Submodules#

botiverse.models.BERT.BERT module#

Theis Module contains the BERT model architecture.

class botiverse.models.BERT.BERT.Embeddings(config)[source]#

Bases: Module

Embedding layer for BERT.

This layer takes input_ids and token_type_ids as inputs and generates word embeddings using three types of embeddings: word, position, and token_type embeddings.

Parameters:

config (Config) – BERT configuration.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input_ids, token_type_ids)[source]#

Forward pass of the Embeddings layer.

Parameters:
  • input_ids (torch.Tensor) – The input token IDs.

  • token_type_ids (torch.Tensor) – The token type IDs.

Returns:

The generated embeddings.

Return type:

torch.Tensor

training: bool#
class botiverse.models.BERT.BERT.EncoderLayer(config)[source]#

Bases: Module

Encoder layer for BERT.

This layer contains self-attention, layer normalization, and position-wise feed-forward network.

Parameters:

config (Config) – BERT configuration.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, attention_mask)[source]#

Forward pass of the EncoderLayer.

Parameters:
  • input (torch.Tensor) – The input tensor.

  • attention_mask (torch.Tensor) – The attention mask.

Returns:

The output tensor.

Return type:

torch.Tensor

training: bool#
class botiverse.models.BERT.BERT.MultiHeadAttention(config)[source]#

Bases: Module

Multi-head attention layer for BERT.

This layer performs multi-head self-attention and returns the output context.

Parameters:

config (Config) – BERT configuration.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(query, key, value, attention_mask)[source]#

Forward pass of the MultiHeadAttention.

Parameters:
  • query (torch.Tensor) – The query tensor.

  • key (torch.Tensor) – The key tensor.

  • value (torch.Tensor) – The value tensor.

  • attention_mask (torch.Tensor) – The attention mask.

Returns:

The output context.

Return type:

torch.Tensor

training: bool#
class botiverse.models.BERT.BERT.PositionWiseFeedForward(config)[source]#

Bases: Module

Position-wise feed-forward network layer for BERT.

This layer applies two linear transformations with a GELU activation function.

Parameters:

config (Config) – BERT configuration.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input)[source]#

Forward pass of the PositionWiseFeedForward layer.

Parameters:

input (torch.Tensor) – The input tensor.

Returns:

The output tensor.

Return type:

torch.Tensor

training: bool#
class botiverse.models.BERT.BERT.Bert(config)[source]#

Bases: Module

BERT model implementation.

This model combines the Embeddings layer, EncoderLayers, and linear transformation layers to perform BERT-based processing.

Parameters:

config (Config) – BERT configuration.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input_ids, token_type_ids, attention_mask, return_dict=False)[source]#

Forward pass of the Bert model.

Parameters:
  • input_ids (torch.Tensor) – The input token IDs.

  • token_type_ids (torch.Tensor) – The token type IDs.

  • attention_mask (torch.Tensor) – The attention mask.

  • return_dict (bool) – Whether to return a dictionary or not, defaults to False.

Returns:

The sequence output and pooled output.

Return type:

torch.Tensor, torch.Tensor

training: bool#

botiverse.models.BERT.config module#

This module contains the configuration class for BERT.

class botiverse.models.BERT.config.BERTConfig(vocab_size=30522, hidden_size=768, encoder_layers=12, heads=12, ff_size=3072, token_types=2, max_seq=512, padding_idx=0, layer_norm_eps=1e-12, dropout=0.1)[source]#

Bases: object

Configuration class for BERT.

This class holds the configuration parameters for the BERT model.

Parameters:
  • vocab_size (int) – The size of the vocabulary, defaults to 30522.

  • hidden_size (int) – The hidden size of the BERT model, defaults to 768.

  • encoder_layers (int) – The number of encoder layers in the BERT model, defaults to 12.

  • heads (int) – The number of attention heads in the BERT model, defaults to 12.

  • ff_size (int) – The size of the feed-forward layer in the BERT model, defaults to 3072.

  • token_types (int) – The number of token types in the BERT model, defaults to 2.

  • max_seq (int) – The maximum sequence length in the BERT model, defaults to 512.

  • padding_idx (int) – The padding index used in the BERT model, defaults to 0.

  • layer_norm_eps (float) – The epsilon value for layer normalization in the BERT model, defaults to 1e-12.

  • dropout (float) – The dropout rate in the BERT model, defaults to 0.1.

botiverse.models.BERT.utils module#

This module contains utility functions for the BERT model.

botiverse.models.BERT.utils.LoadPretrainedWeights(model)[source]#

Load pre-trained weights from the transformers library.

This function loads the pre-trained weights from the transformers library and updates the model’s state_dict accordingly.

Parameters:

model (Bert) – The BERT model.

botiverse.models.BERT.utils.Example()[source]#

Example comparing the outputs of the from scratch model to the pre-trained model from transformers library.

Module contents#