botiverse.models.BERT package#
Submodules#
botiverse.models.BERT.BERT module#
Theis Module contains the BERT model architecture.
- class botiverse.models.BERT.BERT.Embeddings(config)[source]#
Bases:
ModuleEmbedding layer for BERT.
This layer takes input_ids and token_type_ids as inputs and generates word embeddings using three types of embeddings: word, position, and token_type embeddings.
- Parameters:
config (Config) – BERT configuration.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input_ids, token_type_ids)[source]#
Forward pass of the Embeddings layer.
- Parameters:
input_ids (torch.Tensor) – The input token IDs.
token_type_ids (torch.Tensor) – The token type IDs.
- Returns:
The generated embeddings.
- Return type:
torch.Tensor
- training: bool#
- class botiverse.models.BERT.BERT.EncoderLayer(config)[source]#
Bases:
ModuleEncoder layer for BERT.
This layer contains self-attention, layer normalization, and position-wise feed-forward network.
- Parameters:
config (Config) – BERT configuration.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input, attention_mask)[source]#
Forward pass of the EncoderLayer.
- Parameters:
input (torch.Tensor) – The input tensor.
attention_mask (torch.Tensor) – The attention mask.
- Returns:
The output tensor.
- Return type:
torch.Tensor
- training: bool#
- class botiverse.models.BERT.BERT.MultiHeadAttention(config)[source]#
Bases:
ModuleMulti-head attention layer for BERT.
This layer performs multi-head self-attention and returns the output context.
- Parameters:
config (Config) – BERT configuration.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(query, key, value, attention_mask)[source]#
Forward pass of the MultiHeadAttention.
- Parameters:
query (torch.Tensor) – The query tensor.
key (torch.Tensor) – The key tensor.
value (torch.Tensor) – The value tensor.
attention_mask (torch.Tensor) – The attention mask.
- Returns:
The output context.
- Return type:
torch.Tensor
- training: bool#
- class botiverse.models.BERT.BERT.PositionWiseFeedForward(config)[source]#
Bases:
ModulePosition-wise feed-forward network layer for BERT.
This layer applies two linear transformations with a GELU activation function.
- Parameters:
config (Config) – BERT configuration.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input)[source]#
Forward pass of the PositionWiseFeedForward layer.
- Parameters:
input (torch.Tensor) – The input tensor.
- Returns:
The output tensor.
- Return type:
torch.Tensor
- training: bool#
- class botiverse.models.BERT.BERT.Bert(config)[source]#
Bases:
ModuleBERT model implementation.
This model combines the Embeddings layer, EncoderLayers, and linear transformation layers to perform BERT-based processing.
- Parameters:
config (Config) – BERT configuration.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input_ids, token_type_ids, attention_mask, return_dict=False)[source]#
Forward pass of the Bert model.
- Parameters:
input_ids (torch.Tensor) – The input token IDs.
token_type_ids (torch.Tensor) – The token type IDs.
attention_mask (torch.Tensor) – The attention mask.
return_dict (bool) – Whether to return a dictionary or not, defaults to False.
- Returns:
The sequence output and pooled output.
- Return type:
torch.Tensor, torch.Tensor
- training: bool#
botiverse.models.BERT.config module#
This module contains the configuration class for BERT.
- class botiverse.models.BERT.config.BERTConfig(vocab_size=30522, hidden_size=768, encoder_layers=12, heads=12, ff_size=3072, token_types=2, max_seq=512, padding_idx=0, layer_norm_eps=1e-12, dropout=0.1)[source]#
Bases:
objectConfiguration class for BERT.
This class holds the configuration parameters for the BERT model.
- Parameters:
vocab_size (int) – The size of the vocabulary, defaults to 30522.
hidden_size (int) – The hidden size of the BERT model, defaults to 768.
encoder_layers (int) – The number of encoder layers in the BERT model, defaults to 12.
heads (int) – The number of attention heads in the BERT model, defaults to 12.
ff_size (int) – The size of the feed-forward layer in the BERT model, defaults to 3072.
token_types (int) – The number of token types in the BERT model, defaults to 2.
max_seq (int) – The maximum sequence length in the BERT model, defaults to 512.
padding_idx (int) – The padding index used in the BERT model, defaults to 0.
layer_norm_eps (float) – The epsilon value for layer normalization in the BERT model, defaults to 1e-12.
dropout (float) – The dropout rate in the BERT model, defaults to 0.1.
botiverse.models.BERT.utils module#
This module contains utility functions for the BERT model.