botiverse.preprocessors.Special.WhizBot_BERT_Preprocessor package#
Submodules#
botiverse.preprocessors.Special.WhizBot_BERT_Preprocessor.WhizBot_BERT_Preprocessor module#
- class botiverse.preprocessors.Special.WhizBot_BERT_Preprocessor.WhizBot_BERT_Preprocessor.WhizBot_BERT_Preprocessor(file_path)[source]#
Bases:
objectAn interface that provides the required preprocessing for the WhizBot_BERT bot
Initializes a WhizBot_BERT_Preprocessor instance with the file path of the dataset and the BERT model parameters.
- Parameters:
file_path (str) – Path to the .json file to be read.
- Returns:
None
- process()[source]#
Applies preprocessing steps to the loaded data.
- Returns:
Processed data.
- Return type:
DataFrame
- clean_string(string)[source]#
Cleans the given text string by removing the emojies.
- Parameters:
string (str) – The string to process.
- Returns:
The processed string.
- Return type:
str
- tokenize_string(string)[source]#
Tokenizes a given text string using the BERT tokenizer.
- Parameters:
string (str) – The string to tokenize.
- Returns:
A dictionary containing the tokenized version of the input text strin i.e., the ids and attention_masks.
- Return type:
dict