botiverse.preprocessors.TF_IDF_GLOVE package#
Submodules#
botiverse.preprocessors.TF_IDF_GLOVE.TF_IDF_GLOVE module#
- class botiverse.preprocessors.TF_IDF_GLOVE.TF_IDF_GLOVE.TF_IDF_GLOVE(force_download=False)[source]#
Bases:
objectAn interface for transforming sentences into idf-glove vectors by weighting word GloVe vectors by their tf-idf.
Initialize the GloVe and TF-IDF transformer and download the embeddings if needed.
- Parameters:
force_download (bool) – If True, download the embeddings even if they already exist.
- transform_list(sentence_list, all_words)[source]#
Given a list of tokenized sentences, return a table of idf-GloVe vectors (one for each sentence) in the form of a numpy array. This also initializes the tf and idf tables of the class for use in the transform() method.
- Parameters:
sentence_list (list) – A list of tokenized sentences
all_words (list) – A list of all the words in the corpus
- Returns:
A 2D numpy array of idf-GloVe vectors
- Return type:
numpy.ndarray
- transform(sentence)[source]#
Given a sentence, return its idf-GloVe vector as a numpy array by weighting the GloVe vectors of the words in the sentence by their idf then averaging.
- Parameters:
sentence (str) – A string of words
- Returns:
A numpy array of the idf-GloVe vector
- Return type:
numpy.ndarray