botiverse.preprocessors.BoW package#

Submodules#

botiverse.preprocessors.BoW.BoW module#

class botiverse.preprocessors.BoW.BoW.BoW(binary=False)[source]#

Bases: object

An interface for transforming sentences into bag-of-words vectors.

Initialize the BoW transformer.

Parameters:

binary (bool) – Whether to use binary BoW vectors instead of frequency BoW vectors.

transform_list(sentence_list, all_words)[source]#

Given a list of tokenized sentences, return a table of BoW vectors (one for each sentence) in the form of a numpy array.

Parameters:
  • sentence_list (list) – A list of tokenized sentences.

  • all_words (list) – A list of all the words in the vocabulary.

Returns:

A table of BoW vectors (one for each sentence) in the form of a numpy array.

Return type:

numpy.ndarray

transform(sentence)[source]#

Given a sentence, return its BoW vector as a numpy array.

Parameters:

sentence (str) – A string of words

Returns:

A BoW vector for the given sentence.

Return type:

numpy.ndarray

Module contents#