Scripts in Python used during the course
tokenizer
n-grams
stopwords tagger
trigrams to matrix
, hash with non-zero values
PMI
, Pairwise Mutual Information
context filtering
word pairs selection
word pairs
, all word pairs from the matrix
Cosine similarity
ranking
, top most similar words
compare
two words and identifies the list of shared contexts