Shorter lexicons for clustering based on Gutenberg corpus: Gut_16_xx_yy_text.csv: - Gut_16 - Gutenberg corpus, sentences 16+ words long - xx - minimum word frequency - yy - minimum word pair frequency - text: lexicon - word frequencies, word_pairs - word pair frequencies.