Work type: paper, specified "long paper"
Keywords: programming punctuation python tokenization word frequency lists
Topics: english natural language processing programming software design and development standards and interoperability text analysis
Languages: English
Work type: paper
Keywords: corpus building lemmatization Old Church Slavonic pos tagging preprocessing stemming tokenization
Languages: English