Carleton College
Carleton College
Probabilistic topic modeling is a promising and increasingly popular method of text analysis, affording the identification of patterns of change within tremendously large corpora of documents. However, though tools for exploring and analyzing topic models are increasingly common, the process of building a topic model remains something of an art, given the challenges of pre-processing and model training. To help make one stage of pre-processing more transparent, we have created Stopifu, a web-based tool designed to give researchers more direct control of stopword removal and help them anticipate the effect that excluding different words will have on their analysis. We present our design for this tool, along with our categorization of different types of stopwords that motivated its design.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
In review
Hosted at Carleton University, Université d'Ottawa (University of Ottawa)
Ottawa, Ontario, Canada
July 20, 2020 - July 25, 2020
475 works by 1078 authors indexed
Conference cancelled due to coronavirus. Online conference held at https://hcommons.org/groups/dh2020/. Data for this conference were initially prepared and cleaned by May Ning.
Conference website: https://dh2020.adho.org/
References: https://dh2020.adho.org/abstracts/
Series: ADHO (15)
Organizers: ADHO