Stopifu: Supporting Task-Specific Stoplisting for Topic Models

paper, specified "short paper"
Authorship
  1. 1. Malcolm Mitchell

    Carleton College

  2. 2. Eric Carlson Alexander

    Carleton College

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Probabilistic topic modeling is a promising and increasingly popular method of text analysis, affording the identification of patterns of change within tremendously large corpora of documents. However, though tools for exploring and analyzing topic models are increasingly common, the process of building a topic model remains something of an art, given the challenges of pre-processing and model training. To help make one stage of pre-processing more transparent, we have created Stopifu, a web-based tool designed to give researchers more direct control of stopword removal and help them anticipate the effect that excluding different words will have on their analysis. We present our design for this tool, along with our categorization of different types of stopwords that motivated its design.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2020
"carrefours / intersections"

Hosted at Carleton University, Université d'Ottawa (University of Ottawa)

Ottawa, Ontario, Canada

July 20, 2020 - July 25, 2020

475 works by 1078 authors indexed

Conference cancelled due to coronavirus. Online conference held at https://hcommons.org/groups/dh2020/. Data for this conference were initially prepared and cleaned by May Ning.

Conference website: https://dh2020.adho.org/

References: https://dh2020.adho.org/abstracts/

Series: ADHO (15)

Organizers: ADHO