Jan Potocki’s Manuscript Found in Saragossa has been described as a classic of world literature and is arguably the most complex novel ever written in terms of its nested and interconnected structure of narratives. Because of its sheer complexity, devices have been introduced to help the reader make sense of the overall structure, but none are interactive, and some are based on the more well-known 1965 film of the same name, which differs from the novel. The current project is the generation of an open data set based on the only public domain version of the novel, Edmund Chojecki’s Polish translation. The data set is a database of every narrative segment of the novel, including the name of the story, its location within the nested structure of the novel, and the full text, which will allow for a variety of analyses including word counts, topic modeling, and other narratological techniques.

