Texas A&M University
I introduce the Victorian400 dataset for colorizing black and white nineteenth century illustrations using deep learning. While there has been progress in colorizing photos and videos based on Conditional Generative Adversarial Networks (cGANs, there have not been attempts to colorize illustrations from the nineteenth century using deep learning. In addition, datasets with nineteenth-century illustrations have not been provided for deep learning colorization. Therefore, I decided to create the Victorain400 dataset, which is a collection of colorful illustrations painted with nineteenth-century palettes. The Victorian400 dataset provides an opportunity for those studying deep learning to run code easily without high performance devices. I have created, curated and publicly shared the Victorian400 dataset for the deep learning colorization of illustrations from the Victorian era.
I examine the process of creating and curating the Victorian400 dataset and demonstrate the validity of the Victorian400 dataset by looking into the results of the test set with the trained Victorian400 set. I tested the Victorian400 dataset with the pix2pix model introduced by Isola et al., which performs automatic graphic operations on photographs based on cGANs by learning from datasets. It was built based on the GAN model which was introduced by Goodfellow et al. The GAN model has the generator and the discriminator learn and compete with each other for the generation of the best outputs. Similarly, the generator and the discriminator in cGANs can be used to predict colors for input images through the usage of encoders and decoders. The pix2pix model uses U-Net, a convolutional network for image segmentation with skip connections, to make it possible to get both local and contextual information quickly. After showing the test results with the test set from the Victorian400 dataset, I colorize black-and-white illustrations from Charles Dickens’s Bleak House (serialized 1852–1853) for which the plates were created by Hablot Knight Browne (Phiz). Through the experiments, I reveal that the illustrations created with the dark plate technique are compatible with deep learning colorization due to the distinct contrast between darkness and highlights. I discuss the limits of colorizing black-and-white illustrations with the Victorian400 dataset, such as the lack of colors in backgrounds and the possible distortion of original illustrations. Ultimately, I claim that colorized illustrations provide imagination and enjoyment to modern readers when reading fiction.
The Victorian400 dataset was created for data scientists and digital humanists who create, train and test colorization deep learning models. The Victorian400 dataset will not only save a tremendous amount of time for digital humanists who experiment with Victorian illustrations, but will also contribute to the development of deep learning-based research in the digital humanities. As a digital humanist, I believe that we should create, curate, and share humanities datasets for deep learning like the Victorian400 dataset, as well as perform exploratory data analysis, since humanities datasets created by digital humanists are credible enough to be deployed for deep learning.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
In review
Tokyo, Japan
July 25, 2022 - July 29, 2022
361 works by 945 authors indexed
Held in Tokyo and remote (hybrid) on account of COVID-19
Conference website: https://dh2022.adho.org/
Contributors: Scott B. Weingart, James Cummings
Series: ADHO (16)
Organizers: ADHO