Instituto de Matemática Pura e Aplicada
Instituto de Matemática Pura e Aplicada
This poster focuses on OBSERVATOR!O2016, a web-based platform for collecting, structuring and visualizing the online response to Rio-2016 from content shared on Twitter. This project was developed at the Vision and Computer Graphics Laboratory and is based on two cross related research lines. First, the conception and design of the OBSERVATORJO2016 website, which provides a space to explore comments and images about the Olympics through structured visualizations. Second, the ongoing deployment of research regarding the application of a digital method to explore large sets of images. The goal is to highlight how we use Deep Learning applications - such as automatic image classification - and visualization techniques to enhance discoverability and expression of subject features within an extensive image collection.
Method: Deep Learning and mosaic visualization
OBSERVATOR!O2016 collected around 1 million tweets from April 18th to August 25th, 2016. In other to gather different perspectives on the debate about the Olympics we created seven Twitter search queries. The data was presented in eight interactive visualizations:
digital images inaugurates new avenues for researchers interested in the human creative practice. In this sense, the investigation of Lev Manovich on digital methods to study visual culture is quite relevant. With this in mind, we explored Rio-2016 images through Deep Learning approaches. As the investigation is currently ongoing, we will report the research process of a single task, which resulted in the Torch Mosaic visualization.
During the pre-Olympics, it became evident that many of the images gathered by our query scripts were related to the Olympic torch relay and depicted the iconic object. Part of these images were accompanied by texts that mentioned the torch, but not all. In addition, some tweets mentioning the torch relay incorporated images that didn't depict the object. In other words, text analysis alone was not sufficient to detect a set of images containing the torch.
Thus, we referred to a Deep Learning approach to recognize the Olympic torch in our database. The field of visual pattern recognition has been recently improved by the efficient performance of Convolutional Neural Networks (CNN). In 2012, the work of Krizhev-sky et al. on training a deep convolutional neural network to classify the 1.2 million images in the ImageNet LSVRC-2010 contest into 1000 different classes had a substantial impact on the computer vision community.
More recently, and thanks to Google, computer vision tasks such as image classification have become more accessible and applicable. That's because the company released last year their open source software library TensorFlow. This library runs code for image classification on Inception-v3 CNN model, which can be retrained on a distinct image classification task (this quality is referred to as transfer learning). By creating a set of training images, it is possible to update the parameters of the model and use it to recognize a new image category. That said, we retrained the network by showing it a sample of 100 manually labeled images containing the torch. Finally, the retrained network ran over our database and returned a set of images with their corresponding confidence score for the new category.
Approximately 180,000 of the collected tweets included unique images. The production of large sets of
R!02016
DEEP LABELS
Figure 2. Confidence score over 83%
•-
T
il 1
¡jííi J
?? A *
?
p r
W'
Ï
r»
z ¿A
s y
y
i
« -
r.
j
£
’•j • *7^'*
í-
»w*'"
,y '
ia
&
7
Figure 5. Tile images
Until June 25th, around 1500 images with over 85% confidence score for the Olympic torch category had been classified by our network. We used them to create a mosaic visualization that can be zoomed and panned. The mosaic idea is that, given an image (target image), another image (mosaic) is automatically build up from several smaller images (tile images). To implement the mosaic, we used a web-based viewer for high-resolution zoomable images called OpenSeadragon.
Figure 3. The target image
Figure 4. The mosaic
The organization nature of the mosaic visualization is mainly aesthetic. Nevertheless, zooming and panning the mosaic allows the user to explore a wide variety of views, and to discover image details and surprises such as the spoof picture of Fofao, a Brazilian fictional character, carrying the torch.
For DH2017 poster session, we expect to present the results for another subject feature - the sporting disciplines - and visualisation technique - the videosphere - we are working on at the moment. In addition, we plan to discuss with participants some possible scenarios in which Deep Learning models could be applied to help image collections become more discoverable and expressive.
Bibliography
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012).
"Imagenet classification with deep convolutional neural networks". Advances in neural information processing systems, pp. 1097-1105.
Manovich, L (2012). "How to Compare One Million Images?”. In Berry, D. (ed), Understanding Digital Humanities. Palgrave, pp. 249-278.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Angue-lov, D., ... & Rabinovich, A. (2015). "Going deeper with convolutions". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Complete
Hosted at McGill University, Université de Montréal
Montréal, Canada
Aug. 8, 2017 - Aug. 11, 2017
438 works by 962 authors indexed
Conference website: https://dh2017.adho.org/
References: http://web.archive.org/web/20170802132745/https://www.conftool.pro/dh2017/sessions.php
Series: ADHO (12)
Organizers: ADHO