Personal Video Manager: A Tool for Navigating in Video Archives

Matti Hosio; Mika Rautiainen; Ilkka Juuso; Ikka Hanski; Jukka Kortelainen; Matti Varanka; Tapio Seppänen; Timo Ojala

Authorship

1. Matti Hosio

University of Oulu
2. Mika Rautiainen

University of Oulu
3. Ilkka Juuso

University of Oulu
4. Ikka Hanski

University of Oulu
5. Jukka Kortelainen

University of Oulu
6. Matti Varanka

University of Oulu
7. Tapio Seppänen

University of Oulu
8. Timo Ojala

University of Oulu

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The amount of digital information has been growing tremendously. Digital set-top-boxes are more and more common in every home. Until recently people have recorded their favourite television shows on video tapes and stored them in their bookshelves. Now they have the opportunity to record them in digital format to the hard disk of a digital receiver. This leads to growing repositories of digital video content with no annotation except the title of the recording. As the amount of data grows large enough it will be more and more difficult to find a particular video. This problem is not new. The problems were more or less the same before the digital
era of today. Poorly labelled piles of VHS cassettes
stored in bookshelves did not provide much help for people
searching for their favourite episode of a television series recorded years ago.
The traditional way to search for a particular scene in a video stored in a large archive is to select a video and start searching from the beginning. Searching stops when the relevant position of the video is found or the end is
reached. This type of search method may take a long time,
especially if the user does not know the content of the video at hand well enough. Moreover, generic searches such as finding video clips with a certain theme are even more demanding using this methodology.
What we propose here is a new computer-aided way for searching and browsing through a video repository. The video collections are processed and analyzed automatically by the computer that controls the data. Based on the analysis, the computer produces some additional data
structures that will later support the search process
targeted at this archive. Using this approach, the time consuming searching of videos to find something specific is no longer necessary. Actually, one does not even have to remember the title of the video one is looking for. The only information required for creating queries consists
of keywords and mental imagery about the subject of interest. This kind of information is easier to remember than categorical information, such as titles of videos. In addition, the required search time using this type of methodology is only a small fraction of the time needed when using more conventional searching methods. The search concept described above is called content-based video retrieval.
A prototype of this search concept, Personal Video
Manager, was developed at the MediaTeam Oulu research
group [1] of the Department of Electrical and Information
Engineering at the University of Oulu, Finland. It is a video search and browsing application and has been
developed especially for digital set-top-boxes, as the application combines sophisticated video analysis and search software with an easy-to-use user interface.
Analysis of the video material is fully automatic and thus no manual annotation is required. The user interface was designed to be simple and effective, so that the system would be usable without special training.
The research of automatic video content analysis and retrieval has years of tradition at the MediaTeam Oulu research group. The technologies developed include ones that measure visual similarities between video clips using novel relevance metrics [2][3], and efficient
content-based indexing structures that utilise both
automatically constructed speech transcripts and specific
visual cues extracted from the videos [4]. The Personal Video Manager application makes use of these and other technologies developed during the past years.
The video material must be indexed before the search system can be used. Indexes are created by analysing the video content automatically. At first, the software
segments long videos into video shots. Each shot represents
one consistent camera run bounded by visual transitions.
The shots are then analyzed by extracting numerical
descriptions from the visual content and indexed for
content-based retrieval by the search system. Text transcripts
are pre-processed by a simple stemming and stop word removal and inverse document indexes are created with a temporal expansion. The resulting text indexes are
matched with the extracted shot segments. A search collects
and returns the shots matching the query definition.
Therefore users can efficiently locate relevant spots within
a time frame of seconds, even when the collections
contain several hours of video.
Due to the demanding usability requirements of our
prototype, a large proportion of the time spent for the system
design was devoted to the user interface. The final version of the prototype provides users with two navigational methods complementary to each other. The principal method of access, which is also the more traditional one, is based on a simple keyword search. User defined query terms are compared to the text indexes of the whole video database and the most relevant video shots are returned.
The system ranks the results by their relevance and
displays them to the user as still images accompanied with a short piece of descriptive text. In this representation each
image corresponds to a single shot stored in the archive.
The displayed text is an extract of the text content in the shot. Based on the given information the user can
quickly get an impression of the shot content.
Another navigational tool is based on visual similarities between video clips. For each result returned from the
keyword based search, the system provides visually
similar video shots based on computational similarity
measurements. When a visually close match to the search need is found, it is possible to find several other relevant
candidates from the visually similar results. This
functionality provides an alternative way of finding
relevant content, even when there are no textual content
descriptions available.
By using the search methods presented above, relevant results can be obtained relatively quickly. The better the user can describe what he/she is looking for, the faster the search process is. When a relevant video shot is found, it can easily be played using the video player software provided. The user can also view the video in its entirety using conventional video playback options.
The prototype system was developed for the Oulu
Expo exhibition [5], a showcase for the technological know-how of Oulu, where the public can try it out. The system contains video material consisting of over sixteen hours of news material provided by the Finnish public service broadcasting company YLE [6]. Detailed user instructions are provided in order to make sure that all visitors of the exhibition will be able to use the system. It has been configured to collect information about how the users actually use the system. This will help to assess which of the offered navigational methods users prefer.
We strongly believe that the problem imposed by growing digital video archives will create the need for applications such as the Personal Video Manager. Such systems will probably be a standard part of future digital set-top-boxes when the required technological maturity is reached.
References
[1] MediaTeam Oulu research group. http://www.
mediateam.oulu.fi/?lang=en
[2] Ojala T, Rautiainen M, Matinmikko E & Aittola M.
(2001). Semantic image retrieval with HSV
correlograms. Proc. 12th Scandinavian Conference on Image Analysis, Bergen, Norway, 621 – 627.
[3] Rautiainen M & Doerman D. (2002). Temporal Color Correlograms for Video Retrieval. Proc. 16th International Conference on Pattern Recognition, Quebec, Canada, 1:267 - 270.
[4] Rautiainen M, Ojala T & Seppänen T. (2004).
Analysing the performance of visual, concept and text features in content-based video retrieval. Proc. 6th ACM SIGMM International Workshop on
Multimedia Information Retrieval, New York, NY, 197-205.
[5] Oulu Expo. http://www.tietomaa.fi/eng/nayttelyt/ouluexpo.html
[6] YLE. http://www.yle.fi/fbc/thisyle.shtml

Full text license: This text is republished here with permission from the original rights holder.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ACH/ALLC / ACH/ICCH / ADHO / ALLC/EADH - 2006

Hosted at Université Paris-Sorbonne, Paris IV (Paris-Sorbonne University)

Paris, France

July 5, 2006 - July 9, 2006

151 works by 245 authors indexed

The effort to establish ADHO began in Tuebingen, at the ALLC/ACH conference in 2002: a Steering Committee was appointed at the ALLC/ACH meeting in 2004, in Gothenburg, Sweden. At the 2005 meeting in Victoria, the executive committees of the ACH and ALLC approved the governance and conference protocols and nominated their first representatives to the ‘official’ ADHO Steering Committee and various ADHO standing committees. The 2006 conference was the first Digital Humanities conference.

Conference website: http://www.allc-ach2006.colloques.paris-sorbonne.fr/

Series: ACH/ICCH (26), ACH/ALLC (18), ALLC/EADH (33), ADHO (1)

Organizers: ACH, ADHO, ALLC