Customized Video Playback Using a Standard Metadata Format

poster / demo / art installation
  1. 1. Michael Bush

    Center for Language Studies - Brigham Young University

  2. 2. Alan K. Melby

    Department of Linguistics - Brigham Young University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

A key mantra in the early days of the digital cellular
telephone revolution was “anything, anytime, anywhere.” Nicolas Negroponte, director of MIT’s
Media Lab, modified the thrust of this declaration, coining a phrase that some would say should be the slogan of the
Information Age, “nothing, nowhere, never unless it is timely, important, amusing, relevant, or capable of engaging my imagination.” It is growing increasingly difficult to meet the challenge posed by this high principle, given the exponential explosion of digital media that the world faces today. Sheer volume is making it increasingly difficult
to accurately identify in time and space where digital
assets of interest are to be found. To find what we need (or want!), when we need it, techniques are needed to not only to represent the assets, but also to describe these in a way that facilitates storage, search, retrieval, and even playback.
The Text Encoding Initiative (TEI) and the associated
international and interdisciplinary standard have addressed
this challenge and made it possible for a wide variety
of individuals and organizations to encode texts in such a way as to facilitate sharing of encoded texts and
processing tools, thus enabling important improvements in research and teaching. Yet, the Information Age is such that it is no longer possible to rely primarily on corpora
created from the written word alone. Because cultural and historical artifacts of today’s society are often based on digital media other than text, it is necessary to devise standard solutions in order to ensure that all resources, which exist as all sorts of digital media, are accessible, retrievable, and useable in a wide variety of settings.
Similar principles were important for TEI, and they are important for digital media today, such as DVDs and audiovisual files transmitted over the Internet.
With this wide array of distribution solutions, the problem
of access now takes on new forms that exist on two levels:
macro (global) and micro (local). At a macro or global
level, it can be a challenge to find a particular type of video
asset that conforms to some list of desired characteristics. Once a video “document” (that is, an asset) is located, however, then access takes on a micro- or local-level dimension as it becomes necessary to show only those segments of interest or to avoid showing portions that are not pertinent or appropriate, for whatever reason, to the need at hand. Such selective use can be problematic from many standpoints, some that are technical and some that are of a legal nature, depending on the circumstances.
Whether we are talking about access at the macro (global) level or access at the micro (local) level, access at both levels is dependent upon the availability of a descriptive mechanism that is sufficiently powerful to find the right portion of the right digital video asset. Such a descriptive mechanism, or Video Asset Description (VAD), should
provide clear and searchable metadata that can be
combined with display systems that are based on specialized
DVD players driven by lists of video clips selected from a VAD. In either case, such systems provide access at both levels, to the video materials themselves and to
specific portions of the video.
For such systems to work it is necessary to efficiently describe digital media, a need addressed by the Moving Picture Experts Group (MPEG) that has developed a standard for describing multimedia content data known
as MPEG-7. In contrast with previous efforts of this
committee that resulted in the standards for compressing and/or streaming digital video this effort produced an XML schema, somewhat comparable in purpose to TEI. This standard, formally named “Multimedia Content Description Interface,” supports some degree of semantic interpretation that is accessible by compatible software
and has been adopted by the International Standards
Organization (ISO).
Our group at Brigham Young University participated in the latter phases of the MPEG-7 development process, working specifically with representatives of Motorola and NHK (the Japan Broadcasting Corporation). This group developed an MPEG-7 “profile” (a subset of the MPEG-7 standard) that has been incorporated into Part 9 of the recently published MPEG-7 standard. Given the complexity of MPEG-7, it was necessary to develop
these profiles as subsets of MPEG-7 “tools” (data element
types) that cover certain functionalities.
Our interest in MPEG-7 grew out of our long-held desire to give teachers, researchers, and learners easy access to video clips in a wide variety of settings: homes, learning centers, offices, classrooms, and libraries. Indeed, we have based our work on desire to make available what
is needed, when it is needed. It is clear that this principle is important not only for consumers or users of digital
media, but for producers of the media, as well as libraries or online repositories where the digital media are stored. The end result is “customized video playback (CVP)” made possible by technologies built on descriptions created using standard metadata formats. CVP includes playback of a video asset under the control of a list of commands that define which segments of the video are played in what order and with which annotations the viewer can interact.
Our approach to Customized Video Playback helps achieve repeatable, customized viewing of a video
program. There are basically two ways to achieve
digital-technology-based Customized Video Playback. One approach is a file-based approach that requires clips to be digitized and encoded for playback, raising several
challenges that have to be addressed. Particular
drawbacks to this approach are (1) it is time-consuming, (2) it requires specialized skills, (3) it requires specialized, expensive, software and hardware tools, and (4) it can violate copyright law. A second approach, one for which we have developed important techniques and technologies, involves the creation of a content data model that very accurately describes the video asset of interest.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info



Hosted at Université Paris-Sorbonne, Paris IV (Paris-Sorbonne University)

Paris, France

July 5, 2006 - July 9, 2006

151 works by 245 authors indexed

The effort to establish ADHO began in Tuebingen, at the ALLC/ACH conference in 2002: a Steering Committee was appointed at the ALLC/ACH meeting in 2004, in Gothenburg, Sweden. At the 2005 meeting in Victoria, the executive committees of the ACH and ALLC approved the governance and conference protocols and nominated their first representatives to the ‘official’ ADHO Steering Committee and various ADHO standing committees. The 2006 conference was the first Digital Humanities conference.

Conference website:

Series: ACH/ICCH (26), ACH/ALLC (18), ALLC/EADH (33), ADHO (1)

Organizers: ACH, ADHO, ALLC

  • Keywords: None
  • Language: English
  • Topics: None