The Creation of Music Query Documents: Framework and Implications of the HUMIRS Project

poster / demo / art installation
Authorship
  1. 1. J. Stephen Downie

    Graduate School of Library and Information Science (GSLIS) - University of Illinois, Urbana-Champaign

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Music Information Retrieval (MIR) and Music Digital Library (MDL) are inter-related multidisciplinary research endeavors that strive to develop innovative content-based searching schemes, novel interfaces, and evolving networked delivery mechanisms in an effort to make the world’s vast store of music accessible to all. Some teams are developing “Query-by-Singing” systems (e.g., [1,2]), some “Query-by-Note” systems (e.g., [3,4]), some “Query-by-Example” systems (e.g., [5,6]), some comprehensive music recommendation and distribution systems (e.g., [7,8]), some musical analysis systems (e.g., [9,10]), and so on. Good overviews of MIR/MDL’s interdisciplinary research areas can be found in [6,11,12].
This poster outlines an ambitious music query analytic model that promises to have significant implications for the future of humanities computing. It specifically details the framework of the "Human Use of Music Information Retrieval Systems" (HUMIRS) project being initiated at the University of Illinois at Urbana-Champaign (UIUC). HUMIRS is an integral sub-component of the author's Music Information Retrieval / Music Digital Library Evaluation Project. The "Evaluation Project" has an intial four-year time-span (1 October 2003 to 30 September 2007) and is being supported via significant funding from both the Andrew W. Mellon Foundation and the National Science Foundation. The project is a co-operative undertaking involving researchers from the Graduate School of Library and Information Science, the National Center for Supercomputing Applications (NCSA), the Faculty of Music, the Music Library, Electrical Engineering and the Computer Science department of UIUC.
Current Problem
The MIR/MDL communities have long recognized the need for a scientific evaluation paradigm. A formal resolution expressing this need was passed, 16 October 2001 by the attendees of the Second International Symposium on Music Information Retrieval (ISMIR 2001). (See http://music-ir.org/mirbib2/resolution for the list of signatories.) The resolution highlights the fact that MIR/MDL research requires access to music materials consisting of inter-connected collections of symbolic (e.g., scores, MIDI, MusicXML, etc.), textual (e.g., lyrics, libretti, reviews, analyses, etc.), audio (e.g., MP3, WAV, etc.) and metadata (e.g., bibligraphic records, etc.) information. In conjunction with the requirements for the material infrastructure, the resolution emphasizes the need for a common set of formal and standardized tasks that each team can use to evaluate its system(s). This notion of standardized collections and tasks is derived from the Text Retrieval Conference (TREC) paradigm. TREC, developed over a decade ago by the National Institute of Standards and Technology (NIST), is a testing and evaluation paradigm for the text retrieval community (see http://trec.nist.gov/overview.html). Under this paradigm, each text retrieval team is given access to:
1. a standardized, large-scale test collection of text;
2. a standardized set of test queries; and,
3. a standardized evaluation of the results each team generates.
It is Item #2, the creation of the set of test queries specifically designed to address the needs of the MIR/MDL research communities, that is the HUMIRS project's central concern. We must emphasize here the disctinction between "queries" and "search statements". "Queries" are the verbalized expressions of a user's information need; whereas, "search statements" are the expression(s) of the queries in the syntax of particular retrieval engines [13]. This distinction is important for it highlights that music queries, the objects of interest in the HUMIRS project, are:
semantically-rich;
sytantically-undetermined;
structurally independent of any particular search system(s); and,
content variable (i.e., can contain singing, text, recorded examples, notation, etc.) .
Understanding the semantic richness, the possible syntactic options, the structural implications and the kinds of information contained in real-world music queries is the sine qua non of the project. The example provided in Appendix A is a perfect illustration of the semantic and structural complexities to be found in real-world music queries [14]. Note how the example contains information about the structure of the desired song, some lyric fragments presented simultaneously with the harmonic accompaniment, some rather specific tempo information, some suggestions of genre and style information, some suggestions of simularity, some indications of the time frame in which the piece is situated, and so on.
Human Use of Music Information Retrieval Systems (HUMIRS)
As we iteratively create the TREC-like evaluation programme for the MIR/MDL communities, it is very important that our TREC-like evaluation tasks be grounded in real-world requirements. That is, we must ensure that the test tasks developed are realistic proxies for the kinds of uses that MIR/MDL systems might expect to encounter. Synthesizing from the suggestions made by the expert participants of the preceding “MIR/MDL Evaluation Frameworks Project,” (see http://music-ir.org/evaluation) it appears that a minimal TREC-like query record needs to include the following basic elements:
High quality audio representation(s)
Verbose Metadata:
i. About the “user”
ii. About the “need”
iii. About the “use”
Symbolic representation(s) of the music presented
One is struck by how these requirements are less like a traditional TREC query or "topic statement" (Fig. 1) and more like the kind of information garnered in a traditional, well-conducted, reference interview [15, 16]. This suggests that the involvement of professional music librarians in the development of the TREC-like music query records is very important — perhaps even critical. Thus, the UIUC Music Library and its staff will be playing a pivotal role in the project, as will other members of the international music library world.
<num> Number: 409
<title> legal, Pan Am, 103
<desc>Description:
What legal actions have resulted from the destruction of Pan Am Flight 102 over Lockerbie, Scotland,
on December 21, 1988?
<narr> Narrative:
Documents describing any charges, claims, or fines presented to or imposed by any court or tribunal
are relevant, but documents that discuss charges made in diplomatic jousting are not relevant.
Figure 1. A TREC "topic statement" as found in [17].
Developing a database of standardized MIR query records
Over the course of the project we will be setting up, and running, the first TREC-like evaluation “contests” whereby each MIR/MDL research team will be given the opportunity to test their system(s) using the set of standardized test queries and results that we will be developing. This will be an iterative process as feedback from the research community and the project’s advisory panel will be solicited to ensure that the test queries and the evaluation measures used are truly reflective of system performance (i.e., the evaluations give us a true picture of the strengths and weaknesses of each participating system). Based upon the data collected, we will develop a database of well-formed, standardized query record prototype(s). These records will be developed to reflect both the complexity of music queries and the principle of retrieval neutrality (i.e., will not favour one retrieval approach over another). They will incorporate, in some form, textual, metadata, audio, and symbolic information. The two central production and research facets of this complex problem are outlined below.

Facet #1: Creation of the formal specifications for the query record content.

This includes the specifications for the type(s) of audio formats to be used (e.g., WAV, MP3, etc.) and the selection of necessary sampling rates, etc. Assessment of transcription protocols (e.g., audio-to-symbolic, symbolic-to-audio, audio-to-text, etc.) will be conducted to establish the mechanisms for completing the query records (e.g., making sure that an aural query also has a symbolic representation, etc.) The requisite types of textual, needs and uses, user and metadata information will also be delineated. This is the aspect of the project that will draw most heavily upon the prior work of the humanities computing community. We will need to develop means of analyzing the semantic categories present in the real-world queries we capture. Once the categories have been uncovered we will then develop the necessary mark up languages to fully delineate the variety of semantic types present. We hope that we be able to integrate many pre-existing categories from other encoding initiatives. We further hope to find formal mechanisms for making whatever new (i.e., music specific) categories hospitable for inclusion in a broader (i.e., generalizable) conceptualization of document mark up schemes.

Facet #2: Creation of the formal specifications for the query record structure.

As we gain experience with the types of information needed to properly represent a query, we will be in a better position to ascertain their structural implications. We are currently experimenting with various metadata standards. One, based upon Indiana University’s Variations2 research (http://www.dml.indiana.edu/metadata/index.html), uses XML as the structural wrapper. Our final specifications will be biased toward extensibility and possible integration into the overall database system and the proposed Music GRID distribution mechanisms [18]. The construction and dissemination of basic tools to build, edit and interact with the query records will be an important outcome of this part of the project. The query record specifications and the tools developed in association with them will outlive the project. This will allow others in the community to contribute to the ongoing success of future MIR/MDL evaluation experiments and the Music GRID network. It will also allow others to investigate the implications of formal, domain-specific and cross-domain, document mark up schemes for queries involving other media of cultural expression including text, speech, video, film, dance, etc.

Acknowledgements

Drs. Don Waters and Suzanne Lodato, both of the Andrew W. Mellon Foundation, are thanked for their moral and financial support. Karen Medina, Joe Futrelle and Mike Welge are also thanked for their valuable contributions and suggestions throughout the project. This project is also support by the National Science Foundation (NSF) under Grant Nos. NSF IIS-0340597 and NSF IIS-0327371.

Appendix A. An Exemplar Real-World Music Query (as presented by [14])

From: XXXXXXXXX

Subject: Early 80's - Please identify this song! (it's *very* difficult, though)

Newsgroups: alt.music.lyrics

Date: 2000-12-14 09:42:24 PST

Hi, thiis is so difficult because I only remember those damn FRAGMENTS of it,

which can (in combination with possible errors) make it VERY difficult to identify this song!

But I'll try my best to make myself clear as possible.

This song MUST be from the period 1979-1984, most likely 1981 or 1982.

Tempo: about 120 bpm

Sounds VERY close to a SAGA or Asia tune (maybe it is SAGA even! ;)

OK here I go...(gonna add the chords for you guitarists out there ;)

[verse 1]

F C Bb Bb C

Crazy ................ onto the ..... café

F C Bb

I'm drinking coffee, she came away

F C Bb Bb C
She ordered .............. precious sum of money ???
F C Bb
deedeedeedeedeedeedeedee....
C
Ohohohoo
[(instrumental) F C Bb Bb C F C Bb]
[verse 2] [...]
[chorus]
Dm Bb
Another da-------y, in the afternoon
Dm Bb(7)
my delight (?),
F/C C11
another da-------y, in the afternoon
F
my secret delight (?).
[verse 3] [...]
[BRIDGE] (b b b b)
<MATERIAL DELETED>
???
[repeat bridge]
Ab Db Eb (b) F
[2nd time] ???? fantasy, yeah
Hope that's enough to identify this tune ... :)
References
[1] Haus, G. and Pollastri, E., "An Audio Front End for Query-by-Humming Systems." Second International Symposium on Music Information Retrieval, Bloomington, IN, USA, pp. 65-72, 2001.
[2] Birmingham, W., Dannenberg, R. B., Wakefield, G. H., Bartsch, M., Bykowski, D., Mazzoni, D., Meek, C., Mellody, M., and Rand, W., "MUSART: Music Retrieval Via Aural Queries." Second International Symposium on Music Information Retrieval, Bloomington, IN, USA, pp. 73-81, 2001.
[3] Doraisamy, S. and Rüger, S. M., "A Comparative and Fault-tolerance Study of the Use of N-grams with Polyphonic Music." Third International Conference on Music Information Retrieval, Paris, France, pp. 101-106, 2002.
[4] Pickens, J., "A Comparison of Language Modeling and Probabilistic Text Information Retrieval Approaches to Monophonic Music Retrieval." International Symposium on Music Information Retrieval, Plymouth, MA, USA, 2000.
[5] Haitsma, J. and Kalker, T., "A Highly Robust Audio Fingerprinting System." Third International Conference on Music Information Retrieval, Paris, France, pp. 107-115, 2002.
[6] Downie, J. S., "Music Information Retrieval." Annual Review of Information Science and Technology, vol. 37, pp. 295-340, 2003.
7] Pauws, S. and Eggen, B., "PATS: Realization and User Evaluation of an Automatic Playlist Generator." Third International Conference on Music Information Retrieval, Paris, France, pp. 222-230, 2002.
[8] Logan, B., "Content-Based Playlist Generation: Exploratory Experiments." Third International Conference on Music Information Retrieval, Paris, France, pp. 295-296, 2002.
[9] Kornstädt, A., "The JRing System for Computer-Assisted Musicological Analysis." Second International Symposium on Music Information Retrieval Bloomington, IN, USA, pp. 93-98, 2001.
[10] Barthélemy, J. and Bonardi, A., "Figured Bass and Tonality Recognition." Second International Symposium on Music Information Retrieval, Bloomington, IN, USA, pp. 129-136, 2001.
[11] Byrd, D. and Crawford, T. C., "Problems of Music Information Retrieval in the Real World." Information Processing and Management, vol. 38, pp. 249-272, 2002.
[12] Futrelle, J. and Downie, J. S., "Interdisciplinary Communities and Research Issues in Music Information Retrieval." Third International Conference on Music Information Retrieval, Paris, France, pp. 215-221, 2002.
[13] Tague-Sutcliffe, J. , "The Pragmatics of Information Retrieval Experimentation, Revisited." Information Processing and Management, vol. 28, pp. 467-490, 1992.
[14] Cunningham, S. J. "User Studies: A First Step in Designing an MIR Testbed." The MIR/MDL Evaluation Project White Paper Collection, Champaign, IL: GSLIS, pp. 17-19, 2002.
[15] Dewdney, P. and Michell, G., "Asking 'Why' Questions in the Reference Interview: A Theoretical Justification." Library Quarterly, vol. 67, pp. 50-71, 1997.
[16] Smith, L. C., "The Reference Interview." Reference and Information Services: An Introduction, eds. Bopp, R. E. and Smith, L. C. Englewood, CO: Libraries Unlimited, pp. 47-68, 2001.
[17] Voorhees, E. M. "Whither Music IR Evaluation Infrastructure: Lessons to be Learned from TREC." The MIR/MDL Evaluation Project White Paper Collection, Champaign, IL: GSLIS, pp. 7-13, 2002.
[18] Dovey, M. J. "Music GRID: A Collaborative Virtual Organization for Music Information Retrieval Collaboration and Evaluation." The MIR/MDL Evaluation Project White Paper Collection, Champaign, IL: GSLIS, pp. 50-52, 2002.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ACH/ALLC / ACH/ICCH / ALLC/EADH - 2004

Hosted at Göteborg University (Gothenburg)

Gothenborg, Sweden

June 11, 2004 - June 16, 2004

105 works by 152 authors indexed

Series: ACH/ICCH (24), ALLC/EADH (31), ACH/ALLC (16)

Organizers: ACH, ALLC

Tags
  • Keywords: None
  • Language: English
  • Topics: None