Knowledge and Reasoning: Connecting Scientific Data and Cultural Heritage

paper
Authorship
  1. 1. Fenella G France

    Library of Congress

  2. 2. Michael B. Toth

    R.B. Toth Associates

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Knowledge and Reasoning: Connecting Scientific Data and Cultural Heritage
France, Fenella G., Library of Congress, United States of America, frfr@loc.gov
Toth, Michael B., R.B. Toth Associates, mbt.rbtoth@gmail.com
Uncovering and detecting patterns of information in library and museum collection items requires the integration of scholarly and scientific data analyses. Research into the materials substrate (paper, parchment, etc) and media (ink, pigment, colorant) that comprise the historic object inform the scholar as to the provenance of a particular document. This may be through characterization and identification of the media and substrate on which the information is contained, or watermark identification that can effectively date the document to a specific time period. The challenge of linking disparate materials characterization and identification databases and scientific analyses is a critical research issue requiring the development of a knowledge representation (KR) to facilitate interpretation that enables this humanities research. With cultural heritage objects, the KR must maintain linkage between the original document that contains a wealth of knowledge stored contextually, and digital surrogates that represent the document. These links will be explored in reference to cultural treasures of South America, where the integration of scientific analyses and historic scholarly information led to the generation of knowledge that enriched the contextual interpretation of the original object. Scientific analyses of textile treasures from Llullaillaco provided an understanding of their use and purpose, while environmental information of Pre-classic Mayan structures in the Mirador Basin allowed an assessment of preservation requirements. Investigations of maps with specific links to South America yielded information on the source of pigments and geographical location of their original creation.

Historic documents and cultural heritage objects do not generally lend themselves to ease of context analysis, since documentation about the creation of the document is not readily available. Discovery of data with more detail about the context and circumstances that surround the creation of the artifact allows researchers to visualize information previously not detected. This visualization is achieved in Library of Congress studies through advanced spectral imaging techniques that incorporates data from both visible and non-visible regions of the spectrum to create an integrated “digital cultural object” (DCO). The additional contextual information is not apparent in conventional digitization techniques for these objects, so the integration of the spectral data assists in mining the layers of data stored within the objects. In this way the DCO provides a range of information that allows a shift from the use of interpretive virtual heritage applications that focus on the artistic rather than the investigative and inferential, towards the development of interdisciplinary scientific data analyses as part of cultural heritage humanities scholarly studies. In this way, cultural heritage, science and technology are intertwined, advancing the capacity to mine and analyze historic data from multiple viewpoints. Interaction and interpretation of these additional dimensions allow the description of new relationships between constituent elements, connecting patterns and mining the data for trends, and correlating formerly disparate components. For example, the new identification of pigments in an object that come from a different geographical location, can then suggests trade routes and exchange of materials and artistic techniques. The UNESCO Charter on the Preservation of Digital Heritage has recognized the importance of digital versions of cultural materials, referring to digital heritage as “resources of information and creative expression” being “increasingly produced, distributed, accessed and maintained in digital form.”

The composite of images of the maps and textiles that forms the new DCO is related to, but distinct from the originals. This digital object is not a surrogate for the original, but provides new knowledge through the integration of scientific analysis of these cultural heritage objects within libraries, museums and archives. The range of data this new digital object contains enhances interaction between a range of professions, allowing multidisciplinary collaboration for integration of preservation, sociological and cultural information. Digitally generated and accessed data for these maps and textiles balances the opposing goals of libraries and museums and optimizes preservation of the original objects, while increasing access to information from the original. Hyperspectral imaging allows the DCO to create a new interpretation of the original object, as apparent from the 3-dimensional reconstruction of the original woodcut of the Waldseemüller 1507 world. Since these original manufacturing materials no longer exist, the DCO allows the representation of this new scientific knowledge to link with geography and map curatorial knowledge of the era, and assist a greater understanding of the printing techniques and possible location of where these materials originated. This was the first time America was referred to as America, on any map, printed or manuscript, with a unique perspective of South America.

Figure 1. LHS: Section of Printed Sheet of South America, RHS: 3D Rendering of Woodcut

Full Size Image

Image capture and processing of these and other objects is important for the interpretation of cultural heritage by allowing layers of data to be analyzed and linked. This offers an archeological examination of the information strata, the materials, inter-connections between the materials used, discovered text and information, and the relative associations between the components of the artifact. The creation of an image data-cube deconstructs layers of data into discrete components, while conversely also integrating and utilizing the application of scientific methods to the recognition of areas of interest within the artifact. The explanations generated from these processes expand the associations and extracted knowledge of the original cultural heritage objects.

There is dynamic interaction between re-examination of the original or source materials and the DCO, with raw and processed data that can enhance obscured and specific features. Inter-connections and relationships between the source and the generated interpretations based upon analysis of generated data are an iterative process. The process of interpretation relies upon the use of implicit assumptions, inferences or internal filtering. For both scientific and scholarly researchers, these processes are often based upon prior knowledge and experience. Additional filtering for scientific analyses are introduced by reference databases that match known reference materials with “pixel” samples taken from the spectral images to non-invasively characterize materials. These categorizations are objective in nature and reduce the potential error sometimes introduced by subjective assumptions. However, the essential element that should not be overlooked in this iterative process is the power of strong collaborations between a range of disciplines. In 1999 on Llullaillaco's summit, an Argentine-Peruvian expedition co-directed by Johan Reinhard and Argentine archaeologist Constanza Ceruti found the perfectly preserved bodies of three Inca children, sacrificed approximately 500 years earlier. These were accompanied by a range of textiles, figurines and other ceremonial sacrificial materials. A close collaboration between the American Museum of Natural History, preservation scientists, engineers from Argentina and the USA, conservators and curators was critical to ensuring the preservation of these unique materials. The interweaving of the assessment of current condition of the cultural artifacts such as strength and chemical degradation to aid their preservation, with scholarly analysis of the unique patterns of construction enabled the knowledge of both to be linked so as to create a rare collaboration of the forensic type recovery of the history of these materials. This is the highest Inca burial so far discovered and the world's highest archaeological sit. There was intense pressure for exhibition of the materials so close collaboration between all parties to gain as much scientific and cultural information about the origins of the materials and the mummies was a unique component in ensuring their longevity while on exhibit. This sharing of data aided the control of exhibit conditions in South America and long-term management of the materials to allow further studies.

The critical component in the generation of knowledge in these studies is recognizing the importance of skilled people and work processes that efficiently add value and meaning. In order to analyze the original source material, high resolution spectral imaging creates the DCO, and data processing and further scientific analyses revolve around the interactions of the people involved – preservation scientist, curator, scholar and technology specialist. This iterative process is reliant not only upon the effective use of technology to assimilate process and disseminate the information with standardized processing, metadata and data management, but also the quality of the collaborative interaction between professionals from different fields.

While the relationship between the original and the DCO is provided through metadata that maintains the spatial links between the scientific data, the original material, and the new knowledge generated from these linkages, the integration of diverse opinions and perspectives is an integral component of this new knowledge generation. Standardization of file formats and structures across these different fields provides a method of ensuring continued access and integration to the information, by maintaining and creating effective associations while generating new knowledge. The above examples illustrate the requirement for this standardization of both scientific and scholarly files, to enable true international collaboration and sharing of resources between countries and disciplines. These protocols can support effective data exchange with conventions that provide a local structure for a scientific data network. This sustains diversity in scientific research and scholarly studies. This requirement for a structural framework for cultural heritage institutions allows both user access and functional usability of information to support research.

Effective visualization of these data connections is essential for further associations, with access to both spatial and temporal data for the maps, textiles and other objects directly linked back to the original source material through the DCO. Visualization tools and interfaces offer potential for open dialogue between multidisciplinary fields and ease of navigation through layers of data, since knowledge generation is reliant on the cohesive interaction and collaboration between science and humanities researchers. The knowledge representation and underlying interpretation system needs to be appropriate for the application and the types of analyses that are needed for integrating and expanding cultural heritage research. The development of XML-based knowledge representation languages and standards include the Resource Description Framework (RDF). A major benefit of RDF is the ability to utilize the features that facilitate the merging of data, even if the underlying schemas differ. The further advantage for this heritage science application is the capacity to support the evolution of schemas over time, allowing both structured and semi-structured data to be combined and shared across different areas within the application. The large volumes of digital data generated require a repository that can cope with a diverse range of object types, as is true for with collections of data used for scientific analyses. Data files from these objects include images, spectra, reports, and other extant files, requiring additional metadata mechanisms to accommodate a range of data, while retaining access and associations. An RDF model offers greater flexibility and opportunity for integration across various disciplines, since it is designed to accommodate multiple ontologies with a structured approach. The model or ontology being exploited within this structure allows us to coordinate a wide range of data and file types with extensive metadata and multiple data formats. For example, metadata from the scientific instrument is included so the measurement could be reproduced, and in addition the data can be imported in multiple formats – as spectra for visual representation and as a .csv standardized sharable data file.

Figure 2. Visual Representation of Layers of Digital Object Data

Full Size Image

Scriptospatial mapping of the textiles and maps involves an accurate coordinate system that links scientific and scholarly analyses to the DCO, and allows inferences to be drawn to generate new knowledge. This approach to viewing the DCO in relation to multiple dimensions applies an essentially archeological methodology toward uncovering and interconnecting information strata of cultural heritage artifacts. Utilizing an object-oriented approach in conjunction with the data layer allows the mapping of spatial and temporal data with increasing complexity. Examining and explaining the physical, spectral and chemical properties of the maps and textiles permit the humanities scholar to link these scientific analyses to the social aspects of how they were created. These links therefore create meaningful scientific outcomes of the content: When obscured or faded text can be retrieved; inks and pigments characterized and traced to specific geographical locations; analysis of the intensity of handwriting imparts understanding of the author’s original intent; and the provenance and source of paper is gleaned through the capture and analysis of the watermark in the paper.

A continued focus on collaboration between people, data and processes is a major factor in promoting access and integration of scientific and humanities research, emphasizing the importance of linking the original artifact with digital tools and techniques for visualizing and disseminating new knowledge in the arts and humanities. Concentrating on generating new knowledge from content derived from these maps, textiles and other objects enhances the importance of the DCO. This allows improved access, interpretation and preservation of fragile items of significant cultural heritage. However the extracted information is only as important as the strength of the collaborative partnerships set up to create a constant iterative loop for access to and interpretation of new scholarly and scientific information. This requires a strong and committed association between previously disparate fields to incorporate and share the generation of new knowledge by mining additional data and forging new advances in humanities research. These related but previously disparate disciplines comprise humanities scholars, scientists, researchers, technology and data management specialists to form an open yet interconnected digital exchange of humanities research.

References:
Cameron, F, and Kenderdine, S. (eds) 2007 Theorizing Digital Cultural Heritage: A critical discourse, Massachusetts Institute of Technology Press Massachusetts, USA

Emery, D, France, F. G., and Toth, M. B. 2009 “Management of Spectral Imaging Archives for Scientific Preservation Studies, ” Archiving 2009, Society for Imaging Science and Technology, Arlington, VA, May 4-7 137-141

Esteva, M., Trelogan, J., Rabinowitz, A., Walling, D. and Pipkin, S. 2010 “From the Site to Long-term Preservation: A Reflexive system to Manage and Archive Digital Archeological Data, ” Archiving 2010, Society for Imaging Science and Technology, The Hague, June 1-4

France, F.G. 2010 “Spectral Imaging and Non-Invasive Characterization of Manuscripts, ” Eikonopoiia: Symposium on Digital Imaging of Ancient Textual Heritage, Helsinki, Finland, October 28-29, 2010 51-64

France, F.G., Christens-Barry, W., Toth, M.B., Boydston, K. 2010 “Advanced Image Analysis for the Preservation of Cultural Heritage, ” 22nd Annual IS&T/SPIE Symposium on Electronic Imaging, San Jose, January 2010

France, F. G., Emery, D., and Toth, M.B. 2010 “The Convergence of Information Technology, Data and Management in a Library Imaging Program, ” Library Quarterly special edition: Digital Convergence: Libraries, Archives, and Museums in the Information Age,

France, F.G., Roussakis, V., Lissa, P., Xamena, M, Santillán, P, Capero de Larrán, M., Doña, G and Ammirati, G. 2005 “Textile Treasures of Llullaillaco, ” North American Textile Conservation Conference, Mexico City, November 2005 25-30

Museum of High Altitude Archaeology (MAAM), Salta, (link)

Schreibman, S. and Hanlon, A.M. 2010 “Determining Value for Digital Humanities Tools: Report on a Survey of Tool Developers, ” Digital Humanities Quarterly, 4 (2)

Svensson, P. 2010 “The Landscape of Digital Humanities, ” Digital Humanities Quarterly, 4 (1)

UNESCO Charter on the Preservation of Digital Heritage, (link)

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2011
"Big Tent Digital Humanities"

Hosted at Stanford University

Stanford, California, United States

June 19, 2011 - June 22, 2011

151 works by 361 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (still needs to be added)

Conference website: https://dh2011.stanford.edu/

Series: ADHO (6)

Organizers: ADHO

Tags
  • Keywords: None
  • Language: English
  • Topics: None