Center for Integrated Area Studies - Kyoto University
Center for Integrated Area Studies - Kyoto University
To understand an academic humanities paper, it is crucial to understand each element in the paper (person, place, mention of another paper, etc.). To understand each element in the paper, we have to refer to knowledge outside of the paper; in other words, linked knowledge is important. Linked Open Data (LOD) (Heather and Bizer, 2011), which is emerging from research in the Semantic Web, provides the way to represent and share such linked knowledge.
We prepared texts of "Japanese Journal of Southeast Asian Studies" as a core dataset, in order to attempt to link words or documents in the core dataset
to external resources such as DBpedia (also available in Japanese) or the National Diet Library in Japan.
Though LOD has been growing rapidly (Schmachtenberg, 2014), it is difficult to cover specific knowledge in each academic paper. Therefore, the publication of LOD is also an important effort to represent knowledge networks in academic humanities papers.
The Center for Integrated Area Studies, Kyoto University (CIAS) has developed two information tools, named MyDatabase (MyDB) and Resource Sharing System (RSS), to solve these difficulties. The main component of MyDB is a database builder, allowing humanities researchers to construct and revise databases without expert knowledge. MyDB stores metadata and accepts any vocabulary of metadata, including nonstandard vocabularies. This enables humanities researchers to use their own metadata vocabulary according to their own purposes. On the other hand, those metadata varieties make the integration processes difficult. RSS was developed to integrate heterogeneous databases on the Internet, and to provide users with a uniform interface to retrieve databases seamlessly in one operation. Thus, MyDB and RSS have helped accelerate open data in the humanities. However, there are still two problems to solve, especially in the case of RSS: limited coverage of databases and initial costs of integration. First, for example, Kyoto University released KULINE (OPAC), KU-RENAI (repository), KURRA (archive), Open Course Ware and various databases developed by each research institute in the university, but RSS does not integrate these databases. Second, it is time consuming to integrate new databases into RSS and impossible to trace links automatically. As such, for now, RSS is not the appropriate tool to discover hints and/or create new knowledge.
To overcome these drawbacks, a new project has been launched to develop an innovative information platform for open humanities data. This platform comprises three sublayers. The first layer is "Open Data Layer" which accumulates heterogeneous metadata. This layer uses RDF to describe data of different structures. The second layer is "Data Link Layer." This layer uses ontology techniques such as RDFS and OWL to link ambiguous (uncontrolled) vocabularies and emerge "humanities big data.” The third layer is "Application Layer." As big data in the humanities is too huge and complicated to retrieve, categorize, and analyze by hand, this layer provides utilities to process big data. This platform will prepare for APIs to help mashup applications. We expect the platform to reconstruct a knowledgebase from heterogeneous databases, which is used to construct meaningful chunks from scattered data.
Thus, humanities Linked Open Data has been developed, and the "Japanese Journal of Southeast Asian Studies" dataset can be linked to that LOD. This linked knowledge can then help readers from other domains.
Bibliography
Heath, T. and Bizer, Ch. (2011). Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web. Morgan & Claypool Publishers.
Schmachtenberg, M. et al. (2014). Linking Open Data cloud diagram http://lod-cloud.net/
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Complete
Hosted at McGill University, Université de Montréal
Montréal, Canada
Aug. 8, 2017 - Aug. 11, 2017
438 works by 962 authors indexed
Conference website: https://dh2017.adho.org/
References: http://web.archive.org/web/20170802132745/https://www.conftool.pro/dh2017/sessions.php
Series: ADHO (12)
Organizers: ADHO