Construction of the corpus of senmyō: one of the oldest materials of Japanese language

poster / demo / art installation
  1. 1. Toshinobu OGISO

    National Institute for Japanese Language and Linguistics (NINJAL)

  2. 2. Neisin GO

    National Institute for Japanese Language and Linguistics (NINJAL)

  3. 3. Yukie IKEDA

    Chuo University

  4. 4. Tetsuya SUNAGA

    Showa Women's University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

We worked on construction of the corpus of senmyō (imperial edict) written in the 8th century for linguistic research. Senmyō is one of the oldest materials of Japanese language and is written in Old Japanese using a special notation method using only Chinese characters called "senmyō-gaki". In order to encode this notation, we reproduced it using a originally extended tag set based on TEI. We also added word information to the full text of this corpus using Mecab and UniDic. As some of the words in senmyō can be read in two ways, Chinese style and Japanese style, we devised that these two readings can be assigned to the same location when adding word information. This corpus is published through an online search application called "Chunagon".

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2020
"carrefours / intersections"

Hosted at Carleton University, Université d'Ottawa (University of Ottawa)

Ottawa, Ontario, Canada

July 20, 2020 - July 25, 2020

475 works by 1078 authors indexed

Conference cancelled due to coronavirus. Online conference held at Data for this conference were initially prepared and cleaned by May Ning.

Conference website:


Series: ADHO (15)

Organizers: ADHO