A Method to Automatically Georeference and Estimate the Coastline Precision of Digital Historical Maps

paper, specified "short paper"
Authorship
  1. 1. Giovanni Maria Pala

    Oxford University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.


Maps were a key technology in modern navigation yet, until recently, the quantitative study of historical map’s content on a large scale has been limited by constraints on access to materials and by computational limitations. Consequently, existing historical studies dealing with cartography have relied on representative examples and curated comparisons, without engaging in formal large-scale investigations (Jupp, 2017; Carlton, 2015; Petto, 2007; Akerman, 2006). The recent flourishing of new digital technologies and materials encourages different approaches (Chiang et al., 2014; Kovarsky, 2011), and has seen a flourishing of methods that explore ways of extracting maps’ textual content (see Machines Reading Maps) and achieving semantic segmentation for feature recognition (Petitpierre et al., 2022). In line with recent applications, this contribution presents a new method to register changes in historical maritime cartography by studying coastlines representation. To do so, it introduces a new automatic approach to historical maps georeferencing.
The choice of coastlines avoids issues of tampering: contrary to land boundaries mapping, the coastline was rarely subject to politically driven alterations. It is a very different border. Precision of maps is meant here to be the closeness of the land shapes they represent to current 21
st century representations. Because the geographic coordinate system was, in its general construction, already defined by the late 17th century, it is possible to compare changes of positioning within it across time. Importantly, erosion at high scales is a assumed to be limited.

Rigorously studying the evolution of the quality of coastline mapmaking would contribute to numerous histories related to maritime trade, printing, and the knowledge economy (Kelly and Ó Gráda, 2019; Pascali, 2017; Dowey, 2017). This would also increase our understanding of political connections, especially between distant regions such as Europe and East Asia (Hostetler, 2009).
The measurement of the evolution of coastline and landmass positioning in historical maps requires the solution of three technical problems: georeferencing, segmentation, and assessing the map’s precision.
Georeferencing is the process of associating an object (e.g., a map scan) to a system of geographic coordinates. Currently this process is almost invariably done by hand, with the user imputing specific Ground Control Points (GCPs) on the digital image which are associated with the equivalent point of known coordinates on the globe. Existing algorithms can then, with increasing accuracy as the points increase in number, create a geo-referenced raster that is readable by a GIS software (Jenny and Hurni, 2011; Rumsey and Williams, 2002). This process can be very time consuming. The algorithm proposed here solves for the first time this issue by means of a multi-step approach. As a starting point, it uses a state-of-the-art Optical Character Recognition software (OCR) to obtain the longitude and latitude coordinates reported on the boundary of the map. The boundary region is identified with an approach similar to the one used by Ares Oliveira et al. (2018). A Convolutional Neural Network (CNN) modelled after a U-Net encoder-decoder with skip-connections, is then applied on a binary pre-processed version of the map to extract its projection grid. The extracted grid is decomposed in segments via a Probabilistic Hough Transform (Galamhos et al., 1999), and the grid’s inclination space is obtained by interpolation of the extracted segments’ inclinations. Combining the results, the GCPs are automatically placed at the intersection of the latitude-longitude lines produced from the interpolated grid values. The GCPs’ geocoordinate values are the OCR ones.
Once the image is geo-referenced, every image pixel has a latitude-longitude coordinate assigned to it. The pixels, however, need to be categorised (for example, as “land” or “sea”). A semantic segmentation (i.e., by-pixel classification) CNN network has been trained on synthetic data and original map scans to complete the task. The resulting classified and geo-referenced image can be next used to single out the coastlines.
The final step of this process is the measurement of the distance between the segmented map and its 21
st century equivalent. This is the measure of the maps’ precision. As a rule, this study employs a simple error rate built on spatial blocks set at some fixed arc of a degree granularity, but more advanced approaches are considered (e.g., Wasserstein or Sinkhorn distances, see: Villani, 2009; Cuturi, 2013). The error rates obtained for each map, paired with the metadata available for the cartographic objects, create a rich dataset that can be used to model and test hypotheses with other correlates, and improve our capacity to ask questions on the evolution of geographic information, precision, and accuracy.

Bibliography

Akerman, J. R. ed (2006).
Cartographies of Travel and Navigation. University of Chicago Press.

Ares Oliveira, S., Seguin, B., and Kaplan, F. (2018). dhSegment: A generic deep-learning approach for document segmentation.
Frontiers in Handwriting Recognition (ICFHR), 2018 16th International Conference on, pp. 7-12

Carlton, G. (2015).
Worldly Consumers: The Demand for Maps in Renaissance Italy. University of Chicago Press.

Chiang, Y., Leyk, S., and Knoblock, C. A. (2014). A survey of digital map processing techniques.
ACM Computing Surveys (CSUR), 47(1): 1-44.

Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport.
Advances in Neural Information Processing Systems, 26: 2292-2300.

Dowey, J. (2017).
Mind over matter: access to knowledge and the British Industrial Revolution. Ph.D. Dissertation, The London School of Economics and Political Science.

Galamhos, C., Matas. J., and Kittler, J. (1999). Progressive probabilistic Hough transform for line detection.
Proceedings: 1999 IEEE computer society conference on computer vision and pattern recognition, 1: 554-560.

Hostetler, L. (2009). Contending cartographic claims? The Qing empire in Manchu, Chinese, and European maps. In Akerman, J. R. (ed),
The Imperial Map: Cartography and the Mastery of Empire. University of Chicago Press, pp. 93-132.

Jenny, B., and Hurni, L. (2011). Studying cartographic heritage: Analysis and visualization of geometric distortions.
Computers and Graphics, 35(2): 402-411.

Jupp, D. L. (2017). Projection, Scale, and Accuracy in the 1721 Kangxi Maps.
Cartographica: The International Journal for Geographic Information and Geovisualization, 52(3): 215-232.

Kelly, M. and Ó Gráda, C. (2019). Speed under sail during the early industrial revolution (c. 1750–1830).
The Economic History Review, 72(2): 459-480.

Kovarsky, J. (2011). Searching for early maps: use of online library catalogs.
ACMLA Bulletin, 138.

Machines Reading Maps,
https://www.turing.ac.uk/research/research-projects/machines-reading-maps (accessed 20 April 2022)

Pascali, L. (2017). The wind of change: Maritime technology, trade, and economic development.
American Economic Review, 107(9): 2821-2854.

Petitpierre, R., Kaplan, F., and di Lenardo, I. (2022). Generic semantic segmentation of historical maps.
CEUR Workshop Proceedings,
http://ceur-ws.org, ISSN 1613-0073.

Petto, C. M. (2007).
When France Was King of Cartography: The Patronage and Production of Maps in Early Modern France. Lexington books.

Rumsey, D., and Williams, M. (2002).
Historical maps in GIS.

Villani C. (2009).
Optimal Transport: old and new. Berlin: Springer, pp. 93-111.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022
"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website: https://dh2022.adho.org/

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO