Humboldt-Universität zu Berlin (Humboldt University)
Humboldt-Universität zu Berlin (Humboldt University)
SLUB-Dresden
1
Dingler-Online – The
Digitized "Polytechnisches
Journal" on Goobi
Digitization Suite
Hug, Marius
marius.hug@culture.hu-berlin.de
Humboldt-Universität zu Berlin
Kassung, Christian
CKassung@culture.hu-berlin.de
Humboldt-Universität zu Berlin
Meyer, Sebastian
sebastian.meyer@slub-dresden.de
SLUB-Dresden
This project located at Humboldt-Universität
zu Berlin sets out to digitize Dingler’s
"Polytechnical Journal" ("Polytechnisches
Journal"), 1820-1931. Aside from the
digitization of the journal’s images, we encode
the OCRed text according to the Text Encoding
Initiative Guidelines TEI-P5. Our online edition
of Dingler’s journal will be freely available
on a state-of-the-art system called Goobi
Digitization Suite, which is a new production
and presentation solution funded by the DFG
(German Research Foundation).
1. Dingler's Journal
In 1820 the German chemist and industrialist
J. G. Dingler started publishing the
"Polytechnisches Journal". This journal was to
include a personal but representative selection
of a broad variety of articles. Originally most of
these articles had been published in magazines
all over Europe and, though most of them
originated from the UK, there were also
specimens from France, Italy, and Russia.
The "Polytechnisches Journal" was published
over a period of 111 years. Thus, the journal
became an extremely important source for
the history of 19th century knowledge, as it
is an account of period that included the
industrialization, the progress of transport
and communication, and the differentiation
of various technologies. For instance, the
journal covers the discovery of electro-
magnetism by Hans Christian Oersted (1820)
and the theory of relativity by Albert Einstein
(1905/1915). It contains articles on steam
engines and locomotives, as well as bicycles and
automobiles.
Synchronic and diachronic transfer of
knowledge and technique
Articles published in Dingler’s journal give
us an example of the emergence of culture
in a technical context. In the process of
industrialization, new technical achievements
profoundly affected everyday life. It is in the
interplay of science and knowledge that the
journal evolves its epistemic significance. The
"Polytechnisches Journal" is unique and highly
relevant for very different research fields which
focus on the cultural history as it emerged
from Europe’s technical transformations. It is
significant not only for people engaged in the
history of science but for anyone interested in
the cultural heritage of Europe.
2. Dingler-Online – The encoding
Linking
Since Dingler annotated his editorial work very
thorougly, we find all necessary metadata on
each article contained within the journal. He
even went one step further: Dingler cross-
referenced other source material on issues
inherent to each article. Therefore "linking" is
one of the main tasks for enriching the text.
Doing this consistently from the very beginning
of our project we are aiming at a network of
digitized knowledge for that period covering
the whole of Europe. Any digitized magazine
of a somehow technical background of the 19th
century will be interesting to be linked to.
Indexing
As is true for any non-digitally published
magazine, researching the contents of journals is
a time consuming task. Right from the beginning
Dingler knew he would have to give assistance
to those accessing his material, so he compiled
an index once a year. In 1843 the first so-
called "Real-Index" was published, a third-hand
work which covered the first 78 volumes of the
journal. All in all there are four of these "Real-
Indexes".
Based on these two different kinds of indexes
as well as our index-related TEI-encoding, we
will be able to provide a deeply granulated and
2
dynamically generated index. It will consist of
a register of persons (differentiated according
to their role, i.e. author, translator, originator
etc.), objects, and, among others, those journals,
which were the source of the published articles.
The articles – our key component
Dingler’s journal comes in 360 volumes, each
including 4 to 6 issues. The key components
of our edition are the 50 to 170 articles in
each volume. Even at this very basic level, we
distinguish between two types of articles, since
there are in fact a couple of articles published for
the first time in the "Polytechnisches Journal"
in addition to the reprinted articles. We extract
all these articles from the volume and provide
access to downloadable PDF-versions as well as
different formats of established bibliographical
meta-data. In the long-run we aim at providing
access to PDFs generated dynamically via XSL-
FO.
Text and images
The editors of the journal strictly adhered to
the medial conditions of their time. In 1820
Dingler started using Gothic typescript for text,
and copper engravings for the imprints. In later
issues (starting in the 1870s) we find Antiqua
letters and floating images integrated within the
text.
Our aim is a re-interpretation of the relationship
between text and images. Dingler completed
each volume with technical drawings and
visualizations on additional plates. Hence, up
to 40 figures on a plate are encoded according
to their specific coordinates using the Image
Markup Tool developed by the University of
Victoria. Via hyperlink we are able to provide
access to a zoomable view of each figure.
This approach has two immediate advantages:
Firstly, it enables parallel reading of text and
image and therefore adopts the original layout,
in which plates were attached to the back of
each volume as foldouts. Secondly, we can
provide a new kind of readability. For economic
reasons the plates were densely packed with
images. Highlighting them per mouseover will
be much more convenient, allowing to inspect
them in more detail and thus enabling a wider
integration of the text and images.
Since right from the beginning Dingler insisted
on very detailed and thus expensive lithographs
rather than wood engravings, we have made it
our task not to veer from the standard set by
Dingler at this point.
3. Dingler-Online II – The
appearance
Not only since we are facing the challenging
task of digitizing Gothic type in more than
220 volumes of the "Polytechnisches Journal",
we find ourselves in good company with two
rather impressive German digitization projects:
Grimms "Deutsches Wörterbuch" and Krünitz’s
"Oeconomische Encyclopädie". Both made use
of double-keying and therefore sent their books/
images to Asia where the text digitization took
place. Afterwards so-called TUSTEP-routines
were employed in order to match the two
different text versions.
In the following we will take a closer look at
different aspects, which will take our project
one step further than the aforementioned
approaches.
3.1. Goobi Digitization Suite
With the so called Goobi Digitization Suite
– a software solution funded by the
DFG and developed by the SLUB-Dresden
(Sächsische Landesbibliothek – Staats- und
Universitätsbibliothek) and the SUB-Göttingen
(Niedersächsische Staats- und Universitäts-
bibliothek) – we will be using a completely new
technology on the market.
The Goobi Suite consists of two parts:
Goobi.Production
and
Goobi.Presentation
.
Goobi.Production
is a web-based tool for
managing a digitization workflow using Java
technology. Among other features, it comes
with a very flexible metadata editor, an user-
based permission system, and visually enhanced
statistics.
Since at the beginning of our project the
Goobi Suite wasn’t available yet, we found an
experienced service provider for text digitization
and (semi-)automatic encoding: the Editura
GmbH. Their OCR produces very good results
even for Gothic type, given that the images are
scanned at 600 dpi.
Editura encodes the OCRed text and already
enriches it according to the TEI-P5 guidelines.
This step includes 'tagging' the structure and
3
special attributes of the text to an encoding level
between 3 and 4. Thus the digitization of the text
includes more than a basic structural encoding
and we can concentrate on a more scientific
encoding approach going beyond other projects
comparable in extent.
Apart from XML-files in TEI-encoding and
images encoded using the Image Markup Tool
our service provider delivers elaborate METS-
files which are necessary for a presentation
of the edition in the so called DFG-viewer,
as well as in
Goobi.Presentation
, which we
use as part two of the Goobi Digitization
Suite. This is a full-featured web presentation
layer for digital material and is based on the
TYPO3 CMS Framework, which can hold a
regular website, too. Hence,
Goobi.Presentation
integrates perfectly into any page inside the
CMS.
The whole software suite is considered open
source and freely available to everyone. As can
be seen in our project,
Goobi.Presentation
can
be used independently from
Goobi.Production
.
This modularity of Goobi is ensured by the
consequent usage of the international standards
METS, MODS and TEI.
3.2. Customizing Goobi
The more data there is to present, the
less important any unstructured information
becomes. This is why encoding and a directed
access to data, via searching or browsing,
becomes more and more important.
Goobi.Presentation
makes it possible to
customize the search engine. Naturally one will
be able to search any term anywhere in the
text. In addition, it is possible to limit the
search results referring to different issues. For
instance: if someone is looking for all articles on
patent applications on steam engines published
in the magazine in the 1840s, they will just have
to search for "steam engine", then restrict their
search to "text type" patent application, and
"time" 1840s.
4. Conclusion
Dingler-Online is an enriched digitization that is
neither simply image-based nor massproduced.
It is a user-friendly platform which inspires
a broad use not restricted to historians of
technology or, come to that, researchers, but is
open to the interested public in general.
References
Das Deutsche Wörterbuch von Jacob und
Wilhelm Grimm auf CD-ROM und im
Internet.
http://germazope.uni-trier.de/Proj
ects/DWB/
.
Dingler-Online | Das digitalisierte Polytech-
nische Journal.
http://www.polytechnischesjo
urnal.de/
.
Editura GmbH.
http://www.editura.de/
.
Fischer, F.
(2007). 'Dinglers Polytechnisches
Journal bis zum Tode seines Begründers
(1820-1855)'.
Archiv für Geschichte des
Buchwesens.
15
: 1027-1142.
Goobi – DigitalLibraryModules.
http://www.go
obi.org
.
Oeconomische Encyclopädie online.
http://www
.kruenitz1.uni-trier.de/
.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Complete
Hosted at King's College London
London, England, United Kingdom
July 7, 2010 - July 10, 2010
142 works by 295 authors indexed
XML available from https://github.com/elliewix/DHAnalysis (still needs to be added)
Conference website: http://dh2010.cch.kcl.ac.uk/
Series: ADHO (5)
Organizers: ADHO