Texas A&M University
Texas A&M University
Texas A&M University
Sam Houston State University
Studying Picasso as writer might seem strange, considering
that the Spanish artist is mostly known for his paintings.
However, in the Fall of 2006 we began working on Picasso’s
writings. Audenaert, et.al. [1], describe the characteristics of
Picasso’s manuscripts, and the challenges they pose due to
their pictorial and visual aspects. With over 13,000 artworks
up-to-date catalogued, the On-line Picasso Project [5] includes
a historical narrative of relevant events in the artist’s life. In
this paper we discuss the contents of the texts—from the
linguistic standpoint—and the implementation of a bilingual
concordance of terms based on a red-black tree. Although
concordances have been widely studied and implemented
within linguistics and humanities, we believe that our collection
raises interesting challenges; fi rst because of the bilingual
nature of Picasso’s poems, since he wrote both in Spanish
and French, and second, because of the connection between
his texts and his paintings. The work reported in this paper
focuses on the fi rst issue.
Integrating Texts Into the On-line
Picasso Project
A catalogue raisonné is a scholarly, systematic list of an
artist’s known works, or works previously catalogued. The
organization of the catalogues may vary—by time period, by
medium, or by style—and it is useful to consult any prefatory
matter to get an idea of the overall structure imposed by the
cataloguer. Printed catalogues are by necessity constrained to
the time in which they are published. Thus, quite often catalogue
raisonnés are superseded by new volumes or entirely new
editions, which may (or may not) correct an earlier publication
[2]. Pablo Picasso’s artistic creations have been documented
extensively in numerous catalogs. Chipp and Wofsy [3], started
publishing a catalogue raisonné of Picasso’s works that contains
an illustrated synthesis of all catalogues to date on the works
of Pablo Picasso.
In the Fall of 2007 Picasso’s texts were added to the collection
along with their corresponding images, and a concordance of
terms both in Spanish and French was created. The architecture
created for Picasso’s poems is partially based on the one we
developed for the poetry of John Donne [7]. As often happens
in the humanities, each collection has its own characteristics,
which makes a particular architecture hard if not impossible
to reuse directly. For example, Donne’s texts are written in
English; Picasso in contrast wrote both in Spanish and French.
The Concordance of Terms
A concordance, according to the Oxford English Dictionary,
is “an alphabetical arrangement of the principal words
contained in a book, with citations of the passages in which
they occur.” When applied to a specifi c author’s complete
works, concordances become useful tools since they allow
users to locate particular occurrences of one word, or even
more interestingly, the frequency of such words in the entire
oeuvre of an author. Apparently, the fi rst concordances in
English ever put together were done in the thirteenth century,
and dealt with the words, phrases, and texts in the Bible. Such
concordances were intended for the specialized scholar of
biblical texts and were never a popular form of literature. As
might be expected, these were soon followed by a Shakespeare
concordance.
A concordance of the literary works of Pablo Picasso has
more in common with a Biblical concordance than with a
Shakespearian concordance, due to the manner in which the
Spanish artist/poet composed his poems. Many critics have
pointed out the heightened quality of words in Picasso’s
texts, a value that surpasses their own sentential context.
One gets the impression that words are simply selected for
their individual attributes and are then thrown together in the
poems. Picasso himself appears to admit using this technique
when he is quoted as saying that “words will eventually fi nd
a way to get along with each other.” For this reason, readers
of Picasso’s poems become well aware of the frequent
recurrence of certain “essential words,” which one is then
eager to locate precisely in each of the poems to determine
signifi cant nuances. Figure 1. A tabbed interface presents users with
French and Spanish Texts in chronological order. On
the right are text and image presentations
By narrowing down the number of these “essential words,”
the concordance also allows users to delimit the “thematic
domain” elaborated in Picasso’s writings. Clearly many words
deal with physical duress and the confrontation between good
and evil, as manifestations of concrete human suffering during
the Spanish Civil War and the German occupation of France in
World War II. By identifying the range of words employed, users
can clearly determine the political and cultural environment
that surrounds Picasso’s artistic creations during this period.
Nevertheless, one must not forget that Picasso’s main
contribution to the world is that of a plastic artist. A
Concordance will allow users to identify each of the words
Picasso used and link them to specifi c graphic images in his
artworks. It has been argued that Picasso’s poetry is quite
“physical” (he often refers to objects, their different colors,
textures, etc.). Even in his compositional technique, one gets
a sense that the way he introduces “physical” words into his
poems emulates the manner in which he inserted “found
objects” in his cubist collages. Many critics have pointed out,
on the other hand, that Picasso’s art, particularly during his
Cubist period, is “linguistic” in nature, exploring the language
of art, the arbitrariness of pictorial expression, etc.
Mallen [6] argues that Cubism explored a certain intuition
Picasso had about the creative nature of visual perception.
Picasso realized that vision involves arbitrary representation,
but, even more importantly, that painting also does. Once
understood as an accepted arbitrary code, painting stopped
being treated as a transparent medium to become an object
in its own right. From then on, Picasso focused his attention
on artistic creation as the creative manipulation of arbitrary
representations. Viewed in this light, painting is merely another
language, governed by similar strict universal principles as we
fi nd in verbal language, and equally open to infi nite possibilities
of expression. A concordance allows users to see these two
interrelated aspects of Picasso’s career fl eshed out in itemized
form.
Concordances are often automatically generated from texts,
and therefore fail to group words by classes (lexical category,
semantic content, synonymy, metonymy, etc.) The Concordance
we are developing will allow users to group words in such
categories, thus concentrating on the network of interrelations
between words that go far beyond the specifi c occurrences in
the poems.
Picasso is a bilingual poet. This raises several interesting
questions connected to what has been pointed out above.
One may wonder, for instance, if Picasso’s thematic domain is
“language-bound,” in other words, whether he communicates
certain concepts in one language but not in the other. A
Concordance will allow users to set correspondences between
words in one language and another. Given Picasso’s strong
Spanish heritage, it would be expected that concrete ideas
(dealing with food, customs, etc) will tend to be expressed
exclusively in Spanish, while those ideas dealing with general
philosophical and religious problems will oscillate between
the two languages. The possible corroboration of this point is
another objective of the planned Concordance.
Figure 2. Concordance with term frequency.
On the right, term-in-context display.
The Red Black Tree Implementation
Term concordances require the extraction of terms from a large
corpus along with the metadata related to their occurrences,
an operation often computationally expensive. The Digital
Donne for instance, is a pre-computed concordance. On the
other hand, repositories of texts are under constant revision
as errors are detected and corrected. When the corpus is
modifi ed, part or the entire concordance has to be rebuilt. To
solve this problem, the Picasso’s concordance is computed on
the fl y, without requiring any previous processing.
The repository of poems has initially been divided into poems,
stanza, and lines, then stored in a database. Using standard join
operations, the poems are reconstructed, allowing the terms
to be retrieved along additional metadata such as title, section,
and line number. Once the poems have been reconstructed,
each poem line is broken down into terms, which are defi ned
as a series of characters delimited by a textual space. Each occurrence of each term in turn, is inserted into the data
structure, as well as its occurrence metadata.
Our algorithm consists of an augmented data structure
composed of a Red Black Tree [4,8], where each node
represents one term found in Picasso’s writings and is used as
pointer to a linked list. A Red Black Tree is a self balanced binary
tree, that can achieve insert, delete, and search operations in
O(log n) time. Because only insertions and random access
is not required on the linked list, term occurrences can be
traversed sequentially in O(n) time. Picasso’s literary legacy
can be browsed and explored using titles as surrogates, which
are ordered by date and by the term concordance. Each entry
provides the number of occurrences and its corresponding
list, which can be used as index to browse the poems.
The retrieval process for terms and their occurrences is
carried out by selecting specifi c terms or choosing from an
alphabetical index of letters. To achieve this, a subtree is
extracted from the data structure and it is traversed, obtaining
every occurrence of a term along with additional metadata
including a unique poem identifi er, page and line number.
Extensible Stylesheet Language Transformations (XSLTs) are
used to transform the resulting XML output, and extract
the specifi c term occurrences within a line of the poem
and produce a surrogate, which is composed of a portion
of the text. Additionally, this surrogate gives access to the
corresponding poems through hyperlinks.
A new component of our implementation is a Spanish-French
thesaurus that correlates terms in both languages, along with
their meanings and commentary. Because our concordance is
created on the fl y, we have to devise a mechanism to support
this. Additionally, this approach still remains to be tested
with different corpuses in other languages, especially where
terms are separated uniquely and spaces between them play
a different role in language constructs. The term extraction
algorithm is effi cient using spaces as delimiters—a common
case both in Spanish and French. However, other languages
might include composite words.
Acknowledgements
This material is based upon work supported by the National
Science Foundation under Grant No. IIS-0534314.
References
1. Audenaert, N., Karadkar, U., Mallen, E., Furuta, R., and
Tonner, S. “Viewing Texts: An Art-Centered Representation of
Picasso’s Writings,” Digital Humanities (2007): 14-16.
2. Baer, B. Picasso Peintre-Graveur. “Catalogue Raisonné de
L’oeuvre Gravé et des Monotypes.” 4 Vols. Berne, Editions
Kornfeld, 1986-1989.
3. Chipp, H. and Wofsy, A. The Picasso Project. San Francisco:
Alan Wofsy Fine Arts. 1995-2003.
4. Cormen, T., Leiserson, C., Rivest, R., and Stein, C.,
Introduction to Algorithms, The MIT Press, 2nd Edition, 2001.
5. Mallen, Enrique. “The On-line Picasso Project.” http://
picasso.tamu.edu accessed October 2007.
6. Mallen, Enrique. The Visual Grammar of Pablo Picasso.
Berkeley Insights in Linguistics & Semiotics Series. New York:
Peter Lang. 2003.
7. Monroy, C., Furuta, R., and Stringer, G., “Digital Donne:
Workfl ow, Editing Tools, and the Reader’s Interface of a
Collection of 17th-century English Poetry.” Proceedings of
JCDL 2007, Vancouver, B.C. 2007.
8. Red-Black Tree. Accessed 2007-11-15, http://en.wikipedia.
org/wiki/Red_black_tree
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Complete
Hosted at University of Oulu
Oulu, Finland
June 25, 2008 - June 29, 2008
135 works by 231 authors indexed
Conference website: http://www.ekl.oulu.fi/dh2008/
Series: ADHO (3)
Organizers: ADHO