On the Content Model for <respStmt>: Newer is Not Necessarily Better

poster / demo / art installation
Authorship
  1. 1. Sydney D (Syd) Bauman

    Brown University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The TEI P3 declaration for <respStmt> boils down to
<!ELEMENT respStmt - O ( (resp & name), (resp | name)* ) >
Which means that a <respStmt> must contain one <resp> and one <name> in either order,
followed by any number of any combination of either. (Remember that the globally included elements are
allowed via an inclusion exception on <text>, so they do not need to be mentioned explicitly in this content
model.) So the following order of elements would be valid content.
(a) <resp>, <name>, <name>, <name>
(b) <name>, <resp>, <resp>, <resp>, <resp>
These make a lot of sense to me. In (a) the <respStmt> is listing all those individuals who shared
or had a particular responsibility. In (b) it lists an individual and all the hats she wore. The content model also
allows for the following.
(c) <name>, <resp>, <name>, <resp>, <name>, <resp>
This also makes a lot of sense: three people each with his or her responsibility. This would probably
be better encoded as a series of three <respStmt>s, each with one <name> and one <resp>. But the
content model also allows for
(d) <name>, <resp>, <name>, <name>, <resp>, <resp>, <resp>, <resp>,
<resp> <name>, <resp>, <name>
What the heck does that mean, I wonder?
The TEI P4 (XML) declaration for <respStmt> boils down to
<!ELEMENT respStmt (resp | name | %m.Incl;)+ >
This content model allows for each of the above sequences of <resp> and <name> (and any other
imaginable sequence of those two elements that would have been valid against the P3 content model), but also
allows the somewhat bizarre
(e) <resp>, <resp>, <resp>
which I suppose is how you would indicate tasks in your project that never got done, and for which
no one is taking responsibility. Even worse, it allows (f) <name> which I suppose is how you would indicate
a freeloader whom your organization pays, but does not actually do anything. And most puzzling, as long as
you include at least one element from m.Incl (say, <cb> 1), it allows you to get away with neither a <resp>
nor a <name>:
(g)
which I suppose is how you indicate that your organization is looking to hire a freeloader who won’t
do anything, but you haven’t found a qualified candidate yet.
Seriously, to me (e), (f), and (g) are clearly errors, and if possible it would be nice to catch them in the
XML validation stage. Further, I consider (d) an error, although I will not be surprised if there are those who
disagree with me. Lastly, I don’t mind giving up the capability to use (c) in the quest to be able to exclude
(d)–(g).
Thus, in 1998-12, I came up with the following SGML declaration for the WWP; I have included the
comment to explain it a bit:
<!--
*** The following declaration for RESPSTMT is similar to the
*** one found in teicore2.dtd in that NAME and RESP are the
149
*** only children of RESPSTMT, but different in that we only
*** allow either one NAME or one RESP (and multiples of the
*** other), and we insist that NAMEs come before RESPs. A
*** (purposeful) side effect is that, since we don’t use an
*** “&” connector, the content model is valid XML. Note
*** that, in principle, the content model is
*** ( ( name, resp+ ) | ( name+, resp ) )
*** but that is ambiguous (i.e., non-deterministic).
-->
<!ELEMENT %n.respStmt; - - (
%n.name;,
(
(%n.resp;)+
|(
(%n.name;)+, %n.resp; )
)
)
>
With the parameter entity references resolved and without some of the extra whitespace, this boils
down to
<!ELEMENT respStmt - - ( name, ( resp+ | ( name+, resp ) ) ) >
The content model here requires that a <name> be first and that a <resp> be last; it allows any
number (including 0) of <name>s or <resp>s, but not both, in between. Thus it allows
(h) <name>, <resp>
(i) <name>, <name>, <name>, <name>, <resp>
(j) <name>, <resp>, <resp>, <resp>
but excludes (a), and (c)–(g). Thus the only concession I need to make is that (a) must be written as
(j), with the <name> first. Not only does this not bother me, it makes a bit of sense (keeping the encoding of
<respStmt>s a bit more consistent).
So now to make this new content model into a P4 content model all that’s left is to figure out where
and how to insert the globally included elements.2 Luckily the TEI has prepared what could be thought of as a
how-to guide on this very subject3 I will not delve into the logic in that paper here; suffice it to say that I think
the following is the correct result of the application of the logic outlined in the aforementioned guide.
<!ELEMENT respStmt (
( (%m.Incl;)*, (name, (%m.Incl;)*) ),
(
( resp,
(%m.Incl;)*)+
|
(
( name, (%m.Incl;)*)+ ,
( resp, (%m.Incl;)*)
)
)
) >
This will likely be the WWP replacement for the P4 declaration of <respStmt>, and I am seeking
input as to whether or not this should be used in P5.
NOTES
1. “Why is a column break allowed in the <teiHeader>?” you ask. I hear you cry. In P3 elements
like <cb>, <lb>, and <pb> are (quite reasonably) inclusion exceptions on <text>. That means they are
allowed to occur inside <text> or any descendant of <text>. This allows you to record the fact that there
was a page break in the middle of a name in your source text: … And thus
<lb/>it was decreed in the Councell at <name
type=”place”>Nice</name>,
<lb/>that the Byshops should assemble twise
<lb/>every yeare. And in the Councel at <name type=”place”>Car
150
<pb n=”225”/>
<lb/>thage</name> it was decreed, that the Bysshops ...
(modified from WWP TR00439, John Jewel, “An Apology or Answer in Defence of the Church of
England, 1564”, Bacon, Ann (Cooke), trans.)
while still, because <pb> is not in the content model of <name>, disallowing a <pb> inside a
<name> inside a <respStmt> in the <teiHeader>. Because XML does not have inclusion exceptions,
the content models of <name> and many other elements like it which could appear both in the
<teiHeader> and in <text> need to include the globally included elements. (There are a couple of rare
exceptions, like <titleStmt>, which, although it can be a child of <biblFull>, can appear nowhere
else inside <text>, and since a <biblFull> could not have a <pb> anymore than a <fileDesc> could,
<titleStmt> does not need to include such things. At least this is my understanding; please correct me if
I’m wrong.) Thus in P4 a <cb> could indeed occur in the <teiHeader>.
2. You may wonder why bother allowing the globally included elements in <respStmt> at all. In
most cases <respStmt> occurs in the <teiHeader>, where such elements are not needed; but
<respStmt> can be a child of <bibl> inside the text. Note that I’m presuming that such a <respStmt>
is being authored, not transcribed. Since it is allowed as a child of <bibl>, one could easily imagine that this
more restrictive content model is moot because of the need to encode (From “I’d Like to Teach the World to
Tag”, from spoofters Julia Flanders & Syd Bauman) as
<bibl rend=”pre(\() post(\))”>From <title rend=”pre(&ldquo;)
post(&rdquo;)>I’d Like to Teach the World to Tag</title>
Tag”, from spoofters Julia Flanders & Syd Bauman) as
from <respStmt><resp>spoofters</resp> <name>Julia Flanders &amp;</name> <name>Syd
Bauman</name></bibl>
Tag”, from spoofters Julia Flanders & Syd Bauman) as
However, I do not think of this as a good way to encode such a reference, as evidenced by the fact
that there is noplace to put that ampersand (it isn’t really part of Julia’s name, now, is it?).
Tag”, from spoofters Julia Flanders & Syd Bauman) as
3. Thanks to then TEI editor C. Michael Sperberg-McQueen; see
http://www.tei-c.org/Vault/ED/edw69.sgm, or, pre-formatted into HTML at
http://www.tei-c.org/Vault/ED/edw69.htm.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ACH/ALLC / ACH/ICCH / ALLC/EADH - 2003
"Web X: A Decade of the World Wide Web"

Hosted at University of Georgia

Athens, Georgia, United States

May 29, 2003 - June 2, 2003

83 works by 132 authors indexed

Affiliations need to be double-checked.

Conference website: http://web.archive.org/web/20071113184133/http://www.english.uga.edu/webx/

Series: ACH/ICCH (23), ALLC/EADH (30), ACH/ALLC (15)

Organizers: ACH, ALLC

Tags
  • Keywords: None
  • Language: English
  • Topics: None