In Search of Task-Centered Software: Building Single Purpose Tools from Multipurpose Components

  1. 1. Gary F. Simons

    Summer Institute of Linguistics (SIL International)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The mood in the humanities computing community tends to vacillate between excitement and despair. On the one hand, there is excitement for some of the great tools and techniques that have been developed; on the other hand, there is despair over the fact that they are so underused within the humanities community at large. Section 1 of the paper explores that problem and concludes that the root of the problem is tht we have been focusing on developing multiple purpose tools. Section 2 proposes that if we want to develop tools that will reach the majority of our colleagues, then we must start building tools that focus on supporting just one task at a time. Section3 describes a software engineering strategy for doing this with an economy of effort; it involves building single purpose tools from reusable multipurpose components.
1. Lamenting the state-of-the-art: underused multipurpose tools
In May of 1996, about two dozen leading figures of the humanities computing community gathered in Princeton, New Jersey for a Text Analysis Software Planning Meeting. The meeting was called to focus on two questions: "Why is it, even though we have developed so many great program, that so few of our colleagues are using them?" and secondly, "What can we do about it?"
Michael Sperberg-McQueen (1996) in his report on the conference lists four factors in answer to the first question:
1. For many potential users, existing software still seems very hard to learn.
2. Current programs don't interoperate well, or at all.
3. Current programs are often closed systems, which cannot easily be extended to deal with problems or analyses not originally foreseen.
4. Almost all current text analysis tools rely on what now seems a hopelessly inadequate model of text structure; they model text as a linear sequence of words rather than as a hierarchical structure of textual elements.
Points 2 through 4 are certainly problems that must be addressed, but they are largely problems that confront the people who are actively trying to use the existing software. The first point, namely, that most of our colleagues find the software hard to learn, is probably the single biggest factor in explaining why it is underused.
What makes software easy to learn? During the last decade, the Macintosh revolution showed us that replacing a command line interface or a question-and-answer interface with a GUI (graphical user interface) substantially improved learnability. At one poin we were optimistic that this would solve the problem, but now that we have GUI-based tools, we are still disappointed that they are underused.
What more will it take before we can hope for widespread use of our software? I think there are still two key hurdles to overcome: familiarity and semantic transparency.
The argument for familiarity was made forcefully by two of the organizers of the Princeton meeting, Susan Hockey and Willard McCarty. They emphasized the fact that the majority of our target audience is not particularly proficient at using computers. The tools we have built are new and unfamiliar; the more complicated they are, the less likely they are to be used. Both Hockey and McCarty, in separate presentations, proposed that the World Wid Web has quickly become the most ubiquitous and most familiar part of the computing landscape They argued that if we really want our software to be used by everyone in the target audience, then we need to figure out how to slip it into the Web frameworkso that just by clicking on links and filling in forms users would be running our software without leaving the familiar surroundings of their Web browser.
The second hurdle, semantic transparency, has to do with how well the task which the program performs corresponds to the task the user wants to perform. If what the program offers to do is the same as what the user wants to do, then the program is transparent; otherwise, it is opaque. If a program is opaque to the user, then it is not user friendly, nomatter how nice its user interface is. I fear that this is, unfortunately, the current state of our art.
This point is easy to illustrate. Say that I am a lexicographer who has access to an electronically encoded text corpus that would be a big help, but I don't know how to take advantage of it. First, I am just looking for example sentences to illustrate certain headwords in the dictionary. I ask my computing consultant how I would do this and he says, "Here, this concordance program will do that." Lter, I note that the sense definitions for some complex entries don't seem quite right and realize that looking at all the occurrences of the words in the corpus would help to sort things out. I tell my consultant what I want to do, and he says, "The conordance program I already gave you will do that, too." Still later I am focusing on the grammatical categories of words in the dictionary and find that I need to verify some of the category assignments. If I could find all the words with the same tag an then compare their uses in context , I could verify that the tags were applied appropriately. When I take this problem to my computing consultant, he says yet again, "Oh, the concordance program does that, too!"
As a lexicographer I had three distinct tasks in mind:
Find a good illustrative example for a word
Identify the senses of meaning of a
Determine the grammatical category for a word
1. When I went for help about software, the answer was always the same. Use the tool that performs the task:
Build a concordance
2. For some, the relationship between the desired task and the prescribed tool with all of its controls would be transparent; these are the people who would succeed in applying the current software to perform their tasks. But for most, the relationship would not be entirely transparent and these are the individuals who are most likely to remain as potential users.
3. As software developers we are tool builders, and our instinct has been to build tools that apply to as many situations as possible so that they will be used as widely as possible. This indeed was the starting point a decade ago when I embarked with a team of programmers on a project to develop a general purpose computing environment for literary and linguistic computing [8]. Along the way we have been learning some new ways of thinking as we have used the general purpose environment to develop many single purpose tools as part of the LinguaLinks system [7].
2. Focusing on a solution: task-centered single purpose tools
1. LinguaLinks is an instance of an electronic performance support system, or EPSS (Gery 1991),[6]. Specifically, it is an EPSS for language field workers that supports tasks in the domains of anthropology, language learning, linguistics, literacy, and sociolinguistics. An EPSS is a computer-based system that seeks to support a knowledge worker in performing his or her job. It does so by integrating the software tools needed to do the job with the tutorial and reference materials that are needed o know how to do the job well. A program in an EPSS gives context-sensitive help that not only explains how the program works, but also explains how to do the job. It gives examples, case studies, guidelines, advice on choosing alternatives, background nformation, and more. The notion of electronic performance support is gaining momentum throughout the business world as a way to provide just-in-time training for workers in a rapidly changing world.
2. A software development project typically begins with requirements analysis--representatives of the target users are interviewed to determine exactly what the software must do. The Princeton meeting took this approach when it broke into small groups to discuss who the potential users of text analysis softwareare and what requirements they might have. The result was a long list of potential user groups and an even longer list of needed functions. But from a performance support point of view, this does not get us any closer to software that people will actualy use.
3. Performance support focuses on the job as opposed to the software. Rather than asking who the potential users are, it asks what is the specific job that needs to be done. Once this is identified, the first step is to perform a task analysis [4]. In this process, the job to be done is broken down into all of its subtasks. These in turn are broken down into even smaller tasks. The knowledge, skills, and attitudes needed to perform each task are identified; so are the tasks tat can be supported by automated tools. Once the automatable tasks have been identified, a requirements analysis for each proposed software tool can commence.
4. The result of such requirements analysis is a number of single purpose tools, each of which is focused on performing a particular task. For instance, returning to the example above of tasks in lexicography that could be supported by concordancing, some requirements for single purpose tools would be as follows:
A tool to "Find a good illustrative ple for a word" would not only have a pane showing a concordance view of occurrences of the selected word, but could also include filtering controls to limit the genre of the source texts or the length and complexity of the sentences that are displayed. It should also have integrated help that describes not only how to operate the tool but also the criteria one should follow in selecting good illustrative examples.
A tool to "Identify the senses of meaning of a word" needs to go beyond a static concordance display to offer the user a means of classifying each individual occurrence with respect to the sense it exemplifies and then of seeing all the occurrences for a given sense listed together. There should also be integrated help describing heuristics fr determining whether two occurrences are different senses or nuances of the same sense.
A tool to "Determine the grammatical category for a word" should not only give a concordance display of all the occurrences of the word, but also give a list of the possible grammatical categories, and then offer a concordance display of all the words that have already been assigned to a chosen category. In this way the user can confirm that the word in question has the same grammatical behavior as other members of th category. There should also be integrated helps offering heuristics for determining whether two words are of the same category or of different categories, and helps on developing appropriate names for new categories that must be posited.
The full paper will include screen shots that demonstrate single purpose tools like the above that have been developed in the LinguaLinks system.
The target community's response to text encoding seems to parallel the response to software. In the same way that multipurpose tools have proven to be too complicated for most users, so have multipurpose text encoding schemes like the TEI [10]. Perhaps encoding schemes like programs need to be single purpose to become accessible to most users. Te Corpus Encoding Standard [5] is an example of an effort that has developed some single purpose DTDs that are based on the TEI.
3. Building the future: single purpose tools from multipurpose components
Software developers have been building multipurpose tools for an obvious reason--we have not had the resources to build a multitude of single purpose tools. Fortunately, new technologies are available that can make it cost effective to pursue a strategy of building single purpose tools that are trul transparent to the target audience.
The established ubiquity of the Web browser as a user environment and the pending ubiquity of XML [2] as a formalism for data encoding and interchange on the Web, give us good fixed points for the front end and back end, respectively, of a new generation of tools for humanities computing. I believe that the key to building the software that lies in between is "componentware" [3],[1]. The notion of components in software development is an anlogy to the practice that is common in building hardware systems. A customized personal computer, for instance, can be fairly easily built by piecing together a number of prepackaged components (like a power supply, mother board, circuit cards, and periperal devices). In software, a component is a generally useful bit of functionality that is housed in a reusable package. A customized program can be built by piecing together pre-existing components (like a chooser list, a concordance window, a sorting pecification, and so on). The full paper will explore more details of this approach, including a sample based on the example of concordances for the lexicographer discussed above.
The key challenge facing humanities software developers today is to move the functionality now available in large stand-alone multipurpose tools into a number of smaller reusable multipurpose components. With relatively little effort, these components can then be combined and configured in novel ways to build single purpose toos that incorporate task-specific helps. Tools like this should make it easier for novice users to tap into the riches of humanities computing.
1. Chappell, David. 1996. Understanding ActiveX and OLE: a guide for developers and managers. Redmond, WA: Microsoft Press.
2. Cover, Robin. 1998. Extensible Markup Language (XML) Web Site. <_>.
3. CWC. 1997. ComponentWare Consortium Web Site. <_>.
4. Desberg, Peter and Judson H. Taylor. 1986. Essentials of Task Analysis. Lanham, MD: University Press of America.
5. Ide, Nancy, ed. 1996. Corpus Encoding Standard. <_>.
6. PSEI 1997.! a Webzine published by Performance Support Engineering International, Inc. <_http://www.epss.com_>.
7. SIL. 1997. LinguaLinks: Electronic Helps for Language Field Work, Version 2.0. Dallas, TX: Summer Institute of Linguistics. See also <_>.
8. Simons, Gary F. 1988. A Computing Environment for Linguistics, Literary, and Anthropological Research: technical overview. <_>
9. Sperberg-McQueen, C. M. 1996. Text Analysis Software Planning Meeting, Princeton, 17-19 May 1996: Trip Report. <_>.
10. Sperberg-McQueen, C. M. and Lou Burnard. 1994. Guidelines for electronic text encoding and interchange. Chicago and Oxford: Text Encoding Initiative.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

"Virtual Communities"

Hosted at Debreceni Egyetem (University of Debrecen) (Lajos Kossuth University)

Debrecen, Hungary

July 5, 1998 - July 10, 1998

109 works by 129 authors indexed

Series: ACH/ALLC (10), ACH/ICCH (18), ALLC/EADH (25)

Organizers: ACH, ALLC