Toward a Morphosyntactic User Model for Language Analysis and Generation: A PhD Proposal Lisa N. Michaud michaud@cis.udel.edu Computer and Information Sciences Department University of Delaware Newark, DE 19716 September 9, 1999 Abstract This proposal paper is being presented in partial fulfillment of the Ph.D. requirements of the Department of Computer and Information Sciences at the University of Delaware. In this paper, I discuss a user modeling architecture for ICICLE, a natural language system intended for use as a writing tutor for deaf learners of written English. This proposed design, intended to model dynamic aspects of a learner over the passage of time, the acquisition of new knowledge, and multiple sessions with the system, includes components to track the history of interaction with a given user as well as a very complex, dynamic model of user interlanguage grammar and domain knowledge. It has been based on research in language acquisition and in the acquisition of cognitive skills. The focus of the work described in this proposal is the development of the model of interlanguage status, which will be used in the analysis of user language production and in the generation of usertailored explanations. Contents 1 Introduction 3 1.1 The ICICLE System: Motivation and Goals . . . . 3 1.2 The User Model: A Proposal . . . . . . . 5 1.2.1 The Demand for a Model . . . . . 5 1.2.2 Components of the Model . . . . . 6 1.3 Guide to this Proposal . . . . . . 7 2 Related Work 8 2.1 Early Explanation Systems . . . . . . . . 8 2.1.1 XPLAIN . . . . . . . . . 8 2.1.2 TEXT . . . . . . . . . . . 9 2.1.3 EES . . . . . . . 10 2.1.4 Discussion . . . . . . . . . 11 2.2 Toward User Modeling in Explanation Systems . . . . . . . 11 2.2.1 TAILOR . . . . . . . . . 12 2.2.2 Menotutor . . . . . . . . 13 2.2.3 EDGE . . . . . . . . . . . 14 2.2.4 Discussion . . . . . . . . . 16 2.3 ComputerAssisted Language Learning . . . . . . . 17 2.3.1 HyperTutor . . . . . . . . 18 2.3.2 Mr. Collins . . . . . . . . 18 2.3.3 German Tutor . . . . . . 19 2.3.4 Discussion . . . . . . . . . 19 2.4 Summary . . . . . . . . 20 3 ICICLE System Overview 21 3.1 Architecture . . . . . . . 21 3.1.1 Error Identification . . . . . . . . . 21 3.1.2 Response Generation . . . . . . . . 22 3.1.3 The User Model . . . . . . . . . . 23 3.1.4 The Domain Knowledge Base . . . . . . . . 23 3.1.5 The User Interface . . . . . . . . . 24 3.2 Motivation . . . . . . . 24 3.2.1 A Cyclic Approach . . . . . . . . . 24 3.2.2 Teaching a Second Language . . . . . . . . 25 3.3 Implementation Status . . . . . . 26 4 Text Generation in ICICLE 27 4.1 Planner Overview . . . . . . . . . 27 4.1.1 Content . . . . . . . . . . 28 4.1.2 Method . . . . . . . . . . 28 4.1.3 Form . . . . . . . 29 4.1.4 History . . . . . . . . . . 30 4.1.5 Manner . . . . . . . . . . 31 4.2 Operationalizing a MultiPhasic Text Planner . . . 31 4.2.1 Method: a Brief Sketch . . . . . . 34 4.2.2 Form . . . . . . . 35 4.2.3 History: a Revision Approach . . . . . . . . 37 4.2.4 Manner . . . . . . . . . . 40 4.3 Realizing the System Response . . . . . . 40 4.3.1 Comprehensible Input . . . . . . . 40 4.3.2 Using FUF . . . . . . . . 41 4.4 Presenting the Explanation to the User . . . . . . 42 4.5 Recovering from Failed Explanations . . . . . . . . 42 5 Proposal: A User Model for ICICLE 45 5.1 Reviewing the Demands on the Model . . . . . . . 45 5.2 Modeling Second Language Acquisition . . . . . . 46 5.2.1 Interlanguage . . . . . . . 47 5.2.2 Focusing on the Frontier of Acquisition: the ZPD . . 48 5.2.3 Toward Modeling the Interlanguage . . . . 49 5.2.4 SLALOM: A Proposed Model Architecture . . . . . 51 5.3 Modeling Explicit Language Knowledge . . . . . . 52 5.4 The History Models . . . . . . . 53 5.5 Representing a Changing User . . . . . . . 55 5.5.1 Initialization . . . . . . . 56 5.5.2 Retrieving the Information . . . . . . . . . 58 5.5.3 Updating . . . . . . . . . 59 6 Summary and Future Directions 61 6.1 Completing the User Knowledge Model Architecture . . . . 61 6.2 Implementation Goals . . . . . . 62 6.2.1 Error Identification Using the Model . . . . 63 6.2.2 Knowledge Model Updating after Text Analysis . . . 63 6.2.3 Pruning the Error List . . . . . . . 64 6.2.4 Response Planning . . . . . . . . . 64 6.3 Evaluation . . . . . . . . 65 6.4 Conclusion . . . . . . . 66 6.5 Acknowledgments . . . . . . . . . 66 Chapter 1 Introduction Approaches to explanation planning and generation in natural language systems have generally moved from an origin in simple, highlyrestrictive techniques to those with greater flexibility in accommodating the context of the generation activity. Generating explanations which are sensitive to their context has been a goal explicitly or implicitly and to varying extents in many systems, but while many text generation systems define "context" as the preceding dialogue alone, in this work I prefer to see the context as encompassing a much broader scope, including: concepts in the domain which can be compared to the topic at hand; the user's skills in the domain; his or her knowledge about domain topics and their supporting concepts (important if the system wishes to select the depth of its explanation or to generate additional explanatory material at need); and the suitability of different tutorial techniques to the strengths of the user. While the relationships of concepts in the domain can be assumed to be static, the other aspects of this redefined context are dynamic (see Figure 1.1) and they form an everchanging atmosphere which must be taken into account when generating explanations if the result is to be maximally effective with this particular user --- and because these dynamic context elements are all artifacts of the user, to be aware of them the system must model the user and account for how he or she changes over time. This work addresses the user modeling issues entailed by the ICICLE system, a natural language system under development which uses both language analysis and language generation to tutor users on their English writing skills. I will focus on the following topics with respect to ICICLE's user modeling: what must be modeled, how it will be modeled, why it can be modeled that way, and where I intend to take this design in the scope of my dissertation work. 1.1 The ICICLE System: Motivation and Goals The name ICICLE represents "Interactive Computer Identification and Correction of Language Errors" and is the name of an intelligent tutoring system currently being developed at the University of Delaware (McCoy and Masterman (Michaud), 1997; Michaud and McCoy, 1998; Schneider and McCoy, 1998; Michaud and McCoy, 1999). The system's primary goal is to employ natural language processing and generation to tutor deaf students on their written English. Of paramount importance to this goal is the correct analysis of the source and nature of usergenerated language errors and the production of tutorial feedback to student performance which is both correct and individualized, taking into account the language knowledge, proficiency, and learning style of the student, as well as the context of previous explanations and related concepts in the domain. ICICLE's interaction with the user takes place primarily through a cycle of user input and system response. The cycle begins when a user submits a piece of writing to review by the system. The system then performs an analysis on this writing, determines its grammatical errors, and constructs a response in the form of tutorial feedback. This feedback is aimed toward making the student aware of the nature of the errors found in the writing and toward giving him or her the information needed to correct them. When the student makes those corrections and/or other revisions to the piece, it is resubmitted for analysis and the cycle begins again. As ICICLE is intended to be used by an individual over time and across many pieces of writing, the cycle will be repeated many times. Figure 1.1: Elements of context. Since ASL is a distinct and vastly different language from English (Baker and Cokely, 1980), we view the acquisition of written English skills to be a task in second language acquisition for these learners (Michaud and McCoy, 1998). While providing this instruction, ICICLE will therefore try to satisfy the deaf learner's need for understandable second language input. With poor or no aural capabilities, deaf learners receive nearly all of their English input through written material, often academic texts aimed at the comprehension level of their hearing peers (Anderson, 1993). Since the consensus among most researchers in Second Language Acquisition (cf. (Krashen, 1985)) holds that second language input at or near the learner's level of existing proficiency is most beneficial for learning, we would like to address this poverty of suitable input in our systemgenerated explanations. The intent for the surface form of our generated text is to focus upon grammatical constructions which involve those aspects of English the student is currently attempting to master, providing positive examples at a level of accessibility our target learners do not always have access to. Another way in which ICICLE will address the unique needs of the deaf population is by providing the user with corrections on his or her errors without involving a human teacher. Because this form of instruction may entail less "loss of face" for the learner than a situation with a human tutor, the hope is that this will put the students more at ease and encourage them to write more. Furthermore, it is our hope that the presentation of the feedback will also allow for a student to further explore concepts which he has not fully understood; in the evaluation of other systems producing usertailored output, users found the system more accessible than the human authority they would otherwise be consulting (Carenini et al., 1994; Moore and Mittal, 1996). 1.2 The User Model: A Proposal The current status of the ICICLE system is a functioning text parser with the ability to recognize and label morphosyntactic errors, delivering "canned" onesentence explanations of each error (see Section 3.3 for more details). In order to extend this system to obtain more accurate parses [The system currently chooses the first grammatical parse of any list of multiple parses for a sentence, or the first of all of the parses if there is no grammatical possibility.] and to involve the generation of original explanations in a manner tailored to the individual learner, the system must be able to collect and refer to information about that learner. It requires a very complex user model which can store and maintain information about a student across multiple sessions of system interaction in order to adapt itself to the changing needs of a student across the learning journey. The purpose of this paper is to motivate and outline a proposed design for that model and to detail how the development and the implementation of part of that model will proceed as part of my doctoral work. Part of the user model design has previously been sketched in earlier work including (McCoy et al., 1996; Michaud and McCoy, 1999), but this paper will be the most comprehensive description of the current design, what remains to be developed, and what questions still need to be answered. 1.2.1 The Demand for a Model Both the error analysis and the system response processes in the ICICLE architecture place demands on a model of the system user. This section addresses those demands in order to motivate the components of the model being proposed. In order to obtain a correct analysis of the source and nature of user errors, the error identification module needs to determine between multiple parses or interpretations of a sentence. Some of these parses represent different structural representations of the text, and in the case of ungrammaticality may place the "blame" for the error on different constituents. Other parses may involve the same violated grammar constituent but with different "sources" for the error. Since determining the nature and cause of student errors is an integral step to deciding how to approach instruction (Matz, 1982), the parser must be able to make principled decisions between these options. For instance, if the phrase "My brother like to go..." [This example has been taken from our corpus of deaf writing samples.] has occurred in the writing of a student, there are several possible situations that could have led to this mistake: the student could be entirely unaware of the English rule for subject/verb agreement; the student could know about the rule, but has applied it incorrectly here due to incomplete knowledge; or the student could have simply mistyped. To determine which of these possibilities is correct, it is necessary for the error analysis component to have at its disposal a model of the student's grammatical proficiency which indicates his or her mastery of such language rules, or features, as the concept of subject/verb agreement (McCoy et al., 1996). This knowledge would also aid in choosing between structurallydifferentiated parses by providing information on which grammatical constructs the user can be expected to use correctly or incorrectly. Another responsibility of the error analysis component is to pass a list of errors to the tutorial response component for the generation of instructive text. It is our wish that ICICLE give instruction only on those language components which are at the user's current level of acquisition; errors on those above this level are likely to be beyond the user's understanding, while errors on those which are wellestablished are likely to be simple mistakes which do not require instruction. This places an additional demand on the user model: not only must it show the user's depth of knowledge on a given feature, but it also must indicate a "current level" to which the features may be compared. With such knowledge, the error analysis component may trim away those errors outside this indicated realm of accessible and productive learning. Another part of the system requiring user modeling is the system response module. It is our goal to generate explanations which are individualized, taking into account a broad spectrum of factors which constitute the context of the generation activity, the components of which were outlined at the beginning of this paper: related concepts in the domain, the user's knowledge about the topic and supporting concepts, the dialogue history, and the user's history of system use. A need has already been established for a model of the user's grammar proficiency; added to this now is a hierarchical model of the user's domain knowledge --- metalinguistic knowledge of the terms and concepts used in grammatical explanations. For instance, an explanation about subjectverb agreement requires at the very least an understanding of the concepts subject and verb, and furthermore may require an understanding of the person property of nouns. This model will need to represent both the user's knowledge of these concepts and the relationships between them. Another need of the response module is to have history models which not only store the dialogue history in order to facilitate contextual references to recent explanations and to avoid repetition, but which also track how different types of explanations have succeeded or failed with this user. This information would be used when choosing between different explanation types in order to maximize the learner's potential for understanding the explanation. Finally, Section 1.1 established that one of ICICLE's goals is to provide generated text whose surface form is at an accessible level of syntactic complexity for the user, using grammatical constructs from the "current level" of acquisition in order to aid learning through the provision of positive examples. The final phase of the response generation therefore also needs to make use of a user model, obtaining from it information about which constructs are at this level in order to weight its surfacelevel generation decisions more heavily toward them. 1.2.2 Components of the Model In cataloguing the demands which the ICICLE system architecture places on a dynamic user model, I have established that this model must have the following components: o Knowledge Models -- A representation of the user's grammar competence in terms of individual morphosyntactic constructions. -- A representation of the user's knowledge of domain concepts underlying the constructions mentioned above. o History Models -- A dialogue history model containing a representation of all of the explanations which have been provided to the user during the current system session. -- A system history model holding information about what tutorial approaches have been attempted with this user and their relative success rates over all sessions. In this paper, the ICICLE system component referred to as the "user model" will generally refer to the large knowledge base spanning all four of these components. Where appropriate, the terminology will be refined to "user grammar model," "domain knowledge model," "dialogue history," and "system history." In some cases, the process of modeling the user may be referred to in terms of the knowledge models alone; these are the largest, most complex elements of the user model as a whole, and they will be the primary focus of this proposal and my subsequent research. 1.3 Guide to this Proposal The rest of this paper will proceed as follows. I will discuss the relevant previous work in the field of user modeling within tutoring and explanation systems in Chapter 2. In Chapter 3, I will then give a short overview of the architecture and approach of the ICICLE system as a whole. In Chapter 4, I will focus upon the generation aspect of the system, outlining an intensely contextaware text planner which will be making use of the user model. Finally, in Chapter 5 I will discuss the specifics of my user model design and address the implementation issues for placing this model within the ICICLE system. Chapter 6 contains a summary which outlines my plan of attack on this work. Chapter 2 Related Work This chapter overviews the efforts of previous explanationgeneration systems both within and outside of the field of Computer Assisted Language Learning. My main intent is to sketch the evolutionary direction of systems which provide tutorial instruction and to compare this direction against ICICLE's design and goals. 2.1 Early Explanation Systems As mentioned in the Introduction, the tendency of explanationgeneration systems has been to move from relatively inflexible beginnings to systems with higher levels of adaptivity to context. In particular, while domain knowledge bases have been a required source of information from the beginning of generation efforts, the extent to which the systems have modeled userspecific information such as the user's knowledge and the dialogue history has increased greatly over time. This section briefly overviews early explanationgeneration systems in order to illustrate this progression. 2.1.1 XPLAIN Williams Swartout gave his XPLAIN system (Swartout, 1983) the task of explaining how an expert consulting system arrived at conclusions or why it asked the user certain questions. Its primary goal was to allow a user to understand the reasoning behind an expert system's actions in order to ensure that the user had faith in the recommendations made by the system. XPLAIN acquired this capacity to explain an expert system through providing the programmer with an environment in which to design the expert system. During the design process it tracked how the programmer connected the descriptive domain model (containing facts in the domain of the expert system) with prescriptive domain principles (containing the heuristics and methods for operating in that domain) and then stored these connections for reference when it needed to explain the methods or heuristics. XPLAIN was implemented as part of a reimplementation of Digitalis Therapy Advisor, a medical advising program for doctors. XPLAIN's generation process was essentially what is called "database tracing," where the generator iterates through a relevant portion of a database and outputs phrases whose organization mirrors that which is hardcoded into the knowledge representation of the system. Each step of reasoning encoded in the system was transformed into a phrase, and the only way in which XPLAIN was able to diverge from the structure of the database was to omit the phrase explaining a given step in a process. One situation in which the system did this was if the step was deemed redundant; if the user was asking about that step, the statement that the step occurred was deemed unnecessary. The system also used a notation called a "viewpoint" marked on each element in the database to determine inclusion or exclusion in an explanation. In the actual implementation of the system, the only viewpoint which was relevant was the "computer" viewpoint, which was used to indicate that the step was deemed to be important only to the internal workings of the system. This was the case when the step was an artifact from the process of breaking down the procedure into a computer algorithm, and was therefore at too primitive a level to be relevant to an explanation presented to a human. Beyond this, the only viewpoint the system made use of was that of a medical professional, the intended audience for the Digitalis Therapy Advisor program. The design of XPLAIN did not consider the user as an individual, although the "viewpoints" were intended to be extended in that direction. There was no attempt by the system to establish, maintain, or reference a model of user knowledge; instead, the system assumed a "perfect learner" who understood everything that was explained. This was typical of early explanation work, as shall be illustrated in the next few sections. 2.1.2 TEXT Another early explanation strategy is the wellknown founding work in natural language generation, Kathleen McKeown's schema approach (McKeown, 1985). Operating from the premise that a generation system can use the same discourse strategies humans use in structuring their discourse, McKeown cataloged the rhetorical techniques humans use to present information as rhetorical predicates. Examples included analogy with a known concept and evidence supplied for a given fact. She combined these predicates into four schemata which generalized the paragraph structures she found in naturallyoccurring text: Attributive, Identification, Constituency, and Compare and Contrast. Essentially, the four schemata represented patterns of rhetorical predicates belonging to coherent paragraphs. They were used for generation in the TEXT system when answering user questions about the structure of a database. The resultant text, structured by these schemata, was organized in a manner independent of the database structure, freeing the database to be lain out in the manner most suited for the system's internal representation of the data, while the schemata could build from this the structures most suitable for human consumption. The schemata also enabled a generator to produce far more variety than the database traces of Swartout's system. Different purposes for the explanation could result in entirely different structures, not just a change in the level of abstraction, and there were decision points within the schemata where focus constraints could select between options to vary the structure. The result was a certain level of flexibility both with respect to the previous dialogue and with respect to the question that needed to be answered. There was no user model in TEXT; it assumed a "static, casual and naive user." This user was taken into account, since the text was structured specifically to present the information in a way tailored to a generic human user, but no individuality was acknowledged. As in XPLAIN, no allowance was made for the user misunderstanding the text, either; because the user was assumed to have always understood, the system always trimmed subsequent explanations to avoid details which had been stated earlier. 2.1.3 EES In the next step of abstraction from the rigidity of database training we find the Explainable Expert System (EES) (Moore and Paris, 1989; Moore and Paris, 1992), originally implemented in the Program Enhancement Advisor system (PEA), a tool which assisted users in writing better Lisp programs. Instead of laying out entire paragraph structures as in McKeown's approach, Moore and Paris used Rhetorical Structure Theory (Mann and Thompson, 1988) to recursively structure text with the nucleus/satellite structure defined in RST, where intentional relations linked "spans" of text together. Unlike the approaches described above, EES modeled its user; the representation it used was a collection of beliefs about the domain and goals within the domain. Its user was dynamic, learning as time passed, and imperfect, capable of misunderstanding an explanation. This imperfection was handled through a text planning approach which stored detailed information about the decisions that were made so that explanations could be reattempted with intelligent modifications that took into account the likely causes of an explanation's failure. The EES planner used an agendabased mechanism to post communicative goals represented as effects the system desired to have on the beliefs and/or goals of the user. A library of planning operators were available to apply "linguistic resources," or rhetorical techniques similar to McKeown's predicates, to meet a given goal. In that way, each operator was a kind of miniature schema, detailing some short sequence of actions to achieve a given communicative goal. The planner began its process by posting a general communicative goal on the agenda and then searching for planning operators which solved that goal. Selection of a given operator depended on the satisfaction of its constraints, which referenced the domain database, the user model, and the dialogue history in order to limit the situations in which it could be applied. While some of these constraints were considered "rigid," the constraints on the user model were treated in a loose fashion; if nothing in the user model gave any information about the user's knowledge concerning a specific topic, the constraint was considered satisfied and the assumption that the user knew this topic (in the absence of information to the contrary) was recorded in the plan if this operator was chosen. Once selected, the operator's subgoals would be examined. Since the operators were based on RST relations, the subgoals were classified as nucleus and possible satellites, between which was a specific RST intentional relation. Each subgoal was either a semantic specification of some primitive speech act or an additional communicative goal to be posted the agenda. The planner continued until all goals from the agenda had been processed and refined down to speech acts. The hierarchical plan built by EES maintained the intentional relationships between each nucleus and satellite, from the upper level spanning the entire explanation to the lowest level between each semantic proposition. McKeown's schemata had also represented intentions but only at the top level spanning all of the schema's text; by associating intentions with the smaller pieces of text in the hierarchy, the planner of EES could reason on what effect each part of the text was intended to achieve. In the case of explanation failure, that information allowed the system to reevaluate just the relevant pieces of the text when generating a new explanation. Another part of the plan which allowed for reevaluation was the recorded user model assumptions; since there was no strong evidence in favor of the assumption, it was considered suspect on explanation failure. 2.1.4 Discussion The earliest machinegenerated explanations, as (Swartout, 1983) points out, were canned, pretyped text that were presented to the user. The flexibility of this approach is of course nil; not only is there one and only one explanation available for a given concept with no variation, but all of the user's explanation needs must be anticipated during the design of the system. Similar to early text adventure games where you could not PUT FISH IN BOWL unless the game creator anticipated that you would try that action, those early systems could not provide any explanation the programmers had not foreseen that you would need. XPLAIN and its contemporaries generated text by iterating through a database and converting the data into text as it was found. This freed the system to generate explanations on any aspect of the system without having to prestore the explanations first, but this approach produced text which was entirely dependent upon the structure of the database; the knowledge engineer predetermined the text's structure by the way he or she designed that database, and there was no variation from this structure. No account is made for the possibility of different explanations serving different purposes, except for rough modifications like producing the explanation at different levels within the database hierarchy. McKeown's work was a significant step forward from this because it enabled the system to impose a structure on the text which was independent of the way it was represented in the database. Her schemata allowed for different structures to be used according to the purpose of the explanation. She was taking the user and the system's interaction with the user into account, letting the context influence more of the generation process. However, her user was generic, unchanging from one individual to the next and static in his or her knowledge. In systems like EES, we see the emergence of the idea of accounting for the user as a unique person who has individual needs that can influence how the system addresses him or her. The result is the introduction of a user model containing what the system believes are the beliefs and goals of the individual, acknowledging that these beliefs and goals may differ from those held by another individual. With the introduction of the user model, however, comes a host of issues that the EES work did not address. While the EES planner made use of a model, the issues of model initialization, updating, and correction were outside of the scope of that work. In the following section, I discuss systems which explored these issues in more detail. 2.2 Toward User Modeling in Explanation Systems The primary goal of user modeling could be defined as assessment of the user's changing knowledge over time in order to adapt the communication of knowledge to that user (Spada, 1993). ICICLE's design certainly espouses this view, since we desire to accommodate a dynamic, learning individual and to tailor the delivery of knowledge to that individual as closely as we can. Another of Spada's observations on user modeling is to categorize models into three basic types of user model: ideographic, modeling one specific person; prototypic, modeling a population of individuals with no variation; and individualized, starting out with population assumptions and adjusting for individual variation. One could argue that XPLAIN and TEXT were using prototypic models that were built into the systems themselves; with no individual variation, all decisions based on the population of users could be predetermined and hardcoded into the planning mechanism of the system. EES, on the other hand, used an ideographic user model, representing the individual as a collection of beliefs and goals without taking the population into account. In this section I will introduce several explanation systems whose approaches span both the ideographic approach and, later, the individualized approach, augmenting the modeling of an individual with information about the population; it is this type of modeling system which ICICLE is going to use for its own purposes. 2.2.1 TAILOR C'ecile Paris' TAILOR system (Paris, 1987; Paris, 1988) presented the idea that the user's domain knowledge should be used to select the discourse strategy for structuring the text. A common approach predating her work was to accommodate user knowledge by simply changing the amount of information presented to the user, such as the approach in XPLAIN and TEXT, where they left out elements the user was assumed to know because they occurred in recent discourse. Paris stressed the need to affect the type of information as well, pointing out that different kinds of explanations are needed for experts and naive users; their different domain knowledge leads them to be capable of understanding different representations of the information available on the topic. In order to implement her approach, she designed two discourse strategies. The constituency schema, based on McKeown's work, was a strategy designed to describe an object by its constituent parts; the process trace explained the processes associated with the object. She also developed the distinction between the "declarative" constituency approach, which structured text according to the abstract organization in the schema, and the "procedural" process trace, which followed the database structure closely byway of following directives on how to access the knowledge. The choice between these two strategies depended partly on what information was available in the database, but mostly on what the user knew about the topic, as described below. As in previous approaches, TAILOR also used the user model to control the information selected for inclusion in the description as well, pruning information that was wellknown or easily inferred and inserting that which was unknown. There were two types of knowledge she represented in the user model: knowledge about objects in the domain, and knowledge about the basic concepts underlying the domain. The knowledge base represented objects in a way that included the related mechanical processes, and an expert's knowledge included the functionality of most objects and processes in the domain, while a naive user did not know about the specific objects and did not understand the underlying basic concepts. Since a given learner could be anywhere between this definition of expert and naive, the user's knowledge was represented by listing which of the objects and underlying concepts are known. In that way, she could represent a "continuum" of expertise along which the user resides. This design is what is termed an "overlay" model, where the user's knowledge is represented as some subset of the knowledge represented in the system. TAILOR executed the discourse strategies by iterating through augmented transition networks representing the two strategies. The entry into the network at the initial level was based partly on the availability of information in the database --- if there existed no process information connected with the given concept, the constituency schema had to be chosen --- but the decision rested mostly on the user domain expertise. If the user had no local expertise (knowledge about the specific object), the system chose a process trace; otherwise, the system could opt for a constituent description on the basis that the more advanced user would be able to infer the processes involved without explicit explanation. The constituent schema was also chosen if the user had local knowledge of most or all of the functioning subparts of the object. At key points in the networks, when a new part or superordinate was introduced, the process recursed and could choose either strategy for the new object. As the network was traversed, information was retrieved in the database to fill out semantic propositions. Roughly utterancesized, these semantic propositions would be translated into text at the completion of network traversal. TAILOR did not establish the initial state of its user model; it was given as a series of parameters in its input. Because of this, the system did not have any need to make complex reasoning about the possibly incomplete nature of the model; it operated from the assumption that what it was given was correct. It did, however, update the model over time; embracing the perfect learner assumption, TAILOR changed a user model item to "known" whenever it was explained. 2.2.2 Menotutor Beverly Woolf's Menotutor system (Woolf, 1984; Woolf and McDonald, 1984), although predating the work described above, was more sophisticated in the way in modeled its user's knowledge and incorporated that model into its planning decisions. Woolf stressed the importance of planning text which was "contextdependent," adapting to the context of the student and the discourse history. As with previous systems, she based her instructive approaches on observations of human strategies, but one important difference from Paris' work was her emphasis on the responsibility of the system as teacher, not only to make highlevel distinctions between different methods of explanation, but also to have a level of flexibility detailed enough to avoid teaching above or below a user's current level of understanding. An important and novel aspect of Woolf's user modeling work was her concern with model establishment and updating; she set goals for Menotutor to model student knowledge accurately, to update that model over time, and to use the model to find the most effective presentation method for that individual. This model was not given as input like in TAILOR, and it was updated over time by the system both to become more accurate and to reflect changing user expertise. Although one could describe Menotutor's user model as an overlay model, it had several qualities to it that differed from the standard. The student's knowledge was modeled as an annotation of a domain knowledge base which included both correct concepts in the domain and misconceptions that a student could possess. [Some definitions of the "overlay" approach would restrict it to describing a model which represents the user's knowledge as a subset of the system's correct knowledge about the domain; I extend it here to include both correct knowledge and misconceptions because the user's knowledge is still considered to be a subset of a predetermined set of facts, including incorrect facts; i.e., the user cannot be noted as having a misconception outside of those provided in the model.] Also, instead of merely taking note of which items were known by the student, Woolf's system labeled each item with a numerical Expected Competence rating indicating the strength of the user's knowledge. This number was set to a default value at the beginning of interaction with a student, and then revised over time as the system interacted with the user and garnered more knowledge about his or her competence; Woolf held that studentmade errors were "powerful clues" for the tutor who was able to use them to determine the strengths and weaknesses in the student's knowledge. If the student answered questions at that level of competence successfully, the value went up, but if he or she failed to answer questions on that or lower levels, the value was altered downward. The value could, therefore, dip below the default value to which it was initialized, if that value had overestimated the user's knowledge. This model allowed for the system to ignore isolated errors, because if a user had a high level of Expected Competence on a concept, a mistake answering a question was likely to have been just a mistake and not a true indication of error. Also, this detailed rating allowed the system to tailor instruction to the precise level of the student, avoiding topics which were below the student's "threshold of learning" because they would be too easy, and avoiding those which were above because they would be too hard. The text generation process in Menotutor iterated through Forty networklike states, progressing through three distinct planning levels in a discourse management network, or DMN. In a way, Woolf's DMN echoed the schemata approach, laying out basic patterns from discourse with choice points at various locations where the context could affect which alternative was taken. At each step in this multilevel network, information described what text was to be produced and what paths extended from that point. Although there were "default" paths to take from each point, representing the "contextindependent" way of implementing the chosen method of discourse, one of twenty meta-rules could be fired at any point by matching conditions with the context --- the student model or the discourse history --- in order to divert the explanation's development along a "contextdependent" path to a new state somewhere else in the network. The conditions they tested involved the user's command of certain topics, the system's confidence ratings, and the existence of related topics in the domain. Since the user model was assumed to be possibly incomplete, when the system did not have a high certainty of user knowledge, it questioned the user in order to determine how to proceed. The first level in the network, the pedagogic level, involved selecting a general tutorial approach. This choice established the overall expository style, determining the number of times the system would allow the user to interrupt and the amount of questioning the system would perform. The two possibilities were "Socratic" versus "coachlike," translating to a style which would involve a high level of interaction or a style with a low level. Chosen at the beginning of the extended discourse, the strategic selection would remain active until the system perceived troubles with the student, in which case the pedagogic approach could be switched in hopes that the other method would be more effective with this individual. The second level entailed the construction of a strategy to implement the pedagogy. Choices here might be between questioning the student, describing a concept, or choosing a new topic, driven partly by the pedagogic approach, partly by the user knowledge, and partly by the discourse history. At the third level, the tactical choice was between the speech patterns and language structures that implemented the strategy. Over the long term, Menotutor tracked the success of its chosen pedagogical approaches, and it could change which strategies it used if the current one was not succeeding with a given student. Planning terminated according to the length of the explanation, ending when it reached a certain size. 2.2.3 EDGE Alison Cawsey's work with EDGE (Cawsey, 1990; Cawsey, 1993) continued the emphasis on what she termed informative explanations --- explanations which take into account the user's knowledge, linking with the user's existing understanding and leaving out superfluous information. As Paris and Woolf had also concluded, Cawsey defined the actions necessary for accomplishing informative explanations to be deciding what material should be included in the explanation and choosing between different ways of structuring the text. Her agendabased planner operated in the domain of electronic systems and consulted its model of the user's domain knowledge when selecting a discourse strategy and when deciding on exclusion or inclusion of information, as is described below. Like Menotutor, EDGE both established and updated its model over time. Cawsey held that an initial user knowledge model, based on some gross generalization about the user, was going to be inherently inaccurate, but that an interactive system could improve that accuracy over time by continually making note of new information about the user and marking it down. The model also needed to change in order to reflect growth in the user's knowledge. Her model was therefore initialized according to the level of expertise the user assigned himself. Each concept in the user model bore a rating to show which of the four levels of expertise was stereotypically required for knowing it, so with a broad categorization of the user's expertise, the system could make certain guesses about specific user knowledge on each concept until more information came in. As the tutorial dialogue progressed between human and computer, information about the user's knowledge on specific concepts was derived from that dialogue and marked in the model, overwriting the initial guesses. This was similar to Woolf's approach except in that Cawsey's model took the individual into account at least at a rough level when assigning the initial settings. The model in which this information was recorded was a database overlay whose precision lay somewhere between those of TAILOR and Menotutor; backing off from a numerical scale which might introduce finer detail than the system could accurately support, the tags on each concept consisted of the basic known and unknown plus maybeknown and undecided to indicate different levels of system confidence. The contents of the model were hierarchically organized into topics and subtopics, from general subjects (e.g., how certain types of devices work) to specific knowledge (such as what a certain indicator on a certain device means). This was supplemented by a general estimate of the user's overall expertise level, starting at the level the user assigned himself at the beginning of system use and changing as the system made it more accurate or updated it when the user progressed. In the course of systemuser interaction, the user both answered questions posed by the system to determine his knowledge and posed questions of his own. These actions gave the system new data about the knowledge of the user, and old data stored in the model was overwritten, improving the model's accuracy. User actions were not the only instigator of model updating, though; changes were motivated by system actions as well. As in earlier systems, EDGE wished to believe that once a concept has been explained, it was known; but taking into account a possibly imperfect learner, the system only went so far as to revise the tag on a concept to maybeknown status once an explanation had been delivered. In some cases, the system needed to know information on a concept about which it had no explicit judgment yet. In those cases, the generalization hierarchy could be used to infer the level of user knowledge according to specific inference rule. For instance, if all subconcepts were known, then the parent concept could be inferred as known; on a more general and less reliable level, if the "concept difficulty" rating was greater than the current level of general user expertise, the concept was inferred to be unknown. Since the assumptions on which these deductions were made changed over time, implicit information was not recorded in the user model; it was derived from the current model on each occasion that it was needed. In cases where the rules of inference did not give the system any more data about a particular concept and it really needed to know the user's knowledge about it in order to proceed, it questioned the user directly to determine the user's level of knowledge before planning the explanation. Since the user's general level of expertise was used not only in model initialization but also for these implicit decisions, it was also dynamic, and like with Menotutor, the level of expertise could be revised either upward or downward; if the user answered difficult questions, the system increased its estimation of the user's knowledge, and if the user asked questions about easy concepts this led to a downward revision. This complex user model was used in three aspects of planning in Cawsey's system: the planner referred to it to determine which strategy to use for structuring the overall explanation, what level of detail to use, and what background or optional information to include. The planner was agenda-based, starting with the overall goal to describe a given circuit and posting this on the agenda, and then searching for contentplanning rules to accomplish this goal. EDGE had 25 of these rules, each representing a pattern of subgoals which defined one possible way of describing some aspect or aspects of the circuit; for instance, one rule made a comparison to a similar circuit the user was familiar with, while a different one identified the type of the device, listed its components, and explained its function. These rules made a distinction between "subgoals" and "preconditions," where subgoals always had to be satisfied by planning text but preconditions only resulted in text if not already satisfied by the user model. In the case of the rule comparing the circuit to another one, that was a subgoal and would always result in text; in the case of the other explanation, each of the three subparts were preconditions, so if the user already knew the components of the device, they would not be listed. As a result, the details already known to the user were not reiterated unnecessarily. The planner tracked every time it opted not to expand a goal into text because of the possibility that the decision was based on faulty information; if the user's subsequent actions indicated that the explanation was not fully understood, this list of modelbased assumptions that led to omissions in the text were the first suspects on the list of reasons why the user failed to understand [Note the difference between "assumptions" in EES and EDGE; in EES, "assumptions" were recorded when the system made a decision not based on the user model because it was incomplete; in EDGE, they were recorded when the system made any decision based on the user model because it might be incorrect.]. Another way in which the planner referred to the model was to decide whether to plan out dialogue actions in the case of dubious user model information. I mentioned earlier that the system was uncertain at times whether a user was familiar with a given concept. The planner would poll the user's understanding first in those cases rather than explain something the user already knew. When the planner reached a primitive, it was generated at that time, so EDGE employed an incremental form of realization; each time a goal was refined down to text, that text was passed through a level of discourse planning (where discourse markers were added for coherence) and then realized byway of filling in utterance templates in the planning rules. One of the reasons why EDGE did this was because the explanations tended to be long and involved, and the system might interrupt itself at any time to ask questions of the user; the user might also ask questions along the way. Since this interaction could result in different user model information along the way, executing dialogue actions as they arrived prevented the system from making extensive plans that would have to be scrapped in light of new information. 2.2.4 Discussion These three systems have significance for the work on ICICLE both for their user modeling efforts and for their generation techniques. In the area of user modeling, all three of these systems used variations on the overlay design, a representation that allows a system to model users not as merely belonging to rough categories of experience but as unique individuals anywhere between complete naivet'e and total expertise, possessing knowledge about some concepts in the domain but not others, a level of flexibility important in a system designed to adapt closely to its user. Menotutor augmented this concept by not only modeling the user's knowledge on a set of correct beliefs about the domain, but also on a set of misconceptions. EDGE extended the concept by structuring its model to allow for reasoning which not only reflected the individual but also a population of learners of which the user was a member, achieving the type of model which Spada calls individualized. EDGE and Menotutor introduced the idea of a user model which must be established and maintained by the system rather than taken as given. Since any conclusions the system makes about the user had the potential to be incorrect or incomplete, those systems also had to deal with the possibility of an imperfect or incomplete model. EDGE coped with incompleteness through a hierarchical organization of the domain concepts and rules which allowed the system to infer what it did not know about the user's knowledge from what it did know (an improvement over EES' lessprincipled approach which decided anything undecided was known by the user), and both systems also had the ability to question the user directly to find out more. They dealt with incorrectness by revising the model according to new data as it came in. In this way, they were also able to deal with a model that started out correct but became incorrect because the user's knowledge changed. The user knowledge modeling aims of ICICLE would be wellserved by basing our model on the work discussed here. Like Menotutor and EDGE, ICICLE needs to be able to establish an initial model, and I believe that EDGE's choice to base the initialization on information about the user is a step in the right direction, rather than just initializing everyone the same way as in Menotutor. ICICLE will also need to maintain and update the model over time through information obtained from the user, namely in his or her language performance. This model will have two categories of knowledge represented within it: grammar proficiency and metalinguistic underlying domain information. In each of these, we have a specific and set of items about which we want to know the status of user knowledge, so an overlay design would function well for our needs (see Section 5.2.3 for a discussion). The explicit information marked directly in the model would be the data drawn from the utterances entered for analysis. We would also like to infer implicit information from relationships between the concepts as in EDGE, however. In the domain knowledge model we should be able to arrange the concepts into a hierarchy of concepts and subconcepts, allowing for rules of inference to be applied when reasoning about related concepts. In the grammar model, we may be able to draw implicit conclusions as well if we can establish certain observations about the acquisition of grammatical forms. This will be discussed more in Chapter 5. Another way in which ICICLE benefits from this earlier work is by examining the way in which the user model affects the text generation process. All three systems focused on using the knowledge of the user as a primary decision factor in multiple areas of text planning, including affecting what is said and in what manner it is expressed. This is essential in a system which desires to modify its approach for a wide variety of learners, some of which may require different types of information presentation to excel at the learning task. Woolf's idea of tracking the success of particular tutorial approaches in order to use the one most successful with a user is particularly important. If ICICLE is to succeed as a tutoring system, it will need to choose its pedagogical tools wisely and to produce explanations which are cohesive and meaningful to the individual. The text planner we propose for ICICLE will be covered further in Chapter 4. 2.3 ComputerAssisted Language Learning ICICLE as a system fits into many categories; while it is an explanation generation system and a user modeling system, it is also a ComputerAssisted Language Learning (CALL) system. In this section I will present brief descriptions of some contemporary systems in the field of CALL, selected because of their relevance to the goals and approach of the ICICLE project. In these descriptions, "L1" is used to denote the learner's first or native language, and "L2" the language he or she is trying to acquire, also called the target language. 2.3.1 HyperTutor The HyperTutor system (Schuster and BurckettPicker, 1996) is a learning tool for Spanishspeaking ESL learners of reasonable proficiency. It interacts with the student through a series of translation tasks, presenting Spanish sentences which the student then translates into English. It gives the student notification of whether the English sentence is right or wrong, and in the latter case gives an explanation about the error. Its goals, like those of ICICLE, are to be able to correctly identify the source of an error in order to focus individualized, appropriate instruction. The authors characterize the HyperTutor user model as an interlanguage, or "languageinprocess," a concept I will discuss further in Section 5.2. The essential nature of their model is a store of the language learning strategies the system has observed the student using, where the possible strategies are: o Direct use of the L1 instead of the L2 when the L2 construct is unknown. o Negative transfer of L1 grammar into the L2 (using Spanish grammar rules for English constructs to which they do not apply). o Simplification, where the nonmeaningful words in English have been omitted from the English utterance. o Reduction of redundancy, where errors in morphology reflect the speaker's deletion of what he or she considers to be redundant. o Positive transfer of the L1 structure into L2 (when it is grammatical in the L2 as well). o Overgeneralization of an L2 construct to a larger set than is appropriate. The model containing these strategies is dynamic, growing as the user performs the translation tasks. Whenever the user commits an error which can be attributed to one of the six strategies, the system makes note of this by adding the strategy to the model, and then generates a message describing the error from the point of view of the strategy the user was observed to be using. In this way, the feedback presented to the user can be specific to the real cause of the error, a quality which ICICLE would like to emulate as well. 2.3.2 Mr. Collins The CALL system Mr. Collins ["Mr. Collins" actually refers to just the user modeling component of a larger system, but the name is also used for the entire system for simplicity.] (Bull, 1997) addresses the learning processes of English speakers acquiring Portuguese, specifically within the restricted domain of Portuguese pronoun usage. Its primary goal is to interact with the student through exercises and discussion, instructing him or her on the efficient use of learning strategies to bolster second language learning. The strategies it teaches include the positive transfer strategy mentioned above, and also deduction, inferencing, grouping, and actively looking up answers in the resources provided by the system. Because of the restricted domain of L2 constructs being acquired, the instruction of Mr. Collins is almost entirely centered on these strategies and on how they might improve the student's performance. Most of the exercises in Mr. Collins involve Portuguese sentences being presented to the user without the object pronoun. The student must find the correct position for the pronoun in the sentence. The system passively observes the student navigating through the information space available and solving the exercises, only providing instruction when the student requests it or when the system decides the student's performance is suffering due to poor strategy use. With its focus on explicit discussion of strategy use and its restricted domain of instruction, Mr. Collins does not seem to be particularly related to the goals of ICICLE. However, one trait of the system which is of great interest is the flexibility which it brings to bear when presenting its instruction to the user. Mr. Collins presents its material in a variety of different formats, sometimes quoting relevant sentences illustrating its point, sometimes presenting the relevant grammar rules explicitly, and at other times making direct comparisons to the L1. This variety will be discussed further below and also plays an important part in ICICLE's generation goals. 2.3.3 German Tutor German Tutor (Heift and McFetridge, 1999) is a CALL system under development designed for the instruction of learners of German. Its current implementation is designed for native English speakers and accepts single sentences, parses them, and provides the user with feedback on single errors found. The student modeling architecture in German Tutor is similar to the one described for Meno-tutor (Woolf, 1984) in the previous section. It utilizes a database containing all of the grammatical constraints the parser can recognize as met or broken, holding a score from 0 to 30 representing the user's knowledge on each constraint. A score from 09 represents expert knowledge, 1020 is intermediate, and 2130 is novice. The score for each constraint is initialized to 15 at the beginning of a session with the user, and is incremented with each failure, decremented with each success. This model is not stored from one session to the next. Both the parser and the feedback generation process make use of the student model. The parser selects between multiple possible parses by averaging the student's proficiency score across all constraints, yielding a general proficiency level, and comparing this against an ordered list of possible parses; the proficiency level will place the student among the possible parses, from those using simplistic forms to those attempting more complex possibilities. When selecting the subject matter for instruction, German Tutor prioritizes the errors in the sentence and selects the most frequent or relevant one for the purpose of feedback. The feedback the system generates is presented in three formats representing three levels of expertise, varying in the level of abstraction discussing the violated constraint from specific knowledge to be presented to a novice to abstract knowledge for an expert. The student's knowledge level on the topic is used to choose between the three, and the result is presented to the user in English. 2.3.4 Discussion All of the systems discussed in this section contain some part which matches the goals of the ICICLE system. In HyperTutor, its claim that the user model is capturing the interlanguage state of the user is very closely matched to what I discussed in the Introduction as being the goal of the user knowledge model for ICICLE. However, simply storing the strategies the user is executing to build this interlanguage seems to be an insufficient technique if one wishes to really model the user's internal language hypothesis; HyperTutor's efforts yield no information about the user's knowledge on specific concepts. The authors do not address the possibility of errors whose source is ambiguous, or sentences which hold more than one error; and furthermore, the revision of the model over time merely adds strategies, not taking into account the possibility of the user's changing proficiency leading to different strategies being used rather than just more strategies. In ICICLE's model of the L2 knowledge, what is in the interlanguage will take preference over how it is being built, yielding us much more specific information on which to base instruction. While HyperTutor does provide the possibility of different explanations depending on what the source of the error is deemed to be, it does not match the flexibility of Mr. Collins' varied explanation presentation facility or of the explanation systems discussed in the previous section. The ability to present information in different forms, originating with TEXT and later reflected in the other systems such as TAILOR and EDGE, is very desirable in an instructional system which desires to reflect the individuality of its user, since, as Paris asserted, different people may benefit from different types of information presentation. ICICLE will most certainly embrace this goal as well, to a much larger extent than HyperTutor or German Tutor have accomplished. Finally, German Tutor's user modeling technique most closely resembles that which has been proposed for the ICICLE user knowledge model, by representing a kind of languageelement overlay model with varied markings according to user proficiency level. The main drawback to the approach as implemented in German Tutor is that instead of using the individual ratings to make decisions and relying on a general estimate of expertise only in the absence of other information (as in Menotutor), German Tutor's lumps all of its data on user language proficiency into one average sum for use in selecting the appropriate parse. This does not seem to be a very accurate way of selecting between parses; a more selective approach that uses the individual markings rather than an overall judgment of user proficiency will be discussed in Chapter 5. 2.4 Summary In the first two sections, I overviewed some explanation systems Which have led up to and contributed to the design of ICICLE's user modeling component and its proposed text planner. We wish for ICICLE to establish and maintain a complex, dynamic user model which is highly important to the text planner, affecting the planning decisions in many ways so as to produce text which is maximally tailored to the individual. In order to accomplish this, we will draw from several aspects of the systems I discussed, including the overlaybased, hierarchical knowledge model design and the use of both direct and indirect information stored in the model. In the third section of this chapter, I briefly described some of ICICLE's contemporaries in the field of ComputerAssisted Language Learning in order to illustrate how the modeling and generation techniques reviewed earlier could add to and improve upon the current state of the art in that area. It is our intent that ICICLE prove more flexible, more widereaching, and more informative than these other systems. In the next three chapters I will put forth the main essence of how ICICLE will accomplish these goals. Chapter 3 outlines the general architecture of the ICICLE system as a whole. Then, in Chapter 4, I will discuss the proposed text planning element for the system in order to motivate the main thrust of my research, the user knowledge modeling component, which will be the focus of Chapter 5. Chapter 3 ICICLE System Overview This chapter overviews the ICICLE system architecture as a whole, and gives a brief status report of its current state of implementation. The purpose of this is to provide a view of the larger picture in which the user model will play a part. 3.1 Architecture To accomplish its goals, ICICLE will use a multicomponent architecture represented as a conceptual drawing in Figure 3.1. The primary active components of ICICLE's design are those which accept the user's input (the error identification module) and provide the response (the response generation module). Both of these draw from two knowledge base components: the first is a domain knowledge base containing information on English, ASL, and the errors recognized by the system; the second is the user model I have previously described, capturing the user's grammar and domain knowledge, the dialogue history, and the history of the specific user's interaction with the system. 3.1.1 Error Identification The analysis of a student's errors is accomplished in ICICLE via a chartbased parser with a coverage of English that has been augmented by errorproduction rules or malrules (Sleeman, 1982; Weischedel et al., 1978) which were derived from an error taxonomy compiled out of actual writing samples from deaf students (Suri, 1993; Suri and McCoy, 1993). These additional rules enable the grammar to recognize syntactic and morphological constituents containing errors produced by the target population. In Section 1.2.1 I addressed the fact that such a parser can and will produce multiple possible parses of a given input sentence, and that the selection of which parse to use requires the use of the user model. Given a possibly large set of parses, the error identification module will select the single one whose grammatical and ungrammatical constituents most closely match the grammar model's representation of what constructs the user can be expected to use with or without error. This choice must also take into account multiple possible accounts for why an error occurred, and thus must have some flexibility with respect to how closely the parse matches the model; i.e., the system cannot necessarily throw out a parse which contains an error in a construct the user knows well, for that error could be a simple mistake and not a true reflection of the user's grammatical competence. Lastly, since the same "erroneous" constituent may have one of a list of causes, the parser must not only identify that an error exists, but must "tag" it with a note indicating its nature (e.g., incorrect because it is beyond the learner's understanding, incorrect because of faulty knowledge, or incorrect because of a simple mistake). Figure 3.1: ICICLE system architecture. Once a single parse for the sentence has been selected and the errors it contains have been fully identified, those errors are passed back to the user interface so that sentences containing problems may be highlighted. The error identification component also consults the user model to create a pruned list of errors, containing only those which are relevant for tutoring, and passes those to the response generation component. The determination of relevance relies on the model's representation of those language structures which are currently within the student's grasp to learn about. 3.1.2 Response Generation The response generation module is charged with creating the tutorial Feedback which will enable the user to correct the errors that have been found. To do this, it will present to the student a natural language explanation of the errors, after which changes to the text will be encouraged. The goals of our response generation module are: to be capable of producing a wide variety of tutorial approaches as discussed in Chapter 2; choosing between these approaches, planning their structures, and determining their information content according to the learning styles and knowledge of the student; and enriching its text with relevant information from the dialogue history and the student's domain knowledge. To accomplish this, a multilevel library of planning operators will apply the information resources of the system (both the user model and a large database of system knowledge about the domain) toward forming and revising a hierarchically structured text plan. This module of the ICICLE system is the focus of Chapter 4. The completed text plan will consist of semanticlevel utterance specifications which can be fed into a surface text generator. This generator will produce the actual English text that will be displayed to the user through the user interface. 3.1.3 The User Model The ICICLE user model has already been presented as a complex, dynamic model of user grammar and domain knowledge, dialogue history, and system use. As is indicated in Figure 3.1, this model has bidirectional information flow with both the error identification module and the tutorial response module. The error identification module relies upon the grammar model to select the most appropriate parse of an input sentence; when the parse of a given writing sample is complete, it will then send information back to the user model in order for the grammar mode to be updated with new statistics and for the system history model to receive information so it may analyze the success or failure of tutorial methods which it is tracking. In turn, the response generation module also consults both the knowledge and history parts of the user model in order to plan its explanations, and then sends the completed plans back to be stored in the dialogue history. Model establishment and maintenance issues will be discussed further in the central part of this proposal in Chapter 5. 3.1.4 The Domain Knowledge Base While the user model is dynamic in ICICLE, its counterpart, the domain knowledge base, is considered static and this is illustrated in Figure 3.1. Although both of the "active" modules draw information from this knowledge source, neither makes modifications to it, as it is assumed that the parsing grammar and the grammatical concepts discussed by the system are unchanging entities. The "Domain Knowledge Base" is a store of the system's domain knowledge and should not be confused with the domain knowledge component of the user model, which stores information about the user's domain knowledge. The Domain Knowledge Base contains two main components, including the augmented parsing grammar used to cover ungrammatical input as discussed above. The purpose and function of the other element of this knowledge base (labeled Database of Grammatical Concepts in the figure) is to supply information to the explanation generation process. This component stores the domain knowledge from which the system's explanations about English grammar are generated, so it must include information about how to define all of the grammatical forms recognized by the parser. For instance, if the parser can identify errors in subject/verb agreement and in preposition placement, this database must include information about how to explain those errors and the concepts involved in those explanations. We are investigating whether the concepts in this domain lend themselves to a generalization hierarchy in which children inherit parts of their definitions from their parents (such as one representing German Shepherds and Collies as Dogs, which in turn with Rabbits are Mammals, which are Vertebrates, etc.). In any case, the relationships between the concepts do need to be represented. Because explaining these concepts may involve mentioning other concepts which the user must also understand in order to absorb the explanation, this definitional dependency relationship needs to be noted in some way by indexing the concepts on which an explanation depends from the definition information stored in a concept node. Also, the system may wish to draw comparisons between related concepts, so the database must also include information on the features certain concepts have in common, or those which contrast. The organization of this component will be correlated to that of the domain knowledge component of the user model mentioned above and discussed in more depth in Section 5.3, but again the two knowledge sources will remain distinct because of their vastly different purposes. The exact design of this database has not been fully determined and is a topic of future research. Figure 3.2: A cycle of user input, system response. 3.1.5 The User Interface The interface component of the system is responsible for accepting the user's text and passing it to the error identification module, and for displaying the results of that analysis (in the form of highlighting those sentences which have errors) back to the user. It also displays the tutorial text generated by the response generation module and allows the user to make corrections based on the explanations or to request additional information if an explanation has not satisfied him or her. This last function of the user interface involves handling one of the possibilities for initiating a re-planned explanation. Re-planned explanations in ICICLE will be initiated in two ways: when the student accepts the explanation, but then fails to improve his or her performance with respect to the concept involved; or when the student does not accept the information immediately, asking instead for additional/different explanations. This will be addressed further in Chapter 4. 3.2 Motivation Having presented the essential architecture of the system, I would Like to take a moment to outline our general approach and compare it against literature relating to tutoring systems and second language instruction. 3.2.1 A Cyclic Approach As mentioned in the Introduction, ICICLE's interaction with the user has a cyclic nature; the user submits text to the system for review, the system presents the user with constructive feedback, and the user can make revisions and submit new text. This cycle is portrayed in Figure 3.2. In it is reflected the two tasks of a tutoring system Which were lain out by (Glaser et al., 1987): that of the diagnostician, who must discover the nature and extent of the student's knowledge (in our system by accepting and analyzing userproduced second language text), and that of the strategist, who must plan a response to this discovery (manifested in ICICLE when the system plans and produces tutorial feedback tailored to the learner). Note that the two participants in this cycle (the user and the system) essentially take discrete turns; the user completes his or her composition before giving it to the system to analyze, and the system controls most of the session during the delivery of tutorial feedback. This approach to tutorial instruction, where a user completes a task before receiving any instruction on his or her performance, is motivated by the theory that the cognitive demands of some tasks are so intense that learning is hampered during their execution (Owen and Sweller, 1985; Sweller, 1988), necessitating a post-completion review. It is our belief that the composition of original text in a nonnative language is a task of this level of cognitive difficulty. Researchers in the field of computeraided learning have found that postperformance review or reflection is a powerful strategy for learning, and that computerbased learning tools are ideally suited to such approaches since they can perfectly capture the user performance and then review any aspect of it (Collins and Brown, 1988). ICICLE therefore endeavors to utilize its input/response cycle to provide an optimal learning environment. It maximizes the knowledge derived from the composition experience through such a strategy, enabling the user to execute selfcorrection through review and instruction. 3.2.2 Teaching a Second Language Having outlined the structural nature of the system interaction with the user, I should also address the content of that interaction. ICICLE is a system whose functionality is based on giving a second language learner explicit feedback on the nature of his or her errors, and yet researchers in the field of second language acquisition have questioned the effectiveness of explicit instruction in the acquisition of new language forms (Krashen, 1981; Beck et al., 1995; Carroll, 1995). These researchers draw a distinction between positiveor "Type 1" data, which is exposure to the language being acquired, and negative, explicit, or "Type 2" data, which is explicit instruction on what forms are and are not part of the target language. Krashen claims that explicit (Type 2) data results only in the modification of a "Monitor" which can correct an utterance before or after realization in speech, and he takes a strong stance on the distinction between genuine "acquisition" of second language grammar and the superficial "learning" which results from explicit instruction (Krashen, 1981). The result of this learning is also called "Learned Linguistic Knowledge" and is held to be a separate, distinct area of knowledge representation in the mind (Schwartz, 1993; Beck et al., 1995). However, the stance against the explicit approach is not absolute. When examining whether or not Type 2 input leads to the restructuring of the learner's internalized second language grammar, (Carroll, 1995) finds that input must be recognized as corrective and not communicative in order to be effective, and must present novel data to the learner in order to initiate restructuring, but that the metalinguistic capability of experienced learners may be welldeveloped enough to make use of explicit, specific correction. (Cook, 1991) cautiously points out that explicitlytaught learners can and do achieve fluency in practice. Krashen also dissolves some of the absoluteness of the Monitor Theory in (Krashen, 1981), where he holds that formal classroom learning does indeed result in acquisition when the classroom is a "high intake" environment. He defines "intake" as input which aids acquisition. In Section 4.3.1 I will discuss Krashen's Comprehensible Input Theory in more detail, but to briefly describe it, he maintains that the input which aids acquisition is that which occurs just beyond the learner's current level of proficiency. Therefore, when explicit classroom instruction on grammatical forms takes place using the target language at that accessible level for the learner, it results in acquisition --- but of the forms being used in the instruction itself, not of the forms being taught. Krashen goes further to state that for an adult learner, the formal classroom situation is more likely to provide the intake needed for acquisition than informal conversational situations, so that the classroom may actually excel over informal situations for adult learners (Krashen, 1982). Since ICICLE generation component will be taking care to provide the "intake" that Krashen describes, it will therefore be contributing to acquisition even if one espouses the view that the actual content of the explanations will only be resulting in "learned" knowledge; and even those researchers who draw this distinction admit that both are present in the language performance of any learner, and some further state that both aspects are required for high literacy in either a first or second language (cf. (Bialystok, 1981; Vygotsky, 1986)). Furthermore, an Interface Theory of second language acquisition (Bialystok, 1978; Ellis, 1993) holds that these two areas of knowledge are not entirely segregated, and that explicitlytaught knowledge can become internalized knowledge over time, although the acquisition may be constrained by "learnability" concerns tied to the natural order in which learners acquire forms --- i.e., a learner will only acquire what he is taught when he is developmentally ready to acquire it. The conclusion I draw from this is that the prevailing research supports rather than undermines ICICLE's approach to second language instruction. Not only will ICICLE be designed to provide intake by focusing its language production on the level of comprehensible input for the user, but since it will constrain the topics of its instruction to those which the user is ready to acquire (see Section 5.2), it will satisfy learnability constraints as well, so both the content of its message and the form it takes should lead to positive effects on the learner's production of English. 3.3 Implementation Status ICICLE's error identification component has been implemented with partial functionality. We have developed an augmented grammar for a parser which is descended from the one presented in (Allen, 1995). Our implementation makes use of the COMLEX Syntax 2.2 lexicon (Grishman et al., 1994). Since there is no user model at this time, choices between multiple parses found by the system are made arbitrarily. A Tcl/TK windowbased interface allows the user to type in or load a text file, request an analysis, and view the results. Sentences containing problems are highlighted in colors corresponding to the type of error and "canned" onesentence explanations of the error can be accessed. The existing system makes no attempt to model the user or the domain, and does not employ actual text generation. The user model I am outlining in this proposal should lead to a revised system in which a bi-directional flow of data has been established between the error identification component and the grammar model, basing the parse selection on data in the model and then updating the grammar model according to error statistics from the analysis of the student's text. The user model will also provide the foundation for our text generation module, which is discussed further in Chapter 4. Chapter 4 Text Generation in ICICLE In Chapter 2 I reviewed the work of many explanation systems which were driving toward a certain common goal --- that of generating text which was "informative," or "contextdriven." This same goal motivates our work with ICICLE. The essential goal behind the text generation component we are developing is to be highly sensitive to the context of the generation activity. As the Introduction established, "context" is defined in this work as encompassing all of the following components: the preceding dialogue, the related concepts in ICICLE's tutoring domain, the user's domain skills and underlying knowledge, and the user's history of system use. In order to achieve sensitivity to such adverse context, it is important that the generation component we propose employ a high level of interactivity with the knowledge bases which provide information about the user and the domain. It is our hope that we are proposing a text planner design that would be able to accomplish this level of interactivity. This chapter will overview our design for a text planner which relies heavily on multiple sources of knowledge in order to make its planning decisions. This discussion precedes that of the focus of the proposal (the user modeling component) in order to clearly designate how we will need to design the user model in order that it may provide the needed information to the planner. The goals of this planner were outlined in Chapter 3 and I will repeat them here. We wish the planner to be capable of: producing a wide variety of tutorial approaches; choosing between these approaches, planning their structures, and determining their information content according to the learning styles and knowledge of the student; and enriching its text with relevant information from the dialogue history and the student's domain knowledge. By accomplishing this, it will not only meet the unique needs of the individual learner, but it will promote an environment of "meaningful learning" (Brown, 1994) where related information is tied together to form stronger and more permanent associations. 4.1 Planner Overview What follows is an overview of the model which has been developed for planning the tutorial responses of the ICICLE system, previously described in (Michaud and McCoy,1998).The complexity of the information exchange in the model suggests a need for breaking down the planning process into many stages, combining both bottomup and topdown processing techniques. In the bottomup phase, the model first organizes the explanations to be generated into a linear order based on the topic of each; in the topdown phase, each explanation is fleshed out one at a time and then revised before realization. The multiple phases of processing are driven by successive foci of attention, represented in an "Anatomy of a Response" [My thanks to Chris Pennington, who originated the Anatomy of a Response idea.] which consists of: o content: the error or errors being discussed in a given system action (explanation) o method: the pedagogical approach employed in discussing the content o form: the semantic structure of sentence specifications (each containing a specific rhetorical force) that will eventually realize the method o history: the discourselevel modifications and annotations that result in an explanation which explicitly realizes its context in the domain and relevant domain concepts o manner: [Please note that the meaning of the term "manner" referred to in the previous publication (Michaud and McCoy, 1998) has since shifted; the "history" phase now covers what was termed the "manner" component in the past.] preprocessing performed directly preceding surface text generation to establish a linear order of propositions The subsequent sections will illustrate how these phases have been conjoined to form the framework of an elaborate explanation planner to fit the needs of the ICICLE system. An illustration of the phases showing both the bottomup and the topdown processing can be found in Figure 4.1. 4.1.1 Content The content of an ICICLE explanation, in terms of this initial phase of explanation planning, is expressed at the most general level: it specifies the specific error or errors from the user's text which are being discussed. The error analysis phase of ICICLE passes to the tutorial response generator a list of errors the user has committed; these are the seeds from which the generator will construct its feedback. In a contextaware system, these seeds cannot be treated as autonomous content units, since part of the context of an individual unit is made of those units that precede and follow it. They must be arranged and ordered so that the resulting text presented to the user is cohesive at the large level (over all of the individual responses). Research in language pedagogy (Anderson, 1993) and empirical studies on learning from written texts (HayesRoth and Thorndike, 1979) both suggest that grouping together related information would be more effective (in terms of the learner's absorption of the information) than explaining each error in the order in which it occurs in the essay. The first phase of explanation planning, therefore, is to group related content units and to give them an overall order. This will be accomplished through referring to the domain knowledge base; it will contain information about the errors recognized by the system as well as possible grouping strategies for clustering them according to shared features. As part of this clustering, errors of identical type will be merged into one explanation to avoid duplication of effort. Next, the order of the clusters will be determined using information on how to best structure the overall discussion flow, completing the bottomup phase. 4.1.2 Method Once the explanations have been placed in order, the first can be sent to be processed by the method phase. This next part of the planning process (and the first part of building the topdown plan) selects a tutorial approach for addressing each error. Given the highlevel goal of instructing the user about a given error, the system must now begin building the topdown plan to accomplish that goal through text. ICICLE will have several possible tutorial methods at its disposal, based on research in second language pedagogy. Each of these approaches may appeal to a different style of language learner. Among the possibilities maybe: o To simply provide a corrected form of the sentence. o To explain the grammar construction that was used incorrectly. o To provide examples of sentences that illustrate proper usage of the faulty grammatical construction. o To compare and contrast the grammatical construction involved with its corresponding construction in ASL. Figure 4.1: The Anatomy of a Response, separated into bottomup, topdown, and revision phases. The choice between these methods will be motivated by the user model discussed earlier in this paper and detailed more specifically in Chapter 5. The domain knowledge model provides information on what concepts in the domain the user knows and what he or she is likely to understand, and the system history informs the planner on the user's longterm performance given what methods have been attempted in the past. Over time, the system should be able to make principled decisions on what style of instruction is best suited to this individual. 4.1.3 Form The selection of the tutorial method sets a general course for the explanation, but in the determination of the specific structure, or form, there are still details to be processed. One could see the method selection as having chosen a general schema for the explanation; the form phase processes the options within that pattern according to the domain knowledge of the user. Of primary interest here is the requisite knowledge for understanding an explanation. A method which involves explicitly discussing certain grammatical concepts, for instance, would have been chosen only if the student either knows these concepts or is in a position to be able to grasp the concepts if instructed on them. In the latter case, the form constructed by the system must include additional explanatory material to fill in the requisite knowledge. At points in the explanation where an unknown but learnable concept is named, a recursive explanation must be generated. To show an example, take the method of stating the grammar rule that has been broken. If the user is familiar with the rule, the system can just inform the user of the type of error, and generate a sentence like: "This sentence contains an error in subjectverb agreement." Alternatively, the user model may indicate that the user is not familiar with the concept of the grammar rule, but is ready to learn about it. The system would then generate a short explanation of what this rule is in English: "This sentence contains an error in subjectverb agreement. In English, thirdperson singular subjects require a present tense verb to have the agreement marker S at the end." If the user model indicates the user could use a reminder what thirdperson singular subjects are, a second level of recursion could be added: "This sentence contains an error in subjectverb agreement. In English, thirdperson singular subjects (pronouns like HE, SHE, and IT, singular noun phrases like THE DOG, and names like JOHN) require a present tense verb to have the agreement marker S at the end." A particular quality of the recursion here is that the additional propositions may also follow the same structure possibilities defined at the top level; the first extra sentence was just another simple definition of a concept, where the second was a list of examples. Because of the recursive nature, the same structuring decisions could be used at each level. To avoid an infinitely recursive explanation of this type, the method selection will need to calibrate its decision metrics so that the user's lack of domain knowledge is not so profound that the planner needs to define the terms in every definition. Intuitively, this should not be the case if this topic is in the "current" realm of learning for this user; the vast majority of the prerequisite concepts should already be acquired in order for this topic to be deemed "learnable." Note that the examples written out above are not intended to proscribe the exact order the propositions containing these concepts will occur in, or to indicate exact sentence structure and complexity; the form phase will only generate propositions at the semantic level, and the manner phase will have the responsibility of ordering the propositions, while the realization process will be selecting between the syntactic choices. As mentioned in the Introduction, syntactic complexity will be influenced by those grammatical constructions the learner is attempting to acquire. It is therefore divorced from semantic content decisions. While making those semantic decisions, however, the form phase will retain the rhetorical connections between the propositions it generates for the purposes of revision in the history and manner phases, as will be discussed later. The form phase obviously makes heavy use of the user domain knowledge model. It does not expect the user model to always be infallible or complete, so it will take into account the possibility of explanation failure and the need to repair. The issues behind handling possibly incorrect user models will be discussed briefly in Section 4.5. The text plan at the completion of the form phase contains a first draft of the basic propositional structure of the discourse, molded by the tutorial method chosen and fleshed out with prerequisite data. The next step in the process is to complete the process of informing the explanation of its context by adding explicit contextual references. 4.1.4 History The primary job of the history component is to take the dialogue history into account. This involves both tying new knowledge into existing knowledge through references to the recent and established past, and making certain that history does not repeat itself in the form of redundant explanations. In order to achieve these goals in a text plan which is otherwise mostly prepared for sending to a surface realizer, our approach views the history phase as a revision of the existing plan to add this contextual material. It has been observed (Moore, 1993; Rosenblum and Moore, 1993) that comparison and contrast to recent and established material is a powerful tool of humangenerated explanations, and essential for generating comprehensible tutorial discourse. At this time, therefore, the text planner needs to begin to make adjustments to its plan in order to insert comparisons and references where possible. The domain context which needs to be exploited for this step is found in three tiers of proximity: the information discussed within this group of explanations, the information discussed earlier in a session, and established information the user has already learned. The strategies for referring to this context through revision need to accept the propositions planned by the form phase and, operating on their relational structure and the sources of relevant information on hand, perform modifications to accommodate established knowledge. In some cases, this involves modifying propositions which exactly duplicate explanatory material from earlier in this group of explanations; exact repetition is unnecessary, or at least it should be explicitly marked as a repetition. In other cases, the actions taken at this step will involve generating additional propositions comparing parts of the explanation to previous ones and to related domain information known by the user. The revised structure is of the same format as that coming from the form phase: semantic propositions linked by their rhetorical relationships, ready for the final processing before surface realization. 4.1.5 Manner While the previous phase produces a plan which is semantically developed and contextually appropriate, at this phase in the planning the semantic propositions plan are not constrained to a linear order, a necessary step before realization. It is the job of the manner phase to make the final decisions about the linear flow of the explanation, serializing the propositions for generation. This step may also involve some preprocessing of the utterances to be sent to the realizer so that the clause structure reflects the syntactic goals of providing the user with example constructions from his or her current level of syntactic proficiency. Once this phase has completed these adjustments, the plan is ready to be sent to the surface generator, realized into English text, and displayed back to the user. 4.2 Operationalizing a MultiPhasic Text Planner An integral facet of the proposed planner that has just been outlined is its multiphasic nature. Division of language generation into phases is not novel, although many systems have differed on how to divide the process. Woolf's division of planning levels between selecting a pedagogical method and choosing a framework for implementing that method is similar to ours (Woolf, 1984). Likewise, Cawsey's EDGE system made a distinction between choosing the content and mapping out the type of explanation (Cawsey, 1993). As in our division between selecting a method and planning out the form, both systems addressed the need to first decide on a basic method, and then what to include or leave out. Most systems seem to have divided the process into only two phases, but division into more than two levels of planning for multisentential text was previously implemented by other systems including (Rambow, 1990). As introduced above, the planning process for ICICLE's tutorial response generation will be five-tiered, mapped out in an "Anatomy of a Response." First, the content planning will group and order the content units passed to the response generator from the error identification module; then the method planning phase will apply a first set of planning operators to select appropriate tutorial techniques for discussing each unit of content in order; then the form planning phase will address those goals posted by the method selection by selecting planning operators to flesh out a hierarchical text plan using that technique, and the history phase will revise the plan to add context from the dialogue and domain information; finally, the manner phase will provide specific preprocessing in order to establish a linear order for surface realization, and then the resulting plan will be ready to be fed into a text realizer and passed to the user interface. Information about this Explanation (CORRECTFORM ?original ?correction) The correct form of the sentence ?original is ?correction. (BROKENRULE ?original ?rule) The broken grammar rule in ?original is ?rule. Information about the User (GOODMETHOD ?hearer ?method) The system history model indicates that the ?method is a good tutorial method to use with this user, or at least does contradict that possibility, where ?method can be one of: CORRECT, TELLRULE, EXAMPLES, or COMPARE. (KNOWS ?hearer ?concept) The domain knowledge model indicates that ?concept is in the user's wellestablished knowledge. (LEARNING ?hearer ?concept) The user model indicates that ?concept is in the hearer's zone of variation in his domain knowledge --- it is currently being learned. (UNFAMILIAR ?hearer ?concept) The user model indicates that ?concept is beyond the hearer's current zone of acquisition. (CANCORRECT ?hearer ?original) The hearer is competent to correct the problems in the original sentence. (CANLEARN ?hearer ?concept) The hearer is competent to learn this concept at this time; this means that the language feature ?concept is at the border of the hearer's zone of acquisition, or more specifically that a majority of the subconcepts involving in explaining ?concept are known. Domain Information (EXAMPLESOF ?rule ?exs) The list ?exs contains correct examples of the language feature represented by ?rule, preferably culled from the user's own work. (SUBCONCEPTS ?concept ?subs) The subconcepts involved in a definition of ?concept are ?subs. (DEFINITION ?concept ?def) The definition of ?concept is ?def. (ASLEQUIVALENT ?rule ?aslequiv) The closest ASL equivalent to the English grammar rule ?rule is ?aslequiv. Goals and Speech Acts (INFORM ?speaker ?hearer ?proposition) A speech act goal to inform the the hearer that a certain proposition holds true. (COMPARELANGS ?rule ?aslequiv) A goal to compare and contrast the difference between the English grammar rule ?rule and the ASL grammar feature ?aslequiv. Figure 4.2: Propositions and Goals Used in the Method and Form Operators. This section contains the basic designs of parts of the library of planning operators that will implement these multiple phases of planning in the ICICLE response generation module. What is included in this work is the depth to which these operators have been developed, represented to this level of detail so that the needs they place on the user model will be clear. Because the content level does not reference the user model, its implication will not be addressed. The operator design is largely inspired by the operators used in the EES system (Moore and Paris, 1992). They are designed for hierarchical agendabased planning with constraints that reference multiple sources of knowledge and the ability to recursively post subgoals, as discussed in Section 2.1.3. The subgoals are in the form of a CORE and possible CONTRIBUTORS. As with the EES work, the operators therefore represent not only one or more spans of text [Here we are using the RST definition of "span." ] , but also one "relation" connecting the core to each contributor in the operator. Note that the subgoals of EES were called NUCLEUS and SATELLITE, but we have not use these names. Although RST formed the original basis for Moore's work, she has since found issue with some of the shortcomings of RST relations (Moore and Pollack, 1992). A proposal for an improved discourse analysis theory was put forth in (Moser and Moore, 1996) and we intend to make use of this work. One of the issues addressed in the 1996 work is the need to maintain both intentional and informational structure. In (Moore, Unpublished) she defines this informational structure as containing "contentbearing relationships between the propositions express in discourse elements," and she notes that the organization of certain types of text follows this informational structure rather than the intentional structure, where the "natural" structure is to place the "core" element before its "contributing" definition. This representation is more pertinent to ICICLE's needs than an intentional one, as intentional relations will have very little variation in our type of explanation structure (Hobbs, 1996), but the conceptual relationship between entities mentioned in the propositions can not only be used to determine the relationships between the contents of different propositions, but can also be used by the realizer to determine the linear order of the clauses (Moore, Unpublished). For this reason, we have chosen to label the subgoals as CORE and CONTRIBUTORS, and to maintain informational rather than intentional links between subgoals generated by the same operator. This information will be used by both the history and manner phases when preprocessing before realization. ICICLE's operator design will not be homogeneous across the tiers; since each phase has a different objective, the design of the operators available to each must be different. The method and form operators discussed in the following sections are specified by the value of each of their fields, which is notated in a LISPlike format. For the interpretation of these values, Figure 4.2 lists the propositions used and gives a brief definition of their meanings. The fields that these propositions will occur in are: o EFFECT: the proposition this operator can be applied toward making true o CONSTRAINTS: those propositions which must be true or must be satisfied in order for the operator to be applied o CORE: the core subgoal or speech act to achieve the effect of the operator o CONTRIBUTORS: any additional subgoals which will assist the core in achieving the affect Some systems have drawn a distinction between "constraints" which must be satisfied without additional planning by the system, and "prerequisites" which can motivate additional planning to satisfy in order to select this operator (Littman and Allen, 1987; Moore and Paris, 1992). There is no "prerequisites" field in the design we propose, but the subgoals posted by the CONTRIBUTORS field indicate more or less the same idea by being optional as far as whether or not they are placed on the agenda. These goals must be satisfied in order to satisfy the main goal of the operator, but if they are already satisfied by the user model the system will not place them on the agenda. The CORE, on the other hand, must always be satisfied through planning. Since the planner is not constraining the linear order of the propositions generated until the manner phase, there is no need for the distinction between prerequisites and subgoals, both of which are CONTRIBUTORS. The rhetorical links between planned propositions will lead to an appropriate order when the time comes. Following are the prototype operators for the tiers method and form, and brief discussions of the design issues behind developing the history and manner phases. 4.2.1 Method: a Brief Sketch The complete specification of the method tier of operators will require research and analysis of second language instructional discourse in order to determine the general structure of typical explanations of this type. This kind of analysis has been used by other researchers in generation (e.g., (McKeown, 1985; Paris, 1987; Moore and Paris, 1992)) to develop schemata on which to base their explanations. The method and form operators of ICICLE would be somewhat similar to schemata, where the method selection would entail basically choosing between schemata, and the form phase would plan out the alternatives within that general structure. As a token example, we will postulate four operators along the guidelines of the original four possibilities for methods in ICICLE, which are restated here: o To simply provide a corrected form of the sentence. o To explqain the grammar construction that was used incorrectly. o To provide examples of sentences that illustrate proper usage of the faulty grammatical construction. o To compare and contrast the grammatical construction involved with its corresponding construction in ASL. These four choices are sketched in Figures 4.3 and 4.4. Note that all but the first posts goals that will need to be further refined; an INFORM statement in the first operator indicates that the system will merely inform the user of a fact. At this point, none of these operators is recursive and none is set up to combine more than one method into a single explanation; but these are merely sketches from which we may proceed in specifying what type of information the user model would need to supply for this tier of operators. NAME: CORRECT (Give a corrected form of the sentence.) EFFECT: (CANCORRECT ?hearer ?original) CONSTRAINTS:(AND (BROKENRULE ?original ?rule) (GOODMETHOD ?hearer CORRECT)) CORE: (INFORM ?speaker ?hearer (CORRECTFORM ?original ?correction)) CONTRIBUTORS: nil NAME: TELLRULE (Tell the user which grammar rule was broken.) EFFECT: (CANCORRECT ?hearer ?original) CONSTRAINTS: (AND (BROKENRULE ?original ?rule) (OR (KNOW ?hearer (CONCEPT ?rule)) (AND (UNFAMILIAR ?hearer (CONCEPT ?rule)) (CANLEARN ?hearer (CONCEPT ?rule)))) (GOODMETHOD ?hearer TELLRULE)) CORE: (KNOWS ?hearer (BROKENRULE ?original ?rule)) CONTRIBUTORS: nil Figure 4.3: Sketches of the Method Operators, part I. 4.2.2 Form The primary function of the form operators is to take the main goal posted by the chosen method operator and to refine that goal down to speech acts by generating subgoals as needed to provide additional explanatory material. This approach is similar to that described in (Moore and Paris, 1992), where the system opportunistically defines new terms if necessary at the point of generating specification for their surface generator. Instead of waiting until that point, however, these form operators will recursively specify clauselevel semantic propositions to provide the needed material during this earlier phase of planning. The two operators suggested in Figure 4.5 are possibilities for addressing the subgoal posted by the TELLRULE method operator from Figure 4.3. Here you can see an additional constraint that did not appear in the method operators: (CORE) or (CONTRIBUTOR) depending on whether the operator may be used to satisfy a goal posted in the core or the contributor of an operator. This constraint has been added in order to separate the operator which plans the core part of the explanation from that which recursively generates explanations of relevant subconcepts. In this case, the first operator that would be chosen would be the RULESTATEMENT operator, whose core is: (INFORM ?speaker ?hearer (BROKENRULE ?original ?rule)) To use an example from the Introduction, this could generate the simple statement: "This sentence contains an error in subjectverb agreement." [The exact syntactic structure of this utterance would depend upon the language level of the user. For instance, a more advanced learner who is trying to perfect relative clause usage may benefit more from the sentence, "This sentence contains an error that occurs in subjectverb agreement." See Section 4.3 for more details. ] The CONTRIBUTORS field of this operator can then post subgoals: that for any subconcept involved in this statement, the user must know this subconcept. Because these subconcepts need to be known in order for the overall concept to be known, these are the kinds of "prerequisites" mentioned above. If the user domain knowledge model indicates that a subconcept is known, this goal is not put on the agenda; otherwise, it is, and additional operators will be needed to handle it. NAME: EXAMPLES (Provide examples which illustrate this language feature's correct usage.) EFFECT: (CANCORRECT ?hearer ?original) CONSTRAINTS: (AND (BROKENRULE ?original ?rule) (GOODMETHOD ?hearer EXAMPLES)) CORE: (KNOWS ?hearer (EXAMPLESOF ?rule ?exs)) CONTRIBUTORS: nil NAME: COMPARE (Compare the language feature with its nearest equivalent in ASL.) EFFECT: (CANCORRECT ?hearer ?original) CONSTRAINTS: (AND (BROKENRULE ?original ?rule) (ASLEQUIVALENT ?rule ?aslequiv) (GOODMETHOD ?hearer COMPARE)) CORE: (COMPARELANGS ?rule ?aslequiv ) CONTRIBUTORS: nil Figure 4.4: Sketches of the Method Operators, part II. For the purely recursive part of the definition, the second operator applies. To further our example, if it is indicated that the user needs a definition of "subjectverb agreement," the generator could expand its explanation to this: "This sentence contains an error in subjectverb agreement. In English, thirdperson singular subjects require a present tense verb to have the agreement marker S at the end." This could continue, recursively defining other concepts such as "thirdperson singular subjects." As mentioned in the beginning of Section 4.2, we have based our operators on the designs discussed in (Moore and Paris, 1992) but plan to substitute "informational" relations for the "intentional" relations used in the original work. Informational relations are more relevant for connecting the individual semantic units generated by form operators, since the primary function of the propositions generated by this phase are to produce text which is related by the information it conveys. The implementations of the operators sketched roughly in Figure 4.5 will need to be augmented to build a data structure with explicit representations of the semantic links between the concepts mentioned in a proposition generated by the operator and those additional definitions spawned from those concepts. It is this informational relationship between clauses which would be most relevant to a realizer which is ordering and configuring the surface structure. It would also be relevant to the history and manner phases, as will be discussed later. In addition to these semantic relationships which imply a general ordering, the planner may also which to note a preferred ordering between the subgoals generated by a given operator. These sibling subgoals do not have informational relationships between them, and yet the general flow of the explanation will impose upon them a preferred order of satisfaction. Therefore, at the time of generating subgoals, additional relationships should be noted so that the order of the speech acts generated by sibling subgoals reflects the order in which concepts are discussed in the overall utterance. NAME: RULESTATEMENT EFFECT: (KNOWS ?hearer (BROKENRULE ?original ?rule)) CONSTRAINTS: (AND (SUBCONCEPTS (CONCEPT ?rule) ?subs) (CORE)) CORE: (INFORM ?speaker ?hearer (BROKENRULE ?original ?rule)) CONTRIBUTORS: (FORANY ?subs (KNOWS ?hearer ?subs))) NAME: RECURSIVEEXPLANATION EFFECT: (KNOWS ?hearer ?concept) CONSTRAINTS: (AND (SUBCONCEPTS ?concept ?subs) (CONTRIBUTOR)) CORE: (INFORM ?speaker ?hearer (DEFINITION ?concept ?def)) CONTRIBUTORS: (FORANY ?subs (KNOWS ?hearer ?subs))) Figure 4.5: Possible Form Operators. The product of this phase of planning will therefore be the semantic specification of utterances connected by informational relations; this structure of semantic units is otherwise ready for the realization phase, but will first be sent through a revising process to add additional contextual information. This revision is handled by the history component. 4.2.3 History: a Revision Approach The history phase modifies the existing text plan to adjust the explanations to accommodate the context in the dialogue history and the user's domain knowledge. Its two responsibilities are to eliminate redundancy and to generate explicit references between related concepts. The latter is an essential task to enable the student to see connections between the new knowledge being presented and knowledge previously or even recently discussed, or between the new knowledge and established knowledge he or she already has. The exploration of human explanation strategies for the Sherlock (Moore, 1993; Rosenblum and Moore, 1993) and Migraine (Carenini and Moore, 1993) systems [Sherlock and Migraine are applications in which the EES planner has been implemented.] led the authors to develop a taxonomy to classify the different types of contextual effects to be found in human explanations, of which there were four: o explicit reference to a previous explanation to point out similarities or differences o omission of previouslyexplained material to avoid distracting the student from novel information being presented o explicit marking of repeated material to distinguish it from new material o elaboration of previous material in forms of generalization, more detail, or justification Because of the bulkiness encountered by the other systems when attempting to plan contextual references concurrent with planning the core structure of the text, we propose to implement some of these effects as revision techniques in ICICLE's penultimate planning phase. Revision in text generation can be described as the process of building an initial draft of the text at some level of representation and then changing that text before presenting it the user. Existing systems primarily focus on performing the changes subsequent to syntactic and surface realization decisions, since the objective of the revision is to make syntactic changes only, improving style and/or readability while leaving the semantic content the same. Such systems include the Yh system (Gabriel, 1988) which combined propositions in order to improve readability, Rambow's Joyce system (Rambow, 1990), and the REVISOR system by (Callaway and Lester, 1997). These approaches are therefore not highly applicable to ICICLE's revision goals, since we desire to add to the semantic content of the plan, and furthermore the semantic inclusion choices are divorced from surface realization concerns, which are driven by the user's language level. The choices of what additional information to include must be driven by the context of the performance of the user in terms of what he or she knows about the domain, and the recent dialogue history; the realization choices must be driven by the user's input needs and level of reading comprehension. Therefore, ICICLE's revision process, both semantically motivated and semantically operating, must focus on the propositional structure built by the previous stages, working before (and independent of) syntactic realization choices. One existing system focuses on content elaboration. Robin's STREAK system (Robin, 1993; Robin, 1994) generates newspaperstyle summaries of sports scores from a database of basketball game results. After its initial planning process STREAK uses "revision operators" to take existing structure and produce an altered structure adding pieces of historical context, such as the recent history of a given player or team. STREAK's goal of adding content (specifically, adding historical context) makes its goals very close to those of ICICLE; for this reason, the general implementation idea of revision operators which produce changes on a planned structure has been adopted for our history component. However, STREAK, like the other revision systems, plans its revisions after realization decisions have been made, and it currently only generates a single sentence. ICICLE's history operators will therefore be largely different from those which Robin implemented. ICICLE's revision goals are unique: it is adding content rather than revising syntax, and its inclusion decision must be independent of surface realization concerns. It needs to draw both from the dialogue context like Migraine and Sherlock, and from the domain knowledge base like STREAK. To appease all of these goals together, our approach to revision must be novel. To complement the planning operators for determining the first pass of rhetorical information, our planner will therefore also include a library of revision operators that function entirely on the semantic level and the "informational" links addressed earlier to transform that structure into the final form for passing to the realizer. In the following list of proposed techniques based on the taxonomy developed for Sherlock and Migraine (Moore, 1993; Rosenblum and Moore, 1993; Carenini and Moore, 1993), "context" is to be read as the large context containing the user's established knowledge in addition to what has been mentioned in recent or past explanations: o If a concept mentioned in an explanation is similar to a concept in the context, make explicit reference to the similarities and distinctions between the two. o If an explanation that has been planned is reiterating all or part of an explanation done earlier in this session [See next section for why this may be possible], modify the latter explanation so that it does not repeat all of the information mentioned before. How much of the information it repeats depends on the system's estimate of how well the user understood the previous explanation (i.e., is it completely new data or a reinforced explanation? Was the user able to perform the correction the last time?). If there is repeated information remaining, include specific mention that the information has already been discussed. o When information in an explanation is being repeated from an explanation in an earlier session, ensure that the repeated information is explicitly marked to indicate that this is old and not new. Figure 4.6: Expanding and then revising a definition. The revision operators of ICICLE will take an existing completed semantic structure (the output from the form phase) and the information in the user model and the domain database to create a new, modified structure. This structure may have added new propositions which do not change the existing propositions (see the simplistic example in Figure 4.6), or it may have added specifications to existing propositions so that the realizer will know to add phrases like "Remember..." or "As I was saying before when I talked about X..." in order to explicitly mark repeated information. There are several issues involved with further developing the history process and its operators. The historyplanning mechanism needs to make principled decisions on when to add comparisons to related domain concepts. Some of these principles may be based on the complexity of the existing explanation; if there are multiple branches in the informational structure, indicating many concepts and subconcepts being defined, a comparison may overcomplicate the structure. Also, it may be most useful to draw comparisons only to the main topic of the explanation and not the subtopics. These constraints will need to be built into the operator library or the operatorselection mechanism. Furthermore, when an explanation item is selected for this kind of revision, the planner must be able to find something relevant for the comparison. In order to develop the part of the planner which scans the user's domain knowledge for relevant items, we will need to investigate methods of determining similarity between objects in the domain. One possibility is to follow the lead in (Rosenblum and Moore, 1993; Lemaire and Moore, 1994) and investigate casebased reasoning such as that described in (Aleven and Ashley, 1992) to select legal cases which share or contrast characteristics in specific ways. Note that Figure 3.1 included a database of language features in the domain knowledge base; it is this knowledge source which will supply information on the properties of the different domain concepts for the comparison/contrast actions. At the conclusion of this phase, the explanation is fully developed semantically. Propositions which represent utterances are linked with informational relations in a hierarchical text plan which is almost ready for surface realization. The manner phase will provide the final step of preprocessing and then the explanations can be realized and presented to the user. 4.2.4 Manner In this last step of processing, our generation system needs to serialize the hierarchical text plan. This process will rely upon the informational links connecting the planned utterances in order to place them in a linear order to be fed to the surface generator. As mentioned above, the core/contributor relationships will be instrumental in deciding part of this order; other information will come from the relationships of propositions in the hierarchy to their parents, and orderpreference relationships between siblings not separated by the core/contributor distinction. 4.3 Realizing the System Response I mentioned in the Introduction that one of our goals is to tailor the syntactic level of the surface output to the acquisition level of the learner. In this section, I will discuss more deeply why this is a goal for ICICLE, and overview how it might be accomplished using an available text realizer. 4.3.1 Comprehensible Input A serious concern for a tutoring system teaching a second language is to produce instruction that can be understood by the learner. Because ICICLE is currently unable to conduct any instruction in the learner's native language, all instruction must be written in English and care must be taken that the level of syntactic complexity does not overwhelm the user. We are particularly concerned about this for our target learner audience, who (as mentioned in the Introduction) typically have very little access to text which is at their syntactic level (Anderson, 1993). We are therefore looking into how we can design the system to produce text at a syntactic complexity appropriate for the learner using our system. It has been observed that humans unconsciously "simplify" or otherwise modify our speech when addressing second language learners (SchinkeLlano, 1994; Snow and HoefnagelHohle, 1982; Krashen, 1982), but questions have been raised as to whether oversimplification is counterproductive in a learning environment (SchinkeLlano, 1994). Because of this, we do not wish to design ICICLE to simply produce very simple syntax regardless of the learner's actual proficiency; we wish for the level to vary along with the learner's progress. Stephen Krashen's ideal of "comprehensible input" (Krashen, 1985) forms a basis for our approach. His theory of acquisition holds that a second language learner acquires the target language through being exposed to grammatical forms which are slightly beyond his current level of proficiency; that is, if the learner is currently at level i, the forms to which he must be exposed are at level i+1. With the help of extralinguistic information such as context and content, the learner can still understand the input even though he has not acquired the grammar it contains. Krashen holds that this input is not only helpful to acquisition, but essential (Krashen, 1981; Krashen, 1982; Krashen, 1985). There has been some argument that tailoring input to exactly level i+1 is not helpful (Krashen, 1981; Ellis, 1992). Natural "foreigner speech," does not hit only level i+1 but also its surrounding levels as well (Krashen, 1981; Krashen, 1982; Krashen, 1985). It may even be harmful to consciously tailor speech to that specific level (SchinkeLlano, 1994), partly because it may distort the natural communication (Krashen, 1982). However, these observations have been made based on human-human interaction and what we can accomplish unconsciously compared to what happens when we concentrate on a specific task; an automated text generation system cannot accomplish anything unconsciously, nor will its actions be affected by the direction of its attention. Therefore, we conclude that directly designing our system to provide level i+1 input is a desirable approach. We do not need to constrain the realizer to produce only input at this level; this would be both very difficult and unnatural. Krashen suggests a "shotgun approach" where the speaker aims for i+1 and acknowledges that the actual area hit will be larger than this. The advantage to this approach is that it not only provides comprehensible input, but also review of known forms and some input which is a little beyond what the user is ready to acquire, an effect which Krashen calls "anticipation." Note that a central aspect of this approach is to be able to tell where a user is on the road to syntactic fluency --- and, specifically, what forms are at, above, and below his or her level. This will be discussed more thoroughly in Chapter 5. 4.3.2 Using FUF Our current goal is to modify FUF, a functional unificationbased system (Elhadad, 1993), for our realization purposes. Specific exploration into the details of this modification has not yet been made, but the semantic specifications produced by the form phase, revised by the history phase, and linearized by the manner phase would be essentially FUF specifications, clausesized semantic propositions connected in a representation of the informat