Major attention in computer assisted learning is being given to immersive, often game-like environments. Students are placed in various problem solving situations—and allowed to either explore on their own or with various kinds of hints (today typically called “scaffolding”). The big questions here are what kinds of hints/scaffolding will be of (most) help and when should it be given?
Other tools such as Texas Instrument's TI-Nspire tackle the problem from the other end. Rather than hints, calculators serve as tools students can use to facilitate problem solving. serve as prerequisites—as more or less comprehensive foundational skills on which learners may build.
Scaffolding and prerequisites both play a central role in all learning systems. The main problem is that good tutoring systems have been difficult and expensive to build. Moreover, their educational benefits have been difficult and expensive to evaluate. Determining effectiveness and efficiency invariably requires direct (and often expensive) empirical evaluation. The results are rarely if ever as good as what a human tutor can do, and comparisons with classroom instruction are often hard to evaluate.
Instructional design models help. Among other things they help identify what must be mastered for success and what can be assumed on entry. Computer Based Instruction (CBI) systems build on assumed prerequisites and are directed at what must be learned. After years of effort, beginning with Control Data's work (under the leadership of William Norris) in the early 1960s, the best CBI is limited to providing pretests to identify areas of weakness, providing instruction aimed at deficiencies and following up with post tests to determine how much has been learned.
ALEKS is one of the better commercially available CBI systems (a McGraw Hill company). In ALEKS and other advanced CBI systems (e.g., Paquette, 2007, 2009) to-be-acquired knowledge is represented in terms of relational models.
Intelligent Tutoring System (ITS) research goes further, attempting to duplicate or model what a good tutor can do—by adjusting diagnosis and remediation dynamically during instruction. ITS focus on modeling and diagnosing what is going on in learner minds (e.g., Anderson, 1993; cf. Koedinger et al, 1997; Scandura et al, 2009). Assumptions are made both about what knowledge might be in learner minds and learning mechanisms controlling the way those productions are used in producing behavior and learning.
Identifying the productions involved in any given domain is a difficult task. Specifying learning mechanisms is even harder. Recognizing these complexities Carnegie Learning credits Anderson's evolving ACT theories (compare Ohlsson, 2007 and Ohlsson &, Koedinger in Scandura et al, 2009), but has focused on integrating ITS with print materials to make them educationally palatable (i.e., more closely aligned with what goes on in classrooms).
The difficulties do not stop there. Ohlsson noted as early as 1987 that specifying remedial actions—what to teach is much harder than modeling and diagnosis. As in CBI, pedagogical decisions in ITS necessarily depend on the subject matter being taught—on semantics of the content. Each content domain requires its own unique set of pedagogical decisions. It is not surprising in this context that Ohlsson and Mitrovic found common cause in developing Constraint Based Modeling (CBM, 2007). CBM is a simplified alternative to ITS based on production systems in which the focus is on constraints that must be met during the course of instruction—not on the cognitive constructs (productions) responsible (for meeting those constraints).
From their inceptions, the Holy Grail in CBI and ITS is to duplicate what good teachers do. As shown by Bloom (1984) the best human tutors can improve mastery in comparison to normal instruction by 2 sigmas. This goal has been broadly influential but never achieved through automation. The limited success of CBI, combined with the complexities and cost inefficiencies of ITS have reduced effort and research support for both CBI and ITS.
This disclosure shows that these trends are premature. Advances in Structural Learning Theory (e.g., Scandura, 2007, Scandura et al, 2009, hereafter SLT) and AuthorIT and TutorIT technologies (Scandura, 2005) based thereon make it possible not only to duplicate human tutors in many areas but to do better. Today, for example, few doubt we can build tutoring systems that teach facts as well or better than humans. “Flash cards”, for example, could easily be replaced by computers—with more efficiency and certain results.
This disclosure goes further. It shows:                a) How AuthorIT makes it possible to create and how TutorIT makes it possible to deliver highly adaptive and configurable tutoring systems that can guarantee learning on well-defined math skills (Scandura, 2005, 2007, 2009).        b) Why and in what sense TutorIT tutorials can guarantee mastery of such skills.        c) Why TutorIT tutorials can be developed cost effectively—at half the cost of traditional CBI development.        d) How TutorIT tutorials can be extended to support the development and delivery of higher as well as lower order knowledge.        e) Why TutorIT tutorials can be expected to produce as good or better learning than most human tutors.        
More generally, the enormous potential of intelligent tutoring systems (ITS) has been recognized for decades (e.g., Anderson, 1988; Scandura, 1987). Explicit attention to knowledge representation, associated learning theories and decision making logic makes it possible to automate interactions between the learner, tutor and content to be acquired (commonly referred to as the expert module). In principle, ITSs may mimic or even exceed the behavior of the best teachers. This potential, however, has never been fully realized in practice.
The situation today remains much as it has (been). The central importance of the content domain, modeling the student and the interactions between them remain as before. Every automated instructional system consists of: a) one or more learners, human and/or automated, b) content to be acquired by the learners, also known as knowledge representation (KR), c) a communication or “display” system for representing how questions and information are to be presented to and received from learners, and d) a tutor capable of deciding what and when information is to be presented to the learner and how to react to feedback and/or questions from the learner.
A major bottleneck in the process has been representation of the content domain (expert model). As Psotka et al (1988, p. 279) put it, “The fundamental problem is organizing knowledge into a clear hierarchy to take best advantage of the redundancies in any particular domain. Without extensive experience in doing this, we are largely ignorant of the kind of links and predicates to use. We are still far from cataloging the kinds of links needed for these semantic hierarchies. The best predicates that might describe these knowledge structures simply, beyond ISA and PART OF hierarchies, still need to be defined. Too little work has been aimed at developing new representations for information and relationships.” He goes on to mention the need to consider OO frameworks and/or structures containing objects, relationships and operations (pp. 279-280).
A variety of approaches to Knowledge Representation (KR) have been used in ITS development. Among the earliest are Anderson's ACT-R theory (e.g., 1988) based on productions systems and Scandura's (2001) Structural Learning Theory (SLT). The SLT provides a comprehensive and increasingly rigorous theoretical framework for explaining, predicting and controlling problem solving behavior, with particular attention to interactions between teachers and learners (e.g., Scandura, 1971, 1973, 1977, 2001a, 2003, contemporaneously updated and further refined in Scandura, 2005, 2007. 2009). Structural (Cognitive Task) Analysis (SA) comprises an essential part of the SLT used to construct higher and lower order SLT rules, each SLT rule being comprised of a hierarchical Abstract Syntax Tree (AST) representing procedural knowledge at multiple levels of abstraction, in which each node consists of an operation or decision operating on a data structure (defined as a structural declarative AST) on which the node operates (e.g., Scandura, 2003). Because an SLT rule consists of both a procedural AST and the structural AST on which it operates, the terms SLT rules and ASTs or AST based knowledge representations are used interchangeably herein with the understanding that each unit of knowledge, each SLT rule, AST-based knowledge or just AST in the SLT involves a procedural AST operating on a corresponding structural AST (e.g., Scandura, 2001, 2003). and the closely associated method of Structural (Cognitive Task) Analysis (SA) (e.g., Scandura, 1971, 1973, 1977, 2001a, 2005, 2007). Both production systems and procedures in SLT are fully operational in the sense that they can be directly interpreted and executed on a computer. Other popular approaches are based on semantic/relational networks of one sort or another (e.g., knowledge spaces, conceptual graphs). Semantic networks represent structural knowledge hierarchically in terms of nodes and links between them. Semantic networks have the benefit of representing knowledge in an intuitive fashion. Unlike production systems and SLT procedures, however, they are not easily interpreted (executed on a computer). In effect, each type of representation has important advantages and limitations: Production systems are operational but do not directly reflect structural characteristics of the knowledge they represent. Networks represent the structure but are not directly operational. It also is worth noting that most approaches to KR make a sharp distinction in the way they treat declarative and procedural knowledge, on the one hand, and domain specific and domain independent knowledge, on the other.
Given the complexity of current approaches, a variety of tools have been proposed to assist in the process. Most of these are tailored toward one kind of content or another. Merrill's ID2 authoring software (1994), for example, offers various kinds of instructional transactions, each tailored to a specific kind of knowledge (e.g., facts, concepts or procedures). Such systems facilitate the development of instructional software but they are limited to prescribed kinds of knowledge. In particular, they are inadequate for delivering instruction where the to-be-acquired knowledge involves more than one type, as normally is the case in the real world.
Such tools can facilitate the process. However, the problem remains that there have been no clear, unambiguous, universally applicable and internally consistent methods for representing content (expert knowledge). In the absence of a generalizable solution to this problem, ITS development depends on subject matter semantics. The way tutor modules interact with student models as well as the human interface through which the learner and tutor communicate have been heavily dependent on the content in question. The widespread use of production systems (e.g., Anderson, 1988) for this purpose is a case in point. Production systems have the theoretical appeal of a simple uniform (condition-action) structure. This uniformity, however, means that all content domains have essentially the same structure—that of a simple list (of productions). In this context it is hard to imagine a general-purpose tutor that might work even reasonably (let alone equally well) with different kinds of content. Without a formalism that more directly represents essential features, it is even harder to see how production systems might be used to automate construction of the human interface.
Representations focused on OO inheritance are limited in its emphasis on objects, with operations subordinate to those objects. Representing rooms, car and clean operations, for example, involves talking about such things as room and car objects cleaning themselves, rather than more naturally about operations cleaning rooms and cars (e.g., Scandura, 2001b).
Consequently, ITSs have either been developed de novo for each content domain, or built using authoring tools of one sort or another designed to reduce the effort required (e.g., by providing alternative prototypes, Warren, 2002). Whereas various industry standards (e.g., SCORM) are being developed to facilitate reuse of learning objects (e.g., diagnostic assessments and instructional units) in different learning environments, such standards are designed by committees to represent broad-based consensus. Cohesiveness (internal consistency) and simplicity are at best secondary goals. Specifically, such standards may or may not allow the possibility of building general-purpose tutors that can intelligently guide testing (diagnosis) and instruction based solely on the structure of a KR without reference to semantics specific to the domain in question. They do not, however, offer a solution to the problem.
A recent article on the Structural Learning Theory (SLT) outlines an approach to this problem (Scandura, 2001a).1 SLT was designed from its inceptions explicitly to address interactions between the learner and some external agent (e.g., observer or tutor), with emphasis on the relativistic nature of knowledge (e.g., Scandura, 1971, 1988). The former articles (i.e., Scandura 2001a,b) summarize the rationale and update essential steps in carrying out a process called Structural (cognitive task) Analysis (SA). In addition to defining the key elements in problem domains, and the behavior and knowledge associated with such domains, special attention is given to the hierarchical nature of Structural Analysis, and its applicability to ill-defined as well as well-defined content.2 Among other things, higher order (domain independent) knowledge is shown to play an essential role in ill-defined domains. SA also makes explicit provision for diagnostic testing and a clear distinction between novice, neophyte and expert knowledge. While broad, however, this overview omits essential features as well as the precision necessary to allow unambiguous automation on a computer. 1 The SLT has evolved over a period of several decades beginning in the 1960's. Scandura (2001a) summarizes the current status and new developments in SLT. Appendix A in Scandura (2001a) provides a useful overview of major developments and related publications over the years. Some key characteristics of SLT are listed here for the reader's convenience: a) the central importance of structural (cognitive task) analysis, b) distinctions between lower and higher order knowledge (used to distinguish between domain specific and domain independent knowledge), c) the representation of knowledge at different levels of abstraction (used to distinguish between levels of expertise, make testing more efficient and/or to guide the learner and/or instruction), d) explicit processes for assessing what a learner does and does not know relative to a given body of content (i.e., the learner model), e) a universal control mechanism (playing a central role in problem solving, being implementable in a manner that is totally independent of higher as well as lower order knowledge), and f) assumed fundamental capacities such as processing capacity and processing speed). More directly related characteristics are detailed for the first time in this series.2 The process of structural analysis has a long history. Most of the earlier work through the early 1980s concentrated on the identification of (domain independent) higher order knowledge as well as lower order (domain specific) knowledge. The use of ASTs to represent knowledge, however, came largely as a result of later work in software engineering.
Although incomplete from the standpoint of ITS, parallel research in software engineering provides the necessary rigor as far as it goes. This research makes very explicit what has until recently been an informal process (of SA). SA has evolved to the point where it is automatable on a computer (U.S. Pat. No. 6,275,976, Scandura, 2001b) and sufficient for representing the knowledge not only associated with domain specific content but also with domain independent knowledge and ill-defined domains.
A recent article by Scandura (2003) extends this analysis to domain specific systems of interacting components and shows how the above disclosure ((U.S. Pat. No. 6,275,976) provides an explicit basis for representing both declarative and procedural knowledge within the same KR. While abstract syntax tree (ASTs) are used for this purpose in the preferred embodiment it is clear to anyone skilled in the art, that any number of formal equivalent embodiments might be used for similar purposes. ASTs, for example, are just one kind of representation of what is commonly referred to as Knowledge Representation (KR).
Programming is inherently a bottom-up process: the process of representing data structures and/or processes (a/k/a to be learned content) in terms of elements in a predetermined set of executable components. These components include functions and operations in procedural programming and objects in OO programming. Software design, on the other hand, is typically a top-down process. Instead of emphasizing the assembly of components to achieve desired ends, the emphasis is on representing what must be learned (or executed in software engineering) at progressive levels of detail.
Like structured analysis3 and OO design in software engineering, SA is a top-down method. However, it is top-down with a big difference: Each level of representation is designed to be behaviorally equivalent to all other levels. The realization of SA in AutoBuilder also supports complementary bottom-up automation. Not only does the process lend itself to automation, but it also guarantees that the identified competence is sufficient to produce the desired (i.e., specified) behavior.4 3 Whereas processes and data are refined independently in structured analysis and in OO design, both are refined in parallel in structural analysis (SA).4 See Scandura (2001a, 2005, 2007; Scandura et al, 2009) for updated information on a General Purpose Intelligent Tutor, which working in conjunction with content represented in this manner, can guarantee specified learning in a minimum time (e.g., with the fewest possible test and/or instructional interactions between tutor and learner).
The present disclosure is based on the commonly made assumption that any instructional system consists of one or more learners, a human and/or automated tutor, and a representation of the content to be taught. The latter represents the knowledge structure or structures to be acquired by the learners and may be represented in any number of ways. Also assumed is an electronic blackboard or other means of presenting information to and receiving responses from learners. Either the learner or an automated tutor must decide what and when information is to be presented to the learner and/or how to react to feedback from the learner.
Whereas processes and data are refined independently in structured analysis and in OO design, both are refined in parallel in structural analysis (SA).
See Scandura (2001a, 2005, 2007; Scandura et al, 2009) for updated information on a General Purpose Intelligent Tutor, which working in conjunction with content represented in this manner, can guarantee specified learning in a minimum time (e.g., with the fewest possible test and/or instructional interactions between tutor and learner).
Further Background.—In the 1960s, there was a disconnect in educational research and research in subject matter (math) education. Educational research focused on behavioral variables: exposition vs. discovery, example vs. didactic, demonstration vs. discussion, text vs. pictures, aptitude-treatment interactions, etc. (cf. Scandura, 1963, 1964a,b). Subject matter variables were either ignored or limited to such things as simple, moderate, difficult. Little attention was given to what makes content simple, moderate or difficult. Conversely, research in subject matter (e.g., math) education, focused primarily on content (reading, writing, arithmetic skills, algebraic equations, proof, etc.).
In same time period, instructional design focused on what was to be learned and prerequisites for same. Task analysis focused initially on behavior—on what learners need to do (Miller, 1959; Gagne, 1966). In my own work, this focus morphed into cognitive task analysis—on what learners must learn for success (e.g., Scandura, 1970, 1971, Durnin & Scandura, 1973). My parallel work in experimental psychology (Greeno & Scandura, 1966; Scandura & Roughead, 1967) in the mid 1960s added the critical dimension of behavior to the equation.
Structural Learning grew out of this disconnect, with the goal of integrating content structure with human cognition and behavior. Structural Learning Theory (SLT) was first introduced as a unified theory in 1970 (published in Scandura 1971a). SLT's focus from day one (and the decade of research on problem solving and rule learning which preceded it) was on what must be learned for success in complex domains, ranging from early studies of problem solving and rule learning (Roughead & Scandura, 1968; Scandura, 1963, 1964a,b, 1973, 1977) to Piagetian conservation (Scandura & Scandura, 1980), constructions with straight edge and compass, mathematical proofs and critical reading (e.g., Scandura, 1977).
This research was focused on the following four basic questions [with their evolution from 1970 through the present, Now]:                Content: What does it mean to know something? And how can one represent knowledge in a way that has behavioral relevance?                    [1970: Directed graphs (flowcharts)→Now: Abstract Syntax Trees (ASTs) & Structural Analysis (SA)]                        Cognition: How do learners use and acquire knowledge? Why is it that some people can solve problems whereas others cannot?                    [1970: Goal switching→Now: Universal Control Mechanism (UCM)]                        Assessing Behavior: How can one determine what an individual does and does not know?                    [1970: Which paths are known→Now: which nodes in AST are known (+), −, ?]                        Instruction: How does knowledge change over time as a result of interacting with an external environment?                    [1970: Single level diagnosis & remediation→Now: Multi-level inferences about what is known and what needs to be learned]                        
Higher order and lower order knowledge played a central role in this research from its inceptions—with emphasis on the central role of higher order knowledge in problem solving (Scandura, 1971, 1973, 1977). Early SLT research also focused heavily on identifying what individual learners do and do not know relative to what needed to be learned (e.g., Durnin & Scandura, 1974; Scandura, 1971, 1973, 1977).
Deterministic theorizing was a major distinguishing feature of this research (Scandura, 1971). I was focused, even obsessed with understanding, predicting and (in so far as education is concerned) controlling how individuals solve problems. Despite considerable training in statistics and having conducted a good deal of traditional experimental research (e.g., Greeno & Scandura, 1966; Scandura & Roughead, 1967; Scandura, 1967), I found unsatisfying comparisons based on averaging behavior over multiple subjects. I wanted something better—more akin to what had been accomplished in physics centuries earlier (cf. Scandura, 1974a).5 5 The deterministic philosophy I am proposing represents a major departure in thinking about how to evaluate instruction—in particular, it calls into question the usual measures used in controlled experiments. After understanding how TutorIT works, please see my concluding comments on this subject.
SLT was unique when introduced, and raised considerable interest both in the US and internationally (Scandura, 1971a, 1973, 1977). Literally hundreds of CBI programs based on SLT were developed later in the 1970s and early 1980s, and many sold for decades.
Nonetheless, ITS largely ignored this research and focused on later work in cognitive psychology (Anderson et at, 1990, 1993) and especially the Carnegie school of artificial intelligence based on production systems (esp. Newell & Simon, 1972).
By the mid-1970s, cognitive psychology also discovered the importance of content, often equating theory with alternative ways of representing knowledge. Research focused largely on what (productions or relationships) might be in learner minds and comparing fit with observable behavior. Experimental studies followed the traditional statistical paradigm.
Similarly, most CBI development was heavily influenced by Gagne's work in instructional design (1965), along with that of Merrill and his students, 1994). The restricted focus of Reigeluth's (1983, 1987) influential books on Instructional Design largely eliminated or obscured some of SLT's most Important features, most notably its focus on precise diagnosis and higher order learning and problem solving. With essential differences requiring significant study, the long and short of it is that other than our own early tutorials (which made small publisher Queue one of Inc Magazine's 100 fastest growing small businesses), SLT failed to significantly inform on-going research in either CBI or ITS. After the interdisciplinary doctoral program in structural learning I developed at Penn was eliminated in the early-mid 1970s, SLT became a little understood historical curiosity.
With recent publications in TICL, depth of understanding in ITS, CBI and SLT has increased in recent years (Mitrovic & Ohlsson, 2007; Paquette, 2007; Scandura, 2007), including their respective advantages and limitations (Scandura, Koedinger, Mitrovic & Ohlsson, Paquette, 2009). Advances in the way knowledge is represented in SLT has the potential of revolutionizing the way tutoring systems are developed, both now and in the future. SLT rules6 were originally represented as directed graphs (e.g., Scandura, 1971a, 1973). Directed graphs (equivalent to Flowcharts) make it possible to assess individual knowledge. They have the disadvantage, however, of forcing one to make a priori judgments about level of analysis. They also make it difficult to identify subsets of problems associated with various paths in those graphs. 6 I used the term “rule” rather extensively in behavioral research during the 1960s. Adopting the term “production” from the logician Post in the 1930s, Newell & Simon (1972) introduced the term “production rule” in their influential book on problem solving. Anderson later used of the term “rule” in ITS as synonymous with “production rule” (in production systems). Accordingly, it ultimately seemed best to introduce the term “SLT rule” to distinguish the two. Distinctive characteristics of SLT rules became even more important with my introduction of ASTs into SLT. In this context, ASTs represent a long sought solution to my early attempts at formalization in SLT (see Scandura, 1973, Chapter 3). The importance of ASTs in SLT, however, only gradually became clear to me after using the concept for some time in developing our software engineering tools—despite the fact that ASTs had played a central role for years in compiler theory.
Having spent two decades in software engineering (e.g., Scandura, 1991, 1994 1995, 1999, 2001), it became increasingly apparent that a specific form of Abstract Syntax trees (ASTs) offered a long sought solution. ASTs are a precise formalism derived from compiler theory and that are widely used in software engineering. To date, ASTs have had almost no impact on knowledge representation, ITS or CBI. However, we will see that they do indeed have very significant advantages in SLT.
An up to date version of SLT and its relationship to other recent approaches to automated instructional systems has been documented in (Scandura, 2007; Scandura, Koedinger, Mitrovic & Ohlsson and Paquette, 2009).
This disclosure focuses on what is most unique about knowledge representation in SLT along with why and how it offers major advantages in developing adaptive tutoring systems. To avoid misunderstanding, it is helpful to distinguish the current approach is fundamentally different from others based on statistical analyses e.g., Sheehan, U.S. Pat. No. 6,144,838). The present disclosure is based on a priori identification of the underlying proficiency model, what an expert in the area believes should be learned for success. In contrast, the Sheehan disclosure explicitly assumes that the true proficiency model is not known a priori. The Sheehan method uses regression, Bayesian decision theory and other statistical methods to model ways in which required skills interact with different item features to produce differences in item difficulty. In contrast, the present disclosure has a deterministic foundation (i.e., that a true proficiency model IS known a priori). In short, the main thing they have in common is that both disclosures share the word “tree-based”.
There have been three fundamental advances in SLT in recent years. First is in the way knowledge is represented. SLT rules were originally represented as directed graphs (Flowcharts). They are now represented in terms of Abstract Syntax Trees (ASTs). Second is formalization of a key step in Structural (domain) Analysis (SA), enabling the systematic identification of higher order SLT rules that must be learned for success in ill-defined domains. Third is the complete separation of SLT's control mechanism from higher order knowledge. These advances distinguish knowledge representation in SLT from all others, and have fundamental implications for building adaptive tutoring systems.