`
`l,%.\‘\.- mam-.1“-:12
`
`.-\l!\"I'l3B 2".-‘:3; (1985) 255- .172
`
`Artificial
`
`Intelligence
`
`AN INTERNATIONAL JOURNAL
`
`‘a’
`
`LAST NUMBER OF THIS VOLUME
`
`N0rth—H0lla11d
`
`Amsterdam
`
`(cid:51)(cid:79)(cid:68)(cid:76)(cid:71)(cid:3)(cid:20)(cid:19)(cid:20)(cid:24)
`
`Plaid 1015
`
`
`
`ARTIFICIAL INTELLIGENCE
`
`289
`
`Heuristic Classification*
`
`William J. Clancey
`
`Stanford Knowledge Systems Laboratory, 701 Welch Road,
`Building C, Palo Alto, CA 94304, U.S.A.
`
`Recommended by Allen Newell
`
`ABSTRACT
`
`A broad range of well~structured problems-——embracing forms of diagnosis, catalog selection. and
`skeletal planning-—are solved in ‘expert systems‘ by the methods of heuristic classification. Llhese
`programs have a characteristic inference structure that systematically relates data to a pre-enumerated
`set of solutions by abstraction, heuristic association, and refinement.
`ln contrast with previous
`descriptions of classification reasoning, particularly in psychology, this analysis emphasizes the role of
`a heuristic in routine problem solving as a non—hierarchical, direct association between concepts. in
`contrast with other descriptions of expert systems. this analysis specifies the knowledge needed to solve
`a problem.
`independent of its representation in a particular computer language. The heuristic
`classification problem-soltting model provides a useful framework for characterizing lcinds of prol)~
`lems, for designing representation tools. and for understanding non-classification (constructive)
`problem -soloing methods.
`
`To understand something as a specific instance of a. more
`general case—which is what understanding a more fundamen-
`tal principle or structure means—is to have learned not only a
`specific thing but also a model for understanding other things
`like it that one may encounter. [13]
`
`1. Introduction
`
`Over the past decade, a variety of heuristic programs, commonly called ‘expert
`systems’, have been written to solve problems in diverse areas of science,
`engineering, business, and medicine. Developing these programs involves
`satisfying an interacting set of requirements: Selecting the application area and
`specific problem to be solved, bounding the problem so that it is computation-
`ally and financially tractable, and implementing a prototype program—to name
`a few obvious concerns. With continued experience, a number of programming
`environments or ‘tools’ have been developed and successfully used to imple-
`
`in: Proceedings Fourth National Con-
`* Expanded version of “Classification Problem Solving“.
`ference on Artificial Intelligence, Austin. TX (August 1984) 49-55.
`Artificial lntelligetice 27 (1985) 239-350
`GU04-3702/85l$3.30 © 1985. Elsevier Science Publishers B.V. (Nortl1—I-lolland)
`
`
`
`290
`
`w.J. CLANC}_jy
`
`ment prototype programs [51]. Importantly, the representational units of tools
`(such as ‘rules’ and ‘attributes‘) provide an orientation for identifying manage,
`able subproblems and organizing problem analysis. Selecting appropriate ap.
`plications now often takes the form of relating candidate problems to known
`computational methods, our tools.
`Yet,
`in spite of this experience, when presented with a given ‘knowledge
`engineering tool’, such as EMYCIN [94], we are still hard-pressed to say what
`kinds of problems it can be used to solve well. Various studies have demon-
`strated advantages of using one representation language instead of another-__
`for ease in specifying knowledge relationships, control of reasoning, and
`perspicuity for maintenance and explanation [2, 3,21,27, 92]. Other studies
`have characterized in low-level
`terms why a given problem might be in-ap.
`propriate for a given language, for example, because data are time-varying or
`subp1'ob1ems interact
`[51]. While these studies reveal
`the weaknesses and
`limitations of the rule-based formalism,
`in particular, they do not clarify the
`form of analysis and problem decomposition that has been so successfully used
`in these programs. In short, attempts to describe a mapping between kinds of
`problems and programming languages have not been satisfactory because they
`don't describe what a given program knows: Applications-oriented descriptions
`like ‘diagnosis’ are too general (e.g., solving a diagnostic problem does not
`necessarily require a device model), and technological terms like ‘rule-based’
`do not describe what kind of problem is being solved [48, 49]. We need a better
`description of what heuristic programs do and know—a computational charac-
`terization of their competencewi-independent of task and independent of pro-
`gramming language implementation. Logic has been suggested as a basis for a
`‘knowledge—level’ analysis to specify what a heuristic program does and might
`know [69, 71]. However, we have lacked a set of terms and relations for doing
`this.
`
`In an attempt to characterize the knowledge-level competence of a variety of
`expert systems, a number of programs were analyzed in detail.‘ There is a
`striking pattern: These programs proceed through easily identifiable phases of
`data abstraction, heuristic mapping onto a hierarchy of pre-enumerated solu-
`tions, and refinement within this hierarchy. In short, these programs do what is
`commonly called clamficarlon, but with the important twist of relating concepts
`in difierent classification hierarchies by non-hierarchical, uncertain inferences.
`We call this combination of reasoning heuristic classification.
`Note carefully: The heuristic classification model characterizes a form of
`knowledge and reasom'r1g—patterns of familiar problem situations and solu~
`tions, heuristically related. In capturing problem Situations that tend to occur
`and solutions that tend to work, this knowledge is essentially experiential, with
`
`‘Including: Ten rule-based systems {MYCIN, PUFF, CLOT, HEADMED, SACON from the EMYCIN
`family [15], plus WINE, BANKER, The Drilling Advisor. and other proprietary systems developed 3!
`Teknowledge,
`Inc.),_ a frame-based system (GRUNDYL and a program coded directly in LISP
`(so1=HIE 111).
`
`
`
`;HEURIS'I‘IC CLASSIFICATION
`
`291
`
`an overall form that is problem-area-independent. Heuristic classification is a
`'method of computation, not a kind of problem to be solved. Thus, we refer to
`‘the heuristic classification method‘, not ‘classification problem’.
`Focusing on epistemological content rather than representational notation,
`this paper proposes a set of terms and relations for describing the knowledge
`used to solve a problem by the heuristic classification method. Subsequent
`sections describe and illustrate the model
`in the analysis of MYCIN, SACON,
`GRUNDY, and SOPHIE 111. Significantly, a knowIedge—level description of these
`programs corresponds very well
`to psychological models of expert problem
`solving. This suggests that the heuristic classification problem-solving model
`capturesgeneral principles of how experiential knowledge is organized and
`used, and thus generalizes some cognitive science results. A thorough dis-
`cussion relates the model to schema research; and use of a conceptual graph
`notation shows how the inference-structure diagram characteristic of heuristic
`classification can be derived from some simple assumptions about how data and
`solutions are typically related (Section 4). Another detailed discussion then
`considers “what gets selected”, possible kinds of solution (e.g., diagnoses). A
`taxonomy of problem types is proposed that characterizes solutions of problems
`in terms of synthesis or analysis of some system in the world (Section 5). We
`finally turn to the issue of inference control in order to further characterize tool
`requirements for heuristic classification (Section 6), segueing into a brief
`description of constructive problem solving (Section 7).
`This paper explores different perspectives for describing expert systems; it is
`not a conventional description of a particular program or programming
`language. The analysis does produce some specific and obviously useful
`results, such as a distinction between electronic and medical diagnosis pro-
`grams (Section 6.2). But
`there are also a few essays ,with less immediate
`payoffs, such as the analysis of problem types in terms of systems (Section 5)
`and the discussion of the pragmatics of defining concepts (Section 4.5). Also,
`readers who specialize in problems of knowledge representation should keep in
`mind that
`the discussion of schemas (Section 4) is an attempt to clarify the
`knowledge represented in rule-based expert systems, rather than to introduce
`new representational ideas.
`From another perspective, this paper presents a methodology for analyzing
`problems, preparatory to building an expert system. It
`introduces an inter-
`mediate level of knowledge specification, more abstract than specific concepts
`and relations, but still independent of implementation language. Indeed, one
`aim is to afford a level of awareness for describing expert system design that
`enables knowledge representation languages to be chosen and used more
`
`deliberately.
`We begin with the motivation of wanting to formalize what we have learned
`about building expert systems. How can we classify problems‘? How can we
`select problems that are appropriate for our tools‘? How can we improve our
`tools‘? Our study reveals patterns in knowledge bases: Inference chains are not
`
`
`
`292
`
`W.J. CLANCEY
`
`arbitrary sequences of implications, they compose relations among concepts in
`a systematic way. Intuitively, we believe that understanding these high-level
`knowledge structures, implicitly encoded in today’s expert systems, will enable
`us to teach people how to use representation languages more eliectively, and
`also enable us to design better languages. Moreover,
`it
`is a well-established
`principle for designing these programs that the knowledge people are trying to
`express should be stated explicitly, so it will be accessible to auxiliary programs
`for explanation, teaching, and knowledge acquisition (e.g., [30]).
`Briefly, our methodology for specifying the knowledge contained in an
`expert system is based on:
`—a computational distinction between selection and construction of solutions;
`—a relational breakdown of concepts, distinguishing between abstraction and
`heuristic association and between subtype and cause,
`thus revealing the
`‘classification nature of inference chains; and
`—a categorization of problems in terms of synthesis and analysis of systems in
`the world, allowing us to characterize inference in terms of a sequence of
`classifications involving some system.
`The main result of the study is the model of heuristic classification, which
`turns out to be a common problem-solving method in expert systems. Identify-
`ing this computational method is not to be confused with advocating its use.
`Instead, by giving it a name and characterizing it, we open the way to
`describing when it
`is applicable, contrasting it with alternative methods, and
`deliberately using it again when appropriate.
`As one demonstration of the value of the model, classification in well-known
`
`medical and electronic diagnosis programs is described in some detail, contras-
`ting diflerent perspectives on what constitutes a diagnostic solution and
`different methods for controlling inference to derive coherent‘ solutions. Indeed,
`an early motivation for the study was to understand how NEOMYCIN, a medical
`diagnostic program, could be generalized. The resulting tool, called HERACLES
`(roughly standing for ‘Heuristic Classification Shell’) is described briefly, with a
`critique of its capabilities in terms of the larger model that has emerged.
`In the final sections of the paper, we reflect on the adequacy of current
`knowledge engineering tools,
`the nature of a knowledge-level analysis, and
`related research in psychology and artificial
`intelligence. There are several
`strong implications for the practice of building expert systems, designing new
`tools, and continued research in this field. Yet to be delivered, but promised by
`the model, are explanation and teaching programs tailored to the heuristic
`classification model, better knowledge acquisition programs, and demon-
`stration that
`thinking in terms of heuristic classification makes it easier to
`choose problems and build new expert systems.
`
`2. The Heuristic Classification Method Defined
`
`We develop the ideaiof the heuristic classification method by starting with the
`
`
`
`
`
`2.2. Data abstraction
`
`In the simplest problems, data are solution features, so the matching process is
`direct. For example, an unknown organism in MYCIN can be classified directly
`
`Bacteria Clunlinatinrt"
`
`
`
`
`
`
`
`f\]EIS‘E,ER|,5_
`-:.me.or'.':u:o»:I:u:‘
`srupnvccfcoecus
`E;.1TE|qr;.3_:-.,c'TEn|Ar;E,a_E
`nu-:ua:i¢+uu_u~_~. ms-rn norms
`J z’ ‘X
`\
`..--*’/ I
`“~
`/ \\
`
`CIOl‘lOE"0|.'IC|J5 MEPIIPIGEICDCCUS
`.1_H
`PEBKYEUS
`
`Ir\LEE;lELLa'-I
`
`FIG. 2.1. MYCIN’s classification of bacteria.
`
`though in
`this paper,
`2For simplicity, we will refer to classification hierarchies throughout
`practice these structures are not
`trees, but almost always ‘tangled’ structures with some nodes
`havin multi
`le arents.
`3
`P P
`
`' HEURISTIC CLASSIFICATION
`
`common sense notion of classification and relating it
`-occurs in heuristic programs.
`
`_ 293
`
`to the reasoning that
`
`2.1. Simple classification
`
`As the name suggests, the simplest kind of classification is identifying some
`unknown object or phenomenon as a member of a known class of objects,
`events, or processes. Typically,
`these classes are stereotypes that are hierar-
`chically organized, and the process of identification is one of matching obser-
`vations of an unknown entity against features of known classes. A paradigmatic
`example is identification of a plant or animal, using a guidebook of features,
`such as coloration, structure, and size. MYCIIN solves the problem of identifying
`an unknown organism from laboratory cultures by matching Culture in-
`formation against a hierarchy of bacteria (Fig. 2.1)?
`The essential characteristic of classification is that the problem solver selects
`from a set of pre—enumerated solutions. This does not mean, of course, that the
`‘right answer’ is necessarily one of these solutions, just that the problem solver
`will only attempt to match the data against the known solutions, rather than
`construct a new one. Evidence can be uncertain and matches partial, so the
`output might be a ranked list of hypotheses. Besides matching,
`there are
`several rules of inference for making assertions about solutions. For example,
`evidence for a class is indirect evidence that one of its subtypes is present.
`
`
`
`294
`
`W.J. CLANCEY
`
`given the supplied data of Gram stain and morphology. The features ‘Gram,
`stain negative’ and ‘rod-shaped’ match a class of organisms. The solution might
`be refined by getting information that allows subtypes to be discriminated.
`For many problems, solution features are not supplied as data, but are
`inferred by data abstraction. There are three basic relations for abstracting data
`in heuristic programs:
`(1) definitional abstraction. based on essential, necessary features of a
`concept (“if the structure is a one-dimensional network,
`then its shape is a_
`beam”);
`(2) qualitative abstraction, a form of definition involving quantitative data,
`usually with respect to some normal or expected value (“if the patient is an
`adult and white blood count is less than 2500, then the white blood count is
`low"); and
`(3) generalization in a subtype hierarchy (“if the client is a judge, then he is
`an educated person”).
`These interpretations are usually made by the program with certainty; belief
`thresholds and qualifying conditions are chosen so the abstraction is categori-
`cal. It is common to refer to this knowledge as being ‘factual’ or ‘definitional’.
`
`2.3. Heuristic classification
`
`In simple classification, data may directly match solution features or may match
`after being abstracted. In heuristic classification, solutions and solution features
`may also be matched heuristicaiiy, by direct, non-hierarchical association with
`some concept
`in another classification hierarchy. For example, MYCIN does
`more than identify an unknown organism in terms of visible features of an
`organism: MYCIN heuristically relates an abstract characterization of the patient
`to a classification of diseases. We show this inference structure schematically,
`followed by an example (Fig. 2.2).
`Basic observations about the patient are abstracted to patient categories,
`which are heuristically linked to diseases and disease categories. While only a
`subtype link with E.coli infection is shown here, evidence may actually derive
`from a combination of inferences. Some data might directly match E.coli
`features (an individual organism shaped like a rod and producing a Gram-
`negative stain is seen growing in a culture taken from the patient). Descriptions
`of
`laboratory cultures (describing location, method of collection, and in-
`cubation) can also be related to the classification of diseases.
`The important
`link we have added is a heuristic association between a
`characterization of the patient (‘compromised host’) and categories of diseases
`(‘gram—negative infection’). Unlike definitional and hierarchical inferences, this
`inference makes a great
`leap. A heuristic relation is uncertain, based on
`assumptions of typicality, and is sometimes just a poorly understood cor-
`relation. A heuristic,
`is often empirical, deriving from problem-solving
`
`
`
`HEURISTIC CLASSIFICATION
`
`-295
`
`HEURISTIC MATCH
`
`Patient Abstractions
`
`=s
`
`Disease Ctasses
`
`DATA
`ABSTRACTION
`
`REFINEMENT
`
`Patient Data
`
`Diseases
`
`HEUFNSTIC
`
`Compromised Host
`
`=-.
`
`Gram-Negatiueiniection
`
`GENERALIZATION
`
`SUBTYPE
`
`Immunosuppressed
`
`E.coli Infection
`
`GENEFIALIZATION
`
`Leukopenia
`
`DEFINITIONAL
`
`Low WBC
`
`QUALITATIVE
`
`WBC ( 2.5
`
`FIG. 2.2. Inference structure of MYCIN.
`
`experience; heuristics correspond to the ‘rules of thumb’, often associated with
`expert systems [35].
`Heuristics of this type reduce search by skipping over intermediate relations
`(this is why we do not call abstraction relations ‘heuristics’). These associations
`are usually uncertain because the intermediate relations may not hold in the
`specific case. Intermediate relations may be omitted because they are un~
`observable or poorly understood. In a medical diagnosis program, heuristics
`typically skip over the causal relations between symptoms and diseases. In
`Section 4 we will analyze the nature of these implicit relations in some detail.
`To summarize,
`in heuristic classification abstracted data statements are
`
`associated with specific problem solutions or features that characterize a solu-
`tion. This can be shown schematically in simple terms (Fig. 2.3).
`This diagram summarizes how a distinguished set of
`terms (data, data
`abstractions, solution abstractions, and solutions) are related systematically by
`
`
`
`296
`
`W.J. CLANCEY
`
`HEUFl|STiC MATCH
`
`Data Abstractions
`
`2:»
`
`Solution Abstractions
`
`DATA
`ABSTRACTION
`
`REFINEMENT
`
`Data
`
`Solutions
`
`FIG. 2.3. Inference structure of heuristic classification.
`
`the Structure of inference in heuristic
`difierenr kinds of relations. This is
`classification. The direction of inference and the relations ‘abstraction’ and
`‘refinement’ are a simplification, indicating a common ordering (generalizing
`data and refining solutions), as well as a useful way of remembering the
`classification model. In practice,
`there are many operators for selecting and
`ordering inferences, discussed in Section 6.
`
`3. Examples of I-leuristic Classification
`
`Here we schematically describe the architectures of SACON, GRUNDY, and SOPHIE
`111
`in terms of heuristic classification. These are brief descriptions, but reveal
`the value of
`this kind of analysis by helping us to understand what
`the
`programs do. After a statement of the problem, the general inference structure
`and an example inference path are given, followed by a brief discussion. In
`looking at these diagrams, note that sequences of classifications can be com~
`posed, perhaps involving simple classification at one stage (SACON) or omitting
`‘abstraction’ or ‘refinement’ (GRUNDY and SACON).
`In Section 4, we will reconsider these examples, in an attempt to understand
`the heuristic classification pattern. Our approach will be to pick apart the ‘inner
`structure’ of concepts and to characterize the kinds of relations that are typically
`
`useful for problem solving.
`
`3.1. SACON
`
`Problem: SACON [5] selects classes of behavior that should be further in-
`vestigated by a structural-analysis simulation program (Fig. 3.1).
`
`Discussion: SACON solves two problems by classification—heuristically analyz-
`ing a structure and then using simple classification to select a program. It
`begins by heuristically selecting a simple numeric model
`for analyzing a
`structure (such as an airplane wing). The numeric model, an equation, produces
`stress and deflection estimates, which the program then qualitatively abstracts
`as behaviors to study in more detail. These behaviors, with additional in-
`formation about the material, definitionally characterize different configura-
`tions of the MARC simulation program (e.g.,
`the inelastic-fatigue program).
`
`
`
`HEURISTIC CLASSIFICATION
`
`297
`
`Analysis Program
`
`DATA
`ABSTFIACTION
`
`Quantitative Prediction
`of Material Behavior
`
`HEUFHSTIC MATCH
`
`DEF!N|T|ONAL
`
`Abstractstructure
`
`=:-
`
`Numericlvlodei
`
`DATA
`ABSTFIACTION
`
`Structure Description
`
`!nelastic~Fatigue
`Program
`
`DEFINITIONAL
`
`Fatigue
`I
`Dem-‘°t‘°" * Material
`
`QUALITATIVE
`
`Stress and Deflection
`Magnitude
`
`'
`
`HEUFIISTIC
`
`DEFINlTlONAL
`
`Size
`=;.
`Beam +SuppOn
`Distribution
`
`.
`.
`_
`SpecificEquation
`
`DEFINITIONAL
`
`One-dimensional
`and Network
`
`FIG. 3.1. Inference structure of SACON.
`
`There is no refinement because the solutions to the first problem are just a
`simple set of possible models, and the second problem is only solved to the
`point of specifying program classes. (In another software configuration system
`we analyzed, specific program input parameters are inferred in a refinement
`step.)
`
`
`
`298
`
`3.2. or-zunnv
`
`w.J. CLANCEY
`
`Problem: GRUNDY [78] is a model of a librarian, selecting books a person might
`like to read.
`
`Discussion: GRUNDY solves two classification problems heuristically, classifying
`a reader’s personality and then selecting books appropriate to this kind of
`person (Fig. 3.2). While some evidence for person stereotypes is by data
`abstraction (a JUDGE can be inferred to be an EDUCATED-PERSON),
`other evidence is heuristic (watching no TV is neither a necessary nor sufficient
`characteristic of an EDUCATEDPERSON).
`Illustrating the power of a knowledge-level analysis, we discover that the
`people and book classifications are not distinct
`in the implementation. For
`example, ‘fast plots’ is a book characteristic, but in the implementation ‘likes
`fast plots’ is associated with a person stereotype. The relation between a person
`stereotype and ‘fast plots’
`is heuristic and should be distinguished from
`abstractions of people and books. One objective of the program is to learn
`better people stereotypes (user models). The classification description of the
`user modeling problem shows that GRUNDY should also be learning better ways to
`characterize books, as well as improving its heuristics. If these are not treated
`separately, learning may be hindered. This example illustrates why a knowledge-
`level analysis should precede representation.
`It
`is interesting to note that GRUNDY does not attempt to perfect the user
`model before recommending a book. Rather, refinement of the person stereo-
`
`HEUFIISTIC MATCH
`
`Self-Description
`and Behavior
`
`;-.
`
`People
`Classes
`
`=¢. Book
`Classes
`
`FIEFINEMENT
`
`Books
`
`HEURISTIC
`
`HEUFIISTIC
`
`WatchesNoTV
`
`=9
`
`Educated
`Person
`Stereotype
`
`=:. Bookswithlntelligent
`Main Character
`
`SUBTYPE
`
`FIG. 3.2. Inference structure of owner.
`
`"Earth Angels"
`
`
`
`HEURISTIC CLASSIFICATION
`
`,
`
`299
`
`type occurs when the reader rejects book suggestions. Analysis of other
`programs indicates that this multiple—pass process structure is common. For
`example, the Drilling Advisor makes two passes on the causes of drill sticking,
`considering general, inexpensive data first, just as medical programs commonly
`consider the ‘history and physical’ before laboratory data. The high-level,
`abstract structure of the heuristic classification model makes possible these
`kinds of descriptions and comparisons.
`
`3.3. SOPHIE [11
`
`Problem: SOPHIE III [12] classifies an electronic circuit in terms of the component
`that is causing faulty behavior (Fig. 3.3).
`
`soPHIE’s set of pre-enumerated solutions is a lattice of valid
`Discussion:
`and faulty circuit behaviors.
`In contrast with MYCIN,
`so1=-I-1te’s solutions are
`device states and component
`flaws, not stereotypes of disorders. They are
`related causally, not by subtype. Data are not only external device behaviors,
`but include internal component measurements propagated by the causal analy-
`
`HEURISTIC MATCH
`
`Qualitative Values = Behavior at Some Port
`oi Ports
`of Some Module in
`Behavior Lattice
`
`DATA
`
`ABSTRACTION
`Quantitative
`.
`.
`. B h
`Circuit
`9 awor
`
`DEFINITIONAL
`
`Local Circuit Measurements
`
`REHNEMENT
`
`Component Fault
`
`HEURISTIC
`
`{VOLTAGE N11 N14)
`is High
`
`=:. Variable Voltage
`Reference is High or OK
`
`QUALITATIVE
`
`-
`
`CAUSE
`
`{VOLTAGE N11 N14] > 31V
`
`05 Collector Open
`
`I FIG. 3.3. Inference structure of SOPHIE.
`
`
`
`
`
`300
`
`W.J. CLANCEY
`
`sis of the LOCAL program. Nevertheless, the inference structure of abstractions’
`heuristic relations, and refinement
`fits and heuristic classification model,
`demonstrating its generality and usefulness.
`
`4. Understanding Heuristic Classification
`
`The purpose of this section is to develop a principled account of why the
`inference structure of heuristic classification takes the characteristic form we
`have discovered. Our approach is to describe what we have heretofore loosely
`called ‘classes’, ‘concepts’, or ‘stereotypes’ in a more formal way, using the
`conceptual graph notation of Sowa [88].
`In this formalism, a concept
`is
`described by graphs of typed, usually binary relations among other concepts,
`This kind of analysis has its origins in semantic networks [77], the conceptual.
`dependency notation of Schank et al. [83], the prototype/perspective descrip.
`tions of KRL [7], the classification hierarchies of Kl,-ONE [85], as well as the
`predicate calculus.
`Our discussion has several objectives:
`(1) to relate the knowledge encoded in rule—based systems to structures more
`commonly associated with ‘semantic net’ and ‘frame’ formalisms;
`(2) to explicate what kinds of knowledge heuristic rules leave out (and thus
`their advantages for search efficiency and limitations for correctness); and
`(3) to relate the kinds of conceptual
`relations collectively identified in
`knowledge representation research (e.g.,
`the relation between an individual
`and a class) with the pattern of inference that typically occurs during heuristic
`classification problem solving (yielding the characteristic inverted horeshoe
`inference structure of Fig. 2.3).
`One important result of this analysis is a characterization of the ‘heuristic
`relation’ in terms of primitive relations among concepts (such as preference,
`accompaniment, and causal enablement), and its difference from more essen-
`tial,
`‘definitional’ characterizations of concepts.
`In short, we are trying to
`systematically characterize the kind of knowledge that is useful for problem
`solving, which relates to our larger aim of devising useful
`languages for
`encoding knowledge in expert systems.
`
`4.1. Schemas vs. definitions
`
`In the case of matching features of organisms (MYCIN) or programs (SACON),
`features are essential
`(necessary),
`identifying characteristics of
`the object,
`event, or process. This corresponds to the Aristotelian notion of concept
`definition in terms of necessary properties? In contrast, features may be only
`‘incidental’, corresponding to typical manifestations or behaviors. For example,
`
`3Sowa [88] provides a good overview of these well-known philosophical distinctions. See alS0
`[28, 72].
`
`
`
`HEURISTIC CLASSIFICATION
`
`301
`
`E. coli is normally found in certain parts of the body, an incidental property. It
`' is common to refer to the combination of incidental and defining associations
`as a ‘schema’ for the concept.“ lnferences made using incidental associations of
`a schema are inherently uncertain. For example, we might infer that a parti-
`cular person, because he is educated, likes to read books, but this might not be
`' true. In contrast, an educated person must, by definition, have learned a great
`deal about something (though maybe not a formal academic topic).
`The nature of schemas and their representation has been studied extensively
`1 in Al. As stated in the introduction (Section 1), our purpose here is to exploit
`this research to understand the knowledge contained in rules. We are not
`2 advocating one representation over another; rather we just want to find some
`way of writing down knowledge so that we can detect and express patterns. We
`use the conceptual graph notation of Sowa because it is simple and it makes
`basic distinctions that we find to be useful:
`
`~ A schema is made up of coherent statements mentioning a given concept, not
`a list of isolated, independent features. (A statement is a complete sentence.)
`-A schema for a given concept contains relations to other concepts, not just
`‘attributes and values’ or ‘slots and values‘.
`
`—A concept is typically described from different points of view by a set of
`schemata (called a ‘schematic cluster’), not a single ‘frame’.
`-The totality of what people know about a concept usually extends well
`beyond the schemas that are pragmatically encoded in programs for solving
`limited problems.
`Finally, we adopt Sowa‘s definition of a prototype as a ‘typical individual’, a
`specialization of a concept schema to indicate typical values and characteristics,
`where ranges or sets are described for the class as a whole. Whether a program
`uses prototype or schema descriptions of its solutions is not important to our
`discussion, and many may combine them, including ‘normal’ values, as well as a
`spectrum of expectations.
`
`4.2. Alternative ellcodings of schemas
`
`To develop the above points in some detail, we will consider a conceptual
`graph description and how it relates to typical rule-based encodings. Fig. 4.]
`shows how knowledge about the concept ‘cluster headache‘ is described using
`the conceptual graph notation.5
`Concepts appear in brackets; relations are in parentheses. Concepts are also
`related by a type hierarchy, e.g., a HEADACHE is a kind of PROCESS, an
`
`‘Here we use the word ‘schema’ as a kind of knowledge, not a construct of a particular
`programming language or notation. See [49] for further discussion of this distinction.
`‘One English translation would be: “A cluster headache is a headache that occurs with a
`frequency in clusters, experienced by an older man, accompanied by lacrimation, with charac-
`teristic severe, of location unilateral, occurring at a point in time of early sleep."
`
`
`
`302
`
`WJ. CLAl\ICEy
`
`[UNILATERAL]
`
`[EARLY-SLEEP]
`
`T
`
`T
`
`[SEVERE] A {CHRC)
`
`<-—-— [HEADACHE] —?> (FREQI :3- {Cl-USTEREDI
`
`i
`
`i
`
`IACCM)
`
`IEXPR}
`
`i
`
`V
`
`[LACRIMATION]
`
`[OLDER-MAN]
`
`[EARLY-SLEEP]
`[TIME:
`
`is
`[STATE:
`
`[SLEEP]]
`
`-9 (AFTER)
`
`-2»
`
`[TIME-PERIOD: @few-hrs]]
`
`is
`[CLUSTERED]
`[DAILY] 6 (FREQ)
`
`-6-
`
`[EVENT] -) (DURATION) —)
`
`[TIME-PERIOD: @1week]
`
`is
`[OLDER-MAN]
`[MAN] -9 -(CHRC) —> [etc]
`
`FIG. 4.]. Schema describing the concept CLUSTER-HEADACHE and some related concepts,
`
`OLDER-MAN is a kind of MAN. Relations are constrained to link concepts of
`particular types, e.g., PTIM, a point in time, links a PROCESS to a TIME. For
`convenience, we can also use Sowa‘s linear notation for conceptual graphs.
`Thus, OLDER~MAN can be described as a specialization of MAN, “a man
`with characteristic old”. CLUSTERED in “an event occurring daily for a
`week”. EARLY-SLEEP is “a few hours after the state of sleep".
`We make no claim that a representation of this kind is complete, com-
`putationally tractable, or even unambiguous. For our purposes here, it is simply
`a notation with the advantage over English prose of systematically revealing
`how what we know about a concept can be (at
`least partially) described in
`terms of its relations to concepts of other types.
`For contrast, consider how this same knowledge might be encoded in a
`notation based upon objects, attributes, and values, as in MYCIN. Here,
`the
`object would be the PATIENT, and typical attributes would be HEADACHE-
`ONSET (with possible values EARLY-MORNING, EARLY-SLEEP, LATE-
`AFTERNOON)
`and DISORDER (with
`possible
`values CLUSTER-
`HEADACHE, INFECTION, etc.). A typical rule might be, “If the patient has
`headache onset during sleep,
`then the disorder of
`the patient
`is cluster
`headache.” The features of a cluster headache m