WO 02/088926
`far more likely to produce a complex family of inter-related concepts with ad-hoc
`exceptions. Morelikely, due to the total domain of discourse being so broad, ontology
`produced in this mannerwill be extremely context sensitive, leading to many
`possibilities for introducing ambiguities and contradictions.
`Takinga leaf from ourearlier philosophy of simplification through abstraction
`layering, we instead chooseto define a set of ontologies: one per inter-layer boundary.
`Figure 7 indicates these ontologies as curved arrowsto theleft of the agentstack.
`The communication of factual knowledgeto IAsin the first level of abstraction is
`represented by meansof a simple ontology of facts (called the Level 1 Shapes Vector
`Ontology). All agents described within this portion of the specification make use of
`this mechanism to receive their input. It is worthwhile noting that the knowledge
`domaindefined by this ontology is quite rigidly limited to incorporate only a universe
`of facts -- no higher-level concepts or meta-concepts are expressible in this ontology.
`This simplified knowledge domain is uniform enoughthat a reasonably clean set of
`ontological primitives can be concisely described.
`Interaction between IA’s isstrictly limited to avoid the possibility of ambiguity. An
`agent may freely report outcomes to the Shapes Vector Event Delivery sub-system,
`but inter-I[A communication is only possible between agents at adjacent layers in the
`architecture.It is specifically prohibited for any agent to exchange knowledge with a
`“peer” (an agent within the same layer). If communication is to be provided between
`peers, it must be via an intermediary in an upperlayer. The reasons underlying these
`rules of interaction are principally that they remove chances for ambiguity by forcing
`consistent domain-restricted universes of discourse (see below). Furthermore, such
`restrictions allow for optimised implementation of the Knowledge Architecture.
`WO 02/088926
`Onespecific optimisation made possible by these constraints -- largely due to their
`capacity to avoid ambiguity and context -- is that basic factual knowledge maybe
`represented in termsof traditional context-free relational calculus. This permits the
`use of relational database technology in storage and managementof knowledge.
`Thus, for simple selection andfiltering procedures on the knowledge base we can
`utilise well known commercial mechanisms which have been optimised over a
`numberyears rather than having to build a custom knowledge processor inside each
`intelligent agent.
`Note that we are not suggesting that knowledge processing and retrievalis not
`required in an IA. Ratherthat by specifying certain requirementsin a relational
`calculus (SQLis a preferable language), the database engine assists by undertaking a
`filtering process when presenting a view for processing by the IA. Hence the IA can
`potentially reap considerable benefits by only having to process the (considerably
`smaller) subset of the knowledge base whichis relevant to the IA. This approach
`becomes even more appealing when weconsider that the implementation of choice
`for Intelligent Agents is typically a logic language such as Prolog. Such environments
`mayincursignificant processing delays due to the heavy stack based nature of
`processing on modern Von Neumann architectures. However, by undertaking early
`filtering processes using, optimised relational engines and a simple knowledge
`structure, we can minimise the total amountof data that is input into potentially time
`consuming tree and stack-based computational models.
`The placementof intelligent agents within the variouslayers of the knowledge
`architecture is decided based upon the abstractions embodied within the agent and
`the knowledge transforms provided by the agent. Twocriteria are considered in
`determining whether a placementat layer n is appropriate:
`WO 02/088926
`* would the agent be context sensitive in the level n ontology? If so, it should be split
`into two or moreagents.
`* does the agent perform data fusion from one or moreentities at level n? If so it must
`be promotedto at least level n+1 (to adhere to the requirement of no “horizontal”
`2.2 A Note on the Tardis
`A moredetailed description of the Tardis is provided in part 5 of the specification.
`The Tardis connects the JA Gestalt to the real-time visualisation system.It also
`controls the system’s notion of time in order to permitfacilities such as replay and
`visual or other analysis anywhere along the temporal axis from the earliest data still
`stored to the current real world time.
`The Tardis is unusualin its ability to connect an arbitrary semantic or deduction to a
`visual event. It does this by acting as a very large semantic patch-board. The basic
`premise is that for every agreed global semantic (e.g. X window packet arrived
`[attributelist]) there is a specific slot in an infinite sized table of globally agreed
`semantics. For practical purposes, there are 2
`slots and therefore the current
`maximum number of agreed semantics available in our environment. No slot, once
`assigned a semantic, is ever reused for any other semantic. Agents that arrive at a
`deduction, which matches the slot semantic, simply queue an eventinto the slot. The
`WO 02/088926
`visual system is profiled to match visual events with slot numbers. Hence visual
`events are matched to semantics.
`Asfor the well-known IP numbers and Ethernet addresses, the Shapes Vectorstrategy
`is to have incremental assignment of semantics to slots. Various taxonomiesetc. are
`being considered for slot grouping. As the years go by,it is expected that someslots
`will fall into disuse as the associated semantic is no longer relevant, while others are
`added. It is considered highly preferable for obvious reasons, that no slot be reused.
`As mentioned, further discussion about the Tardis and its operation can be found in
`part 5 of the specification.
`3. Inferencing Strategies
`The fundamental inferencing strategy underlying Shapes Vectoris to leave inductive
`inferencing as the province of the (human) user and deductive inferencing as typically
`the province of the IA’s.It is expected that a user of the system will examine
`deductive inferences generated bya set of IA’s, coupled with visualisation, in order to
`arrive at an inductive hypothesis. This separation of duties markedly simplifies the
`implementation strategies of the agents themselves. Nevertheless, we propose further
`aspects that may produce a very powerful inferencing system.
`3.1 Traditional
`Agents can employ either forward chaining or backward chaining, depending on the
`role they are required to fulfil. For example, some agents continuously combtheir
`views of the knowledge base in attempts to form current, up to date, deductions that
`are as “high level” as possible. These agents employ forward chaining and typically
`inhabit the lowerlayers of the agent architecture. Forward chaining agents also may
`havedata stream inputs from low level “sensors”. Based on these andotherinputs, as
`WO 02/088926
`well as a set of input priorities, these agents work to generate warnings whencertain
`security-significant deductions becometrue.
`Anotherset of agents within the Shapes Vector system will be backward chaining
`(goal driven) agents. These typically form part of the “User Avatar Set”: a collection of
`knowledge elements, which attempt to either prove or disprove user queries
`(described morefully in Section 8 of this part.).
`3.2 Possiblistic
`In executing the possiblistic features incorporatedinto the level 2 ontology (described
`in Section 7.1 of this part), agents may need to resort to alternative logics. This is
`implied by the inherent multi-valued nature of the possiblistic universe. Where a
`universe of basic facts can be described succinctly in terms of a fact existing or not
`existing, the situation is more complex when symbolic possibility is added. For our
`formulation we chose a three-valued possiblistic universe, in which a fact may be
`existent, non-existent, or possibly existent.
`To reason in such a universe we adopt two different algebra’s. Thefirst a simple
`extension of the basic principle of unification commonto computationallogic. Instead
`of the normalassignation of successful unifaction to existence and unsuccessful
`unification to non-existence, we adoptthe following:
`© successful unification implies existence,
`* the discovery of an explicit fact which precludes unification implies non-
`existence(this is referred to this as a hardfail),
`* unsuccessful unification without an explicit precluding case implies possible
`existence (this is referred to as a soft fail)
`WO 02/088926
`A second algebra, which maybe used to reason in the possiblistic universe, involves a
`technique knownas “predicate grounding” in which a user-directed pruning of a
`unification search allowsfor certain specified predicates to be ignored (grounded)
`whenpossibilities are being evaluated.
`3.3 Vectors
`Agents operating at higher levels of the Shapes Vector Knowledge Architecture may
`require facilities for reasoning about uncertain and/or incomplete information in a
`more continuous knowledge domain. Purely traditional forward or backward
`chaining does noteasily express such reasoning, and the three-valued possiblistic
`logic may lack the necessary quantitative features desired. To implementsuch agents
`an alternative inferencing strategy is used based upon notions of vector algebra in a
`multi-dimensional semantic space. This alternative strategy is employed in
`conjunction with more conventional backward chaining techniques. The use of each of
`the paradigms is dependent on the agent, and the domain of discourse.
`Our vector-based approachto inferencing revolves around constructing an abstract
`space in which relevant facts and deductions may be represented by geometrical
`analogues(suchas points and vectors), with the properalgebraic relationships
`holding true. In general, the construction of such a spacefor a large knowledge
`domainis extremely difficult. For Shapes Vector, we adopt a simplifying strategy of
`constructing several distinct deductive spaces, each limited to the (relatively small)
`domain of discourseof a single intelligent agent. The approach is empirical and is
`only feasible if each agentis restricted to a very small domain of knowledgeso that
`construction of its space is not overly complex.
`The definition of the deductive space for an IA is a methodical and analytical process
`undertaken during the design of the agentitself. It involves a consideration of the set
`WO 02/088926
`of semantic concepts (“nouns”) whichare relevant to the agent, and across which the
`agent’s deductions operate. Typically this concept set will contain elements of the
`agent’s layer ontology as well as nouns which are meaningful only within the agent
`itself. Once the agent’s concept set has been discovered, we can identify within it a
`subset of ‘base nouns’ -- concepts which cannotbe defined in terms of other members
`of the set. This identification is undertaken with reference to a semi-formal
`‘connotation spectrum’ (a comparative metric for ontological concepts).
`Such nouns have two important properties:
`* each is semantically orthogonal to every other base noun, and
`* every memberof the concept set which is not a base noun can be described as a
`combination of two or more base nouns.
`Collectively, an 1A’s set of n base nouns defines a n-dimensional semantic space (in
`which each base noun describes an axis). Deductions relevant to the agent constitute
`points within this space; the volume boundedbyspatial points for the full set of agent
`deductions represents the sub-space of possible outputs from that agent. A rich set of
`broad-reaching deductions leads to a large volumeof the space being covered by the
`agent, while a limited deductionset results in a very narrow agent of more limited
`utility (but easier to construct). Our present approach to populating the deductive
`space is purely empirical, driven by human expert knowledge. The onus is thus upon
`the designer of the LA to generate a set of deductions, which (ideally) populate the
`space in a uniform manner.
`In reality, the set of deductions that inhabit the space can become quite non-uniform
`(“clumpy”) given this empirical approach. Hence rigorous constraint on the domain
`covered by an agentis entirely appropriate. Of course this strategy requires an
`WO 12/088926
`appropriate mechanism at a higher abstract layer. However, the population of a
`higher layer agent can utilise the agents below them in a behavioural manner thereby
`treating them as sub-spaces.
`Once an agent’s deductive space has been constructed and populated with deductions
`(points), it may be used to draw inferences from observedfacts. This is achieved by
`representingall available and relevant facts as vectors in the multi-dimensional
`semantic space and considering how these vectors are located with respect to
`deduction points or volumes. A set of fact vectors, when added using vector algebra
`may precisely reach a deduction point in the space. In that situation, a deductive
`inference is implied. Alternatively, even in the situation where no vectors or
`combinations of vectors precisely inhabits a deduction point, more uncertain
`reasoning can be performed using mechanisms such as distance metrics. For example,
`it may be implied that a vector, which is “close enough” to a deduction point, is a
`weak indicator of that deduction. Furthermore, in the face of partial data, vector
`techniques may be used to hone in on inferences by identifying Facts (vectors),
`currently not asserted, which would allow for somesignificant deduction to be
`drawn.Sucha situation mayindicate that the system should perhapsdirect extra
`resources towardsdiscovering the existence (or otherwise) of a key fact.
`The actual inferencing mechanism to be used within higher-level Shapes Vector
`agents is slightly more flexible than the scheme we have described above. Rather than
`simply tying facts to vectors defined in terms of the IA’s base nouns, we can define an
`independentbutspatially continuous ‘fact space’. Figure 8 demonstrates the concept:
`a deductive space has been defined in terms of a set of base nouns relevant to the JA.
`Occupying the same spatial region is a fact space, whose axes are derived from the
`agent's layer ontology. Facts are defined as vectors in this second space:thatis, they
`are entities fixed with respect to the fact axes. However, since the fact space and
`deduction space overlap, these fact vectors also occupy a location with respect to the
`WO 02/088926
`base nounaxes. It is this location which we use to make deductive inferences based
`uponfact vectors. Thus, in the Figure, the fact that the observed fact vector (arrow)is
`close to one of the deductions (dots) may allow for assertion of that deduction with a
`particular certainty value (a function of exactly how close the vector is to the
`deduction point). Note that, since the axes of the fact space are independentof the
`axes of the deductive space,it is possible for the formerto vary (shift, rotate and/or
`translate, perhaps independently) with respect to the latter. If such a variation occurs,
`fact vectors (fixed with regard to the fact axes) will have different end-points in
`deduction-space. Therefore, after such a relative change in axes, a different set of
`deductions may be inferred with different confidence ratings. This mechanism of
`semantic relativity may potentially be a powerful tool for performing deductive
`inferencing in a dynamically changing environment.
`Aninteresting aspect of the preferred approach to vector-based deductive inferenceis
`that it is based fundamentally upon ontological concepts, which can in turn be
`expressed as English nouns. This has the effect that the deductions made by an agent
`will resemble simple sentences in a very small dialect of pseudo-English. This
`language may be a useful medium for a humanto interact with the agent ina
`relatively natural fashion.
`While the inferencing strategy described above has some unorthodox elementsinits
`approachto time-varying probabilistic reasoning for security applications, there are
`more conventional methods that may be used within Shapes Vector IA’s in the
`instance that the method falls short of its expected deductive potential. Frame based
`systemsoffer one well understood (although inherently limited) alternative paradigm.
`Indeed,it is expected that some IA’s will be frame based in any case (obtained off the
`shelf and equipped with ontology to permit knowledge transfer with the knowledge
`WO 02/088926
`As described above, the vector-based deductive engine is able to make weak
`assertions of a deduction with an associated certainty value (based on distancesin n-
`Dimensional space). This value can be interpreted in a variety of ways to achieve
`different flavours of deductive logic. For example, the certainty value could
`potentially be interpreted as a probability of the assertion holding true, derived froma
`consideration of the current context and encoded world knowledge. Such an
`interpretation delivers a true probabilistic reasoning system. Alternatively, we could
`potentially consider a more rudimentary interpretation wherein we consider
`assertions with a certainty above a particular threshold (e.g. 0.5) to be “ possible”
`within a given context. Underthese circumstances, our system would deliver a
`possiblistic form of reasoning. Numerousother interpretations are also possible.
`Inferencing for Computer Security Applications
`As presented, our JA architecture is appropriate to knowledge processing in any
`number of domains. To place the work into the particular context, for whichit is
`primarily intended, we will now consider a simple computer security application of
`this architecture.
`One common,butoften difficult, task facing those charged with securing a computer
`networkis detecting access of network assets which appears authorised (e.g¢., the user
`has the proper passwordsetc) but is actually malicious. Such access incorporates the
`so-called “insider threat” (i.e., an authorised user misusing their privileges) as well as
`the situation where confidentiality of the identification system has been compromised
`(e.g., passwords have been stolen). Typically, Intrusion Detection Systems are not
`good at detecting such security breaches, as they are purely based on observing
`signatures relating to improperuseortraffic.
`Shapes Vector’s comprehensive inferencing systems allow it to deduce a detailed
`semantic model of the network under consideration. This model coupled with a user’s
`WO 02/0889 26
`inductive reasoning skills, permits detection of such misuse even in the absence of any
`This application of Shapes Vector involves constructing a Gestalt of Intelligent Agents
`that are capable of reasoning aboutrelatively low-level facts derived from the
`network. Typically these facts would be in the form of observations of traffic flow on
`the network. Working collaboratively, the agents deduce the existence of computers
`on the network andtheir intercommunication. Other agents also deduceattributes of
`the computers and details of their internal physical and logical states. This
`information serves two purposes:oneis to build up a knowledgebase concerning the
`network, and anotheristo facilitate the visualisation of the network. This latter output
`from the agents is used to construct a near real-time 3D visualisation showing the
`computers and network interfaces knownto exist and their interconnection. Overlaid
`onto this “map” is animation denoting the traffic observed by the agents,classified
`accordingto service type.
`Observing such a Shapes Vector visualisation a user may note somevisual aspect that
`they consider being atypical. For example, the user may note a stream of telnet
`packets (whichitself might be quite normal) traversing the network between the
`primary network server and node which the visualisation shows as only a network
`interface. The implications of such an observation are that a nodeon the networkis
`generating a considerable body of data, but this data is formatted such that none of
`the Shapes Vector agents can deduce anything meaningful about the computer issuing
`the traffic (thus no computer shapeis visualised, just a bare networkinterface).
`The human user mayconsiderthis situation anomalous: given their experience of the
`network, most high volumetraffic emitters are identified quickly by one or more of
`the various IAs. While the telnet session is legitimate, in as muchas the proper
`passwordshavebeen provided,the situation bears further investigation.
`WO 12/088926
`To probe deeper, the User Avatar component of Shapes Vector, described morefully
`in Section 8 in Part 2 of the specification, can be used to directly query the detailed
`knowledge base the agents have built up behindto the (less-detailed) visualisation.
`The interaction in this situation might be as follows:
`human> answer what Useris-logged-into Computer “MainServer”?
`gestalt> Relationship is-logged-into [User Boris, Computer MainServer]
`This reveals a user namefor the individual currently logged into the server. A further
`interaction mightbe:
`human>find all User where id=”Boris”?
`gestalt> Entity User (id=Boris, name=”Boris Wolfgang”, type=” guest user”)
`An agent has deducedat somestage of knowledge processing that the user called
`Boris is logged in using a guest user account. The Shapes Vector user would be aware
`that this is also suspicious, perhapseliciting a further question:
`human> answer what is-owned-by User Boris”?
`gestalt> Relationship is-owned-by [File passwords, User Boris]
`Relationship is-owned-by [Process keylogger, User Boris]
`Relationship is-owned-by [Process passwordCracker, UserBoris]
`The facts have, again, been deduced by one or moreof the IA’s during their
`processing of the original network facts. The human user, again using their own
`knowledge and inductive faculties, would become more suspicious. Their level of
`suspicion might be such that they take action to terminate Boris’ connection to the
`main server.
`WO 12/088926
`In addition to this, the user could ask a range of possiblistic and probabilistic
`questions aboutthe state of the network, invoking faculties in the agent Gestalt for
`more speculative reasoning.
`3.4 Other Applications
`The IA architecture disclosed herein lendsitself to other applications. For example,it
`is not uncommon for the Defence community to have many databases in just as many
`formats.It is very difficult for analysts to peruse these databases in order to gain
`useful insight. There has been mucheffort aimed at considering how particular
`databases maybestructured in order for analysts to achieve their objectives. The
`problem has proved to be difficult. One of the major hurdlesis that extracting the
`analysts’ needs and codifying them to structure the data leadsto different
`requirements not only betweenanalysts, but also different requirements depending
`on their current focus. One of the consequencesis that in order to structure the data
`correctly, it must be context sensitive, which a relational database is not equipped to
`Shapes Vector can overcome manyof the extantdifficulties by permitting knowledge
`and deductionrules to be installed into an IA. This IA, equipped with a flexible user
`interface andstrictly defined query language, can then parse the data in a database in
`orderto arrive at a conclusion. The knowledgerules and analyst-centric processing
`are encodedin the IA, not in the structure of the database itself, which can thus
`remain context free. The Shapes Vector system allows incremental adjustmentof the
`IA without having to re-format and restructure a database through enhancementof
`the IA, or throughan additional IA with relevant domain knowledge. Either the IA
`makes the conclusion, or it can provide an analyst with a powerful tool to arrive at
`low level deductions that can be used to arrive at the desired conclusion.
`WO 12/088926
`4. Rules for Constructing an Agent
`In Section 2 of this part of the specification, several rules governing agents were
`mentioned, e.g. no intra level communication and each agent must be context free
`within its domain of discourse. Nevertheless, there are still a numberof issues, which
`need clarification to see how an agent can be constructed, and someof the resultant
`In a preferred arrangement the three fundamental rules that govern the construction
`of an agentare:
`1. All agents within themselves mustbe contextfree;
`2. If a context sensitive rule or deduction becomes apparent, then the agent must be
`split into two or moreagents;
`3. No agent can communicate with its peers in the samelevel. If an agent’s deduction
`requires input from a peer, then the agent must be promotedto a higherlevel, or a
`higherlevel agent constructed which utilises the agent and the necessary peer(s).
`In our current implementation of Shapes Vector, agents communicate with other
`entities via the traditional UNIX sockets mechanism asan instantiation of a
`componentcontrol interface. The agent architecture does not preclude the use of third
`party agents or systems. The typical approachto dealing with third party systemsis
`to provide a “wrapper” which permits communication between the system and
`Shapes Vector. This wrapper needsto be placed carefully within the agent hierarchy
`so that interaction with the third party system is meaningful in terms of the Shapes
`Vector ontologies, as well as permitting the wrapperto act as a bridge between the
`third party system and other Shapes Vector agents. The wrapper appearsas just
`another SV agent.
`WO 02/088926
`One of the main implications of the wrapper system is that it may not be possible to
`gain accessto all of the features of a third party system.If the knowledge cannot be
`carried by the ontologies accessible to the wrapper, then the knowledge elements
`cannot be transported throughoutthe system. There are several responses to such
`1. The wrapper maybeplaced at the wronglevel.
`2. The Ontology maybe deficient and in needof revision.
`3. The feature of the third party system maybeirrelevant and therefore no
`adjustments are required.
`5. Agents and Time
`In this section wediscuss the relationship between the operation of agents and time.
`The two main areas disclosed are howthe logic based implementation of agents can
`handle data streams without resorting to an embedded, sophisticated temporallogic,
`and the notion of synthetic time in order to permit simulation, and analysis of data
`from multiple time periods.
`5.1 Data Streams and IA’s
`Oneof the fundamental problems facing the use of IA’s in the Shapes Vector system is
`the changingstatus of propositions. More precisely, under temporalshifts, all “facts”
`are predicates rather than propositions. This issue is further complicated when we
`consider that typical implementations of an IA do not handle temporal data streams.
`Weaddress this problem by providing each [A with a “time aperture” over whichit is
`currently processing. A user or a higher level agent can set the value of this aperture.
`WO 02/088926
`Any output from an IA is only relevantto its time aperture setting (Figure 10). The
`aperture mechanism allowsthe avoidanceof issues such as contradictions in facts
`over time, as well providinga finite data set in whatis really a data stream. In fact, the
`mechanism being implemented in our system permits multiple, non-intersecting
`apertures to be defined for data input.
`With time apertures, we can “stutter” or “sweep” along the temporal domain in order
`to analyse long streams of data. Clearly, there are a numberof issues, whichstill must
`be addressed. Chief amongst these is the fact that an aperture may be set which does
`not, or rather partially, covers the data set wherebya critical deduction must be made.
`Accordingly, strategies such as aperture change and multiple apertures along the
`temporal domain must be implemented in orderto raise confidence that the relevant
`data is inputin orderto arrive at the relevant deduction.
`While we are aware that we can implement apertures in order to supply us with
`useful deductions for a numberof circumstances,itis still an open question on how to
`achieve an optimal set of sweepstrategies for a very broad class of deductions where
`confidence is high that we obtain what weare scanning for. One area, which comesto
`mind,is the natural “tension” between desired aperture settings. For example, an
`aperture setting of 180 degrees(ie., the whole fact space) is desirable as this considers
`all data possible in the stream from the beginning of the epoch of capture to the end of
`time, or rather the last data captured. However, this setting is impractical from an
`implementation point of view, as well as introducing potential contradictions in the
`deductive process. On the other hand, a very small aperture is desirable in that
`implementationis easy along with fast processing, but can result in critical packets not
`being included in the processing scan.
`Initial test of an agent, which understands portions of the HTTP protocol, has yielded
`anecdotal evidence that there may be optimum aperturesettings for specific domains
`WO 02/088926
`of discourse. HTTP protocol data from a large (5GB) corpus were analysed for a large
`network. It was shown that an aperture setting of 64 packets produced the largest set
`of deductions for the smallest aperture setting while avoiding the introduction of
`The optimal aperturesetting is of course affected by the data input, as well as the
`domain of discourse. However, if we determine that our corpusis representative of
`expected traffic, then default optimal aperture setting is possible for an agent. This
`aperture setting need only then be adjusted as required in the presence of
`contradicting deductions or for special processing purposes.
`5.2 Temporal Event Mapping for Agents
`In the previous section, we discussed how an agent could havetime apertures in
`order to process data streams. Theissue of time is quite important, especially when
`considering that it takes a finite amountof time for a set of agents to arrive at a
`deduction and present a visualisation. Also, a user may wish to replay events at
`different speedsin order to see security relevant patterns. To provide such facilities in
`Shapes Vector, we introduce the notion of a synthetic clock. All entities in the system
`get their current time from the synthetic clock rather than the real system clock. A
`synthetic clock c

