`
`PCT/AU02/00530
`
`50
`
`far more likely to produce a complex family of inter-related concepts with ad-hoc
`exceptions. Morelikely, due to the total domain of discourse being so broad, ontology
`produced in this mannerwill be extremely context sensitive, leading to many
`possibilities for introducing ambiguities and contradictions.
`
`Takinga leaf from ourearlier philosophy of simplification through abstraction
`
`layering, we instead chooseto define a set of ontologies: one per inter-layer boundary.
`Figure 7 indicates these ontologies as curved arrowsto theleft of the agentstack.
`
`The communication of factual knowledgeto IAsin the first level of abstraction is
`
`represented by meansof a simple ontology of facts (called the Level 1 Shapes Vector
`
`Ontology). All agents described within this portion of the specification make use of
`
`this mechanism to receive their input. It is worthwhile noting that the knowledge
`
`domaindefined by this ontology is quite rigidly limited to incorporate only a universe
`
`of facts -- no higher-level concepts or meta-concepts are expressible in this ontology.
`
`This simplified knowledge domain is uniform enoughthat a reasonably clean set of
`
`ontological primitives can be concisely described.
`
`Interaction between IA’s isstrictly limited to avoid the possibility of ambiguity. An
`
`agent may freely report outcomes to the Shapes Vector Event Delivery sub-system,
`
`but inter-I[A communication is only possible between agents at adjacent layers in the
`
`architecture.It is specifically prohibited for any agent to exchange knowledge with a
`
`“peer” (an agent within the same layer). If communication is to be provided between
`
`peers, it must be via an intermediary in an upperlayer. The reasons underlying these
`
`rules of interaction are principally that they remove chances for ambiguity by forcing
`
`consistent domain-restricted universes of discourse (see below). Furthermore, such
`
`restrictions allow for optimised implementation of the Knowledge Architecture.
`
`Page 745 of 1488
`
`SAMSUNG EXHIBIT 1002 - part 2 of 2
`
`Page 745 of 1488
`
`SAMSUNG EXHIBIT 1002 - part 2 of 2
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`51
`
`Onespecific optimisation made possible by these constraints -- largely due to their
`capacity to avoid ambiguity and context -- is that basic factual knowledge maybe
`represented in termsof traditional context-free relational calculus. This permits the
`
`use of relational database technology in storage and managementof knowledge.
`
`Thus, for simple selection andfiltering procedures on the knowledge base we can
`
`utilise well known commercial mechanisms which have been optimised over a
`
`numberyears rather than having to build a custom knowledge processor inside each
`
`intelligent agent.
`
`Note that we are not suggesting that knowledge processing and retrievalis not
`
`required in an IA. Ratherthat by specifying certain requirementsin a relational
`
`calculus (SQLis a preferable language), the database engine assists by undertaking a
`
`filtering process when presenting a view for processing by the IA. Hence the IA can
`
`potentially reap considerable benefits by only having to process the (considerably
`
`smaller) subset of the knowledge base whichis relevant to the IA. This approach
`
`becomes even more appealing when weconsider that the implementation of choice
`
`for Intelligent Agents is typically a logic language such as Prolog. Such environments
`
`mayincursignificant processing delays due to the heavy stack based nature of
`
`processing on modern Von Neumann architectures. However, by undertaking early
`
`filtering processes using, optimised relational engines and a simple knowledge
`
`structure, we can minimise the total amountof data that is input into potentially time
`
`consuming tree and stack-based computational models.
`
`The placementof intelligent agents within the variouslayers of the knowledge
`
`architecture is decided based upon the abstractions embodied within the agent and
`
`the knowledge transforms provided by the agent. Twocriteria are considered in
`
`determining whether a placementat layer n is appropriate:
`
`Page 746 of 1488
`
`Page 746 of 1488
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`52
`
`* would the agent be context sensitive in the level n ontology? If so, it should be split
`into two or moreagents.
`
`* does the agent perform data fusion from one or moreentities at level n? If so it must
`
`be promotedto at least level n+1 (to adhere to the requirement of no “horizontal”
`
`interaction)
`
`2.2 A Note on the Tardis
`
`A moredetailed description of the Tardis is provided in part 5 of the specification.
`
`The Tardis connects the JA Gestalt to the real-time visualisation system.It also
`
`controls the system’s notion of time in order to permitfacilities such as replay and
`
`visual or other analysis anywhere along the temporal axis from the earliest data still
`
`stored to the current real world time.
`
`The Tardis is unusualin its ability to connect an arbitrary semantic or deduction to a
`
`visual event. It does this by acting as a very large semantic patch-board. The basic
`
`premise is that for every agreed global semantic (e.g. X window packet arrived
`
`[attributelist]) there is a specific slot in an infinite sized table of globally agreed
`
`semantics. For practical purposes, there are 2
`
`slots and therefore the current
`
`maximum number of agreed semantics available in our environment. No slot, once
`
`assigned a semantic, is ever reused for any other semantic. Agents that arrive at a
`
`deduction, which matches the slot semantic, simply queue an eventinto the slot. The
`
`Page 747 of 1488
`
`Page 747 of 1488
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`53
`
`visual system is profiled to match visual events with slot numbers. Hence visual
`
`events are matched to semantics.
`
`Asfor the well-known IP numbers and Ethernet addresses, the Shapes Vectorstrategy
`is to have incremental assignment of semantics to slots. Various taxonomiesetc. are
`
`being considered for slot grouping. As the years go by,it is expected that someslots
`
`will fall into disuse as the associated semantic is no longer relevant, while others are
`
`added. It is considered highly preferable for obvious reasons, that no slot be reused.
`
`As mentioned, further discussion about the Tardis and its operation can be found in
`
`part 5 of the specification.
`
`3. Inferencing Strategies
`
`The fundamental inferencing strategy underlying Shapes Vectoris to leave inductive
`
`inferencing as the province of the (human) user and deductive inferencing as typically
`
`the province of the IA’s.It is expected that a user of the system will examine
`
`deductive inferences generated bya set of IA’s, coupled with visualisation, in order to
`
`arrive at an inductive hypothesis. This separation of duties markedly simplifies the
`
`implementation strategies of the agents themselves. Nevertheless, we propose further
`
`aspects that may produce a very powerful inferencing system.
`
`3.1 Traditional
`
`Agents can employ either forward chaining or backward chaining, depending on the
`
`role they are required to fulfil. For example, some agents continuously combtheir
`
`views of the knowledge base in attempts to form current, up to date, deductions that
`
`are as “high level” as possible. These agents employ forward chaining and typically
`
`inhabit the lowerlayers of the agent architecture. Forward chaining agents also may
`
`havedata stream inputs from low level “sensors”. Based on these andotherinputs, as
`
`Page 748 of 1488
`
`Page 748 of 1488
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`54
`
`well as a set of input priorities, these agents work to generate warnings whencertain
`
`security-significant deductions becometrue.
`
`Anotherset of agents within the Shapes Vector system will be backward chaining
`
`(goal driven) agents. These typically form part of the “User Avatar Set”: a collection of
`
`knowledge elements, which attempt to either prove or disprove user queries
`
`(described morefully in Section 8 of this part.).
`
`3.2 Possiblistic
`
`In executing the possiblistic features incorporatedinto the level 2 ontology (described
`
`in Section 7.1 of this part), agents may need to resort to alternative logics. This is
`
`implied by the inherent multi-valued nature of the possiblistic universe. Where a
`
`universe of basic facts can be described succinctly in terms of a fact existing or not
`
`existing, the situation is more complex when symbolic possibility is added. For our
`
`formulation we chose a three-valued possiblistic universe, in which a fact may be
`
`existent, non-existent, or possibly existent.
`
`To reason in such a universe we adopt two different algebra’s. Thefirst a simple
`
`extension of the basic principle of unification commonto computationallogic. Instead
`
`of the normalassignation of successful unifaction to existence and unsuccessful
`
`unification to non-existence, we adoptthe following:
`
`© successful unification implies existence,
`
`* the discovery of an explicit fact which precludes unification implies non-
`
`existence(this is referred to this as a hardfail),
`
`* unsuccessful unification without an explicit precluding case implies possible
`
`existence (this is referred to as a soft fail)
`
`Page 749 of 1488
`
`Page 749 of 1488
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`55
`
`A second algebra, which maybe used to reason in the possiblistic universe, involves a
`
`technique knownas “predicate grounding” in which a user-directed pruning of a
`
`unification search allowsfor certain specified predicates to be ignored (grounded)
`whenpossibilities are being evaluated.
`
`3.3 Vectors
`
`Agents operating at higher levels of the Shapes Vector Knowledge Architecture may
`
`require facilities for reasoning about uncertain and/or incomplete information in a
`
`more continuous knowledge domain. Purely traditional forward or backward
`
`chaining does noteasily express such reasoning, and the three-valued possiblistic
`
`logic may lack the necessary quantitative features desired. To implementsuch agents
`
`an alternative inferencing strategy is used based upon notions of vector algebra in a
`
`multi-dimensional semantic space. This alternative strategy is employed in
`
`conjunction with more conventional backward chaining techniques. The use of each of
`
`the paradigms is dependent on the agent, and the domain of discourse.
`
`Our vector-based approachto inferencing revolves around constructing an abstract
`
`space in which relevant facts and deductions may be represented by geometrical
`
`analogues(suchas points and vectors), with the properalgebraic relationships
`
`holding true. In general, the construction of such a spacefor a large knowledge
`
`domainis extremely difficult. For Shapes Vector, we adopt a simplifying strategy of
`
`constructing several distinct deductive spaces, each limited to the (relatively small)
`
`domain of discourseof a single intelligent agent. The approach is empirical and is
`
`only feasible if each agentis restricted to a very small domain of knowledgeso that
`
`construction of its space is not overly complex.
`
`The definition of the deductive space for an IA is a methodical and analytical process
`
`undertaken during the design of the agentitself. It involves a consideration of the set
`
`Page 750 of 1488
`
`Page 750 of 1488
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`56
`
`of semantic concepts (“nouns”) whichare relevant to the agent, and across which the
`
`agent’s deductions operate. Typically this concept set will contain elements of the
`
`agent’s layer ontology as well as nouns which are meaningful only within the agent
`
`itself. Once the agent’s concept set has been discovered, we can identify within it a
`
`subset of ‘base nouns’ -- concepts which cannotbe defined in terms of other members
`
`of the set. This identification is undertaken with reference to a semi-formal
`
`‘connotation spectrum’ (a comparative metric for ontological concepts).
`
`Such nouns have two important properties:
`
`* each is semantically orthogonal to every other base noun, and
`
`* every memberof the concept set which is not a base noun can be described as a
`
`combination of two or more base nouns.
`
`Collectively, an 1A’s set of n base nouns defines a n-dimensional semantic space (in
`
`which each base noun describes an axis). Deductions relevant to the agent constitute
`
`points within this space; the volume boundedbyspatial points for the full set of agent
`
`deductions represents the sub-space of possible outputs from that agent. A rich set of
`
`broad-reaching deductions leads to a large volumeof the space being covered by the
`
`agent, while a limited deductionset results in a very narrow agent of more limited
`
`utility (but easier to construct). Our present approach to populating the deductive
`
`space is purely empirical, driven by human expert knowledge. The onus is thus upon
`
`the designer of the LA to generate a set of deductions, which (ideally) populate the
`
`space in a uniform manner.
`
`In reality, the set of deductions that inhabit the space can become quite non-uniform
`
`(“clumpy”) given this empirical approach. Hence rigorous constraint on the domain
`
`covered by an agentis entirely appropriate. Of course this strategy requires an
`
`Page 751 of 1488
`
`Page 751 of 1488
`
`
`
`WO 12/088926
`
`PCT/AU02/00530
`
`37
`
`appropriate mechanism at a higher abstract layer. However, the population of a
`higher layer agent can utilise the agents below them in a behavioural manner thereby
`treating them as sub-spaces.
`
`Once an agent’s deductive space has been constructed and populated with deductions
`(points), it may be used to draw inferences from observedfacts. This is achieved by
`representingall available and relevant facts as vectors in the multi-dimensional
`
`semantic space and considering how these vectors are located with respect to
`
`deduction points or volumes. A set of fact vectors, when added using vector algebra
`
`may precisely reach a deduction point in the space. In that situation, a deductive
`
`inference is implied. Alternatively, even in the situation where no vectors or
`
`combinations of vectors precisely inhabits a deduction point, more uncertain
`
`reasoning can be performed using mechanisms such as distance metrics. For example,
`
`it may be implied that a vector, which is “close enough” to a deduction point, is a
`
`weak indicator of that deduction. Furthermore, in the face of partial data, vector
`
`techniques may be used to hone in on inferences by identifying Facts (vectors),
`
`currently not asserted, which would allow for somesignificant deduction to be
`
`drawn.Sucha situation mayindicate that the system should perhapsdirect extra
`
`resources towardsdiscovering the existence (or otherwise) of a key fact.
`
`The actual inferencing mechanism to be used within higher-level Shapes Vector
`
`agents is slightly more flexible than the scheme we have described above. Rather than
`
`simply tying facts to vectors defined in terms of the IA’s base nouns, we can define an
`
`independentbutspatially continuous ‘fact space’. Figure 8 demonstrates the concept:
`
`a deductive space has been defined in terms of a set of base nouns relevant to the JA.
`
`Occupying the same spatial region is a fact space, whose axes are derived from the
`
`agent's layer ontology. Facts are defined as vectors in this second space:thatis, they
`
`are entities fixed with respect to the fact axes. However, since the fact space and
`
`deduction space overlap, these fact vectors also occupy a location with respect to the
`
`Page 752 of 1488
`
`Page 752 of 1488
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`58
`
`base nounaxes. It is this location which we use to make deductive inferences based
`
`uponfact vectors. Thus, in the Figure, the fact that the observed fact vector (arrow)is
`close to one of the deductions (dots) may allow for assertion of that deduction with a
`
`particular certainty value (a function of exactly how close the vector is to the
`
`deduction point). Note that, since the axes of the fact space are independentof the
`
`axes of the deductive space,it is possible for the formerto vary (shift, rotate and/or
`
`translate, perhaps independently) with respect to the latter. If such a variation occurs,
`
`fact vectors (fixed with regard to the fact axes) will have different end-points in
`
`deduction-space. Therefore, after such a relative change in axes, a different set of
`
`deductions may be inferred with different confidence ratings. This mechanism of
`
`semantic relativity may potentially be a powerful tool for performing deductive
`
`inferencing in a dynamically changing environment.
`
`Aninteresting aspect of the preferred approach to vector-based deductive inferenceis
`
`that it is based fundamentally upon ontological concepts, which can in turn be
`
`expressed as English nouns. This has the effect that the deductions made by an agent
`
`will resemble simple sentences in a very small dialect of pseudo-English. This
`
`language may be a useful medium for a humanto interact with the agent ina
`
`relatively natural fashion.
`
`While the inferencing strategy described above has some unorthodox elementsinits
`
`approachto time-varying probabilistic reasoning for security applications, there are
`
`more conventional methods that may be used within Shapes Vector IA’s in the
`
`instance that the method falls short of its expected deductive potential. Frame based
`
`systemsoffer one well understood (although inherently limited) alternative paradigm.
`
`Indeed,it is expected that some IA’s will be frame based in any case (obtained off the
`
`shelf and equipped with ontology to permit knowledge transfer with the knowledge
`
`base).
`
`Page 753 of 1488
`
`Page 753 of 1488
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`59
`
`As described above, the vector-based deductive engine is able to make weak
`
`assertions of a deduction with an associated certainty value (based on distancesin n-
`
`Dimensional space). This value can be interpreted in a variety of ways to achieve
`
`different flavours of deductive logic. For example, the certainty value could
`
`potentially be interpreted as a probability of the assertion holding true, derived froma
`
`consideration of the current context and encoded world knowledge. Such an
`
`interpretation delivers a true probabilistic reasoning system. Alternatively, we could
`
`potentially consider a more rudimentary interpretation wherein we consider
`
`assertions with a certainty above a particular threshold (e.g. 0.5) to be “ possible”
`
`within a given context. Underthese circumstances, our system would deliver a
`
`possiblistic form of reasoning. Numerousother interpretations are also possible.
`
`3.4
`
`Inferencing for Computer Security Applications
`
`As presented, our JA architecture is appropriate to knowledge processing in any
`
`number of domains. To place the work into the particular context, for whichit is
`
`primarily intended, we will now consider a simple computer security application of
`
`this architecture.
`
`One common,butoften difficult, task facing those charged with securing a computer
`
`networkis detecting access of network assets which appears authorised (e.g¢., the user
`
`has the proper passwordsetc) but is actually malicious. Such access incorporates the
`
`so-called “insider threat” (i.e., an authorised user misusing their privileges) as well as
`
`the situation where confidentiality of the identification system has been compromised
`
`(e.g., passwords have been stolen). Typically, Intrusion Detection Systems are not
`
`good at detecting such security breaches, as they are purely based on observing
`
`signatures relating to improperuseortraffic.
`
`Shapes Vector’s comprehensive inferencing systems allow it to deduce a detailed
`
`semantic model of the network under consideration. This model coupled with a user’s
`
`Page 754 of 1488
`
`Page 754 of 1488
`
`
`
`WO 02/0889 26
`
`PCT/AU02/003530
`
`60
`
`inductive reasoning skills, permits detection of such misuse even in the absence of any
`prior-known“signature”.
`
`This application of Shapes Vector involves constructing a Gestalt of Intelligent Agents
`that are capable of reasoning aboutrelatively low-level facts derived from the
`
`network. Typically these facts would be in the form of observations of traffic flow on
`the network. Working collaboratively, the agents deduce the existence of computers
`
`on the network andtheir intercommunication. Other agents also deduceattributes of
`
`the computers and details of their internal physical and logical states. This
`
`information serves two purposes:oneis to build up a knowledgebase concerning the
`
`network, and anotheristo facilitate the visualisation of the network. This latter output
`
`from the agents is used to construct a near real-time 3D visualisation showing the
`
`computers and network interfaces knownto exist and their interconnection. Overlaid
`
`onto this “map” is animation denoting the traffic observed by the agents,classified
`
`accordingto service type.
`
`Observing such a Shapes Vector visualisation a user may note somevisual aspect that
`
`they consider being atypical. For example, the user may note a stream of telnet
`
`packets (whichitself might be quite normal) traversing the network between the
`
`primary network server and node which the visualisation shows as only a network
`
`interface. The implications of such an observation are that a nodeon the networkis
`
`generating a considerable body of data, but this data is formatted such that none of
`
`the Shapes Vector agents can deduce anything meaningful about the computer issuing
`
`the traffic (thus no computer shapeis visualised, just a bare networkinterface).
`
`The human user mayconsiderthis situation anomalous: given their experience of the
`
`network, most high volumetraffic emitters are identified quickly by one or more of
`
`the various IAs. While the telnet session is legitimate, in as muchas the proper
`
`passwordshavebeen provided,the situation bears further investigation.
`
`Page 755 of 1488
`
`Page 755 of 1488
`
`
`
`WO 12/088926
`
`PCT/AU02/00530
`
`61
`
`To probe deeper, the User Avatar component of Shapes Vector, described morefully
`in Section 8 in Part 2 of the specification, can be used to directly query the detailed
`knowledge base the agents have built up behindto the (less-detailed) visualisation.
`
`The interaction in this situation might be as follows:
`
`human> answer what Useris-logged-into Computer “MainServer”?
`
`gestalt> Relationship is-logged-into [User Boris, Computer MainServer]
`
`This reveals a user namefor the individual currently logged into the server. A further
`
`interaction mightbe:
`
`human>find all User where id=”Boris”?
`
`gestalt> Entity User (id=Boris, name=”Boris Wolfgang”, type=” guest user”)
`
`An agent has deducedat somestage of knowledge processing that the user called
`
`Boris is logged in using a guest user account. The Shapes Vector user would be aware
`
`that this is also suspicious, perhapseliciting a further question:
`
`human> answer what is-owned-by User Boris”?
`
`gestalt> Relationship is-owned-by [File passwords, User Boris]
`Relationship is-owned-by [Process keylogger, User Boris]
`Relationship is-owned-by [Process passwordCracker, UserBoris]
`
`The facts have, again, been deduced by one or moreof the IA’s during their
`
`processing of the original network facts. The human user, again using their own
`
`knowledge and inductive faculties, would become more suspicious. Their level of
`
`suspicion might be such that they take action to terminate Boris’ connection to the
`
`main server.
`
`Page 756 of 1488
`
`Page 756 of 1488
`
`
`
`WO 12/088926
`
`PCT/AU02/00530
`
`62
`
`In addition to this, the user could ask a range of possiblistic and probabilistic
`questions aboutthe state of the network, invoking faculties in the agent Gestalt for
`
`more speculative reasoning.
`
`3.4 Other Applications
`
`The IA architecture disclosed herein lendsitself to other applications. For example,it
`is not uncommon for the Defence community to have many databases in just as many
`formats.It is very difficult for analysts to peruse these databases in order to gain
`
`useful insight. There has been mucheffort aimed at considering how particular
`
`databases maybestructured in order for analysts to achieve their objectives. The
`
`problem has proved to be difficult. One of the major hurdlesis that extracting the
`
`analysts’ needs and codifying them to structure the data leadsto different
`
`requirements not only betweenanalysts, but also different requirements depending
`
`on their current focus. One of the consequencesis that in order to structure the data
`
`correctly, it must be context sensitive, which a relational database is not equipped to
`
`handle.
`
`Shapes Vector can overcome manyof the extantdifficulties by permitting knowledge
`
`and deductionrules to be installed into an IA. This IA, equipped with a flexible user
`
`interface andstrictly defined query language, can then parse the data in a database in
`
`orderto arrive at a conclusion. The knowledgerules and analyst-centric processing
`
`are encodedin the IA, not in the structure of the database itself, which can thus
`
`remain context free. The Shapes Vector system allows incremental adjustmentof the
`
`IA without having to re-format and restructure a database through enhancementof
`
`the IA, or throughan additional IA with relevant domain knowledge. Either the IA
`
`makes the conclusion, or it can provide an analyst with a powerful tool to arrive at
`
`low level deductions that can be used to arrive at the desired conclusion.
`
`Page 757 of 1488
`
`Page 757 of 1488
`
`
`
`WO 12/088926
`
`PCT/AU02/00530
`
`4. Rules for Constructing an Agent
`
`63
`
`In Section 2 of this part of the specification, several rules governing agents were
`
`mentioned, e.g. no intra level communication and each agent must be context free
`
`within its domain of discourse. Nevertheless, there are still a numberof issues, which
`
`need clarification to see how an agent can be constructed, and someof the resultant
`
`implications.
`
`In a preferred arrangement the three fundamental rules that govern the construction
`
`of an agentare:
`
`1. All agents within themselves mustbe contextfree;
`
`2. If a context sensitive rule or deduction becomes apparent, then the agent must be
`
`split into two or moreagents;
`
`3. No agent can communicate with its peers in the samelevel. If an agent’s deduction
`
`requires input from a peer, then the agent must be promotedto a higherlevel, or a
`
`higherlevel agent constructed which utilises the agent and the necessary peer(s).
`
`In our current implementation of Shapes Vector, agents communicate with other
`
`entities via the traditional UNIX sockets mechanism asan instantiation of a
`
`componentcontrol interface. The agent architecture does not preclude the use of third
`
`party agents or systems. The typical approachto dealing with third party systemsis
`
`to provide a “wrapper” which permits communication between the system and
`
`Shapes Vector. This wrapper needsto be placed carefully within the agent hierarchy
`
`so that interaction with the third party system is meaningful in terms of the Shapes
`
`Vector ontologies, as well as permitting the wrapperto act as a bridge between the
`
`third party system and other Shapes Vector agents. The wrapper appearsas just
`
`another SV agent.
`
`Page 758 of 1488
`
`Page 758 of 1488
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`64
`
`One of the main implications of the wrapper system is that it may not be possible to
`gain accessto all of the features of a third party system.If the knowledge cannot be
`
`carried by the ontologies accessible to the wrapper, then the knowledge elements
`
`cannot be transported throughoutthe system. There are several responses to such
`cases:
`
`1. The wrapper maybeplaced at the wronglevel.
`
`2. The Ontology maybe deficient and in needof revision.
`
`3. The feature of the third party system maybeirrelevant and therefore no
`
`adjustments are required.
`
`5. Agents and Time
`
`In this section wediscuss the relationship between the operation of agents and time.
`
`The two main areas disclosed are howthe logic based implementation of agents can
`
`handle data streams without resorting to an embedded, sophisticated temporallogic,
`
`and the notion of synthetic time in order to permit simulation, and analysis of data
`
`from multiple time periods.
`
`5.1 Data Streams and IA’s
`
`Oneof the fundamental problems facing the use of IA’s in the Shapes Vector system is
`
`the changingstatus of propositions. More precisely, under temporalshifts, all “facts”
`
`are predicates rather than propositions. This issue is further complicated when we
`
`consider that typical implementations of an IA do not handle temporal data streams.
`
`Weaddress this problem by providing each [A with a “time aperture” over whichit is
`
`currently processing. A user or a higher level agent can set the value of this aperture.
`
`Page 759 of 1488
`
`Page 759 of 1488
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`65
`
`Any output from an IA is only relevantto its time aperture setting (Figure 10). The
`aperture mechanism allowsthe avoidanceof issues such as contradictions in facts
`
`over time, as well providinga finite data set in whatis really a data stream. In fact, the
`
`mechanism being implemented in our system permits multiple, non-intersecting
`apertures to be defined for data input.
`
`With time apertures, we can “stutter” or “sweep” along the temporal domain in order
`
`to analyse long streams of data. Clearly, there are a numberof issues, whichstill must
`
`be addressed. Chief amongst these is the fact that an aperture may be set which does
`
`not, or rather partially, covers the data set wherebya critical deduction must be made.
`
`Accordingly, strategies such as aperture change and multiple apertures along the
`
`temporal domain must be implemented in orderto raise confidence that the relevant
`
`data is inputin orderto arrive at the relevant deduction.
`
`While we are aware that we can implement apertures in order to supply us with
`
`useful deductions for a numberof circumstances,itis still an open question on how to
`
`achieve an optimal set of sweepstrategies for a very broad class of deductions where
`
`confidence is high that we obtain what weare scanning for. One area, which comesto
`
`mind,is the natural “tension” between desired aperture settings. For example, an
`
`aperture setting of 180 degrees(ie., the whole fact space) is desirable as this considers
`
`all data possible in the stream from the beginning of the epoch of capture to the end of
`
`time, or rather the last data captured. However, this setting is impractical from an
`
`implementation point of view, as well as introducing potential contradictions in the
`
`deductive process. On the other hand, a very small aperture is desirable in that
`
`implementationis easy along with fast processing, but can result in critical packets not
`
`being included in the processing scan.
`
`Initial test of an agent, which understands portions of the HTTP protocol, has yielded
`
`anecdotal evidence that there may be optimum aperturesettings for specific domains
`
`Page 760 of 1488
`
`Page 760 of 1488
`
`
`
`WO 02/088926
`
`PCT/AU02/00530
`
`66
`
`of discourse. HTTP protocol data from a large (5GB) corpus were analysed for a large
`network. It was shown that an aperture setting of 64 packets produced the largest set
`of deductions for the smallest aperture setting while avoiding the introduction of
`
`contradictions.
`
`The optimal aperturesetting is of course affected by the data input, as well as the
`
`domain of discourse. However, if we determine that our corpusis representative of
`
`expected traffic, then default optimal aperture setting is possible for an agent. This
`
`aperture setting need only then be adjusted as required in the presence of
`
`contradicting deductions or for special processing purposes.
`
`5.2 Temporal Event Mapping for Agents
`
`In the previous section, we discussed how an agent could havetime apertures in
`
`order to process data streams. Theissue of time is quite important, especially when
`
`considering that it takes a finite amountof time for a set of agents to arrive at a
`
`deduction and present a visualisation. Also, a user may wish to replay events at
`
`different speedsin order to see security relevant patterns. To provide such facilities in
`
`Shapes Vector, we introduce the notion of a synthetic clock. All entities in the system
`
`get their current time from the synthetic clock rather than the real system clock. A
`
`synthetic clock can beset arbitrarily to any of the current or past time, andits rate of
`
`changecanalso be specified.
`
`A synthetic clock allowsa user to run the system at different speeds and set its notion
`
`of time for analysing data. The synthetic clock also permits a variety of simulations to
`
`be performed under a number of semantic assumptions (see Section 7 of this part of
`
`the specification)
`
`Page 761 of 1488
`
`Page 761 of 1488
`
`
`
`WO 12/088926
`
`PCT/AU02/00530
`
`67
`
`The aboveis all very well, but Shapes Vector mayat the sametimebeutilised for
`
`currentreal-time network monitoring as well as running a simulation. In addition, the
`
`user maybeinterested in correlating past analysis conditions with current events and
`
`vice versa. For example, given a hypothesis from an ongoing analysis, the user ma