`
`Abdelaziz Mounji
`Institut d(cid:2)Informatique(cid:3) FUNDP
`E(cid:4)mail(cid:5) amo(cid:6)info(cid:7)fundp(cid:7)ac(cid:7)be
`
`January (cid:3)
`
`
`
`Introduction
`
`The purpose of this paper is to specify the generic audit record format used by ASAX(cid:2) It also
`provides guidelines for implementing programs that convert a native (cid:3)le to a NADF format(cid:2) Such
`a converter program is called a format adaptor(cid:2)
`
` Why a Common Format (cid:3)
`
`ASAX is a universal tool for data stream analysis (cid:4)and in particular a security audit trail analysis(cid:5)(cid:2)
`That means ASAX is theoretically able to analyse arbitrary sequential (cid:3)les(cid:2) This is achieved by
`translating the native (cid:3)le to a universal format called Normalized Audit Data Format(cid:2) This ensures
`target system independence and avoids the need to tune ASAX for every possible source of data(cid:2)
`
` Speci(cid:5)cation of NADF File Format
`
`A NADF (cid:3)le is a sequential (cid:3)le of records in NADF format(cid:2) This format is (cid:6)exible and allows a
`straightforward implementation of format adaptors(cid:2)
`A NADF record consists of(cid:7)
`
`(cid:0) a four bytes integer representing the length (cid:4)in bytes(cid:5) of the whole NADF record (cid:4)including
`the length (cid:3)eld(cid:5)(cid:8)
`
`(cid:0) a certain number of contiguous audit data (cid:3)elds(cid:2) Each audit data (cid:3)eld contains the three
`following contiguous items(cid:7)
`
`identi(cid:2)er(cid:3) an unsigned short (cid:4) bit(cid:5) integer which is the identi(cid:3)er of the audit data(cid:2) This
`item must be aligned on a (cid:12)bytes boundaries(cid:8)
`length(cid:3) an unsigned short integer which is the length of the audit data value(cid:8)
`
`value(cid:3)
`
`the audit data value itself(cid:2)
`
`In addition(cid:13) audit data identi(cid:3)ers appearing in a NADF record must be sorted in a strict ascending
`order(cid:2) This is important for ASAX to preprocess e(cid:14)ciently audit records before analysis(cid:2) Figure
`shows the general layout of a NADF record(cid:2)
`
`
`
`Patent Owner Finjan, Inc. - Ex. 2032, p. 1
`
`
`
`length id lg
`
`val
`(cid:2)(cid:2)(cid:2)
`
`(cid:7) (cid:7) (cid:7)
`
`idn
`
`lgn
`
`valn(cid:2)(cid:2)(cid:2)
`
`Figure (cid:7) General NADF record layout
`
`Audit Data Alignment
`It follows from the alignment restriction on audit data identi(cid:3)ers that if the audit data value does
`not have an even number of bytes(cid:13) a padding byte (cid:4)a space (cid:0)(cid:5) must be appended to the value(cid:2) This
`byte is not taken into account in the length of the value(cid:2)
`
`NADF record alignment
`Since most machine architectures require (cid:12)bytes integers to be (cid:12)bytes aligned(cid:13) a NADF record
`must always begin at a (cid:12)bytes aligned area(cid:2) This is especially important if NADF records are
`bu(cid:16)ered for I(cid:17)O(cid:2) Similarly(cid:13) if the total number of bytes in a NADF record is not a multiple of (cid:13)
`padding bytes (cid:4)spaces(cid:5) must be added at the end of the record(cid:2) Again(cid:13) the record length does not
`include the padding bytes (cid:4)see Example below(cid:5)(cid:2)
`Finally(cid:13) a NADF (cid:3)le always begins with a header record having the following structure(cid:7)
`
`struct (cid:2)
`int len(cid:3)
`char val(cid:4) (cid:7)(cid:3)
`(cid:8) TypeNADF (cid:9) (cid:2) (cid:11)(cid:12)(cid:13)(cid:13)NADF(cid:13)(cid:13) (cid:14)(cid:15) (cid:12)(cid:8)(cid:3)
`
`immediately followed by a padding space character (cid:0)(cid:2)
`I(cid:17)O routines on NADF (cid:3)les (cid:4)see nad(cid:3)o(cid:4) ASAX(cid:5)(cid:5) always check for the existence of the header record
`before going any further(cid:2) This avoids processing non(cid:12)NADF (cid:3)les(cid:2)
`Note that audit data identi(cid:3)ers are assigned arbitrarily(cid:13) provided they are all distinct(cid:2) Each
`NADF record may contain a di(cid:16)erent number of audit data(cid:2)
`
`Example
`Suppose we want to convert to NADF format a record having the C declaration(cid:7)
`
`struct (cid:2)
`char directory(cid:4) (cid:7)(cid:3)
`int uid(cid:3)
`char filename(cid:4) (cid:7)(cid:3)
`(cid:8) Record (cid:9) (cid:2)(cid:12)(cid:17)tmp(cid:12)(cid:11) (cid:11) (cid:12)(cid:17)etc(cid:17)passwd(cid:12)(cid:8)(cid:3)
`
`and suppose the (cid:3)elds directory(cid:13) uid and (cid:2)lename are assigned the identi(cid:3)ers (cid:13) and respectively(cid:2)
`The corresponding NADF record is depicted in Figure (cid:2)
`In particular(cid:13) notice that strings in the NADF record are not null(cid:12)terminated(cid:2)
`
` Guidelines for implementing Format Adaptors
`
`In implementing a Format Adaptor for your native (cid:3)le format(cid:13) we suggest following the two steps
`here under(cid:7)
`
`
`
`Patent Owner Finjan, Inc. - Ex. 2032, p. 2
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`(cid:7)
`
`e t c
`
`(cid:7)
`
`p a s
`
`ws
`
`d
`
`
`
`
`
`(cid:7)
`
`t
`
`m p
`
`Figure (cid:7) NADF record
`
`Audit Data Description File
`
`(cid:0) examine carefully the documentation of the software producing the native (cid:3)les to (cid:3)gure out
`the exact structure of your native records(cid:8)
`
`(cid:0) for each (cid:3)eld in the data structure of your native record(cid:13) assign a unique identi(cid:3)er number
`(cid:4)between and (cid:5)(cid:2) If a data (cid:3)eld appears in many records(cid:13) it is a good idea to assign
`them the same identi(cid:3)er(cid:8)
`
`(cid:0) choose an external name for each (cid:3)eld in the native record(cid:2) This name is often the same
`as in the original record(cid:2) Field names are referenced by this external name in the RUSSEL
`language(cid:8)
`
`(cid:0) mapping between external (cid:3)eld names and the identi(cid:3)ers must be written in what we call
`audit data description (cid:2)le(cid:2) This is a simple text (cid:3)le where each audit data is described by
`a sequence of lines(cid:2) For easier parsing(cid:13) the (cid:3)rst line begins with (cid:21) (cid:21) followed by a blank(cid:13)
`the second line begins with (cid:21)(cid:21) followed by a blank and so on(cid:2) The (cid:3)rst line in each sequence
`indicates the audit data identi(cid:3)er(cid:13) the second indicates the type of the audit data in the native
`record(cid:2) The third one indicates the type of the audit data in the NADF record(cid:2) The second
`and third line are not interpreted for the present ASAX version but are useful for your own
`documentation(cid:2) The fourth line indicates the external name and (cid:3)nally(cid:13) the (cid:3)fth line contains
`a free comment about the meaning of the audit data and is not interpreted by ASAX(cid:2) At the
`beginning of the (cid:3)le you can optionally write additional comments(cid:2) For instance(cid:13) these lines
`may contain the version number(cid:13) machine type(cid:13) operating system type(cid:13) etc(cid:2) These lines must
`appear as follows(cid:7) or more lines beginning with an (cid:21)A(cid:21) followed by or more lines beginning
`with a (cid:21)B(cid:21)(cid:13) and so on up to (cid:21)F(cid:21)(cid:2) A sample data description (cid:3)le is shown in the appendix
`
`See ASAX user guide for the BNF syntax of audit data description (cid:3)les(cid:2)
`
`Format Adaptor implementation
`Here is a skeleton of a Format Adaptor program(cid:7)
`
`Begin
`open native file
`create a NADF file (cid:17)(cid:19) using (cid:2)(cid:15)em creat(cid:15)(cid:13)NADF(cid:8)(cid:20)(cid:21) (cid:19)(cid:17)
`allocate buffer for input (cid:20)native(cid:21) record
`allocate buffer for output (cid:20)NADF(cid:21) record
`read the first native record
`While not end of native file do
`begin
`convert native record to NADF format in the output buffer
`
`
`
`Patent Owner Finjan, Inc. - Ex. 2032, p. 3
`
`
`
`(cid:17)(cid:19)
`
`this is done by converting each field to a sequence of
`identifier(cid:11) length and value(cid:8)(cid:22) Remember that identifiers
`must be sorted in a strict ascending order(cid:22) You should
`convert fields with lower identifiers first(cid:22)
`
`(cid:19)(cid:17)
`write it to NADF file (cid:17)(cid:19) using (cid:2)(cid:15)em write(cid:15)(cid:13)NADF(cid:8)(cid:20)(cid:21) (cid:19)(cid:17)
`read the next native record
`
`end
`close native file
`close NADF file (cid:17)(cid:19) using (cid:2)(cid:15)em close(cid:15)(cid:13)NADF(cid:8)(cid:20)(cid:21) (cid:19)(cid:17)
`release input and output buffers
`End(cid:22)
`
`Although you can use common I(cid:17)O system calls (cid:4)create(cid:13) open(cid:13) write(cid:13) close(cid:5)(cid:13) it is highly rec(cid:12)
`ommended that the standard NADF I(cid:17)O routines (cid:4)see nad(cid:3)o(cid:4) ASAX(cid:5)(cid:5) be used instead(cid:2) These
`routines support NADF (cid:3)le I(cid:17)O at the record level and insert padding bytes between adjacent
`records to ensure proper data alignment(cid:2) Also(cid:13) creat NADF(cid:4)(cid:5) writes systematically the header
`record when it creates a new NADF (cid:3)le(cid:2)
`
` Final Remarks
`
`From our experience in developing format adaptors(cid:13) some misunderstandings often occur(cid:13) so we
`provide further warnings(cid:7)
`
`(cid:0) a NADF (cid:3)le is not an ASCII (cid:3)le but it is in a binary format(cid:2) If you produce your NADF
`(cid:3)le on a given machine architecture and analyze it on a di(cid:16)erent architecture(cid:13) some byte and
`bit ordering problems may arise(cid:2) (cid:4)For instance(cid:13) MC and MC are big(cid:12)ending with
`respect to bytes but little(cid:12)endian with respect to bits(cid:2) By contrast(cid:13) the VAX and Intel
`are both little(cid:12)endian for bytes and bits(cid:5)(cid:8)
`
`(cid:0) audit data identi(cid:3)ers must be sorted in ascending identi(cid:3)ers(cid:2) This is crucial for record pre(cid:12)
`processing by ASAX during analysis(cid:8)
`
`(cid:0) in converting a string data (cid:3)eld to the format identi(cid:2)er(cid:3) length and value(cid:13) the null(cid:12)character
`(cid:15) marking the end of the string (cid:4)like in C(cid:5) is useless since we already know the string length
`(cid:4)see Example before(cid:5)(cid:2)
`
`
`
`Patent Owner Finjan, Inc. - Ex. 2032, p. 4
`
`
`
`A A Sample Audit Data Description File
`
`A Audit Data Description File for SunOS (cid:22) (cid:22)
`B sunc
`C (cid:25) (cid:25) (cid:26) (cid:26)
`D ASAX V (cid:22)
`
`
` long
` long
` au time
` time since epoch
`
`
` int
` int
` record type
` record type os the audit record
`
`
` int
` int
` uid
` user id
`
`
` int
` int
` pid
` process id
`
`
` string
` string
` filename
` the file name
`
`
`
`Patent Owner Finjan, Inc. - Ex. 2032, p. 5