throbber
0
`
`
`
`43
`
`US 2008029751
`
`as) United States
`a2) Patent Application Publication o) Pub. No.: US 2008/0297513 Al
`Greenhill et al.
`(43) Pub. Date:
`Dec. 4, 2008
`
`(54) METHOD OF ANALYZING DATA
`
`Related U.S. Application Data
`
`(75)
`
`Inventors:
`
`Stewart Ellis Smith Greenhill,
`Hilton (AU): Svetha Venkatesh,
`Winthrop (AU); Peter Leslie Lee,
`Wattle Park (AU); Geoffrey Alec
`William West, Kalamunda (AU);
`Chiou Peng Lam, Karawara (AU)
`
`Cirteepondence Kadibes:
`EDELL, SHAPIRO & FINNAN, LLC
`1901 RESEARCH BOULEVARD, SUITE 400
`nea
`2
`ee
`ROCKVILLE, MD 20850 (US)
`
`(73) Assignee:
`
`IPOM PTY LTD, Bentley (AU)
`
`(21) Appl. No.:
`
`12/102,502
`
`(22)
`
`Filed:
`
`Apr. 14, 2008
`
`(63) Continuation of application No. PCT/AU2005/
`001595, filed on Oct. 14, 2005.
`Publication Classification
`
`(51)
`
`Int. Cl.
`G06r 11/20
`(2006.01)
`G0IG 5/02
`(2006.01)
`GO6F 3/048
`(2006.01)
`(52) US. C1. oeeeeeeseeereereres 345/440; 345/589; 715/771
`(57)
`ABSTRACT
`.
`eat
`:
`‘
`A computer assisted method of analysis suitable for process
`control, comprises the steps of: receiving first data streams
`representing values from a process; receiving second data
`streams representing states of the process; recording meta-
`data about the data streams; calculating relationships between
`pairs of the data streams; and recording relationship data
`resulting from the calculating step together with an associa-
`tion betweenat least one relationship datumandits corre-
`sponding meta-data.
`
`1
`
`34
`
`20
`
`Correlation
`
`‘5
`
`.
`
`0
`:
`Classification
`i
`x Boe
`| My
`|
`i ad
`ae
`Staie Labels
`
`ox So a Ny
`i of
`Poi NN
`; 28
`:
`¢ we i
`me =
`2 *%,
`*
`‘
`an
`
`!
`1
`
`-
`
`:
`
`a
`

`
`| RAS .
`
`i
`|
`
`.
`
`a
`
`ie
`
`E
`
`Ee
`
`Me
`
`Model
`Process
`|
`Lag:
`Signals
`Correlation Matrix
`ee
`J
`View bo View
`t
`b
`View
`bal
`View
`I
`
`
`
`~ ane=148 eee =f. - a Nt fe _ _ ae ee E ni aot
`
`
`
`40
`42
`44
`38
`36
`
`
`
`
`
`1
`
`APPLE 1009
`
`
`Processdata
`14
`EventData——— 16
`
`
`
`Excel
`[ Text
`| {orca | SQLDB
`| SQL DB |
`Text
`
`
`.
`am
`7
`-
`.
`ff
`\
`/
`
`
`
`
`= et a ei
`4 Se a 4"
`se
`ye a ‘
`:
`: 7
`26 ————|_Conelation Database
`Tag Grouping
`d
`
`:
`
`Tag Group Set
`
`—_
`
`Neural Net ‘Training
`
`10
`
`*
`
`‘
`

`
`j
`
`Z
`
`:
`
`SOM Mode!
`
`My
`
`f
`
`: :
`
`
`
`
`
`APPLE 1009
`
`1
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 1 of 11
`
`US 2008/0297513 Al
`
`OL
`
` i1j\POL|aqIOs|9}"amgwag
`
`—4e™.—petaiv4.¥‘
`
`\
`
`\
`
`\;‘‘
`
`ceiASETaS:
`
`7-5
`
`02
`
`iiMOLA|Beyisquats|$f
`
`
`
`NUPMOTRperO:>
`
`aEA,
`
`ve
`
`9€
`
`~——|—__
`iSurryFNPINE=SNroanoeféen&ay\/i-wenmoree|uodury\v2
`
`;ag“St)
`
`
`|aTOS=|INOL|[oxy]
`
`vi
`
`
`
`BIEPSSBOONT
`
`2
`
`
`
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 2 of 11
`
`US 2008/0297513 Al
`
`+as
`
`coneeeie
`
`tatenehOS,
`
`Al;
`4

`-
`
`5
`
`3
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 3 of 11
`
`US 2008/0297513 Al
`
`O3TI1
`SLPY
`o3Tnag.Pyv
`
`7
`I
`4
`PV
`STHM32.PV
`
`SST
`4
`SoA
`25.PV
`‘ati
`28.PY
`TH
`aT
`2T.Py
`rit
`PY
`TH
`>,
`oa
`O3TH2ZR. RY
`6
`ix}
`O3TIN120.PV
`
`O3TM17Pv
`C3THIs.Py
`Sih
`py
`Th
`# OgTI43.
`py
`2.PV
`OsThit
`OSTHILPY
`
`Sah)
`
`Os.PY
`
`37!
`
`Adtobitth
`
`feyLUND
`
`AgIte
`
`ForRLS
`
`4AaBOS
`
`Aya
`
`4
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 4 of 11
`
`US 2008/0297513 Al
`
`ones*
`
`
`
`32194¥ts29f
`
`FSid
`
`wo
`
`it.asezik
`="Yorfoatbsik
`
`
`
`66LIk>ie
`
`
`
`Joyesodeagysela
`
`OotZebayAprera,
`
`"27OOOO
`
`rool|ori
`se1u®eyes
`
`vzLoadzs|
`
`
`epzikas6zixfoe
`
`sz|iwsszafOsszif
`
`(—>zpS29oztaceiOoazezide
`
`
`agezadSH
`retlQ|—azisa>K
`in~3avezodoN
`
`
`
`
`
`
`
`
`as6zidaspeod
`
`5
`
`
`
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 5 of 11
`
`US 2008/0297513 Al
`
`
`
`4
`
`GS‘Sls
`
`
`
`32191¥@pez54)"2
`
`664K)em
`
`
`yzr9oy¥
`
`:cesGZéld=
`
`2190~~scapeaE
`“ruOM
`_SazrgoyvC])
`seuevzid
`nii
`
`t |i4
`
`szwSSzadsszit
`
`
`
`ROOTZéay‘Aemsunda
`
`
`
`soyesodergysejyi
`
`6
`
`
`
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 6 of 11
`
`US 2008/0297513 Al
`
`Ad€5!5180
`
`APENED
`
`APELUES
`
`heyB25LS
`
`Adenia:Je
`
`SZ,
`
`teeents
`
`xrzi¥LtDS
`
`RICO
`
`PFEIDINDJ
`
`7
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 7 of 11
`
`US 2008/0297513 Al
`
`AdG1VOLE6
`
`FadCZSDLEO
`
`ACTS
`
`ieosin
`-——
`falCUISeeeeete
`ee
`
`aaHenla
`
`8
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 8 of 11
`
`US 2008/0297513 Al
`
`elre alias alae
`
`|
`SpltatndSiehnilabarat silsat *
`
`+
`Lea
`
`i ¥
`
`igs,
`
`2i % &
`
`x
`
`eeAd0szl4e0—
`
`||
`
`Ad'SLLOLEO
`.|yd
`
`
`
`SaShaERFmeenSeeeaagata
`ee
`
`2
`Gey
`
`aaaae
`
`9
`
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 9 of 11
`
`US 2008/0297513 Al
`
`
`
`10
`
`10
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 10 of 11
`
`US 2008/0297513 Al
`
`“GaatTTreml0eele
`
`
`
`oaZOLNv.
`
`
`
`misusewoweenPOLAY
`
`
`
`
`
`.WIATECkLatpla
`
`iteat
`
`TNNO,mtCRPoypyLAiMeech
`
`
`
`
`11
`
`~SaSSOHD
`
`
`Sara!
`
`
`
`11
`
`
`

`

`Patent Application Publication
`
`Dec. 4, 2008 Sheet 11 of 11
`
`US 2008/0297513 Al
`
`
`
`12
`
`12
`
`

`

`US 2008/0297513 Al
`
`Dec. 4, 2008
`
`METHOD OF ANALYZING DATA
`
`CROSS REFERENCE TO RELATED
`APPLICATIONS
`
`[0014] There may be one or more third data streams repre-
`senting statistics calculated from the first or second data
`streams, or both.
`
`[0015] The metadata may concernthe origins of the data
`streams, for instance it may comprise tags that identify the
`location of origin of each respective data stream. The asso-
`ciation may link each datum to its respective locations of
`origin. There may be more than onelocation depending on the
`origins of the data streams. The meta-data may include flow
`charts or plant diagrams. The chart or diagram may display
`the value of each datumat the locationof its source.
`[0016] The calculating step may involve calculating corre-
`lations of the data streams. The calculating step may involve
`calculating, fora range ofdifferenttime lags, autocorrelations
`of the data streams. Alternatively, or in addition the calculat-
`ing step may involve calculating, for a range of different time
`lags, cross-correlation ofpairs of data streams.
`[0017]
`Sub-sets may be created within the relationship
`data, and each sub-set may comprise data having a value
`within the same predetermined range of values. For instance,
`each sub-set may comprise data having a correlation value
`within the same predetermined range of values. Where the
`metadata involves tags that label the locations of origins a
`Industrial processes involve large and complex sys-
`[0003]
`sub-set is designated a ‘tag group’.
`tems. Typically, an industrial process involves many thou-
`[0018] The predetermined range ofvalues is a user select-
`sands ofvariables which are controlled in part by automatic
`able parameter, so for instance the user may select a sub-
`processes, and in part by human operators. In the operation of
`group, or tag group, made up of data streams that are corre-
`these processes large amounts of informationare collected by
`lated to better than 90%. The degree of correlation may be
`process control and monitoring systems.
`changed by the user and this may automatically flow through
`[0004] Most tools currently available for process analysis
`to a changein the composition ofthe group. A similar result
`are complex mathematical analysis tools that are general in
`may automatically be achieved when making other changes,
`nature, require an understanding of their language, and are
`such as changing the amountoflag in correlation.
`expensive and time consumingto use. Tools such as Matlab,
`[0019] As time passes and moredata is received, the calcu-
`Excel, or Mathcad are routinely used in process engineering
`lating step may be performed again to update the relationship
`environments. However, they require that the data all be
`data. The step may even be performed repeatedly inreal time.
`stored in memory, limiting the complexity of the problems
`[0020] The relationship data may be displayed inafirst
`that can be analyzed or visualized.
`formas a matrix with a single datumineachcell of the matrix.
`The relationship data calculated for each data stream will
`appear in both a row and a columnof the matrix. The matrix
`may be convertible directly to raster.
`[0021] Therows and columns may be grouped according to
`the value of the relationship data, in other words the tag
`groups may automatically be collected together.
`[0022] The relationship data may be displayed in a second
`form as a diagram of metadata having locations marked
`according to their corresponding relationship datum. The
`location ofthe source ofeach data stream, may be indicated in
`the diagram of metadata.
`[0023] The relationship data may be displayed in a third
`formasa list.
`
`[0001] This application is a continuation of International
`Application No, PCT/AU2005/001595, filed on Oct. 14,
`2005, entitled “Method of Analysing Data,’ which claims
`priority under 35 U.S.C. § 119 to Application No. AU
`2004905955 filed on Oct. 15, 2004, entitled “Method ofAnal-
`ysing Data,” the entire contents of which are hereby incorpo-
`rated by reference.
`
`FIELD OF THE INVENTION
`
`[0002] This invention concerns acomputerassisted method
`of analysis suitable for process control. In further aspects the
`invention concerns a computer system for performing the
`method and computer software for performing the method.
`Theinvention has particular utility in the control of Industrial
`Processes.
`
`BACKGROUND
`
`SUMMARY
`
`[0005] The invention is a computer assisted method of
`analysis suitable for process control, comprising the steps of:
`[0006]
`receivingfirst data streams representing values from
`a process;
`[0007]
`receiving second data streams representing states of
`the process;
`[0008]
`recording metadata about the data streams;
`[0009]
`calculating relationships between pairs of the data
`streams; and
`[0010]
`recording relationship data resulting fromthe cal-
`culating step together with an association betweenatleast one
`relationship datum and its corresponding meta-data.
`[0011] By recording relationship data between the data
`streams together with corresponding metadata the process
`engineer is able to gain insight about the process and its
`control in relation to aspects of the process described by the
`metadata.
`
`[0012] The data streams may be continuous streams, or
`they may be discontinuous, discontiguous or even a succes-
`sion of blocksofdata.
`
`[0013] The values ofthefirst data streams may be measure-
`ments from the process. The valuesofthe first data streams
`may be sampled over time. The states of the second data
`streams may be events or conditions in the process.
`
`[0024] The data streams may also be displayed in the form
`oftime-series data.
`
`[0025] Historical values of the relationship data or data
`streams may be displayed.
`[0026] Correlations betweena pairof data streams may be
`displayed as a function oflagged time.
`[0027] Coding may be used to identify different sub-sets in
`the display, and this coding may survive whena different view
`is selected so that a tag group highlighted in one group is still
`highlighted when the viewis changed. The coding may be
`color coding or shading. A user maybe abletoselect a sub-set
`by:
`[0028]
`
`clicking on a cell in the matrix;
`
`13
`
`13
`
`

`

`US 2008/0297513 Al
`
`Dec. 4, 2008
`
`clicking on a markedlocation in the meta-data dia-
`
`[0029]
`gram; or,
`clicking on a datum inthelist.
`[0030]
`[0031] A neural network may be trained to modelthestate
`space of the process.
`[0032]
`In another aspect the inventionis a computer system
`for performing the method.
`[0033] A further aspectofthe invention is computer soft-
`ware for performing the method.
`[0034]
`In the claims of this application and in the descrip-
`tion of the invention, except where the context requires oth-
`erwise due to express language or necessary implication, the
`words “comprise”or variations such as “comprises” or “com-
`prising” are used in an inclusive sense, i.e. to specify the
`presence of the stated features but not to preclude the pres-
`enceor addition of further features in various embodiments of
`the invention.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`the process may be controlled, suchas for example tempera-
`ture, pressure, ow rate, amount of a raw material. Some of
`the variables may not be able to be controlled, such as for
`example ambient temperature, or purity of a raw material.
`Some examples of industrial processes include anore refining
`process, a production line process, a mining process and a
`construction process. These lists are exemplary and are not
`indented to be limiting.
`[0049]
`FIG. 1 shows a schematic overview of a process of
`producing visualizations from imported data according to an
`embodimentofthe present invention. As will be described
`below the visualizations allow the data from the process to be
`analyzed to gain an understanding of the process or charac-
`teristics ofthe process. Data 12 is provided from a number of
`sources, The data 12 is divided into process data 14 and event
`data 16.
`
`Process data 14 is regularly-sampled time-series
`[0050]
`data collected from sensors in the process. The characteristics
`being measured by the sensoris referred to as a variable and
`the value(s) ofthe variable at a given moment in time forms an
`In order to provide a better understanding of the
`[0035]
`element of data. Typically, the signals are sampled continu-
`present invention preferred embodiments will be described
`ously, with averages being recorded every minute. For a pro-
`below, by way of example only, with reference to the accom-
`cess with 1000 variables, this equates to approximately 1.5
`panying drawings, in which:
`million data elements per day. Occasionally, there are prob-
`[0036]
`FIG. 1 is a schematic view of information flow
`lems with sensors, or with the collection of data from the
`betweenparts of an embodimentofthe present invention.
`process historian. This means that data may not be available
`[0037]
`FIG. 2 is a large scale visualization ofa cross-cor-
`continuously, and may have “holes”. Process data 14 is
`relation matrix (717x717 variables).
`obtained from an Excel spreadsheet, a text file, an OPC-HDA
`[0038]
`FIG. 3 is a small scale visualization of the cross-
`oran SQLdatabase. (OPC stands for “OLEfor Process Con-
`correlation matrix of FIG, 2 (approx 40x40 variables).
`trol”) OLE is a Microsoft protocol
`for communicating
`
`[0039] FIG.4is a process view showing tag grouping. The
`between application processes. OPC is a set of communica-
`selected tag is displayed as a filled square. The related tags are
`tion protocols used by the process industry, based on OLE
`displayed asfilled circles.
`communication mechanisms. OPC protocols include: OPC-
`[0040] FIG.5 isa process view showingtag similarity. The
`DA(or OPCData Access) for real-time accesstothe values of
`selected tag is displayed as a filled square. Other tags are
`process variables and OPC-HDA (or OPCHistorical Data
`displayed as filled circles, with the shading indicating the
`Access.)
`degree of correlation according to the currently defined shad-
`Event data 16 is irregular data generated to describe
`ing mapping.
`[0051]
`events or exceptional conditions. An example ofeventdata is
`[0041]
`FIG. 6 is a signal view showing changes over time
`an alarm whichis triggered whena certain condition or con-
`for process variables and alarms in a tag group.
`ditions is/are met. Event data 16 may be obtained from an
`[0042]
`FIG. 7 is a signal view showing signal amplitude
`SQLdatabase ortextfile.
`using shading rather thanplotting on the vertical axis. This is
`useful for visually identifying patterns in large sets of tags.
`[0052] The process will usually have process meta-data.
`[0043]
`FIG. 8 is a signal view showing a small set of vari-
`The meta-data is data about the process, rather than data
`ables with scale information.
`collected by operationof the process. It may include descrip-
`[0044] FIG.9 is asignal view showing all alarm events over
`tions ofthe structure ofthe process (for example plant draw-
`ings) and the meaning of processvariablesete.
`a two monthperiod.
`[0045]
`FIG. 10 is a lags view showing cross-correlation
`[0053] The process data 14 and event data 16 are collected
`betweena pair ofvariables as a function oftime.
`into databases 18. The databases include a process database
`20 and an event database 22 and a meta-data database 24.
`[0046]
`FIG. 11 is a state space view labeled according to
`key performanceindicators.
`These databases 18 are used to produce dependent databases.
`[0054] Correlation techniques are applied to the process
`data 14 in the process database 20 and event data in the event
`database 22 tofind similarities betweenvariables. Theresult-
`ing correlation data is saved in a correlation database 26.
`[0055] The correlation database 26 can then be used to tag
`variables that are similar to one another. Such similar vari-
`ables are stored in a tag group set 28.
`[0056] The process data 14 in the process database 20 and
`event data 16 in the event database 22 may also be used to
`train a neural network to generate a model of the process. In
`this example a self organizing map (SOM) model 30 is gen-
`erated.The SOM model canbe used toclassify the state ofthe
`process and to produce state labels 32.
`
`DETAILED DESCRIPTION
`
`[0047] The embodiment described here is used as a Process
`Data Management System (PDMS), which deals with data
`from industrial processes.
`It will be appreciated that
`the
`present invention may be used to analyze data from other
`sources,
`
`[0048] Due to the amount of data produced by a typical
`industrial process, and the speed at which it must be handled,
`specialized data structures have been developed to represent
`this information. An industrial process is intended to meana
`non-trivial process in which one or more raw materials are
`converted into a product. Typically some of the variables in
`
`14
`
`14
`
`

`

`US 2008/0297513 Al
`
`Dec. 4, 2008
`
`need to be stored. The PDMScanuse information about this
`redundancy to reduce the size of the stored data, and improve
`retrieval time.
`
`[0057] The resulting information can then be used to visu-
`alize various aspects of the process. Visualizations 34 can be
`producedfromthis information to determine different aspects
`aboutthe process. The visualizations 34 are useful to showa
`[0067] Time: Mostdata is periodic, so a stream can be
`user, such as a process engineer, what the processis actually
`represented as a sequence of periodic regions. Each
`doing, as opposed to what the process ought to be doing. The
`region is defined byastart time, sampling period (dura-
`visualizations 34 aim to improvethe insight of the engineer
`tion), and a number of evenly spaced, contiguous
`into the workings of the process. Relationships revealed by
`samples. Time and duration are not explicitly stored for
`the visualizations can reveal unexpected relationships, con-
`each sample, but are calculated from the region header.
`firm that relationships that were thought to exist do in fact
`Providing the numberof holes (i.e. breaks in the period-
`exist and also canrevealrelationships that should have been
`icity) is small, this representation roughly halves the
`obvious as a logical consequence of the process design, but
`storage per sample.
`the engineer may not have made the required deductive link.
`[0068] Range: Most data that has been imported from a
`[0058] The examples ofthe visualizations 36 include: a
`Distributed Control System (“DCS”) is averaged, but
`correlation matrix view 36, which uses information from the
`does not define the range of the original values. Forthis
`correlation database 26 and the tag group set 28; a signals
`data, the rangeis not stored but is defined to be equiva-
`lent to the value.
`view 38, which uses information from the tag group set, the
`process database and the event database; a lags view 40,
`whichuses informationfromthe correlation database and the
`process database; a process view42, which uses information
`from the tag group set 28, the correlation database 26 and the
`process meta-data 24; and a Model View 44, which may also
`be visualized as will be described further below. Other visu-
`
`alizations are possible.
`
`Data
`
`[0069] Attributes: If a quality measure is not available
`and no user-definedattributes are defined then there are
`no additional attributes to be stored, and this field is
`omitted in the data. If quality is defined, the user may
`choose to filter out “bad” values in pre-processing, in
`whichcase all samples in the time-series are implicitly
`“good” and again, the attribute field is omitted.
`[0070] Quantization: with the above considerations,
`most time-series data can be represented using a 4-byte
`[0059] The process data 14 is imported and stored in the
`float data type per sample. Ifless that 32-bits precisionis
`process database 20. The process database 20 holds the pro-
`required, it is possible to quantize the data using a per-
`cess data 14 as a set of values over time for each of the
`stream scale and offset factor to map between 32-bit
`variables in the process. It is important that process data 14 be
`floats and 8- or 16-bit integers. Repeats: when consecu-
`represented in a way that is both compact and efficient to
`tive periodic samples have the samevalues forattributes
`access. For rapid visualization, it is important to be able to
`that are defined (i.e. value, range, and extra) a run-length
`quickly retrieve samples based on a given time range. While
`encoding is used. Values are stored just once along with
`general purpose databases are useful in many applications,
`a repeat count,
`they impose anadditional layer of software and processing
`[0071]
`For periodic data, samples can be rapidly located
`betweenthe application and its data. In the PDMS, this may
`using a computable offset from the start of each region. For
`not be acceptable because of the required speed at which
`aperiodic data, a binary search allows a given sample to be
`information must be processed. Therefore, specialized repre-
`located in O(log(N)) time, for N samples.
`sentations may be used that use domain information to
`[0072] When process data 14 is imported into the process
`improve speed and reduce the size ofthe stored data.
`database 20, certain statistics ofthe data 14 are calculated and
`[0060] Each process variable may define a series of com-
`stored in the process database 20 with the data stream. These
`ponents toits value over time. For example, each sample may
`include: mean, standard deviation, various central moments
`have the following components:
`(skewness, kurtosis), maximum, minimum, and frequency
`[0061] Time (32-bit integer).
`distribution (represented as a histogramusing a pre-set num-
`[0062] Duration (32-bit
`integer). Together with start
`ber of frequency bins). This informationis used during visu-
`time, this indicates the time interval over which the
`alization to provide an appropriate scaling for display. The
`sampleis valid.
`frequency distribution is also used for display, and for certain
`[0063] Value (32-bitfloat).
`types of normalization,
`[0064] Range (2x32-bit floats). For samples that have
`[0073] Compressionofthe process database 20 is not pre-
`been derived from a numberofother samples, the system
`ferred. Many well-known techniques of compression exist
`optionally stores a maximumand minimumin addition
`including boxcar, backward slope, and straight line interpo-
`to the value. This allows (for example)avisualization of
`lation methods. These techniques are lossy (1.e. they discard
`a decimated time series to display the full range of the
`information) so the reconstructed data may be inaccurate in
`signal for each sample.
`ways that could bestatistically significant. However it
`is
`anticipated that some versions of the PDMS mayincorporate
`integer). Each
`[0065]
`[Extra Attributes (8- or 32-bit
`sample may be tagged with one or more additional Bool-
`data compressionas an option.
`ean or integer attributes packed into integer bit-fields.
`[0074] A facility to decimate time-series data (i.e. to reduce
`The main system-defined attribute is Quality, which is
`the sampling rate) after filtering out high frequency compo-
`defined for data imported from OPC-HDAdata sources.
`nents may be included. In doing so, it preserves the range
`Other tags may be defined by the user, and applied ona
`information in the resulting data stream because this is an
`per-sample basis to stored data.
`important indicator of variability. This makes it possible to
`[0066] There is usually a certain amount of redundancy in
`pre-compute a representation of each signal at a numberof
`the process data 14 that meansthatnot all of the components
`pre-defined time scales (e.g. 1 minute, 10 minutes, | hour, |
`
`15
`
`15
`
`

`

`US 2008/0297513 Al
`
`Dec. 4, 2008
`
`day). This technique (similar to “MIP maps”in 3D graphics)
`can be used to further accelerate the display of data over long
`time-scales.
`
`[0075] The PDMS includesutilities for importing process
`data from a numberof sources:
`[0076]
`Spreadsheetfiles.
`[0077]
`Textfiles.
`[0078] Databases.
`[0079] OPC-HDAservers.
`[0080]
`Spreadsheet
`files are typically encoded using
`Microsoft Excel data formats. Many tools shipped with DCS
`or process historians allow data to be exported in this format.
`However, there are many limitations on what data can be
`represented in spreadsheets. Typically, worksheets can have
`at most 255 columns and 65535 rows. To overcome these
`limitations, the import system allowsprocess data to be dis-
`tributed across multiple directories, spreadsheets, and work-
`sheets. An import “wizard” may be used to allow the user to
`specify what data to import, and how the different sample
`attributes and meta-data attributes are encoded.
`
`[0081] OPC-HDA is a Distributed Component Object
`Model (“DCOM”) based protocol for importing historical
`data from process historians. DCOM is a Microsoft protocol
`for communicating between application programs that may
`be running ondifferent machines. Typically, a process histo-
`rian (e.g. Pi) collects data in real-time from a DCS system and
`stores it in a specialized database, usually with the aid of
`various compression techniques. The OPC-HDAprotocols
`allowclients to retrieve the stored data. This includes:
`[0082] Time
`[0083] Value
`[0084] Quality
`[0085]
`Process data 14 may be imported directly from
`OPC-HDAservers.
`
`[0086] One problem with certain import methods is that
`process meta-data is not available. For example, OPC-HDA
`servers often do not support tag browsing. Therefore, a
`mechanism to separately import meta-data fromtext files (in
`CSVformat) may be implemented.
`[0087] Events 16 are conditions with well defined time and
`duration. Events are usually related to alarm conditions.
`Change inalarmstate is described by several types of types of
`events. Alarm events indicate the time at which an alarm
`
`started. Return events indicate when the alarm stopped. Other
`events indicate how the operators respondtothe alarms. For
`example, Enable, Disable, and Acknowledge. Other kinds of
`operator actions may also be recorded. For example, changes
`to operating set points, and operating modes.
`[0088] Typically, event streams are used for visualization or
`alarm analysis. However, for visualizationit is important that
`the event data be efficiently accessible so the visualization
`tools generally require that a fast binary representation to be
`used.
`
`[0089] The Event Database 22 Is a stream of events 16
`defined for a numberof event variables. In this context, an
`event variable correspondsto a state of a DCStag. Events are
`defined by the following attributes:
`[0090] Time.
`[0091]
`Tag.
`[0092] Event Type (alarm, return, acknowledge, opera-
`tor action).
`[0093]
`Subtype (HI, HIHI, etc).
`[0094]
`Priority (high, low, emergency, diagnostic, etc).
`
`[0095] Events are stored in a compact binary representa-
`tion. Timesare strictly ordered, so that the closest event to a
`given time can be located in O(log(N)) time, where N is the
`number of events. Most attributes are of enumerated types
`(tag, event type, subtype, and priority) and are represented
`using small integers (8- or 16-bits). Small look-up tables are
`used to map these integers to/from string tags. This also
`ensures that event records have a fixed size, which makes
`indexing simpler. Each event record also contains a pointer to
`the next and previous event of the same type, soit is quite
`efficient to enumerateall of the events of a given type, or to
`find (for example) the next return event corresponding to a
`given alarm event.
`[0096] Event streams may originate from a number of
`sources:
`
`[0097] Event logs (e.g. text printed by a DCS)
`[0098] Event databases, stored in database tables or
`spreadsheets.
`[0099] Normally, events are generated by the DCS, and are
`logged in an external system. This may be an external process
`historian, or a customized systemlike an IMAClogger.
`[0100] The PDMSimports eventstreams from text streams,
`or from databases. For data-base import, the user specifies
`which columnsof the input correspondto the eventattributes
`listed above. The user can also define specific mappings
`betweenthe values of thesefields and the resulting enumera-
`tion value (e.g. there may be more than onestring used to
`represent an event type, or sub-type). This allows the conver-
`sion and the event model to be customized fora particular site.
`[0101]
`Process meta-data 24 is information about the pro-
`cess, as distinct from informationcollected from the process.
`This includes:
`[0102] Descriptions ofthe variables and events in a pro-
`cess. This informationis used in the analysis and visu-
`alization ofdata. It includes the DCS name, description,
`measurement units, and any other information about the
`measurement (e.g. sensor type, precision, etc).
`[0103] Descriptions of the relationships between the
`variables. For example, a measurement point may be
`associated with more than one process variable. A vari-
`able that is controlled automatically may have in addi-
`tionto its value, a set-point and a controller output.
`[0104] Descriptions ofthe structure ofthe process. Nor-
`mally, a process is logically divided into separate units.
`This defines specific physical and functional relation-
`ships betweenvariables.
`[0105] Drawingsof the process structure. This includes
`process and instrumentation drawings (P&ID).
`[0106] Meta-data is used for visualization, and during
`analysis to select variables based oncriteria that are mean-
`ingful in the domain.
`[0107]
`Several
`types of meta-data may be represented
`within PDMS. Eachstreamof process data is associated with
`the following attributes:
`[0108]
`‘Tag Name
`[0109] Description
`(O110] Units
`[0111]
`Precomputed statistics and frequency distribu-
`tion,
`
`[0112] This information is stored in the process meta-data
`database 24.
`
`[0113] Certain types ofvisualization in the PDMS make
`use of process drawings. The drawings are stored as image
`files (e.g. using GIF format). Thesefiles can be produced by
`
`16
`
`16
`
`

`

`US 2008/0297513 Al
`
`Dec. 4, 2008
`
`exporting the data from a CAD system,or by scanning printed
`drawings. They can be annotated by the user to indicate the
`position of important process variables. The annotation is
`stored using an XML data format. The process database may
`include a drawing database comprising multiple drawings,
`each with an associated image and XML annotation.
`[0114] Most existing tools require that data be memory
`resident. That is, they assume they can hold all the relevant
`data in memory. This limits the quantity of data that can be
`analyzed. The PDMS uses data structures that are usually
`stored on disk, and hence donotrely uponthe availability of
`adequate computer memory. The PDMS candeal with large
`data vectors collected over long time intervals. This leads to
`datasets that are very large, and can exceed the available
`memory in anytypical high end computer. Indexing methods
`are includedthatallow fast retrieval of data from disk and fast
`
`manipulation in memory. Recursive decomposition of data to
`optimize data for the time-scale of interest avoids using sub-
`second data for a year’s analysis but also avoidsdata loss that
`is common in process data compression algorithms used in
`mosthistorical visualization tools.
`[0115] The PDMS deals with data from both batch and
`continuous processes. There are very few tools available for
`batch processes. This is because of the complexity of the
`description of batch processes. Batch processes require two
`time dimensions to handle both elapsed time and time in a
`process state. They also require a description of the actual
`process equipment associated with any particular batch
`because multiple processing paths may exist through a typical
`batch process. They also require a representationof the state
`ofthe process andthe current process step being employed to
`be recorded in the datasets.
`
`Correlation
`
`[0116] The correlation database 26 comprises correlation
`data, Correlation data measures the similarity between pro-
`cess variables. The PDMS computes the lagged correlations
`for all pairs ofvariables, up to a defined timelag.
`Givena data series x,, the mean x is:
`
`
`
`For twodata series x, and y,, the covariances,.,, is:
`
`n
`
`Gal
`
`=
`
`iy
`
`=
`
`(x;y; — ¥)
`
`nt
`
`The simple variance s,. of x,is:
`?
`Sy “Sy
`
`The correlation R,,, oftwo x, and y,is the covariance normal-
`ized by the product ofthe variances of the twoserie

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket