`
`
`
`U811116173323Bl
`
`(12) United States Patent
`US 6,173,323 B1
`(45) Date of Patent:
`Jan. 9, 2001
`Moghe
`
`(10) Patent 1%.:
`
`(54) ADAPTIVE POLLING RATE ALGORITHM
`FOR SNMP-BASEI) NETWORK
`MONITORING
`
`(75)
`
`Inventor:
`
`I’I‘atyttsh Moghe, Aberdeen, NJ (US)
`
`(73) Assignee: Lucent Technologies Inc, Murray Hill,
`NJ (US)
`
`( * ) Notice:
`
`Under 35 U.S.C. 154th], the term of this
`patent shall be extended [or 0 days.
`
`(21) Appl. No.: 081998313
`
`(22
`
`Filed:
`
`Dec. 24, 1997
`
`Int. CL? ........................... G061" [5116; (1061i 15r'173
`(51)
`
`.....
`__
`__
`(52) U.S. Cl.
`7091224; T092315; 7095225;
`340182507; 340182555; 37932.01
`(58) Field of Search ..................................... 709,924. 226.
`709.235, 227; 340825.07, 825.08, 825.55;
`372220; 710K220; 37992.01
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`791986 Clark ct a1.
`.......................... 304.:“200
`4,598,363 "
`.. 34“.?825118
`8,-‘1987 O‘Brien
`4.689619 *
`
`2.!1990 Cain et al.
`....... 370323?
`4.905.233 *
`”1992 Weinberger et at.
`..... Nail-’16
`5,084,825 *
`
`5,193,151 * M1993 Jain ctal.
`7093237
`11’1994 Rantaniurthy ct a1.
`5.2mm? *
`320.3232
`
`5,633,859 *
`5,!199?
`Iain elal.
`3TIIL’234
`5,659,282 *
`811997 Schiellz
`7093226
`
`4H9!!!) Scltrader et a.
`.. 455.3611
`5,896,561 *
`23‘1999 Sidey ................................... 2119,3223
`5,922,051 *
`OTHER PUBLICATIONS
`
`A multiple access scheme I‘or Wireless access to a broadband
`ATM LAN polling and seetored antennas, Mahmoud, A.S.;
`Mahmoud, 8A.; Falconer. DD. Dept. of Syst. & Compt.
`
`Eng. Carleton Univ., Ottawa, Ont. Canada, pp. 596—608,
`May, 199().*
`Adaptive polling schemes [or an ATM bus, Karlsson, .I.M.;
`Perros, II.G.; Viniolis, 1., Center for Common.
`81. Signal
`Process, North Carolina State Univ., Raleigh, NC, USA, pp.
`403-417, ISBN: (5—7803-0006—8, May 1996.*
`J. Case et al., "A Simple Network Management Protocol
`(SNMP)", RFC 1157, Network Working Group, IETF, 1990.
`R. Storm, “0&A". Open View Adviser, pp. 11—13, 1(2),
`Mar. 1995.
`
`AB. Bondi, "A nonblocking mechanism for regulating the
`transmission of network management polls”, Ill/1’92. pp.
`565—580, May 1997.
`Analysis of symmetric nonexhaustive polling with multiple
`server, Marsanm, de Moraes, Donatelli, Neri, INITOCOM
`")0 May 6,1991].*
`
`* cited by examiner
`
`Priming- browser—Mark I-I. Rineharl
`Assistant Exmniner—Beatriz Prieto
`
`(57)
`
`ABSTRACT
`
`A rate adaptive polling method sensitive to network eon-
`gestion is used to monitor network resources. Initially, a
`baseline rate constraint is set based on desirable engineering
`factors. Unacknowledged polls are retransmitted with
`timeouts, successively increasing by a [actor of 2. During
`each polling round, the rate is adapted in accordance with
`congestion feedback and changes in node status such as
`recently activated or deactivated nodes. Congestion feedw
`back is estimated from a congestion metric computed as the
`average delay of all active nodes. Depending on the value of
`the congestion metric corresponding to polling round R}, the
`minimum rate ol~ polling for round R”, is adapted accord-
`ingly.
`
`3 Claims, 1 Drawing Sheet
`
`POLLING ROUND j+1 HITH Nj.” POLLS
`POLLING HOUNO j HITH N j POLLS
`
`furious rant
`5
`NHS
`t
`
`taI
`
`\
`
`r
`
`I1t'
`
`HATE ESTIMATION
`FOO ROUND 3
`
`HATE ESTIMATION
`FOR HOUNO j+1
`
`RATE ESTIMATION
`FOR ROUND j+2
`
`001
`
`US. Patent No. 8,724,622
`
`Apple Inc.
`APLl 1 19
`
`001
`
`Apple Inc.
`APL1119
`U.S. Patent No. 8,724,622
`
`
`
`US. Patent
`
`Jan. 9, 2001
`
`US 6,173,323 B1
`
`1
`FIG.
`[PRIOR ART]
`
`,5
`OTHER
`
`
`
`
`iO
`
`POLLING
`MESSAGES
`
`
` SUBNET 19
`
`FIG. 2
`
`
`
`POLLING ROUND j HITH Ni POLLS POLLING ROUND j+1 HITH Nj+1 POLLS
`
`‘.
`
`POLLS FROH--\
`HMS
`
`\
`
`E
`:
`
`
`
`HATE ESTIMATION
`FOR ROUND j
`
`HATE ESTIMATION
`FOR ROUND j+1
`
`HATE ESTIMATION
`FOR ROUND j+2
`
`002
`
`002
`
`
`
`US 6,173,323 B1
`
`1
`ADAPTIVE POLLING RATE ALGORITHM
`FOR SNMP-IIASEI) NETWORK
`MONITORING
`
`TECHNI CAI . FIEI .D
`
`This invention relates generally to the field of network
`management and in particular to a method for eflicicnt
`polling of network hosts and resources and a network
`manager for implementing the same.
`BACKGROUND 0]: TIIE INVENTION
`
`1!]
`
`Referring to FIG. 1, identifying the various hosts and
`resources 13, 15 and 17—19, connected and available to a
`network 18 is a vital part of proper network management.
`Typically one host on the network is assigned the task of
`network manager ("NM") 10, running appropriate software,
`while the remaining hosts and resources are identified as
`agents. The manager 10 will periodically request informa-
`tion from the agents using one of a variety of protocols, e.g.
`Simple Network Manager Protocol (“SNMP”) at the appli-
`cation layer, or Packet Internet Groper ("PING“) at the IP -
`layer, and expect a response from each agent using the same
`protocol. This process is referred to as "polling."
`Eflicient polling is becoming increasingly important with
`new bandwidth—intensive applications such as conferencing
`and web-push applications. Proper monitoring of the net-
`work can help deploy such applications. The collected traflic
`pattern of the network can be used to set and administer
`policies on application use, to configure intranet switches
`and routers, and to detect errant behavior. The challenge in
`polling is to be able to poll with high throughput, yet avoid
`intruding on the performance of user applications in the
`network.
`
`_
`
`an
`
`Presently, the de t‘acto network management software is
`marketed by Hewlett Packard under the trade name Open-
`View (“0V”) and described more fully in R. Sturm, "0&A”,
`OpenView Advisor, pp. 11—13, 1(2) (March 1995), hereby
`incorporated by reference as if fully set forth herein. In
`operation, 0V can transmit a maximum of N outstanding
`polls, where N is typically 3. For example, NM ll] may send
`a lirst poll to node 17. NM 10 will
`then wait a time-out
`period T“ for a response. If no response is received, a
`second poll
`is sent
`to node 17 and NM 10 will wait a
`timeout period 2xT1, for a response. This is repeated two
`more times with corresponding time-out periods of 4><T1 and
`8x’l'1, respectively. NM 10 resends the poll these four times
`with continuously increasing timeout periods in order to
`rule out that the lack of a response is due to network traffic.
`If after four attempts no response to the polls is received,
`NM It] concludes that node 17 is unavailable to the network.
`It has been reported that the foregoing method of polling
`network resources is inefficient and leads to network "freeze
`up” as for example when a succession of nodes are unre-
`sponsive. See,A. B. Bondi, "ANonblocking Mechanism for
`Regulating the transmission of Network Management Polls"
`1M ‘97, pp. 565—80 (May 1997), hereby incorporated by
`reference as if fully set forth herein. One proposed solution
`to ameliorate the deficiencies of the 0V method is described
`in Bondi, supra. As described more fully therein, a minimum
`time threshold 1'.
`is set under which no two polls can be
`transmitted, while the maximum number of polls parameter
`N of 0V is eliminated. As a result, freeze up is avoided by
`removing the constraint of N and network traffic is managed
`with constraint ‘5. However, even this proposal suffers from
`its inability to provide feedback relating to network conges—
`tion. The ability of a poller to quickly adapt to congestion is
`becoming increasingly important. Newer applications
`require continual monitoring, for example, at a rate 01.5% of
`network capacity. Such monitoring can cause a network that
`is already congested with user traffic to collapse.
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`003
`
`2
`SUMMARY OF THE INVENTION
`
`Accordingly, the present invention provides an adaptive
`rate of polling which is sensitive to network congestion. In
`accordance with the invention the network resources are
`polled with a minimum rate of transmission to avoid “freeze
`up,“ while also providing feedback as to network conges—
`tion. Initially, a baseline rate constraint
`is set based on
`desirable engineering factors. Unacknowledged polls are
`retransmitted with timeouts, successively increasing by a
`factor of 2. During each polling round, the rate is adapted in
`accordance with congestion feedback based on earlier poll
`responses and changes in node status such as recently
`activated or deactivated nodes.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is prior art diagram of a network of hosts and
`routers monitored by a network manager through polling.
`FIG. 2 is a time line describing the polling rate adaptation
`of the present invention.
`DETAILED DESCRIPTION OF TIIE
`INVENTION
`
`In accordance with the present invention a network man-
`ager begins an initial polling round at a rate R0 and a
`corresponding maximum number of hosts to be polled in a
`polling round, NW”. These initial parameters are typically
`based on engineering criteria. Thus for the initial round j:—,1
`is set equal
`to R0 and N- is set equal to NM“. Using
`SNMP, the network manager sends a poll to each ofN active
`nodes, where polls to consecutive nodes are spaced by a time
`equal to HR, and as in CV, waits for an acknowledgment
`from each polled node for a period of time "’1 In one
`advantageous embodiment of the present invention T'1 is set
`equal to 10 seconds. If an acknowledgment is not received
`within time T', another poll is sent and the NM waits for an
`acknowledgment for a period of time 2x1". This continues
`two more times for corresponding waiting periods of 41le
`and BXTI, respectively. If after four polls a node does not
`respond it is taken oh" the list of active nodes that are used
`in the next immediate round j+1. A node is returned to the
`active list
`through well-known mechanisms like traps, as
`described more fully in J. Case, et al., "A Simple Network
`Management Protocol (SNMP)" RFC 1157, Network Work-
`ing Group. IETF (1990), hereby incorporated by reference
`as if fully set forth herein.
`Referring to FIG. 2, after the last node in round j is polled,
`round j+l begins. Prior to polling further network resources,
`the network manager notes the number of active nodes after
`roundj, as N“, the polling rate of the prior round, R, and
`the congestioi'i feedback whichIs derived from the received
`acknowledgments during roundj These parameters are then
`used to compute the polling rate R,“ for round j+t, which
`entails estimating the network congestion from polls issued
`in round j.
`In particular, a congestion metric D: is computed as the
`average delay over the resolved and outstanding polls. Polls
`are resolved when the node acknowledges the poll. They are
`outstanding when they have not been acknowledged but the
`NM has not concluded that the polled node is inactive. Thus
`the sum of the delays of acknowledged polls is added to the
`sum of the current timeout values of all outstanding delays
`and the result is divided by the number of outstanding and
`resolved polls. This metric represents the acknowledgment
`delay of the active polls.
`With D-, congestion can be estimated by comparison with
`the levelsof timeout described above Thus for example, if
`D- is less than or equal to T1,eeg 10 seconds, the network
`isnormally loaded and not experiencing congestion. For D
`
`003
`
`
`
`US 6,173,323 B1
`
`3
`j-greater than T1 but less than or equal to T2, e.g ZOseconds
`t e network is experiencing a first
`level 01 congestion.
`Similarly for the third and fourth levels of timeout,
`the
`network would be experiencing second and third levels of
`congestion. respectively. Each range of consecutive time~out
`values from a lower bound to a next upper bound, represents
`a level of congestion. Thus for four time-out values, there
`exists congestion levels 1
`through 3, represented by the
`variable k
`
`This congestion feedback D:- is used to adapt the polling
`rate at round j+1. [n the presenjce of any level of congestion
`1 through 3, ie TED—j5’1""+1 the equation
`
`. Ni-I
`m.“
`RM =nu1{N'
`
`Rf
`_
`XRO. ?]
`
`5
`
`ll)
`
`15
`
`4
`
`determining the number of active nodes NIH remaining at
`the conclusion of each round j; and
`removing each node that has failed to acknowledge a poll
`after said four successive polls, from said list of active
`polls.
`4. A method according to claim 3 wherein said adapting
`step assigns a value to R1 equal to the lesser of
`
`Niel
`NM“
`
`X fin.
`
`and
`
`is used. The multiplicative factor 01. 24' corresponds to the
`exponential growth of delay in the presence of congestion
`with a slight increase in throughput. Thus the polling rate
`must be reduced exponentially fast. As congestion increases
`the minimum of the above two terms is biased toward the
`right
`term which causes the polling rate to decrease.
`Furthermore, as nodes become inactive the desired period
`for polling a particular node can be maintained even at a
`slower polling rate since there are fewer nodes to poll within
`the same period of time. In this case the left term is biased *
`as the new polling rate.
`Under a1 normal load, however, where D- is less than or
`equal to ”l, the delay will grow approximately linearly with
`increased l.oad Thus to assure that
`the network will not
`become congested, the right most term of the right side 01'
`the equation is modified by setting k=U and adding a AR
`which is typically a small fraction of R”, for example RUIIO.
`Thus RJ-+ARU.
`[n this way,
`the method of rate adaptive
`polling in accordance with the present invention maximizes
`the benefits of polling by maintaining a desired polling rate
`whenever possible and minimizes the undesirable elTects of
`bandwidth congestion.
`The foregoing merely illustrates the principles of the
`present invention. Those Skilled in the art will be able to
`devise various modifications, which although not explicitly
`described or shown herein, embody the principles of the
`invention and are thus within its spirit.
`What is claimed is:
`l. A method for monitoring network nodes with a mini-
`mum polling rate per roundj. Rn said method comprising the
`steps of:
`polling sajd nodes during each round at said rate R; where
`R equals R0 for said initial round,
`estimating congestion in said network experienced during
`said round j;
`adapting the polling rate [or round j+1 based on said
`estimated congestion during round j, said adapted poll-
`ing rate for round j+l represented as R)“; and
`setting j=fi+1) and repeating said polling, estimating and
`adapting steps.
`2. Amethod according to claim I wherein R}. is set initially
`to RD, and said estimating step further comprising the step of
`computing a congestion metric T3—j as the average delay over
`all resolved and outstanding polls
`3. A method according to claim 2 wherein said network
`comprises an initial maximum number of nodes to be polled
`in a round, NM”, said method further comprising a list of
`active nodes, said polling steps further comprising the steps
`of:
`
`3f]
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`where TWEETI‘”. wherein k is an integer and has a value
`at least equal to l, and assigns a value to R;—H equal to the
`lesser of
`
`
`ngi I
`NM“
`
`x Ru
`
`and lift-MID where fiéTJ.
`5. A computer implemented network manager for moni-
`toring network nodes with a minimum polling rate per round
`j, R1" comprising:
`means for polling said nodes during each round at said
`rate R;— where R}. equals R0 [or initial round;
`means for estimating congestion in said network experi-
`enced during said round j;
`means for adapting the polling rate for round j+1 based on
`said estimated congestion during round j, said adapted
`polling rate represented as R!+1, and
`means for polling said nodes. during round j+l at said
`adapted polling rate R;—+1
`6. The computer implemented network manager of claim
`5 wherein R,- is set initially to R0, and wherein said means
`for estimating further comprising means for computing a
`congestion metric E as the average delay over all resolved
`and outstanding po s.
`7. The computer implemented network manager of claim
`6 wherein said network comprises an initial maximum
`number of nodes to be polled in a round, Nmm, said network
`manager further comprising a list of active nodes, said first
`recited and second recited means for polling further com-
`prising:
`transmitting up to [our successive polls to each node
`where said node fails to acknowledge a prior poll
`within a predetermined period of time T1 and wherein
`'1“ increases by a factor of-7 [or each successive pol];
`determining the number of active nodes NIH remaining at
`the conclusion o1 each round; and
`removing each node that has failed to acknowledge a poll
`after said four successive polls, from said list of active
`polls.
`8. The computer implemented netWork manager of claim
`7 wherein said means for adapting includes means for
`assigning a value to RH equal to the lesser 01
`
`thl
`— ><Rn
`NM
`
`transmitting up to four successive polls to each node
`where said node fails to acknowledge a prior poll
`within a predetermined period of time TJ and wherein
`TI increases by a factor of 2 for each successive poll;
`
`65
`
`and
`
`004
`
`004
`
`
`
`US 6,173,323 B]
`
`where 'I“<fij§'l“”l, wherein R is an integer and has a value 5
`al lcasl equal to 1, and means for assigning a value [0 RI“
`equal [0 [he lesser of
`
`and RfifiRn where [7521“.
`
`a:
`
`no:
`
`005
`
`005
`
`