draft-ietf-ospf-scalability-00.txt  -->   draft-ietf-ospf-scalability-02.txt

view Side-By-Side changes




   Internet Engineering Task Force                       A. S. Maunder                 Gagan L. Choudhury
   Internet Draft                                        Cisco Systems                                  Vera D. Sapozhnikova
   Expires in August, 2001              
draft-ietf-ospf-scalability-00.txt                    G. Choudhury May, 2003                            AT&T Labs  
                                                      March, 2001
   draft-ietf-ospf-scalability-02.txt         
                                                   Anurag S. Maunder
                                                   Sanera Systems

                                                   Vishwas Manral
                                                   Netplane Systems

                                                   November, 2002


    Explicit Marking and Prioritized Treatment of Specific IGP OSPF Packets
      for Faster IGP Convergence and Improved Network Scalability and
                                 Stability

<draft-ietf-ospf-scalability-00.txt>


Status of this Memo

   This document is an Internet-Draft and is in full conformance
   with all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-Drafts. Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
www.ietf.org/ietf/1id-abstracts.txt.
        http://www.ietf.org/ietf/1id-abstracts.txt
   The list of Internet-Draft Shadow Directories can be accessed at www.ietf.org/shadow.html.
        http://www.ietf.org/shadow.html.
   Distribution of this memo is unlimited. 

Copyright Notice

Copyright (C) The Internet Society (2000). All Rights Reserved.


Abstract

There has been a lot of interest in the networking community to 
allow for fast failure detection followed by the fast restoration 
and recovery. It may be possible to provide fast recovery using 
special mechanisms; however, there is a strong interest in 
addressing

   In this issue at a more fundamental level i.e. at IGP 
convergence because it addresses the problem at a much broader 
scale.  Faster IGP convergence inevitably requires faster detection 
by using smaller hello interval timers (unless one relies on link 
level detection which is not always possible), fast flooding and 
more frequent SPF calculations. However, draft we provide analytic and 
simulation results* propose the following mechanisms to show that this compromises improve 
   the scalability and stability of OSPF-based network:

   (1) Process the network, mainly because Hello packets received at a router are indistinguishable from higher priority compared to other packets and may 
experience long queueing delays during a sudden burst of many LSA 
updates.
       OSPF packets.  In this draft we suggest a need for order to facilitate this, explicitly mark the 
       Hello and potentially 
some packets, to differentiate them from other IGP packets OSPF packets.
       One way of special marking is to be marked explicitly so that efficient 

Maunder, use a different Diffserv 

          
   Choudhury et. al.         Expires: August, 2001           [page                                         [Page 1] 








implementations can detect and act upon these messages

   Internet Draft          Explicit Marking                 May, 2003


       codepoint for Hello packets compared to other OSPF packets.

   (2) In the absence of special marking, or in a priority 
fashion thus allowing significant reduction addition to it, use 
       other mechanisms in convergence time order not to miss Hello packets. One example
       is to treat any packet received over a link as a surrogate for
       a Hello packet (an implicit Hello) for 
IGP while maintaining network stability. 

The figures and graphs are missing from the ASCII version purpose of keeping 
       the 
draft. link alive.

   (3) The pdf versions same type of this draft can explicit marking and prioritized treatment may
       be found in the Internet-
Drafts repository.

1 Motivation

The motivation of this draft is beneficial to address two key issues:
(1) Fast restoration under failure conditions
(2) Increased network scalability and stability

The motivation for allowing fast restoration under failure 
conditions other OSPF packets as well.  One important 
       example is similar to the one provided in [1]draft-alaettinoglu-
isis-convergence-00.txt. The theoretical limit for link-state 
routing protocols to re-route is in link propagation time scales, 
i.e., in tens LSA acknowledgment packet that can reduce 
       retransmissions during periods of milliseconds.  However, in practice it takes 
seconds to tens congestion.  Other examples 
       include (a) Database description (DBD) packet from a slave that 
       is used as an acknowledgement, and (b) LSAs carrying intra-area 
       topology change information.
   
   It is possible that some implementations are already using one or
   more of seconds to detect the link failure and 
disseminate this information above mechanisms in order not to miss the network followed by the 
convergence on the new set of paths. This is an inordinately long 
period processing of transient time for mission
   critical traffic destined to 
the non-reachable nodes packets during periods of congestion.  However, we suggest
   the network. One component above mechanisms to be included as part of the long 
re-route time is the link failure detection time standard so that
   all implementations can benefit from them.


Table of between 20 and 
30 seconds through three missed Hello packets with the typical hello 
interval Contents

   1. Introduction...................................................2
   2. The Network Under Simulation...................................5
   3. Simulation Results ............................................7
   4. Observations on Simulation Results ...........................11
   5. Need for Prioritized Treatment of 10 seconds (between 30 Critical OSPF Packets and 40 seconds if missed hello 
threshold is 4).  This component would be much shorter in the 
presence of link level detection, but as pointed out in [1]draft-
alaettinoglu-isis-convergence-00.txt it does not work in some cases.  
For example, a device driver may detect the link level failure but 
fail 
      Special Marking to notify it Facilitate That............................12
   6. Summary.......................................................13
   7. Acknowledgments...............................................14
   8. References....................................................14
   9. Authors' Addresses............................................15


1. Introduction

   Due to the IGP level.  Also, if a router fails behind 
a switch world-wide increased traffic demand, data networks are ever 
   increasing in a switched environment then even though the switch gets 
the link level notification it cannot communicate that size in terms of number of nodes, number of links,
   adjacencies per node and Link State Database size.  Our motivation
   is to other 
routers. Therefore for faster reliable detection at improve the IGP level, 
one has ability of large networks to reduce withstand
   the hello interval.  Reference [1]draft-
alaettinoglu-isis-convergence-00.txt suggests that this be reduced 
to below simultaneous or near-simultaneous update of a second, perhaps even to tens large number of milliseconds.  A second 
component of the long re-route time is delayed SPF (shortest-path-
first) computation.  The typical delay value is 5 seconds but needs 
to be reduced significantly to have sub-second rerouting. 

The second issue we address is the ability of a network to withstand 
the simultaneous or near-simultaneous update of a large number of 
link-state- advertisement
   link-state-advertisement messages, or LSAs.  We call this event, an
   LSA storm.  An LSA storm may be generated initiated due to many reasons.  Here
   are some examples:  

   (a) one or more link failures due to fiber cuts,

          
   Choudhury et. al.                                         [Page 2]

   Internet Draft          Explicit Marking                 May, 2003


   (b) one or more node failures for some reason, e.g., failed power 
supply software
       crash or some type of disaster in an office, office complex hosting 
       many nodes,

   (c) requirement of taking down and later bringing back many 
       nodes during a software/hardware upgrade, 

   (d) near-synchronization of the once-in-30-minutes refresh instants
       of some types of LSAs, 

   (e) refresh of all LSAs in the system during a change in software 
       version. The LSA storm tends  

   In addition to drive the node 
CPU utilization to 100% for LSAs generated as a period direct result of time and the duration link/node 
   failures, there may be other indirect LSAs as well.  One example 
   in MPLS networks is traffic engineering LSAs generated at other 
   links as a result of 
this period increases with the size significant change in reserved bandwidth 
   resulting from rerouting of Label Switched Paths (LSPs) that went 
   down during the link/node failure.
 
   The LSA storm causes high CPU and memory utilization at the node 
adjacency, i.e., the number of trunks connected
   processors causing incoming packets to it. During this 
period be delayed or dropped.  
   Delayed acknowledgements (beyond the retransmission timer value) 
   results in retransmissions, and delayed Hello packets received at (beyond the node would see high delays 
and if this delay exceeds typically three or four hello intervals 

Maunder, et. al.         Expires: August, 2001           [page  2] 






(typically 30 or 40 seconds) 
   Router-Dead interval) results in links being declared down.  A
   trunk-down event causes Router LSA generation by its end-point
   nodes.  If traffic engineering LSAs are used for each link then the associated trunk
   that type of LSAs would also be 
declared down.  Depending on generated by the implementation, there may be other 
impacts of a long CPU busy period end-point nodes
   and potentially elsewhere as well.  For example, well due to significant changes in a 
reliable node architecture with an active and a standby processor, a 
processor-switch may result during an extended CPU-busy period which 
may mean that all
   reserved bandwidths at other links caused by the adjacencies would be lost failure and need to be re-
established.  Both reroute
   of LSPs originally using the above events failed trunk.  Eventually, when the
   link recovers that would cause more database 
synchronization with neighbors also trigger additional Router and network-wide traffic
   engineering LSAs.

   The retransmissions and additional LSA flooding which generations result in turn might cause extended CPU-busy periods at other nodes.  This 
may cause unstable behavior in the network for an extended period of 
time further 
   CPU and potentially memory usage, essentially causing a meltdown in positive feedback loop.  
   We define the extreme case.  Due to world-
wide increased traffic demand, data networks are ever increasing LSA storm size as the number of LSAs in 
size. As the network size grows, a bigger LSA original 
   storm and a higher 
adjacency at certain nodes would be more likely and so would 
increase not counting any additional LSAs resulting from the probability of unstable behavior.  One way to address  
   feedback loop described above.  If the scalability issue LSA storm is to divide the network hierarchically into 
different areas so that flooding of LSAs remains localized within 
areas.  However, this approach increases too large then
   the network management and 
design complexity and less optimal routing between areas. Also area 
0 positive feedback loop mentioned above may see the flooding of be large enough to 
   indefinitely sustain a large number of summary LSAs CPU and some of 
the new protocols may not work well under memory utilization at many 
   network nodes, thereby driving the hierarchical system. 
Thus it is important network to allow an unstable state.

   In the past, network to grow towards as large a 
size as possible under a single area. The undesirable impact of 
large LSA storms is understood
   outage events have been reported in the networking community IP and it is 
well known that ATM networks using 
   link-state protocols such as OSPF, IS-IS, PNNI or some proprietary 
   variants.  See, for example [Ref1-Ref4].  In many of these examples,
   large scale flooding of LSAs or other similar control messages 
   (either naturally or due to a bug) has triggered by some bug or inappropriate 

          
   Choudhury et. al.                                         [Page 3]

   Internet Draft          Explicit Marking                 May, 2003


   procedure) have been partly or fully responsible for several network 
events in the past causing a meltdown or a near-meltdown.  Recently, 
proposals have 
   instability and outage. 

   It has been submitted suggested [Ref5] to avoid synchronization of LSA 
refreshes [2]draft-ietf-ospf-refresh-guide-01.txt and reduce 
flooding overhead the Hello interval and
   Router-Dead interval significantly in case more than one interface goes order for OSPF to the same 
neighbor [3] draft-ietf-ospf-isis-flood-opt-00.txt, detect
   link failures and [4]draft-
ietf-ospf-ppp-flood-00.txt.

In this proposal we recoveries faster. Reduction of Router-Dead
   interval would like to make the point that reducing hello 
intervals and it even more frequent SPF computation would in fact reduce 
network scalability and stability. We will use a simple and 
approximate but easy-to-understand analytic model likely for this purpose. links to be declared down
   due to missed Hellos.

   We will also use a more involved simulation model.  Next, we would 
like model to make the point show that many of the underlying causes of network 
scalability could be avoided if certain IGP messages could be 
specially marked and provided prioritized treatment.  

2 Analytic Model for Delay seen By a Received Hello Packet During there is a certain LSA Storm

For every trunk interface, a node has to send and receive a Hello 
packet once every hello interval.  Sending of a Hello packet can be 
triggered storm
   size threshold above which the network may show unstable behavior 
   caused by a timer and it is possible to give higher priority large number of retransmissions, link failures due to 
timer-driven jobs and thereby ensure that it is not excessively 
delayed even during extended CPU-busy periods.  However, a received 
   missed Hello packet cannot be easily distinguished from other IGP or IP packets and therefore is typically served in a first-come-first-
served fashion. subsequent link recoveries.  We do a simple and approximate analysis of also show
   that the 
delay experienced by this packet during an LSA storm at a node with 
highest adjacency.  LetÆs assume:
? S = Size of LSA storm (i.e., number of LSAs in it).  Also, it 
is assumed that each size causing instability may be substantially
   increased by providing prioritized treatment to Hello and LSA is carried in one LSU packet.
? L = Link adjacency of 
   Acknowledgment packets.  Furthermore, if we prioritize Hello 
   packets then even when the node under consideration. network operates somewhat above the 
   stability threshold, links are not declared down due to missed 
   Hellos.  This implies that even though there is 

Maunder, et. al.         Expires: August, 2001           [page  3] 







assumed 
   control plane congestion due to be many retransmissions, the maximum data plane
   stays up and no new LSAs are generated (besides the ones in the network.
? t1 = Time to send or receive one IGP packet over an interface 
(the same time is assumed for 
   original storm and the refreshes).  Based on these observations
   we propose prioritized treatment of Hello, LSA, duplicate LSA acknowledgment
   and LSA 
acknowledgement even though in general there may other critical OSPF packets and a special marking to facilitate
   that.

   One might argue that the scalability issue of large networks should
   be some 
differences. solved solely by dividing the network hierarchically into 
   multiple areas so that flooding of LSAs remains localized within 
   areas.  However, this would approach increases the network management 
   and design complexity and may result in less optimal routing between 
   areas. Also, ASE LSAs are flooded throughout the AS and it may be
   a good approximation problem if 
majority there are large numbers of the time is in the act them.  Furthermore, 
   a large number of receiving or sending summary LSAs may need to be flooded across
   Areas and a 
relatively small part their numbers would increase significantly if 
   multiple Area Border Routers are employed for packet-type-specific work.  In the 
numerical examples we assume t1 = 1 ms.
? t2 = Time purpose of
   reliability. Thus it is important to do one SPF calculation. For allow the network to grow 
   towards as large network, this 
time a size as possible under a single area.  
   
   Our proposal here is usually in hundreds synergistic with a broader set of ms scalability 
   and stability improvement proposals. [Ref6, Ref7] proposes flooding
   overhead reduction in the numerical examples 
we assume t2 = 200 ms.
? Hi = Hello interval.
? Si = minimum interval between successive SPF calculations.
? ro = Rate at which non-IGP work comes case more than one interface goes to the node (e.g., 
forwarding of data packets).  For the numerical examples we 
assume ro = 0.2.
? T = Total work brought same
   neighbor.  [Ref8] proposes a mechanism for 
   greatly reducing LSA refreshes in stable topologies. [Ref9] compares
   several restricted flooding algorithms in terms of their ability to the node during the LSA storm.  
For each LSA update generated elsewhere, the node will receive 
one new
   withstand large LSA packet over one interface, send an acknowledgement 
packet over that interface, storms and send copies robustness to failure conditions.
   [Ref10] proposes a wide range of congestion control and failure 
   recovery mechanisms.   


          
   Choudhury et. al.                                         [Page 4]

   Internet Draft          Explicit Marking                 May, 2003


   Section 2 describes the LSA packet 
over the remaining L-1 interfaces. Also, assuming that the 
implicit acknowledgement mechanism is in use, network under simulation and Section 3
   provides the node will 
subsequently receive either an acknowledgement or simulation results.  Section 4 gives the basic
   observations based on the simulation results.  Section 5 explains
   the need for prioritized treatment of certain critical OSPF packets 
   and special marking to facilitate that.  Section 6 gives the summary.
          

2. The Network Under Simulation

   We generate a duplicate LSA random network over a rectangular grid using a  
   modified version of Waxman's algorithm [Ref11] that ensures that 
   the remaining L-1 interfaces.  So over each interface one 
packet network is sent connected and one is received.  It can be seen that the same 
would be true for self-generated LSAs.   So the total work has a pre-specified number of nodes, 
   links, maximum number of neighbors per 
LSA update is 2*L*t1.  Since there are S LSAs in node, and maximum number  
   of adjacencies per node. The rectangular grid resembles the storm, we 
get

T = 2*S*L*t1      (1)

In Equation (1) we ignore retransmissions 
   continental U.S.A. with maximum one-way propagation delay of LSAs 30 ms 
   in case 
acknowledgements are not received or processed within 5 seconds.  
This impact the East-West direction and other details are taken into account maximum one-way propagation delay of 
   15 ms in the 
simulation model to be presented later.
? T2 = Time period over which the work comes. Due to differences North-South direction.  We consider two different 
   network sizes as explained in propagation times and congestion at other nodes, it is 
possible for the work arrival time to be spread out over Section 3.

   The network has a long 
interval.  However, since we are considering the flat, single-area topology.

   Each node with 
highest adjacency, i.e., one with highest congestion, (this is 
assuming a Router and each link is a point-to-point link 
   connecting two routers.

   We assume that all nodes have the same processing power node CPU and about memory (not the same non-IGP workload) most of link bandwidth) is the work will come 
   main bottleneck in one 
chunk.  We verified this to be usually true using simulations.  
One part of T2 the LSA flooding process.  This will typically 
   be true for high speed links (e.g., OC3 or above) and/or links 
   where OSPF traffic gets an adequate Quality of the order of link propagation delay and 
we assume that there is a second part which is proportional Service (QoS) 
   compared to T. 
Therefore we get,

T2 other traffic.
 
   Different Timers: 
     LSA refresh interval = A + B*T    (2)

   Where A and B are constants.  For the numerical examples we 
assume 
   A 1800 seconds, 
     Hello refresh interval = 10 ms and B = 0.1.

? D Seconds, 
     Router-Dead interval = Maximum delay experienced by a Hello packet during the 40 seconds, 
     LSA 
storm.  We assume first-come-first-served service retransmission interval: two values are considered, 10 seconds 
       and hence the 
delay seen by the Hello packet would be the total outstanding 
work at the node at the arrival instant plus its own processing 
time.  We assume 5 Seconds (note that outstanding work steadily increases over 

Maunder, et. al.         Expires: August, 2001           [page  4] 







the interval T2 and so the maximum delay is seen by a Hello 
packet that comes near retransmission is disabled on the end 
       receipt of this interval.  We write down either an approximate expression for D and then explain the various 
terms on the right hand side:

D = T û T2 + max(1,2*T2/Hi)*t1 + max(1,T2/Si)*t2 + ro*T2 (3)
   
The first term is the total work brought in due to the explicit acknowledgment or a duplicate LSA storm.  
The second term is the work
       over the node was able to finish since we are 
assuming same interface that it was continuously busy during acts as an implicit acknowledgment) 
     Minimum time between successive generation of the period T2.  The 
third term same LSA = 5 
       seconds, 
     Minimum time between successive Dijkstra SPF calculations 
       is the total work due to the sending and receiving 1 second.

   Packing of 
Hello packets during the period T2.  Note that it LSAs: It is assumed that at 
least one Hello packet is processed, i.e., itself.  The fourth term 
is due to SPF processing during for any given node, the LSAs 
   generated over a 1-second period T2 and we assume that at 
least are packed together to form an LSU 
   but no more than 3 LSAs are packed in one SPF LSU.

   LSU/Ack/Hello Processing Times: All processing is done.  The last term is the total non-
IGP work coming to times are expressed  
   in terms of the node over parameter T.  Two values of T are considered, 1 ms

          
   Choudhury et. al.                                         [Page 5]

   Internet Draft          Explicit Marking                 May, 2003


   and 0.5 ms.

   In the interval T2

? Dmax = maximum allowed value case of D, i.e., if D exceeds this 
value then a dedicated processor for processing OSPF packets the associated link would be declared down. In 
   processing time reported represents the 
numerical examples below we assume 

Dmax = 3*Hi     (4) true processing time. If the 
   processor does other work and only a fraction of its capacity can be 
   dedicated to OSPF processing then we assume have to inflate the processing 
   time appropriately to get the effective processing time and in that 
   case it is assumed that the previous Hello packet was minimally 
delayed then exceeding Dmax really means four missed hellos since inflation factor is already taken into 
   account as part of the reported processing time. 

   The fixed time to send or receive any LSU, Ack or Hello packet under study itself came after a period Hi. is T.
   In addition, a variable processing time is used for LSU and Ack 
   depending on the numerical examples below, both D number and Dmax change with choice types of system parameters and we are mainly interested in identifying 
if D exceeds Dmax.  For this purpose we define the following 
ratio LSAs packed.  No variable

Delay Ratio = D/Dmax      (5)

and identify if Delay Ratio exceeds 1.

In Figures 1-3 we plot 
   processing time is used for Hello.
   Variable processing time per Router LSA is (0.5 + 0.17L)T where L is 
   the Delay Ratio as a function number of LSA Storm 
size with node adjacencies 10, 20 and 50 respectively.  All 
parameters except advertised by the Router LSA.  For other 
   LSA types (e.g., ASE LSA or a "Link" LSA carrying traffic 
   engineering information about a link), the variable processing time  
   per LSA is 0.5T.

   Variable processing time for an Ack is 25% that of the ones corresponding 
   LSA.

   It is to be noted explicitly on the figures that if multiple LSAs are 
as stated earlier.  Figure 1 assumes Hello packets packed in a single LSU 
   packet then the fixed processing time is needed only once but the 
   variable processing time is needed for every 10 component of the 
   packet.
 
   The processing time values we use are roughly in the same range of 
   what has been observed in an operational network.

   LSU/Ack/Hello Priority: Two non-preemptive priority levels and
   three priority scenarios are considered. Within each priority level 
   processing is FIFO with new packets of lower priority being
   dropped when the lower priority queue is full.  The higher priority
   packets are never dropped.    
      In Priority scenario 1, all LSUs/Acks/Hellos received at a node 
      are queued at the lower priority.
      In Priority scenario 2, Hellos received at a node are queued at 
      the higher priority but LSUs/Acks are queued at lower priority.
      In Priority scenario 3, Hellos and Acks received at a node are 
      queued at the higher priority but LSUs are queued at lower 
      priority. 
   All packets generated internally to a node (usually triggered by 
   a timer) are processed at the higher priority.  This includes the 
   initial LSA storm, LSA refresh, Hello refresh, LSA retransmission 
   and new LSA generation after detection of a failure or recovery.

   Buffer Size for Incoming LSUs/Acks/Hellos (lower priority): Buffer 

          
   Choudhury et. al.                                         [Page 6]

   Internet Draft          Explicit Marking                 May, 2003


   size is assumed to be 2000 packets where a packet is either an Ack, 
   LSU, or Hello. 

   LSA Refresh: Each LSA is refreshed once in 1800 seconds and the 
   refresh instants of various LSAs in the LSDB are assumed to be 
   uniformly distributed over the 1800 seconds period, i.e., they are 
   completely unsynchronized.  If however, an LSA is generated as part 
   of the initial LSA storm then it goes on a new refresh schedule of 
   once in 1800 seconds starting from its generation time.   

   LSA Storm Generation: As defined earlier, "LSA storm" is the 
   simultaneous or near simultaneous generation of a large number of 
   LSAs. In the case of only Router and ASE LSAs we normally assume  
   that the number of ASE LSAs in the storm is about 4 times that of  
   the Router LSAs, but the ratio is allowed to change if either the  
   Router or the ASE LSAs have reached their maximum possible value.   
   In the case of only Router and Link LSAs (carrying traffic  
   engineering information) we normally assume that the number of Link  
   LSAs in the storm is about 4 times that of the Router LSAs, but the  
   ratio is allowed to change if either the Router or the Link LSAs  
   have reached their maximum possible value.  For any given LSA storm  
   we keep generating LSAs starting from Node index 1 and moving  
   upwards and stop until the correct number of LSAs of each type have  
   been generated.  The LSAs generated at any given node is assumed to  
   start at an instant uniformly distributed between 20 and 30 seconds 
   from the start of the simulation.  Successive LSA generations at a  
   node are assumed to be spaced apart by 400 ms. It is to be noted  
   that during the period of observation there are other LSAs  
   generated besides the ones in the storm.  These include refresh of  
   LSAs that are not part of the storm and LSAs generated due to  
   possible link failures and subsequent possible link recoveries.

   Failure/Recovery of Links: If no Hello is received over a link (due 
   to CPU/memory congestion) for longer than Router-Dead Interval then
   the link is declared down.  At a later time, if Hellos are received
   then the link would be declared up.  Whenever a link is declared
   up or down, one Router LSA is generated by each Router on the
   two sides of the point-to-point link.  If "Link LSAs" carrying
   traffic engineering information is used then it is assumed that each
   Router would also generate a Link LSA.  In this case it is also 
   assumed that due to rerouting of LSPs, three other links in the 
   network (selected randomly in the simulation) would have significant 
   change in reserved bandwidth which would result in one Link LSA 
   being generated by the routers on the two ends of each such link.


3. Simulation Results

   In this section we study the relative performance of the three 

          
   Choudhury et. al.                                         [Page 7]

   Internet Draft          Explicit Marking                 May, 2003


   Priority scenarios defined earlier (no priority to Hello or Ack, 
   priority to Hello only, and priority to both Hello and Ack) with a 
   range of Network sizes, LSA retransmission timer values, LSA types, 
   processing time values and Hello/Router-Dead-Interval values:
   
   Network size: Two networks are considered.  Network 1 has 100 nodes, 
   1200 links, maximum number of neighbors per node is 30 and maximum 
   number of adjacencies per node is 50 (same neighbor may have more 
   than one adjacencies).   Network 2 has 50 nodes, 600 links, maximum 
   number of neighbors per node is 25 and maximum number of adjacencies 
   per node is 48. Dijkstra SPF calculation time for Network 1 is  
   assumed to be 100 ms and that for Network 2 is assumed to be 70 ms.

   LSA Type: Each node has 1 Router LSA (Total of 100 for Network 1 and 
   50 for Network 2). There are no Network LSAs since all links are 
   point-to-point links and no Summary LSAs since the network has only 
   one area. Regarding other LSA types we consider two situations.  In  
   Situation 1 we assume that there are no ASE LSAs and each link has  
   one "Link" LSA carrying traffic engineering information (Total of  
   2400 for Network 1 and 1200 for Network 2). In Situation 2 we assume
   that there are no "Link" LSAs and half of the nodes are ASA-Border  
   nodes and each border node has 10 ASE LSAs (Total of 500 for  
   Network 1 and 250 for Network 2).  We identify Situation 1 as "Link  
   LSAs" and Situation 2 as "ASE LSAs".

   LSA retransmission timer value: Two values are considered, 10 
   seconds and 5 seconds (default value).

   Processing time values: Processing times for LSUs, Acks and Hello 
   packets have been previously expressed in terms of a common  
   parameter T.  Two values are considered for T, which are 1 ms 
   and 0.5 ms respectively.
   
   Hello/Router-Dead-Interval: It is assumed that Router-Dead interval
   is four times the Hello interval.  In one case it is assumed that
   Hello interval is 10 seconds and Router-Dead-Interval is 40
   seconds (default values), and in the other case it is assumed that 
   Hello interval is 2 seconds and Router-Dead-Interval is 8 seconds. 

   Based on Network size, LSA type and processing time values we 
   develop 6 Test cases as follows:

   Case 1: Network 1, Link LSAs, retransmission timer = 10 sec., 
           T = 1 ms, Hello/Router-Dead-Interval = 10/40 sec.

   Case 2: Network 1, ASE LSAs, retransmission timer = 10 sec., 
           T = 1 ms, Hello/Router-Dead-Interval = 10/40 sec.

   Case 3: Network 1, Link LSAs, retransmission timer = 5 sec., 

          
   Choudhury et. al.                                         [Page 8]

   Internet Draft          Explicit Marking                 May, 2003


           T = 1 ms, Hello/Router-Dead-Interval = 10/40 sec.

   Case 4: Network 1, Link LSAs, retransmission timer = 10 sec., 
           T = 0.5 ms, Hello/Router-Dead-Interval = 10/40 sec.

   Case 5: Network 1, Link LSAs, retransmission timer = 10 sec., 
           T = 1 ms, Hello/Router-Dead-Interval = 2/8 sec.

   Case 6: Network 2, Link LSAs, retransmission timer = 10 sec., 
           T = 1 ms, Hello/Router-Dead-Interval = 10/40 sec.


   For each case and for each Priority scenario we study the network 
   stability as a function of the size of the LSA storm.  The stability
   is determined by looking at the number of non-converged LSUs as a   
   function of time. An example is shown in Table 1 for Case 1 and 
   Priority scenario 1 (No priority to Hellos or Acks).
   
=========|==========================================================
         | Number of Non-Converged LSUs in the Network at Time(in sec)
    LSA  |                    
   STORM |====|=====|=====|=====|=====|=====|=====|=====|========|==
   SIZE  |10s | 20s | 30s | 35s | 40s | 50s | 60s | 80s | 100s   |
=========|====|=====|=====|=====|=====|=====|=====|=====|========|==
    100  | 0  |  0  | 24  | 29  | 24  |  1  |  0  |  1  |  1     |
 (Stable)|    |     |     |     |     |     |     |     |        |
---------|----|-----|-----|-----|-----|-----|-----|-----|--------|--
    140  | 0  |  0  | 35  | 48  | 46  | 27  | 14  |  1  |  1     |
 (Stable)|    |     |     |     |     |     |     |     |        |
---------|----|-----|-----|-----|-----|-----|-----|-----|--------|--
    160  | 0  |  0  | 38  | 57  | 55  | 40  | 26  | 65  | 203    |
(Unstable)    |     |     |     |     |     |     |     |        |
=========|==========================================================

           Table 1: Network Stability Vs. LSA Storm 
              (Case 1, No priority to Hello/Ack)

   

   The LSA storm starts a little after 20 seconds and SPF calculation every 5 seconds which are typical default values 
today.  With a node adjacency so for some 
   period of 10, time after that the Delay Ratio is below 1 even 
with number of non-converged LSUs should
   stay high and then come down for a stable network. 
   This happens for LSA storms of sizes 100 and 140.  With an LSA storm
   of size 1000.  However, with a node adjacency of 
20, 160, the Delay Ratio exceeds 1 at around a storm number of size 800 non-converged LSUs stay high indefinitely
   due to repeated retransmissions, link failures due to missed Hellos  
   for more than the Router-Dead interval which generates additional 
   LSAs and with 
a node adjacency of 50, also due to subsequent link recoveries which again 
   generate additional LSAs.  We define network stability threshold as
   the Delay Ratio exceeds 1 at around a maximum allowable LSA storm 
of size 325. 
 
Figure 1: Delay Ratio with Hello Every 10 Seconds, SPF Every 5 
Seconds, Dmax = 30 seconds

In for which the number of 

          
   Choudhury et. al.                                         [Page 9]

   Internet Draft          Explicit Marking                 May, 2003


   non-converged LSUs come down to a large network it low level after some time. It 
   turns out that for this example the stability threshold is not unusual to have LSA storms
   150. 

   The network behavior as a function of size 
several hundreds since the LSA database storm size may can
   be several 
thousands. This categorized as follows:

   (1) If the LSA storm is particularly true if well below the stability threshold then
       the CPU/memory congestion lasts only for a short period and
       during this period there are many type 5 LSAs very few retransmissions, very
       few dropped OSPF packets and there no link
       failures due to missed Hellos.  This type of LSA storms are special LSAs for carrying information about available 
bandwidth at trunks as is common
       observed routinely in ATM operational networks and might be used 
in MPLS-based networks as well. 
Figure 2 decreases
       recover from them easily.

   (2) If the hello interval to 2 seconds and SPF 
calculation is done once a second. LSA storm thresholds are 
significantly reduced.  Specifically, with a node adjacency of 10, 

Maunder, et. al.         Expires: August, 2001           [page  5] is just below the Delay Ratio exceeds 1 at around a storm of size 310; with a node 
adjacency of 20, stability threshold then
       the Delay Ratio exceeds 1 at around CPU/memory congestion lasts for a storm of size 
160; longer period and with a node adjacency of 50, the Delay Ratio exceeds 1 at 
around a storm during
       this period there may be considerable amount of size only 65. 
 
Figure 2: Delay Ratio with retransmissions
       and dropped OSPF packets.  If Hello Every 2 Seconds, SPF Every 1 
Second, Dmax = 6 seconds

Figure 3 decreases packets are not given
       priority then there may also be some link failures due to
       missed Hellos.  However, the hello interval even further network does go back to 300 ms and SPF 
calculation is done once every 500 ms. LSA storm thresholds are 
really small now.  Specifically, with a node adjacency stable
       state eventually. This type of 10, the 
Delay Ratio exceeds 1 at around a LSA storm of size 40, may happen rarely in
       operational networks and they recover from it with a node 
adjacency of 20, some
       difficulty. 

   (3) If the Delay Ratio exceeds 1 at around a LSA storm of size 
20, and with a node adjacency of 50, is above the stability threshold then
       the Delay Ratio CPU/memory congestion may last indefinitely unless
       some special procedure for relieving congestion is already over 
1 even with a storm followed. 
       During this period there are considerable amount of size 10.
 
Figure 3: Delay Ratio with 
       retransmissions and dropped OSPF packets.  If Hello Every 300 ms, SPF Every 500 ms, 
Dmax = 900 ms

Whenever Delay Ratio exceeds 1, the associated packets are
       not given priority then there would also be link is declared down 
even if it is actually up and eventually other undesirable events 
start (e.g., trunk flapping and cascading of extended CPU overload 
periods failures due 
       to other nodes).  Therefore, the missed Hellos.  This type of LSA storm threshold at 
which the Delay Ratio exceeds 1 may also roughly be considered happen very 
       rarely in operational networks and usually some manual procedure
       such as taking down adjacencies in heavily congested nodes is
       needed.

   (4) If Hello packets are given priority then the network stability threshold.  Figures 1-3 show that the 
stability
       threshold rapidly decreases as increases, i.e., the hello interval and SPF 
computation interval decreases.  One reason for network can withstand a larger
       LSA storm. Furthermore, even if the network operates at or 
       somewhat above this higher stability threshold, Hellos are 
       still not missed and so there are no link failures.  So even 
       if there is congestion in the 
increased CPU work control plane due to more frequent hello and SPF computations, 
but increased 
       retransmissions requiring some special procedures for congestion
       reduction, the dominant reason is that Dmax itself decreases data plane remains unaffected.
        
   (5) If both Hello and so a 
smaller CPU busy interval is needed to exceed it.  Specifically, 
Dmax is 30 seconds in Figure 1, 6 Seconds in Figure Acknowledgement packets are given priority
       then the stability threshold increases even further.       
   


          
   Choudhury et. al.                                         [Page 10]

   Internet Draft          Explicit Marking                 May, 2003


   In Table 2 and only 900 
ms in Figure 3. It is clear from we show the above examples that in order to 
maintain network stability as threshold for the hello interval decreases, it is 
necessary five  
   different cases and for the three different priority scenarios
   defined earlier.  

|===========|========================================================|
|           |    Maximum Allowable LSA Storm Size For                |
|   Case    |=================|==================|===================|
|  Number   | No Priority to provide faster prioritized treatment  |Priority to received Hello 
packets which can of course be only done if those packets can be 
distinguished from other IGP | Priority to Hello |
|           |  Hello or IP packets. Ack   |      Only        |   and Ack         |
|===========|=================|==================|===================|
|   Case 1  |        150      |        190       |        250        |
|___________|_________________|__________________|___________________|
|   Case 2  |        185      |        215       |        285        |
|___________|_________________|__________________|___________________|
|   Case 3 Simulation Study

We have also developed a simulation model to capture more accurately 
the impact of an  |        115      |        127       |        170        |
|___________|_________________|__________________|___________________|
|   Case 4  |        320      |        375       |        580        |
|___________|_________________|__________________|___________________|
|   Case 5  |        120      |        175       |        225        |
|___________|_________________|__________________|___________________|
|   Case 6  |        185      |        224       |        285        |
|___________|_________________|__________________|___________________|

       Table 2: Maximum Allowable LSA storm Storm for a Stable Network


4. Observations on Simulation Results

   Table 2 shows that in all cases prioritizing Hello packets increases
   the nodes of the network.  It 
captures the actual congestion seen at various nodes, propagation 
delay between nodes and retransmissions in case an LSA is not 
acknowledged. It also tries to approximate a real network 
implementation stability threshold, and uses processing times that are roughly in the 
same order addition, prioritization of magnitude as measured in  
   LSA Acknowledgment packets increases the real network (of stability threshold even
   further.  The reasons for the 
order of milliseconds).  There above observations are two categories as follows.
   The main sources of IGP messages.  
Category one messages are triggered by a timer and include the Hello 
refresh, sustained CPU/memory congestion (or positive
   feedback loop) following an LSA refresh and retransmission packets. Category 2 messages storm are not triggered by a timer and include received Hello, received (1) LSA retransmissions 
   and received acknowledgements. Timer-triggered messages are 
given non-preemptive priority over the other type. A beneficial 
effect (2) links being declared down due to missed Hellos which in 
   turn causes further LSA generation and future recovery of this strategy is that Hello packets are sent out with 
little delay even under intense CPU overload.  However, the received link   
   causing even more LSA generation. 
   Prioritizing Hello packets avoids and practically eliminates the received acknowledgement packets may see long 
queueing delays under intense CPU overload. Figures 4 and 5 below 
show sample results
   second source of congestion.  Prioritizing Acknowledgements 
   significantly reduces the simulation study when applied to a 

Maunder, et. al.         Expires: August, 2001           [page  6] 






network with about 300 nodes and 800 trunks.  The hello interval first source of congestion, i.e.,
   LSA retransmissions.  It is 
assumed to be 5 seconds, noted that retransmissions can
   not be completely eliminated due to the minimum interval between successive SPF 
calculations is 1 second, and a trunk is declared down if no Hello 
packet is received for three successive hello intervals, i.e., 15 
seconds.   During following reasons. Firstly,
   only the study, an LSA storm of size 300 and 600 
(Figures 4 and 5 respectively) explicit Acknowledgments are created at instant of time 100 
seconds.  Three prioritized but duplicate
   LSAs carrying implicit Acknowledgments are packed within one LSU packet still served at the 
   lower priority.  Secondly, LSAs may get greatly delayed or dropped
   at the input queue of receivers and it therefore Acknowledgments may
   not even get generated in which case prioritizing Acks would not   
   help. Another factor to keep in mind is 
assumed that they remain packed since Hellos and Acks  
   are prioritized, the same way during LSAs see bigger delay and potential for 

          
   Choudhury et. al.                                         [Page 11]

   Internet Draft          Explicit Marking                 May, 2003


   dropping. However, the flooding 
process.  Besides simulation results show that on the storm, there whole 
   prioritizing Hello and LSA Acks are always beneficial and 
   significantly improve the network stability threshold.    

   Our simulation study also showed that in each of the normal once-in-
thirty-minutes LSA refreshes and those LSAs are packed one per LSU 
packet. We define cases, instead 
   of prioritizing Hello packets if we treat any packet received over 
   a quantity ôdispersionö which is link as a surrogate for a Hello packet (an implicit Hello) then
   we get about the number of LSU same stability threshold as obtained with
   prioritizing Hello packets.

   If we prioritize Hello packets generated in then even when the network but not received and processed in 
at least one node.  Figures 4 and 5 plot dispersion as a function of 
time.  Before the LSA storm, operates
   somewhat above the dispersion stability threshold, links are not declared
   down due to normal LSA 
refreshes remains small.  As expected, right after the storm the 
dispersion jumps and then comes down again missed Hellos.  This implies that even though there is 
   control plane congestion due to many retransmissions, the pre-storm level 
after some period of time.  In Figure 4 with an LSA storm size 300, 
the ôheavy dispersion periodö lasted about 11 seconds data plane
   stays up and no trunk 
losses were observed.  In Figure 5 with an LSA storm of size 600, new LSAs are generated (besides the ôheavy dispersion periodö lasted about 40 seconds.  Some trunk 
losses were observed a little after 15 seconds within ones in the ôheavy 
dispersion periodö but eventually all trunks recovered 
   original storm and the 
dispersion came down to the pre-storm level.

 
Figure 4: Dispersion Versus Time (LSA Storm Size = 300)


 
Figure 5: Dispersion Versus Time (LSA Storm Size = 600)


4 refreshes)   


5. Need for Special Marking and Prioritized Treatment of Specific IGP 
packets

The analytic Critical OSPF Packets and simulation models
   Special Marking to Facilitate That 

   The observations in the previous section clearly show that a major cause for 
unstable behavior in networks is received
   prioritizing Hello and LSA Acknowledgment packets at a node 
getting queued behind other work brought are greatly
   beneficial in to improving the node during an 
LSA storm scalability and missing the deadline stability of typically three or four hello 
intervals.  This need not happen large
   networks.  In addition to outgoing Hello these packets that are 
triggered by a timer since the node CPU can give it prioritized 
treatment.  Clearly, if the received Hello packet can may be specially 
marked beneficial
   to distinguish it from treat certain other IGP and IP packets then they can 
also be given prioritized treatment and they would not miss the 
deadline even during a large LSA storm.  Some specific field of IP OSPF packets may be used for this purpose.  Besides at the Hello packets 
there may be other IGP packets that could also benefit from special 
marking and prioritized marking. We give two examples but clearly 
others are possible.  
? higher priority as well.
   One example is the LSA acknowledgement packet.  This packet 
disables retransmission and if a large queueing delay to this 
packet expires (during the retransmission timer (typical default value is 
5 seconds) then database exchange process between neighbors
   following a needless retransmission will happen causing 
extra traffic load.  Retransmission event link recovery) is usually rare due to 
the reliable nature of transmission links, but during the 600 LSA 
storm simulation in Figure 5 many retransmission events were 
noted.  Usually, retransmission events happen more with a longer 
CPU busy period.  Clearly, Database Description packet from 
   a special marking and prioritization 

Maunder, et. al.         Expires: August, 2001           [page  7] 







of slave that is used as an acknowledgment for the LSA acknowledgement previous Database
   Description packet would eliminate many needless 
retransmissions. 
? A second sent from the master. Another example is an LSA 
   carrying a bad news, i.e., a failure change information which may trigger SPF calculation
   and rerouting of a trunk or a node. Label Switched Paths. It is preferable to transmit  
   this information faster than other LSAs in the network that either 
carry good news or are  
   just once-in-30-minutes refreshes. The 
explicit identification can also be used to refreshes and typically would not trigger the SPF 
calculation after processing LSAes carrying bad information. This 
will obviate the
   any route computation or route change.

   Given that there is a need of lowering for providing prioritized treatment
   to certain OSPF packets, the SPF calculation interval 
under all circumstances and thus reducing next natural question is how to
   facilitate this prioritization.  

   If it is possible to
   examine the packet header (for the purpose of prioritization) 
   much faster than processing 
overhead.

The example in this draft focussed explicitly on the control domain. whole packet then prioritized
   treatment is possible without any protocol changes.

   However, it can easily be seen we also propose that having an explicit 
identification a special marking be used for certain æchosenÆ
   categorizing all OSPF packets will help minimize their 
drop probability in the traffic plane also. The explicit 
identification allows these control into one of two priority classes.
   It is also important to separately mark OSPF packets from other
   IP packets.  One way to do this is to reserve two diffserv

          
   Choudhury et. al.                                         [Page 12]

   Internet Draft          Explicit Marking                 May, 2003


   codepoints, one for higher priority OSPF packets and another
   one for lower priority OSPF packets.  With this special
   marking it would be easily 
distinguished from the data easy for OSPF implementers to
   treat Hello, LSA acknowledgment, and other critical OSPF
   packets in at a higher priority and thereby significantly
   improve the line card scalability and hence their 
processing (forwarding) can be expedited even under large traffic 
conditions. 

5 stability of networks using
   OSPF.     


6. Summary

   In this proposal draft we point out that if the node processors of a large LSA storm is generated 
   network may be subjected to a sustained CPU/Memory congestion
   as a result of a large LSA storm caused by some type of 
   failure/recovery of nodes/trunks nodes/links or synchronization among refreshes then refreshes.
   There is a certain LSA storm size threshold above which the network
   may show unstable behavior caused by large number of 
   retransmissions, link failures due to missed Hello packets received at and
   subsequent link recoveries.  Using a 
node simulation study we show that
   the LSA storm size causing instability may see large queueing delays be substantially
   increased by providing prioritized treatment to Hello and miss LSA 
   Acknowledgment packets.  Furthermore, if we prioritize Hello 
   packets then even when the network operates somewhat above the 
   stability threshold, links are not declared down due to missed 
   Hellos.  This implies that even though there is 
   control plane congestion due to many retransmissions, the data plane
   stays up and no new LSAs are generated (besides the ones in the 
   original storm and the refreshes).

   Based on the above observations we propose the following:

   (1) Process the Hello packets at a higher priority compared to other
       OSPF packets.  In order to facilitate this, explicitly mark the deadline 
       Hello packets, to differentiate them from other OSPF packets.
       One way of 
typically three special marking is to use a different Diffserv 
       codepoint for Hello packets compared to other OSPF packets.
       
   (2) In the absence of special marking, or four hello intervals.  This causes in addition to it, use 
       other mechanisms in order not to miss Hello packets. One example
       is to treat any packet received over a link as a surrogate for
       a Hello packet (an implicit Hello) for the purpose of keeping 
       the trunk link alive.  Our simulation study shows that this mechanism
       is just as effective as explicitly prioritizing Hello
       packets.

   (3) The same type of explicit marking and prioritized treatment may
       be beneficial to 
be down and other OSPF packets as well.  One important 
       example is potentially the beginning LSA acknowledgment packet that can reduce 
       retransmissions during periods of unstable behavior in the 
network.  This congestion.  Our simulation

          
   Choudhury et. al.                                         [Page 13]

   Internet Draft          Explicit Marking                 May, 2003


       study shows that prioritization of both Hello and LSA
       Acknowledgment packets is already a concern in todayÆs network but would be considerably more effective than
       just prioritizing Hello packets.  Other examples 
       include (a) Database description (DBD) packet from a much bigger concern if the hello interval slave that 
       is used as an acknowledgement, and minimum interval 
between SPF calculations (b) LSAs carrying intra-area 
       topology change information.

   It is possible that some implementations are substantially reduced (below already using one or perhaps 
well below a second)
   more of the above mechanisms in order not to allow faster rerouting, as proposed 
in [1]draft-alaettinoglu-isis-convergence-00.txt.  To avoid the 
above, we propose miss the use processing of a special marking for Hello packets 
(perhaps using a special field in IP packets) so that they may be 
distinguished from other IGP and IP
   critical packets and provided a 
prioritized treatment during intense CPU overload periods caused by 
LSA storms.  We also point out of congestion.  However, we suggest
   the above mechanisms to be included as part of the standard so that other IGP packets could
   all implementations can benefit from special markings as well.  Two examples are LSA acknowledgement 
packets and LSA packets carrying bad news.  


5 them.


7. Acknowledgments

The authors

   We would like to thank members of the High-Speed Packet 
Switching division of AT&T acknowledge Jerry Ash, Margaret Chiosi, Elie 
   Francis, Jeff Han, Beth Munson, Roshan Rao, Moshe Segal, Mike
   Wardlow, and Pat Wirth for their help during the study. 

6 collaboration and encouragement in 
   our scalability improvement efforts for Link-State-Protocol based 
   networks. 


8. References

[1] draft-alaettinoglu-isis-convergence-00.txt   November, 2000

[2] draft-ietf-ospf-refresh-guide-01.txt     July, 2000

[3] draft-ietf-ospf-isis-flood-opt-00.txt    October, 2000

[4] draft-ietf-ospf-ppp-flood-00.txt         November, 2000


Maunder,


   [Ref1] Pappalardo, D., "AT&T, customers grapple with ATM net 
   outage," Network World, February 26, 2001.

   [Ref2] "AT&T announces cause of frame-relay network outage," AT&T 
   Press Release, April 22, 1998.

   [Ref3] Cholewka, K., "MCI Outage Has Domino Effect," Inter@ctive 
   Week, August 20, 1999.

   [Ref4] Jander, M., "In Qwest Outage, ATM Takes Some Heat," Light
   Reading, April 6, 2001.

   [Ref5] C. Alaettinoglu, V. Jacobson and H. Yu, "Towards Milli-
   second IGP Convergence," Work in Progress.

   [Ref6] A. Zinin and M. Shand, "Flooding Optimizations in Link-State
   Routing Protocols," Work in Progress.

   [Ref7] J. Moy, "Flooding over Parallel Point-to-Point Links," Work in
   progress.

   [Ref8] P. Pillay-Esnault, "OSPF Refresh and flooding reduction in  

          
   Choudhury et. al.         Expires: August, 2001           [page  8] 








8                                         [Page 14]

   Internet Draft          Explicit Marking                 May, 2003


   stable topologies," Work in progress.

   [Ref9] G. Choudhury, V. Manral, "LSA Flooding Optimization
   Algorithms and Their Simulation Study," Work in progress.

   [Ref10] J. Ash, G. Choudhury, V. Sapozhnikova, M. Sherif, A.  
   Maunder, V. Manral, "Congestion Avoidance & Control for OSPF 
   Networks", Work in Progress.

   [Ref11] B. M. Waxman, "Routing of Multipoint Connections," IEEE
   Journal on Selected Areas in Communications, 6(9):1617-1622, 1988.

   
9. Authors' Addresses

Anurag S. Maunder           
Cisco Systems                 
email: amaunder@cisco.com

   Gagan L. Choudhury
   AT&T Labs,
   Room D5-3C21
   200 Laurel Avenue
   Middletown, NJ, 07748
   USA
   Phone: (732)420-3721
   email: gchoudhury@att.com   
                              
                              



*The study was done when


   Vera D. Sapozhnikova
   AT&T
   Room C5-2C29
   200 Laurel Avenue
   Middletown, NJ, 07748
   USA
   Phone: (732)420-2653
   email: sapozhnikova@att.com


   Anurag S. Maunder was a Sr. Member of 
Techical Staff at AT&T.
   Sanera Systems
   370 San Aleso Ave.
   Second Floor
   Sunnyvale, CA 94085
   Phone: (408)734-6123
   email: amaunder@sanera.net

   Vishwas Manral
   NetPlane
   189, Prashasan Nagar,
   Road Number 72
   Jubilee Hills, Hyderabad
   India
   email: Vishwasm@netplane.com

   Choudhury et. al.                                         [Page 15]
----