hannes@Frankfurt> show bfd session extensive Transmit Address State Interface Detect Time Interval Multiplier Client ISIS L1, TX interval 0.100, RX interval 0.100, multiplier 5 Client IS
Trang 1JUNOS command output
hannes@Frankfurt> show isis interface detail
IS-IS interface database:
is the lowest Hello interval possible on JUNOS
Relying on Hellos puts an upper boundary of 1 second to the detection time following
a link-failure on the routing protocols But by tracking an interface state, routers candetect the liveliness state much more quickly
5.6.2 Interface Tracking
The chipsets that drive modern router interfaces report link errors, such as a loss of signal, to the routing sub-system within a few milliseconds For high-speed detection,therefore, optical interfaces are the best choice However, there are still similar prob-lems, as illustrated in Figure 5.10 If there are active elements in the middle of the trans-mission chain, then local errors are not propagated downstream and the receiving routerdoes not detect that the light went out
SONET/SDH offers a true advantage over other physical media like Ethernet, which
do not propagate local errors to downstream Network Elements
Many Protocols like Frame Relay and ATM also include their own Local ManagementInterface (LMI) protocol which performs link-layer keep-alive checking, and so on.Unfortunately there is still no LMI-like protocol for Ethernet Bi-directional fault detec-
tion attempts make a neutral liveliness-checking protocol available.
5.6.3 Bi-directional Fault Detection (BFD)
BFD is defined in draft-katz-ward-bfd-01, and its encoding rules are documented indraft-katz-ward-bfd-v4v6-1hop-00 BFD is an answer to the following problems:
• Link-Layer neutral high frequency keep-alive protocol
• Offload high frequency keep-alive processing from the IGP Layer
Neighbour Liveliness Detection 137
Trang 2• Support sub-second timers on behalf of protocols that cannot
• Negotiate timers dynamically
The BFD protocol, unlike many other protocols, includes no auto-neighbour ery It has client software instead, typical of the IP routing protocols, and based on thedetected IGP neighbours The IGP asks the BFD module to set up a BFD session to theLink IP addresses of the provided neighbours
discov-BFD is (at time of writing this book) only available for JUNOS The first release withsupport for BFD is JUNOS 6.1 onwards The configuration of BFD is a property of theinterface {}stanza inside the protocols isis {} branch
BFD runs on top of IP UDP port 3784 and 3785 Port 3784 is used for control packetsand 3785 is used for Echo Mode traffic The JUNOS implementation just supports con-trol packets for liveliness detection Echo Mode is envisioned for the future: the plan isthat forwarding plane software can generate that traffic and the control plane is onlyneeded for parameter setup
The following Tcpdump output shows the parameters that are conveyed using the 24-bytes fixed length packet
Tcpdump output
The BFD protocol runs on top of UDP port 3784 and 3785 It is meant as a high frequency keep-alive mechanism which augments routing protocols that do not have sub-second timer support.
Trang 3My Discriminator: 0x00000001, Your Discriminator: 0x00000002
Desired min Tx Interval: 100 ms
Required min Rx Interval: 200 ms
Required min Echo Interval: 0 ms
Session state transactions are provided using the Flag contents The Desired/Required Timer Fields are used for negotiating a common timer that both peers canaccept The pair of discriminators is necessary to multiplex several sessions between apair of hosts
After BFD has been enabled on both sides, one can verify if a BFD-capable neighbourhas been found on the other end and if the BFD session is Up The show bfd ses-sion commanddisplays the session state
JUNOS command output
Using the show bfd session command you can display the current state and details of the active BFD sessions.
hannes@Frankfurt> show bfd session extensive
Transmit Address State Interface Detect Time Interval Multiplier
Client ISIS L1, TX interval 0.100, RX interval 0.100, multiplier 5
Client ISIS L2, TX interval 0.100, RX interval 0.100, multiplier 5
Session up time 12:34:22, previous down time 00:00:06
Local diagnostic CtlExpire, remote diagnostic None
Remote heard, hears us
Min async interval 0.100, min slow interval 1.000
Adaptive async tx interval 0.100, rx interval 0.200
Local min tx interval 0.100, min rx interval 0.100, multiplier 5
Remote min tx interval 0.100, min rx interval 0.100, multiplier 5
Local discriminator 1, remote discriminator 2
Echo mode disabled/inactive
1 sessions, 2 clients
Cumulative transmit rate 10.0 pps, cumulative receive rate 5.0 pps
BFD is likely to become the dominant keep-alive protocol due its open tation It is expected to even be the protocol of choice between routers and servers For server applications like voice-over IP or financial applications there are open-sourceBFD implementations for hosts available
implemen-Neighbour Liveliness Detection 139
Trang 45.7 Summary
IS-IS adjacency processing has changed over the years It started out with simple 2-wayfinite state machines and, due to the underlying problems of not detecting half-brokenlinks, it quickly evolved to a 3-way FSM It is remarkable that the defects of the under-lying protocol have been solved with just the addition of an optional Adjacency StateTLV Reliably detecting that a neighbour is Up or Down is not enough for today’s ser-
vice provider environments On the one hand the implementation has to be slow enough
to protect the network from flapping adjacencies that are propagated through the network –
on the other hand there is a need for quick keep-alive detection mechanisms Due to therise of Ethernet as popular core-facing interface technology, an LMI-like protocol likeBi-directional Fault Detection (BFD) had to be designed The application of BFD toserve as an IGP detection protocol is just the start It is expected that BFD will be usedfor other network protocols or other environments like application keep-alive detectionfor mission-critical servers
Trang 5Generating, Flooding and Ageing LSPs
141
Unlike distance vector protocols, such as RIP, link-state routing protocols, such as OSPF
and IS-IS, don’t tell only their neighbours about the topology of the network Link-state protocols distribute both their IP reachability and topological view far beyond their adja-
cent neighbours, ultimately flooding this information to all routers in an area
To keep the reachability information in the network current, link-state protocols need
to have a basic set of functions available that can be used to originate, distribute and finally revoke, or time-out topology information In IS-IS-speak, that piece of topology
information is encoded in a link-state protocol data unit (LSP) This chapter covers these
functions and the surrounding network events th at cause the IS-IS protocol to generate,
flood and finally age LSPs.
Link-state routing protocols such as IS-IS follow a paradigm that can best be described
as distributed databases with local computation, which is quite different to the way other
common routing protocols like RIP and BGP work Distributed databases are discussed
in more detail in the following section
of the places and streets that are next to the tourist’s immediate location This makes itvery difficult to find the best path to a landmark or museum, and in the worst case situation,the tourist has to try out several paths, being careful not to circle around the same locations
Localized databases work the same way In contrast, a distributed database approach
works differently: here all of the routers share common information about what the networklooks like If the tourist in the example has got lost, a distributed database map would givethem a more complete map of the best way to get to a particular destination in the city (or
in this case, the network)
How does IS-IS compute the map of the network? If each router just contributes itslocal view to its neighbours, and if that information can be shared among all the routers
in a network, then each will ultimately have a global map of the network Link-state ing protocols, such as IS-IS, work like a jigsaw puzzle, as shown in Figure 6.1 Eachrouter in the figure represents one piece of the puzzle, and if all of the puzzle pieces arepresent, then each router can start to put the puzzle together to acquire an understanding
Trang 6rout-of what the big picture looks like The collection rout-of puzzle pieces is called the link-state
database IS-IS has a number of techniques called flooding and synchronizing to get a
complete map of the network In this chapter, you will learn about the individual puzzle
pieces, which are called link-state protocol data units (LSPs), and how IS-IS distributes
the information they contain
IS-IS, by default, tries to acquire two maps from its neighbours and therefore maintainstwo databases to store topology-related information Information for the first map, whichtypically represents the topology inside the close collection of routers, called a point-of-presence (POP), is stored in the Level 1 (L1) database Sticking with the lost touristexample, think of this as just a local map that guides you around the next few streets Thesecond map, which typically represents the backbone structure of the network, is stored
Pennsauken
Frankfurt
London
Washington NewYork
Paris
Trang 7Distributed Databases 143
in the Level 2 (L2) database This would best compare to a nationwide map where all thefreeways and highways are shown You can take a quick look into both of these link-state
databases to find out exactly which puzzle pieces the database holds by issuing a show
isis databasecommand on both the IOS and JUNOS software platforms
IOS command output
The contents of the IS-IS link-state database can be displayed using the show isis database command:
Amsterdam# show isis database
[ … ]
IS-IS Level-1 Link State Database:
[ … ]
IS-IS Level-2 Link State Database:
JUNOS command output
In the JUNOS software, you can display the IS-IS database using the show isis base command Watch for an inconsistency between the LSPs being sent and received,
data-as this is a problem indication:
hannes@New-York> show isis database
IS-IS level 1 link-state database:
[ … ]
4 LSPs
IS-IS level 2 link-state database:
Trang 8LINX-gw.01-00 0x128 0x455a 41901 L1 L2
[ … ]
12 LSPs
The JUNOS software output contains similar information to the IOS software output
The only difference is the little bit more detailed breakdown of the so-called attribute
typeblock, which will be discussed later in this chapter As far as the attribute typeblock
is concerned, the JUNOS output is more verbose than the IOS equivalent
Based on the information in the two link-state databases for L1 and L2, each router in
an IS-IS network computes the topology of the network independently of every other router This principle of independent router operation is called local computation This is
the topic of the next discussion
6.2 Local Computation
Routing protocols, such as RIP or BGP, compute the best path through a network in a
dis-tributed fashion That is, no single RIP or BGP router knows what the other routers know
about the route, and this is a real limitation For instance, each time a RIP router passes on
a route to its neighbour, the route gets a worse value This “worseness” is indicted in a
metric field, which represents the hop-count (number of routers) that a router is away from
the router attached to the source sub-net In Figure 6.2 the sub-net 192.168.1/24 is directlyconnected to RIP Router #1 Router #1 reports the sub-net to its neighbours with a hopcount of 1 Router #2 learns the sub-net with a metric of 1 and reports it further to the rightside of the figure after incrementing the metric field by one The routing update thereforearrives at Router #3 with a metric of 2 But Router #1 has no idea what value this route has
on Router #3 RIP routing illustrates how a distributed computation scheme works.IS-IS utilizes a totally different way of calculating routing information Before the routecalculation takes place, all IS-IS routers distribute the information about the local views
of the routers to each other Intermediate routers along the way must not change theseviews (represented in the LSPs) After this distribution (flooding), a common route-
calculation scheme, which in IS-IS is called the shortest path first (SPF) algorithm, is
applied Note that each router computes the routes independently from every other router
Trang 9Local Computation 145You can watch the result of the SPF calculation by issuing a show ip route isis com-mand on IOS platforms and a show isis route command on Juniper Network routers.
In IOS, you cannot see the entire results of the SPF calculation – all you can see arethe results that make it into the main routing table That excludes redundant routers thathappen to be in both IS-IS levels The IS-IS learned routes that are active in the routingtable can be displayed using the show ip route isis command
IOS command output
Amsterdam#show ip route isis
tion about the weight or, as it is called in Cisco IOS speak, the administrative distance of
the routing protocol that inserted this route into the routing table (in this case, IS-IS).After the “via” statement the IP address of a locally connected router appears (the next-hop) Finally the end of each line gives the physical interface through which the next-hopcan be reached and this is how packets to this destination will leave the router
In the JUNOS software, you can display both the immediate results from the SPF lation as well as the routes installed in the routing table The SPF results are displayed usingthe show isis route command The IS-IS learned routes that are active in the mainrouting table can be displayed using the show route protocol isis command
calcu-JUNOS software command output
hannes@Pennsauken> show isis route
IS-IS routing table Current version: L1: 0 L2: 485
192.168.52.177/32 2 485 127850 int so-3/0/0.0 Frankfurt 192.168.54.164/32 2 485 128510 int so-3/0/0.0 Frankfurt
172.16.176.32/30 2 485 127830 int so-3/0/0.0 New-York 172.16.176.60/30 2 485 123790 int so-3/0/0.0 New-York [ … ]
The format of the output is one route entry per line The first column contains the routeand the second column contains information about the level where this route did resultfrom The version field is just an internal number that tells you how the SPF run number
Trang 10based upon this route was calculated The version field is typically not interesting in troubleshooting networks The metric tells the distance relative to the local router of theprefix Next is an indication whether the route is internal or external Typically all theroutes are internal unless routes have been injected from other routing protocols into the IS-
IS database Finally, the interface where the traffic leaves the router is displayed, plus theforwarding router’s name
JUNOS software command output
hannes@Pennsauken> show route protocol isis
inet.0: 118243 destinations, 246129 routes (118243 active,
0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both
172.16.44.248/30 *[IS-IS/18] 4d 12:57:11, metric 41550, tag 2
> to 172.16.5.93 via so-3/2/0.0 192.168.49.5/32 *[IS-IS/18] 2d 07:26:54, metric 67850, tag 2
> to 172.16.5.93 via so-2/3/0.0 192.168.49.67/32 *[IS-IS/18] 1d 20:01:28, metric 67860, tag 2
> to 172.16.5.93 via so-7/0/0.0 [ … ]
In the show route protocol isis output we can see a subset of routes that gotdisplayed in show isis routes Those are the routes that competed for installation inthe routing table with other routing protocols that may have had similar information;however, the routes in this table are the ones that have won In JUNOS the level of routes
is displayed in the tag field – a tag 2 means that this is a Level 2 route The number in the
brackets is a similar value to the administrative distance for IOS platforms, called ply the route preference The to and via keywords indicate the next-hop and the outgoing
sim-interface
The universal transport vehicle to build the IS-IS database map is called a link-stateprotocol data unit or LSP for short, which is another OSI-speak word for link-statepacket In the following sections, you will see what information an LSP contains, howthe LSP gets distributed, and how LSPs get throttled when the network is busy
6.3 LSPs and Revision Control
The information element that transports IS-IS-related information to populate all the
routers’ link-state databases is called the link-state PDU or LSP Each router in an IS-IS network generates at least one LSP that describes, as the name implies, the current state
of the links to other routers Actually, an LSP conveys more than just the state of the links
or circuits on the router Routers use LSPs as a kind of envelope to get different types ofinformation elements such as IP routes, checksums and even router names across to otherrouters LSPs need to be accurate and up-to-date If, for example, a link between a pair
of routers goes down, both routers must immediately tell the other routers in the network
Trang 11that the link is down The other routers then update their link-state databases, schedule
an SPF calculation, and remove that broken link from any transit paths in the networkthat might use the failed link
Now, assume that a remote router gets two conflicting messages at the same time That
is, messages arrive almost at the same time claiming that a link is up and down Consider
Figure 6.3 There are three routers connected by point-to-point links Unfortunately, thelink between Router B and Router C is constantly going down, then up, and a short time
later, it goes down again In network-speak this a flapping link Next, assume that Router
B is a busy router with its CPU being loaded close to the ceiling Therefore, Router B isslower in processing the link-down/link-up events In addition, the subsequent regener-ation of its LSPs that report the link as down or up to other routers is slower
Figure 6.4 shows the LSP messages from Router A’s perspective Both Router B andRouter C find out first that there is a link-up event However, Router C processes that eventfar faster than the overloaded Router B The trouble occurs as the link flaps again, this time
transitioning from the Up to the Down state Router B still did not send the previous LSP
out because it was too busy Router B’s LSP is now outdated information because the link
is now in the Down state So both LSPs (Router B’s old Up one and Router C’s accurate Down one) arrive at the same time at Router A Now Router A needs to decide which is the most accurate report – the Up or Down state message The LSP to use should be the most recent one, but in the example the most recent rule would fail because the propagation
delay from Router B to Router A made that inaccurate LSP arrive after Router C’s date LSP The conclusion here is that LSPs need to carry some sort of version information
up-to-to explicitly tell the receiving router what is current and what is outdated LSP information.
6.3.1 Sequence Numbers
Link-state routing protocols carry version information through a Sequence Numbers
field IS-IS has a linear sequence number space starting from one and counting up Thatmeans that the first LSP that is announced by a router has the sequence number 0x1.Each time the router issues a new view of the local environment to its neighbours, therouter will package that information in an LSP, increment the sequence number by one,and send the LSP to all of its neighbours The neighbours compare the incoming LSPwith the LSP in the local database If the received LSP is new to them (that is, the
LSPs and Revision Control 147
What is the most recent
Trang 12received LSP is not in the local database at all), then they unconditionally install the LSPinto their local link-state database If the LSP is already installed in the database, thereceiving router needs to check if the sequence number is higher than the sequence num-
ber of the LSP that is already installed in the link-state database If the LSP is newer, then
the router will flush (or discard) the existing LSP and update the LSP with the morerecent one If it turns out (like in the previous example) that the most recent arriving LSP
is older (has a lower LSP sequence number) then the one installed in the link-state
data-base and therefore carries outdated information, the received LSP is silently discarded
As IS-IS is a reliable protocol, the router will of course acknowledge receipt of that LSP
to the neighbour that sent it If not acknowledged, the router will see the LSP again after
5 seconds, once the neighbour retransmits it You can learn more about acknowledgingand retransmission of LSPs in Chapter 8
The LSP sequence number field is a 32-bit identifier, giving room for about 4 billion
LSP updates LSPs are subject to strict pacing, which means, for example, that a router
must not originate more than one LSP every 5 seconds 2^32 times 5 seconds results in
an interval of 21,474,836,480 seconds, or roughly 681 years So the sequence numberspace is not likely to get to its end, at least not until readers are retired, which is typicallythe timeframe that engineers care about
Seriously, it is just assumed that the LSP sequence space will not run out Assumptions
always cause a lot of trouble for engineers The root-cause of the Y2K scare went back to
Link Down to Router C
Little processing delay Conflicting
Messages hit Router A
is the most recent event
Trang 13assumptions about events that should not be a problem but ultimately were The bottomline is the Y2K problem cost corporate customers a lot of money to sanity check theirapplications and to spot software problems before the millennium turnover But IS-IS iswell prepared in that respect, since there actually is something that can be done if the LSPsequence number space is ever maxed out So what does a router do if it wants to originate
a new LSP and does hit the ceiling of the LSP sequence space? Now, the assumption is that this ceiling will never be reached But even if it finally is, there is a well-defined pro- cedure to handle that event: the router must simply wait for a period of max-age seconds This sounds odd at first: why does waiting solve anything? And how long does max-age
last? As it turns out, it lasts a Lifetime – an LSP’s Lifetime
6.3.2 LSP Lifetimes
In addition to the revision information (the LSP sequence numbers), link-state protocols
include in their LSPs a field called the Lifetime, which helps to control the maximum
validity span of LSPs A router announcing an LSP does not mean that the LSP will bevalid forever, only for the number of seconds indicated in the Lifetime field of an LSP.Adding a Lifetime field to the protocol helps to protect against stale (and potentiallywrong) entries in the link-state database Consider a scenario where a router is taken out
of the network by being powered down The LSP(s) of that powered-down router is or arestill installed in the link-state database of all the routers in the network If the originating
router did not revoke or purge them (you will see shortly how this works), the LSPs would
stay in the link-state database forever The Lifetime field in the LSP is a 16-bit entity,which means that the Lifetime field can be set as high as 2^16-1 or in decimal notation65,535 seconds, or a little over 18 hours
The Lifetime field provides an answer to the unlikely event of IS-IS LSP sequence ber space exhaustion Before an IS-IS router can generate a new LSP with a sequencenumber of 1, the router must wait until the Lifetimes of all previous LSPs it has generatedhas expired and the LSPs have disappeared from all other routers’ link-state databases
num-At most, this wait (max-age) will be 18 hours This sounds very high, but waiting 18 hoursevery 681 years should not be much of a problem for a network And in practice, IS-ISimplementations only use the maximum 18-hour Lifetime when extreme backgroundflooding silence is needed Most of the time, IS-IS uses the default Lifetime value of
1200 seconds (20 minutes) This value can be changed in most IS-IS implementations,and often it is changed But what stops every LSP from disappearing from the networkevery 20 minutes? A periodic LSP refresh
6.3.3 Periodic Refreshes
LSPs with maximum Lifetimes have the consequence that LSPs need to get refreshed.
Refreshing means that a router has to re-originate its LSPs periodically The re-originationinterval has, of course, to be less than the LSP’s Lifetime For example, if the LSP isvalid for 1200 seconds (the default value), the router needs to refresh the LSP in less than
1200 seconds in order to avoid removal of the LSP from the link-state database by other
LSPs and Revision Control 149
Trang 14routers The recommended max-LSP-origination-interval is the Lifetime minus 300
sec-onds So in a default environment this would be 900 secsec-onds
Figure 6.5 shows in a timing diagram how and when a router refreshes its LSPs Every
900 seconds an LSP with the same information content is created Here, Router A always
reports that the router has links in the Up state to Router B and C Please note that foreach refresh, the Sequence Number is incremented by one (bumped) Each time that LSP
is refreshed, the Lifetime gets prolonged for another N seconds, as described in the
Lifetime field (the default value is 1200 seconds)
Both Cisco IOS and JUNOS software do originate all LSPs with a default Lifetime of
1200 seconds, as suggested in the ISO 10589 specification However, you can changethis to higher values to reduce the amount of refreshes in the network (the refresh timer
is seldom made a lower value) Often theses periodic LSP refreshes are called refresh
noise, and network administrators want to reduce this noise close to zero Both Cisco
IOS and JUNOS software offer configuration knobs to change the maximum Lifetime oftheir LSPs and at the same time the re-origination interval derived from this value IOSlets you define the Lifetime and refresh intervals independently from each other All youhave to make sure of is that the max-lsp-lifetime be a few hundred seconds higherthan the lsp-refresh-interval If you modify the max-lsp-lifetime do
not forget to set the lsp-refresh-interval accordingly (a few hundred seconds
lower than max-lsp-lifetime) If you forget to set the refresh interval, then theLSPs will get refreshed according to the default timer, which is 900 seconds This willnot break anything but it also does not help to reduce the refresh noise The outcomemight be an LSP originated with the maximum Lifetime of 65,535 seconds which willstill be refreshed each 900 seconds
In IOS you can set the LSPs Lifetime and refresh interval independently from eachother, as shown in the following (note the bolded sections in this code listing)
happen if the Lifetime is set to be smaller than the refresh interval The refresh interval
is calculated automatically The refresh interval in a Juniper Networks router is the LSPLifetime minus 317 seconds