The Complete IS-IS Routing Protocol- P17 docx

inden-hannes@Munich> show configuration policy-options indenta-standalone then accept term does is accept every unicast route in the inet.0 routing tables and mark it for export into the

Trang 1

The IS-IS conﬁguration looks alright – all interfaces are referenced At the top there is

a pointer to an export policy which we will examine closer

JUNOS conﬁguration

On ﬁrst sight the static-to-isis policy looks good, however once you check the tation of the terms and accept statements you will ﬁnd out that the policy does not do what the network operator wanted it to do.

inden-hannes@Munich> show configuration policy-options

indenta-standalone then accept term does is accept every unicast route in the inet.0 routing

tables and mark it for export into the IS-IS link-state database Because there is no fromstatement at the same indentation level as the ﬁnal then accept statement, we have

an unconditional export of the entire Internet routing table into IS-IS (The ﬁnal “then”logic is executed when no terms match the routes The logic is here “Is the route 10/8 orlonger?” No, that’s a private address “Is the route static?” No, it’s an Internet route

“Okay, then unconditionally accept the route into IS-IS.”)

The distributed storage space that each node may allocate is 1492(–27) * 256

375 Kbytes How many IPv4 preﬁxes do ﬁt in those 375 Kbytes? Figure 12.11 in Chapter

12 “IP Reachability Information” illustrates the structure and storage requirements of theExtended IP Reachability TLV #135 Worst case, the TLV consumes 9 bytes and bestcase 5 bytes due to variable preﬁx length packing For the average Internet route we canassume a preﬁx length between /16 and /24 and safely assume a total storage requirement

of 8 bytes per preﬁx In a single TLV, on average, 31 TLVs ﬁt, which requires 31 * 8 + 2(TLV Overhead) 250 bytes to store An LSP fragment is at maximum 1492 bytes insize For TLV information there is 1492 – Header size (27) 1465 space That means

in total we can store 31 * 5 + 26 181 routes per fragment Inside 256 fragments we can

store around 46 K routes, which is too little to hold the entire Internet routing table As

soon as the routers hit that limit, it pulls the “emergency brake” and sets the overload bit

Trang 2

Finally, it cleans up the mess by purging the previously generated LSPs off the uted link-state database And that’s what the router was showing us.

distrib-In order to ﬁx the problem, the then accept statement is moved into the termstatic

data-JUNOS command output

After the router has changed, the broken routing policy the Overload Bit is automatically cleared.

hannes@Munich> show isis database

IS-IS level 2 link-state database:

LSP ID Sequence Checksum Lifetime Attributes Munich.00-00 0x1c2 0x2d3b 1192 L1 L2 Pennsauken.00-00 0xc77 0xec5e 711 L1 L2 Frankfurt.00-00 0x198 0xdd86 933 L1 L2

Trang 3

15.4 Summary

Most IS-IS problems can be resolved quickly if you stick to a troubleshooting plan andcheck from Layer-1 of the OSI Reference Model right up to the Application Layer In IS-IS, the Application Layer represents the link-state database that holds the network’slink state PDUs The network engineer needs to develop an understanding of what func-tions each layer is performing and what tools he has available to gather information.After information gathering, the collected data needs to be analyzed and interpreted,which requires knowledge of the show commands and debug outputs For detecting mis-conﬁguration on a router, the network engineer needs to understand where the IS-IS rele-vant data in the conﬁguration are stored

The majority of IS-IS problems are related to adjacency formation The network engineerneeds to get familiar with all sorts of debug output for IOS and JUNOS Just looking atthe IS-IS speciﬁc conﬁguration is often not enough to resolve a problem We havedemonstrated in the Internet route export case study that understanding of route exportand policy processing is paramount for resolving complex problems

Trang 4

Network Design

For a long time, link-state protocols were believed not to scale However, today there areoperational networks with more than 1200 routers in a single level Still, networks that runlink-state protocols need to be carefully designed and a lot of factors need to be considered

to get to such a scale By ignoring certain reasonable constraints, you can easily break anetwork in certain scenarios In this chapter you will learn about the critical IS-IS networkdesign factors, all forms of router stress, including ﬂooding stress, SPF stress and forward-ing state change stress, as well as what things to consider to build robust, fast-convergingnetworks

16.1 Topology and Reachability Information

In service provider networks there are always at least two protocols in use The ﬁrst is anIGP (which could be OSPF or IS-IS), and the other is BGP One of the ﬁrst questions

asked by networking novices is why do we need both? It turns out that all IGPs (IS-IS, OSPF, EIGRP) lack one fundamental thing, which is ﬂow-control For IGPs, there is no

way to tell an adjacent router that their updates have overwhelmed the receiver and thesender should throttle down The only way to deal with the situation is to throw away theupdates and wait for re-transmission However, that is still a dangerous game, as it mayofﬂoad stress at the expense of the sending router, which needs to queue retransmissionsand therefore consumes CPU and memory Careful protocol heuristics need to be imple-mented to make sure that both the sending and receiving router do not take themselves

out of service Dave Katz, a software engineer with Juniper Networks, who can be blamed for writing the majority of IGP implementations on the Internet (his own self-

deﬁnition) puts the complexity around ﬁnding the right heuristics in a single quote:

Link State Protocols are hard! (Dave Katz) What network engineers at service providers have been doing is to apply a divide and conquer strategy and separating topology from reachability information Topology infor-

mation contains the skeleton of the network – it is a graph that describes how the nodes are connected to each other It does not contain any information about customernetworks and server networks, or so on Ideally, it does not even contain informationabout the directly connected sub-nets Figure 16.1 shows that the only information that therouters advertise is their loopback IP address, which is necessary to bring up an iBGP full-mesh distribution network which handles bulk transport of the routing information

routing-475

Trang 5

When you run IS-IS over a link you typically advertise your local IP sub-net in your

IS-IS LSPs There is even the notion that local IP sub-nets should not be announced by

IS-IS, but rather by BGP Historically there has not been an option to preclude certain IPsub-nets from being announced However, recent routing software allows you to change

BGP BGP

BGP

BGP BGP

BGP

Washington

IS-IS IS-IS

IS-IS

IS-IS IS-IS

Trang 6

that behaviour In IOS, there is a single knob that changes the advertising behaviour ofdirectly connects sub-nets Once you conﬁgure the passive-only knob, the routingsoftware walks down the list of conﬁgured interfaces and looks for interfaces that aremarked as passive Recall that passive means that you include that interface’ssub-net in your routing update, but you do not try to establish a neighbour relationship or

an adjacency over that interface The loopback interface is by default passive and so

if you conﬁgure the passive-only option, only the loopback IP address of the router

Trang 7

The nice thing about the JUNOS policy is that you may explicitly control the level tosuppress direct routes by introduction of a to {} statement The following exampleshows how to restrict to the loopback0 interface related routes inside Level 2 LSPs only.

Kirk Lougheed (Cisco Systems) and myself’s goal was to build a routing protocol able

to convey 1000 routes and not fall into pieces – If you consider the total routes being today in the Internet we pushed the envelope a bit (Yakov Rekhter)

Based on BGP’s superb scaling capabilities, the idea here is to “borrow” the existingBGP distribution mesh being used for transport of Internet routes for internal routes

as well

The conclusion as to why you always need two protocols is therefore: IS-IS scales

too poorly for conveying a bulk amount of routes, however, it can quickly discover a topology and provide routing connectivity between router loopback IP addresses BGPheavily depends on these IGP-supplied routes to bring up the iBGP Second, BGP isreally in the dark when it comes to ascertaining the distance between a pair of routers.Internal BGP sessions are not “targeted” and therefore need an IGP to resolve routes and

to give BGP speakers directions

In order to come up with a design recommendation, let’s ﬁrst evaluate the forms of

stress that routers are exposed to and develop a set of critical design factors based on

those insights From there we will set up some rules to follow when designing an IS-ISnetwork

Trang 8

16.2.1 Flooding

Unlike link-local packets like Hellos (IIH) or Synchronization packets (SNP), ting link-state PDUs (LSPs) has a network-wide bandwidth usage impact Once a routerﬂoods LSPs, it is using bandwidth equal to the number of links in a given topology timesthe size of the LSP Worst case, it can be that network-wide transmission of an LSPcomes at a cost of using the number of all links times the size of a LSP squared The biggap between the best and the worst case (recall the best case is linear behaviour and the

transmit-worst case is N^2 behaviour) is solely explainable by the way the topology is meshed.

Consider Figure 16.2, where in a strict ring topology of six routers there is no duplicate

F 16.2 In a dense-meshed environment there are lots of duplicate LSPs to process

Trang 9

transmission of an LSP As soon as a link breaks, the LSP travels round until every nodegets a copy Note that for greater visibility the propagation of only one LSP is shown Ofcourse, in real networks both ends of the link that breaks would originate a new LSP Assoon as you add links to the topology, the more redundant the transmission of LSPs gets.

In the ring-topology each router sees the LSP one time

The worst case is a full-mesh of all routers, where a single router failure triggers

(N – 1) LSPs being ﬂooded over (N – 2) links ( O(N2)) through the network The bigproblem in a dense- or full-mesh environment is that nodes that already got a copy ofLSPs receive many redundant duplicates with the same information

An additional source of flooding stress comes from turning on the TE extensions.Once you turn on features like Traffic Engineering, DiffServ Traffic Engineering or AutoBandwidth, then the TEDs throughout the network topology need to be updated throughthe use of the IS-IS flooding sub-system That means that every router in the networksees (and needs to see) accurate TE information However, if the TE implementation permits changes to flooding timers, then let having very conservative timers guide yourdesign TE extensions are a major source of LSP updates and there should be an effort toreduce these to the minimum possible

It is recommended that you consider the topology to evaluate the stress resulting fromreceipt of duplicate LSPs Densely meshed environments scale poorly in ﬂooding environ-ments Try to avoid full-mesh or near-full mesh topologies Sometimes a lot of extraredundancy does not turn into more resiliency

16.2.2 SPF Stress

Link-state routing protocols were once believed to be CPU intense algorithms thatexhausted an embedded system’s sparse resources Because of that belief, both link-stateIGPs (OSPF, IS-IS) have provisions to split the size of the link-state domains to smallerunits In OSPF multiple areas, and in IS-IS two levels, are an attempt to spare the controlplane CPU when doing the SPF run

A lot has changed in the last decade CPUs became (in line with Moore’s Law) faster by

a factor of 8000; Trunk bandwidth grew from T1 speeds to OC-192c/STM-64 The onlything that has not changed at all is the paranoid thinking that SPF may exhaust the CPUresources of a router The fact is, the demand that SPF puts on router resources has beenoutpaced by the processing power of modern CPUs Table 16.1 shows how SPF executionfares on modern route processors like the Cisco Systems GRP or a Juniper Networks RE3.0 The CPU requirements of an SPF operation are well understood and well documented

by computer scientists The fundamental relationship is O(N * log(N )), which describes a curve where the CPU requirements grow a little more than linearly, with N being the number of total routers in the network In practice it is a little more than just log N due to the

2-way check that is needed to verify that a node is connected on both ends and not a dead end.The results from the simulation in Table 16.1 are impressive It means that processing

a grid of 2000 routers, which are in total connected by 5000 links, has a typical tion runtime of only 100–245 milliseconds If you consider this table then it is obvious

execu-that raw SPF execution time is not a problem for large IS-IS networks So what is it then?

Trang 10

Why are we all so scared of routers running excessive number of SPF runs back to back?What is it besides the SPF calculation itself that scares network operators so much?

16.2.3 Forwarding State Change Stress

The purpose of the SPF calculation is to ﬁnd out the shortest path to every edge of the

network However, just the insight that there are better paths available is not enough.

There are no good things, unless you do them! (Erich Kästner)

The router has to pass on the new proximity results to a subsystem called the resolver,

which is used to map third party next-hops to forwarding next-hops Consider Figure16.3, if the link between Washington and New York breaks, the SPF calculation will beﬁnished in a matter of microseconds Each IS-IS speaker is also a BGP speaker and car-ries several thousand active BGP routes If the IS-IS topology changes, then the BGProutes that depend on IS-IS need to get changed as well The resolver needs now to back-track through all the BGP routes and verify that the BGP next-hop is affected by a change

in the core topology As you can imagine, walking down a table of several hundreds ofthousands of BGP route-entries is a resource intensive task In our example, there aretons of forwarding state changes to do: all Washington and New York routes need to bechanged in a very short time

After the BGP dependencies have been worked out, this may generate changes in theBGP topology as well: recall that the IGP distance is part of the BGP route selectionprocess But that is only half of the story, as those things still occur on the control plane

T ABLE 16.1 Modern route processors can calculate topologies for

thousands of nodes and links sub second.

SPF runtime (ms) Juniper Networks Cisco Systems Routers Links Routing Engine 3.0 GRP 12000

Trang 11

The forwarding state change of tens of thousands of routes may stress several sub-systems

of an Internet core router It turns out that changing a forwarding state is one of the mostexpensive operations in a router Meanwhile, both Juniper and Cisco have found a way topass on third party next-hop information to the line-cards and retain the dependency ofBGP routes to IS-IS speakers to forwarding interfaces More on passing on third party next-hop information, and why it is not always a good idea to attempt to fully resolve a route toits forwarding next-hop, can be found in Chapter 10, “SPF and Route Calculation”

Wash D.C.

Metric 4 Metric 2

Metric 1 Metric 1

Pennsauken

Frankfurt London

Washington New York

Paris

BGP

40 K active routes

BGP

F IGURE 16.3 The resolver needs to track and map BGP next-hops to the shortest path resulting from the SPF calculation

Trang 12

16.2.4 CPU and Memory Usage

The two main things that utilize the CPU most in an IS-IS router are the SPF calculationand the resolver SPF calculation puts a short burden on the system but even in largetopologies that burden does not last more than 200 ms using modern route processors Asdiscussed in the previous section, the far bigger CPU hog is the resolver, which maps BGProutes to forwarding next-hops SPF execution runtime is ultimately a non-issue; however,the burden that the resolver can put on the system needs to be carefully examined

In the 1990s, during the explosive growth of the Internet, routers were constantly short

of memory Since then network service providers are cautious about the memory usage

of their routing protocols There is almost no IS-IS-related documentation regardingmemory consumption The majority of IS-IS implementations use memory in three areas:

1 Link-state database

2 SPF result table

3 Storing neighbour information

The link-state database size is the easiest to predict It contains mostly raw data thatwas extracted from the TLVs in an IS-IS PDU There are also overhead and index struc-tures so the IS-IS software can quickly traverse the database when it is looking for a cer-tain LSP As a rough guideline, one can state that the size of the link-state database isabout double the size that individual LSPs consume on the wire For example, if the net-work knows about 100 LSPs with an average length of 400 bytes each, then the size tostore this information in the router software is 100 * 400 * 2 80 KB

The size of the SPF result table depends largely on how many IP preﬁxes are known

to IS-IS inside the network A good estimation here is that each preﬁx consumes about

70 bytes For example, if you have 1600 IS-IS preﬁxes in your network, then the ory consumption on the control plane is 112 KB

mem-The neighbouring table is the most complex one to calculate as all the flooding stateand retransmission list needs to be kept on a per adjacency basis That structure is alsodependent on the size of the link-state database, because all the flooding states are tied toboth the LSP and the adjacency There is a lot of clever pointer work involved here, andthe overhead to do efficient flooding is enormous A good approximate figure is that thistable is about 50 times the average LSP size multiplied by the number of active adjacen-cies For example, if the average LSP is about 400 bytes and the number of adjacencies

is eight, then the memory consumption is 400 * 50 * 8 160 K

If you sum the three memory areas up, then the result for a large network is unlikely

to exceed 4–5 MB in total In IS-IS, the memory consumption is minimal given that there are mainly route processors with 256 MB–2 GB memory deployed in the ﬁeld.Interestingly, there are large overhead structures in the LSP databases to increase LSPlookup speed and to keep ﬂooding state even for large numbers of adjacencies This is justmore evidence that memory consumption for IS-IS networks with big core routers is anon-issue

Trang 13

recommenda-The rest of this chapter draws on many of the topics and ideas discussed throughoutthis book There is no need to repeat more than the basics of the discussions, however, so

we don’t present all of the gory details all over again

16.3.1 Separate Topology and IP Reachability Data

Perhaps the most important rule is keeping topology and IP reachability data separate.You saw that IGPs are not very good at transporting large numbers of routes, so justavoid it and pass the job to BGP In large (more than 1000 routers per level) you mayeven decide to advertise directly connected routes in BGP as well Given that an averageIS-IS core router has about five or six directly attached sub-nets, then you clearly want toavoid that extra 2500–3000 prefixes at the IS-IS level in order to keep convergence timeswithin an upper bound An ideal IS-IS LSP contains just a single IP prefix, which is therouter’s loopback IP address, plus Extended IS Reach TLVs that point to neighbouringrouters

Tcpdump output

An ideal LSP just conveys a single IP preﬁx per router and passes all other routing

infor-mation via BGP.

12:36:45.587565 OSI, IS-IS, length: 405

hlen: 27, v: 1, pdu-v: 1, sys-id-len: 6 (0), max-area: 3 (0)

L2 LSP, lsp-id: 2092.1113.4009-00, seq: 0x000002fd, lifetime: 1198s chksum: 0xe984 (correct), PDU length: 185, Flags: [ L1L2 IS ]

Area address(es) TLV #1, length: 4

Area address (length: 3): 49.0001

Protocols supported TLV #129, length: 1

NLPID(s): IPv4

IPv4 Interface address(es) TLV #132, length: 4

IPv4 interface address: 192.168.1.1

Hostname TLV #137, length: 10

Hostname: Washington

Extended IS Reachability TLV #22, length: 99

IS Neighbor: 1921.6800.1077.00, Metric: 4, sub-TLVs present (12) IPv4 interface address (subTLV #6), length: 4, 172.17.1.6

Trang 14

IPv4 neighbor address (subTLV #8), length: 4, 172.16.33.37

IPv4 neighbor address (subTLV #8), length: 4, 172.16.33.26

Extended IPv4 reachability TLV #135, length: 9

IPv4 preﬁx: 192.168.1.1/32, Distribution: up, Metric: 0

Authentication TLV #10, length: 17

HMAC-MD5 password: 68e18feb2e29257113e4bb6580169310

16.3.2 Keep the Number of Active BGP Routes per Node Low

Vendors have come up with smart representations of BGP routes and how those routesdepend on IS-IS routes However, there is one fault condition where even smart routerepresentations inside a router do not gain us much If an entire BGP speaker disappears,then when the BGP speaker goes down the BGP control plane needs to re-route all thosepreﬁxes, which of course takes time If an IS-IS router is carrying a large number ofactive routes, then it takes proportionally longer if that BGP router goes down Figure16.4 shows that, on the left-hand side, Washington is a “hotspot” BGP speaker that car-ries the majority of BGP routes If this speaker goes down, then you need to re-route all

120 K routes, which can cause a network wide outage of up to 3 minutes The logical step

is to spread those 120 K routes among several routers as shown on the right-hand side ofFigure 16.4

In well-developed peering meshes, the average number of routes per border router isnot more than 10 K In our example, because of a lack of routers, we still did not put morethan 30 K routes per node In practice, if you receive more than 10 K routes per peer, thenyou may need to consider a redundant router and spread the incoming preﬁxes over thetwo redundant routers Re-routing 10 K preﬁxes if the active router breaks down can bedone in a matter of 5–10 seconds

16.3.3 Avoid LSP Fragmentation

IS-IS has plenty of space (precisely 375,040 bytes per LSP) in the distributed database.Despite this vast amount of information that an individual IS-IS speaker can originate,

you typically do not want to use that storage size – ever You should try to accommodate

all the information that you need in maxLSPsize (1492) – LSP header (27) 1465bytes There may be a number of additional LSP updates if you cross an LSP boundaryand have to break things up into another segment Consider Figure 16.5 to see what happens

if you are at the edge of Fragment 0 and an additional adjacency comes up Router1921.6800.1018 decides that it needs to break up another segment Router 1921.6800.1018 generates the fragment and ﬂoods it The troubles start if any of the router’sother sub-nets or adjacencies become unavailable Assume that Adjacency #4 falls down,and then the entire TLVs that follow this particular adjacency gets shifted, and also mayfall into another fragment Considering the example in Figure 16.5, there is no need to

Trang 15

routes New York

30K active routes Washington

Định dạng
Số trang	30
Dung lượng	456,3 KB