Loop-Free Alternate Routes - EtherealMind
In this post we will take a look at IP FRR and Micro-loops. The router calculates these loop free alternate paths in advance and program them. You are here: Home / Blog / Loop-Free Alternate Routes technique called Loop -Free Alternate Fast ReRoute (LFA-FRR), a.k.a. IP Fast. This document provides an overview of Juniper's Loop-free Alternates feature, a solution that delivers fast restoration and convergence for.
It covers the case where the repair first hop is reached via a broadcast or non-broadcast multi-access NBMA link such as a LAN and the case where the P or Q node is attached via such a link. It does not, however, cover the more complicated case where the failed interface is a broadcast or NBMA link. This document considers the case when the repair path is confined to either a single area or to the level two routing domain.
Repair paths are precomputed in anticipation of later failures so they can be promptly activated when a failure is detected. A tunneled repair path tunnels traffic to some staging point in the network from which it is known that, in the absence of a worse-than- anticipated failure, the traffic will travel to its destination using normal forwarding without looping back.
This is equivalent to providing a virtual loop-free alternate to supplement the physical loop-free alternates; hence the name "remote LFA FRR". In its simplest form, when a link cannot be entirely protected with local LFA neighbors, the protecting router seeks the help of a remote LFA staging point.
Examples of worse failures are node failures see Section 7the failure of a Shared Risk Link Group SRLGthe independent concurrent failures of multiple links, or broadcast or NBMA links Section 3 ; protecting against such failures is out of scope for this specification. In this document, such a tunnel is termed a repair tunnel.
The tail end of this tunnel the repair tunnel endpoint is a "PQ node", and the repair mechanism is a "remote LFA". Note that the repair tunnel terminates at some intermediate router between S and E, and not E itself. This is clearly the case, since if it were possible to construct a tunnel from S to E, then a conventional LFA would have been sufficient to effect the repair. In this case, the outer label is S's neighbor's label for the repair tunnel endpoint, and the inner label is the repair tunnel endpoint's label for the packet destination.
In order for S to obtain the correct inner label, it is necessary to establish a targeted LDP session [ RFC ] to the tunnel endpoint. The selection of the specific tunneling mechanism and any necessary enhancements used to provide a repair path is outside the scope of this document. The performance of the encapsulation and decapsulation is efficient, as encapsulation is just a push of one label like conventional MPLS-TE FRR and the decapsulation is normally configured to occur at the penultimate hop before the repair tunnel endpoint.
In the control plane, a Targeted LDP TLDP session is needed between the repairing node and the repair tunnel endpoint, which will need to be established and the labels processed before the tunnel can be used. The time to establish the TLDP session and acquire labels will limit the speed at which a new tunnel can be put into service.
This is not anticipated to be a problem in normal operation since the managed introduction and removal of links is relatively rare, as is the incidence of failure in a well-managed network. Consequently, the repair tunnel used MUST be provisioned beforehand in anticipation of the failure. Since the location of the repair tunnels is dynamically determined, it is necessary to automatically establish the repair tunnels. Multiple repair tunnels may share a tunnel endpoint.
Construction of Repair Paths 5. Identifying Required Tunneled Repair Paths Not all links will require protection using a tunneled repair path. Tunneled repair paths which may be calculated per prefix are only required for links that do not have a link or per-prefix LFA. It should be noted that using the Q-space of E as a proxy for the Q-space of each destination can result in failing to identify valid remote LFAs.
The extent to which this reduces the effective protection coverage is topology dependent. Determining Tunnel Endpoints The repair tunnel endpoint needs to be a node in the network reachable from S without traversing S-E. In addition, the repair tunnel endpoint needs to be a node from which packets will normally flow towards their destination without being attracted back to the failed link S-E.
Note that once released from the tunnel, the packet will be forwarded, as normal, on the shortest path from the release point to its destination. This may result in the packet traversing the router E at the far end of the protected link S-E, but this is obviously not required. The properties that are required of repair tunnel endpoints are as follows: In some topologies it will not be possible to find a repair tunnel endpoint that exhibits both the required properties.
For example, if the ring topology illustrated in Figure 1 had a cost of four for the link B-C while the remaining links were the cost of one, then it would not be possible to establish a tunnel from S to C without resorting to some form of source routing. Computing Repair Paths To compute the repair path for link S-E, it is necessary to determine the set of routers that can be reached from S without traversing S-E and match this with the set of routers from which the node E can be reached by normal forwarding without traversing the link S-E.
The approach used in this memo is as follows: This is called the S's P-space with respect to the failure of link S-E. This is called the S's extended P-space with respect to the failure of link S-E. The use of extended P-space allows greater repair coverage and is the preferred approach. This is called the Q-space of E with respect to the link S-E. The selection of the preferred node from the set of nodes that are in both extended P-space and Q-space with respect to the S-E is described in Section 5.
A suitable cost-based algorithm to compute the set of nodes common to both extended P-space and Q-space with respect to the S-E is provided in Section 5. P-space The set of routers that can be reached from S on the shortest path tree without traversing S-E is termed the P-space of S with respect to the link S-E.
The exclusion of routers reachable via an ECMP that includes S-E prevents the forwarding subsystem from attempting to execute a repair via the failed link S-E. Thus, for example, if the Shortest Path First SPF computation stores at each node the next hops to be used to reach that node from S, then the node can be added to P-space if none of its next hops are link S-E.
In the case of Figure 1, this P-space comprises nodes A and B only. Extended P-space The description in Section 5.Dua Lipa & BLACKPINK - Kiss and Make Up (Official Audio)
However, since router S will only use a repair path when it has detected the failure of the link S-E, the initial hop of the repair path need not be subject to S's normal forwarding decision process. Thus, the concept of extended P-space is introduced.
The use of extended P-space may allow router S to reach potential repair tunnel endpoints that were otherwise unreachable. Since node C is also in E's Q-space with respect to link S-E, there is now a node common to both extended P-space and Q-space that can be used as a repair tunnel endpoint to protect the link S-E.
Q-space The set of routers from which the node E can be reached, by normal forwarding without traversing the link S-E, is termed the Q-space of E with respect to the link S-E. The rSPT uses the cost towards the root rather than from it and yields the best paths towards the root from other nodes in the network. As can be seen in the case of Figure 1, there is no common node and hence no viable repair tunnel endpoint.
However, when the extended P-space Section 5. Note that the Q-space calculation could be conducted for each individual destination and a per-destination repair tunnel end point determined. However, this would, in the worst case, require an SPF computation per destination that is not currently considered to be scalable. Therefore, the Q-space of E with respect to link S-E is used as a proxy for the Q-space of each destination.
This approximation is obviously correct since the repair is only used for the set of destinations which were, prior to the failure, routed through node E. Selecting Repair Paths The mechanisms described above will identify all the possible repair tunnel endpoints that can be used to protect a particular link.
In a well-connected network, there are likely to be multiple possible release points for each protected link. All will deliver the packets correctly, so arguably, it does not matter which is chosen. However, one repair tunnel endpoint may be preferred over the others on the basis of path cost or some other selection criteria. There is no technical requirement for the selection criteria to be consistent across all routers, but such consistency may be desirable from an operational point of view.
In general, there are advantages in choosing the repair tunnel endpoint closest shortest metric to Bryant, et al. Choosing the closest maximizes the opportunity for the traffic to be load balanced once it has been released from the tunnel. As described in [ RFC ], always selecting a PQ node that is downstream to the destination with respect to the repairing node prevents the formation of loops when the failure is worse than expected.
The use of downstream nodes reduces the repair coverage, and operators are advised to determine whether adequate coverage is achieved before enabling this selection feature. This section describes a method of computing the remote LFA repair target for a specific failed link using a cost-based algorithm.
The pseudocode provided in this section avoids unnecessary SPF computations; for the sake of readability, it does not otherwise try to optimize the code. It also covers the case where the P or Q node is attached via such a link. It does not cover the case where the failed interface is a broadcast or NBMA link. To address that case it is necessary to compute the Q-space of each neighbor of the repairing router reachable through the LAN, i. The following notation is used: However, there are two situations where this behavior may result in a repair path traversing a link or router that should be excluded: One situation is when the first hop on the repair tunnel path from the PLR to a direct neighbor does not follow the IGP shortest path.
Node protection for route C1 is not applicable. If C1 goes down, traffic destined to C1 is lost anyway. It is node- protecting, because eq2: It is node-protecting if eq2: This relationship between e and c is an important aspect of the analysis, which is discussed in detail in Sections 3.
All impacted destinations are protected against link failure. Node protection upon E1's failure is not applicable, as the only impacted traffic is sinked at E1 and hence is lost anyway. Node protection is not applicable. De facto node protection is not applicable. This is particularly the case for dual-plane core or two-tiered IGP metric design; see Sections 3. The IGP convergence does not cause any uLoop. A1, C1, E2, and P. Node protection for route A1 is not applicable. If A1 goes down, traffic to A1 is lost anyway.
Node protection for route C1 is guaranteed: Node protection is guaranteed: De facto node protection is provided for all destinations except to A1, which is not applicable. If C1 goes down, traffic to C1 is lost anyway. It is de facto protected against node failure if eq2: A1, E1, and E2. E2 behaves like E1 and hence is not analyzed further. Node protection upon A1's failure is not applicable, as the traffic to A1 is lost anyway. Node protection upon A1's failure is guaranteed, because eq2: Node protection is guaranteed where applicable.
De facto node protection is available. They benefit from node protection upon failure of A nodes.
Loop-Free Alternate Routes
Node protection for traffic to A1 upon A1 node failure is not applicable. This LFA is guaranteed to be node-protecting, because eq2: E1's primary route to P is via E1A1.
Its LFA is via A2, because eq1: C1, A3, E3, and P. Node protection is not applicable for traffic to C1 when C1 fails.
It is node-protecting, because eq2: The LFA is via A2, because eq1: This LFA is node-protecting from the viewpoint of A1 computing eq2 if eq2: Note that A3 benefits from de facto node protection. E2's analysis is the same as E1 and hence is omitted.
C1 has no LFA for A1. Indeed, its neighbors C2 and A3 have a shortest path to A1 via C1. It provides node protection, because eq2: Indeed, there are many more destinations reachable over A1C1 than over C1A1.
Typically, most of the traffic traversing link C1A1 is directed to these E nodes; hence, the lack of per-prefix LFAs for the destination A1 might be insignificant. It definitely has a negative impact upon per-link LFAs. The number of destinations impacted by A1C1 failure is much larger than the direction C1A1; hence, the protection is provided for the wrong direction. Some backbone topologies will lead to very good protection coverage, while some others might provide very poor coverage.
Such a study likely requires a planning tool, as each remote destination P would have a different e value exception: C1 has no per-prefix LFA to A1. Second, when per-prefix LFA provides node protection eq2 is satisfiedper-link LFA provides effective de facto node protection. A Square Might Become a Full Mesh If the vertical links of the square are made of parallel links at the IP topology or belowthen one should consider splitting these "vertical links" into "vertical and crossed links".
The topology becomes "full mesh". A typical scenario in which this is prevented would be when the A1C1 bandwidth may be within a building while the A1C2 is between buildings. Hence, while from a router-port viewpoint the operation is cost-neutral, from a cost-of-bandwidth viewpoint it is not. A Full Mesh Might Be More Economical Than a Square In a full mesh, the vertical and crossed links play the dominant role, as they support most of the primary and backup paths.
The capacity of the horizontal links can be dimensioned on the basis of traffic destined to a single C node or a single A node, and to a single E node. Extended U For the Extended U topology, we define the following terminology: This loopback is in L2. There might be an L2 link between C1 and C2. This is not relevant, as this is not seen from the viewpoint of the L1 topology, which is the focus of our analysis.
It is guaranteed that there is a path from C1LO to C2LO within the L2 topology except if the L2 topology partitions, which is very unlikely and hence not analyzed here. We call "c" its path cost. Once again, as the source and destination addresses are the loopbacks of C1 and C2 and these loopbacks are in L2 only, it is guaranteed that the tunnel does not transit via the L1 domain.
A router supporting such an extension learns that it has one additional potential neighbor in topology level-1 when checking for LFAs. The metric advertised by C2L1 is bigger than the metric advertised by C1L1 by "c". The metric advertised by C2L1 is bigger than the metric advertised by C1L1 by "e".
IP FRR and Micro-loops Part 1 - Packet Pushers
Node protection is guaranteed, because eq2: Node protection is possible, because eq2: Same as that for the square topology. C1, E3, and P. Indeed, eq1 is true: Remember that the tunnel is not seen by IS-IS for computing primary paths! Node protection is not applicable for traffic to A1 when A1 fails.
Node resistance is applicable for traffic to E1 and E2. Conclusion The Extended U topology is as good as the square topology. It does not require any crossed links between the A and C nodes within an aggregation region.
It does not need an L1 link between the C routers in an access region. Note that a link between the C routers might exist in the L2 topology. Assuming such an IGP metric allocation, the following properties are guaranteed: Then, we show that all of the other cases do not have uLoop potential. It can be shown that all of the other routing transitions following a link failure in the analyzed topologies do not have uLoop potential.
Indeed, in each case, for all destinations affected by the failure, the rerouting nodes deviate their traffic directly to adjacent nodes whose paths towards these destinations do not change. As a consequence, all of these routing transitions cannot undergo transient forwarding loops. For example, in the square topology, the failure of directed link A1C1 does not lead to any uLoop.
The destinations reached over that directed link are C1 and P. A1's and E1's shortest paths to these destinations after the convergence go via A2. Summary In this section, we summarize the applicability of LFAs detailed in the previous sections.
For link protection, we use "Full" to refer to the applicability of LFAs for each destination, reached via any link of the topology. For node protection, we use "Yes" to refer to the fact that node protection is achieved for a given node. In practice, this means that per-prefix LFAs will be used.
The main optimization objectives for backbone topology design are cost, latency, and bandwidth, constrained by the availability of fiber. Optimizing the design for local IP restoration is more likely to be considered as a non-primary objective.
- IP FRR and Micro-loops Part 1
For example, the way the fiber is laid out and the resulting cost to change it lead to ring topologies in some backbone networks. Also, the capacity-planning process is already complex in the backbone.
The process needs to make sure that the traffic matrix demand is supported by the underlying network capacity under all possible variations of the underlying network what-if scenario related to one-SRLG failure. Classically, "supported" means that no congestion is experienced and that the demands are routed along the appropriate latency paths. Selecting the LFA method as a deterministic FRR solution for the backbone would require enhancement of the capacity-planning process to add a third constraint: Each variation of the underlying network should lead to sufficient LFA coverage.
We detail this aspect in Section 7. On the other hand, the access network is based on many replications of a small number of well-known well-engineered topologies. In practice, we believe that there are three profiles for the backbone applicability of the LFA method: In the first profile, the designer plans all of the network resilience on IGP convergence.
In such a case, the LFA method is a free bonus. If an LFA is available, then the loss of connectivity is likely reduced by a factor of 10 50 msec vs. The LFA method should be very successful here, as it provides a significant improvement without any additional cost.
In the second profile, the designer seeks a very high and deterministic FRR coverage, and he either does not want or cannot engineer the topology. The LFA method should not be considered in this case. Explicit routing ensures that a backup path exists, whatever the underlying topology.
In the third profile, the designer seeks a very high and deterministic FRR coverage, and he does engineer the topology. The LFA method is appealing in this scenario, as it can provide a very simple way to obtain protection. Furthermore, in practice, the requirement for FRR coverage might be limited to a certain part of the network e. In such a case, if the relevant part of the network natively provides a high degree of LFA protection for demands of interest, it might actually be straightforward to improve the topology and achieve the level of protection required for the sub-topology and the demands that matter.