A Comparative Study of Dynamic Routing in Circuit-Switched Networks

by Greg Trangmoe

Introduction

This report traces the development of Dynamic Routing methods for circuit-switched traffic. Circuit-switched traffic (e.g. voice calls) still comprise the largest usage of today's telephone networks. Advances in switching technology have established Dynamic Routing as the dominant alternative to static hierarchical networks. As technology continues to improve, Dynamic Routing must evolve in order to provide enhanced service to users. With performance measured by "Grade Of Service" - the probability that a call encounters network congestion - efficiency can translate into a cheaper network for given performance criteria, or into increased performance at a given cost. A careful examination of three important Dynamic Routing strategies in existence today will serve to classify the approaches to Dynamic Routing and derive the relevant merits ad demerits of each approach against a given set of criteria.

The reader is assumed to have fundamental knowledge of communications networks. The routing problem is described in some detail and would be sufficient for application to the content of this report.

Why Dynamic Routing?

There has been a steady move in research and industry toward Dynamic Routing solutions. A brief description of the history and evolution of routing is presented to motivate current thinking in network routing.

Routing Principles

The goal of routing in a communications network is to direct user traffic from a source to the correct destination in accordance with the network’s service requirements. The service requirements for a given network are often expressed as a set of objectives. Objectives can include maximizing network performance (e.g., delay and throughput) and minimizing costs (e.g., equipment and facilities). The underlying technology of the network imposes constraints on the network objectives. Constraints arise from the limitations of the switching technology, the volume of user and network traffic, and the services requested by the network. It is the multi-objective, multi-constraint nature of routing that makes it such a complex problem in communications networks.

Although many routing solutions have been employed, all communications networks share a core of basic routing functionality. [16]

Assembling and Distributing State Information for Network Traffic
State information for an active network may include available services, available resources, and abnormal conditions in the network. State information may comprise both measured and predicted values obtained from the network and from external sources.

Selecting Feasible or Optimal Routes Based on Network State Information
Feasible routes are those which satisfy all the user- and network-imposed service constraints. Optimal routes are feasible routes that are “best” with respect to the network objectives. It is often too computationally intensive to find the most optimal route for every call but heuristic approaches have been shown to closely approximate optimal routing for a fraction of the computational costs.

Forwarding Traffic Along Selected Routes
Once a route has been selected traffic can be forwarded according to one of two forwarding paradigms: Connection-oriented or Connection-less forwarding. Connection-oriented forwarding requires forwarding directives to be set up prior to using the route to carry user traffic. This approach is often compared to placing a phone call where both participants in the conversation must pick up the phone in order for communication to take place.

Connection-less forwarding requires that the user traffic carry enough information to provide forwarding directives for the switches along the network path. This approach is analogous to the US Postal system of delivering mail in which each piece of mail must have the destination address in order to be delivered.

Routing systems vary widely in the types of state changes to which they respond and the speed of their responses.

Fixed Routing

At one extreme are static routing systems whose routing remains fixed, independent of the current state of the users and the network. Static routing is based on expected rather than actual network state. Static routing involves virtually no real-time activities other than traffic forwarding and thus requires almost no computational resources within the network itself.

This was the primary reason for the development of Bell’s fixed hierarchical network, shown in Figure 1, which had it’s beginnings in the late 1950's. Since the mechanical switches of the day were computationally slow it made sense to reduce the role of the switches to one of forwarding only. The real-time routing computations were essentially eliminated by designing the network to exceed busy-hour, busy-day predictions. This imposed an additional burden on the network designers who needed comprehensive and accurate predictions of network behavior to ensure network reliability. [3,5,6]

The secondary reason for fixed routing was that the telephone companies were reluctant to relinquish a large portion of network control to the network itself, because of the economic consequences if the network were to make incorrect routing decisions. The rate and unpredictability of changes in modern communications networks makes it a necessity to use dynamic routing within the network to improve performance.Safeguards were built into the hierarchical network to provide alternate routes in the event of node or link failure. Figure 1 shows these redundant links as dashed lines between different levels of the hierarchy. The redundant links provide alternate routes for traffic but the network is still considered a static network since routing patterns can not be changed according to the time of day or current network traffic.

Dynamic Routing

A Dynamic Routing system selects routes based on current state information for the network. The state information can be predicted or measured but the route will change depending on the available state information at the time of the traffic request. The advent of smart digital switches in the network allowed for the evolution of traffic routing from fixed hierarchical to dynamic non-hierarchical, in order to eliminate problems that are presently well identified. [8, 15]

As traffic routing is closely related to network design in fixed networks, it becomes clear that using a fixed routing pattern, determined by busy-hour and busy-season traffic measurements, cannot allow the efficient accommodation in unexpected traffic situations.

Network robustness is compromised when a single link failure represents a loss of connection to a significant portion to the overall network. This is also a cause of poor efficiency in the network since large portions of the network can lie idle while calls through one congested link are blocked.

A typical non-hierarchical network is shown in Figure 2. Each of the switches has a direct connection to all other switches in the network. This is in sharp contrast to the hierarchical network in which one possible route exists between two switches in the network. In a N-node non-hierarchical network there is always one direct route between any two nodes and N-2 two-link routes between nodes. It is clear that this type of network will be better able to adapt to failure of one node or link since multiple alternative routes exist between any two nodes.

The telephone companies have come to recognize that dynamic routing is profitable because it reduces both network trunking and operating costs along with call blocking under all conditions. The integration of non-hierarchical routing into the AT&T switched network was completed in 1987.

Dynamic Routing Strategies

Dynamic Routing strategies are typically classified along two lines:

1. The location of the computational engine in the routing system.

2. The type of information used to make routing decisions.

Centralized vs. Distributed

The basic routing functions can in either centralized or decentralized form. The degree of decentralization depends upon the desired dynamism, robustness and manageability of the routing system.

In a centralized implementation, a single entity performs the routing computations. Centralized procedures are easy to manage because the functionality resides at a single entity. Costs can be reduced by concentrating the computational engine at a central location.

There are also drawbacks to centralized implementations. When a central entity fails or is otherwise isolated from the network it affects the performance of the entire network. This fact was evidenced in 1990 when the failure of AT&T's routing computer prevented completion of many calls during a 9-hour period. [7] Also, a routing function's responsiveness to state changes depends on the state of the central processing facility. A highly loaded network will have slower response to state changes than a lightly loaded network. This loss in responsiveness comes at time when the network is loaded and needs to be more responsive. Portions of the network that are physically more distant from the central processor will also see slower response to state changes in the network. Thus, concentrating a routing function at a single entity limits the responsiveness of that function.

With a decentralized routing system, multiple peer entities perform routing functions. If the functionality is replicated, each entity (switch) independently provides routing functionality without exchanging it's state information with it`s peers. If the functionality is distributed, peer switches share state information to cooperatively provide routing functionality.

While decentralization requires higher switching costs and complicated network management schemes, it has numerous advantages. Replicating functionality at multiple entities increases the reliability of the system. It also makes the network responsive to local state changes that a centralized controller may not have detected. Since the computational load is spread over the entire network the routing functionality will be less affected by overloading the network. A distributed implementation will also be more scaleable because adding switches also adds computational units to the network.

Time-Dependent vs. Adaptive

Another important distinction in dynamic routing systems is the nature of the state information used to make routing decisions. In time-dependent routing implementations, the choice of routes taken is a factor of the time of day when the request for traffic takes place. This approach relies on accurate predictions of network traffic as a function of time to avoid congested links or switches in a network. There is strong correlation between network traffic and the time of day as shown in [6]. By taking advantage of time zone changes and regional usage measurements, AT&T has significantly reduced busy-hour blocking without extending network bandwidth. Time-dependent routing is not considered adaptive because routing alternatives remain fixed during a constant time period, such as one hour.

Adaptive routing is driven purely by the current state information available for the network. The term adaptive refers to the networks ability to adapt to conditions and find a route optimal to the present conditions. Unlike time-dependent routing, alternative routes are evaluated on a real-time basis and as such have no dependence on the time of day. Adaptive routing is computation intensive at the switches since the state information must be examined frequently but it is also more responsive to local network conditions.

Examples of Dynamic Routing

Perhaps the best way to make a comparison of Dynamic Routing strategies is through examples of existing methods that embody different characteristics of Dynamic Routing.

Table of Dynamic Routing Classifications

Table 1 breaks the Dynamic Routing alternatives into four groups according to the location of the computational resources in the routing system and the type of information used to make routing decisions. Three examples of Dynamic Routing Systems will be examined that embody one class of Dynamic Routing as shown in Table 1.

The fourth class, Decentralized Time-Dependent Routing would be cost prohibitive since each switch would need the computational power to calculate time-dependent routes for the entire network. Such a system has not been implemented and as such will not be considered in this report.

The routing algorithms presented for each example are presented for a fully-connected network. Fully connected networks are true non-hierarchical networks in which every node has a direct connection to every other node in the network. In reality, most "non-hierarchical" networks do not satisfy this criteria but routing algorithms handle this case as if the missing link were temporarily out of service.

Commonalties Among Dynamic Routing Strategies

Up to this point I've been addressing how routing strategies differ. However, there are some essential features of Dynamic Routing systems that are common to the examples covered here.

First, traffic for node pair (i,j) is always carried by the direct link between i and j unless that link lacks the necessary bandwidth or the link or is out of service. This is not only the obvious first choice for a route but also the "best" route in all cases since using a two-link route consumes twice as much overall network bandwidth.

Second, all of the examples covered here limit the alternate routes to two-link paths from the source to the destination. Considering paths of more than two links would not only be computationally prohibitive but it would also be fruitless since all three-link paths would have to use one of the links that were already found to be congested. The two-link restriction lets the switches operate at higher speeds with less memory than maintaining tables of all possible routes through the network.

Third, links are assumed to reserve a portion of their bandwidth that can not be accessed by traffic following an alternative route. This gives nodes wanting a direct link priority when the link is congested since requests to use the link as an alternative path are refused if bandwidth is scarce on that link. This concept is called Trunk Reservation, where a parameter rA is the number of circuits that must be free on link A if that link is to be used as an alternative path. Overflow calls will be accepted on a link only if r circuits are available while single-link calls are accepted until the link is full. The Trunk Reservation (TR) parameter can be motivated by noting that a call routed through two links could potentially block two direct calls wishing to use the individual links. In this case it would be advantageous to the network if the original call had been blocked or routed through a different alternative path.

At certain critical traffic levels a network with no trunk reservation controls can exhibit unstable behavior. [16] This behavior was first predicted by simple fixed-point analytical models and later verified through simulation. The problem can be most simply observed by looking at the effect that one overloaded link could have on a network without TR parameters. Consider link (i,j) which becomes full and begins routing overflow traffic to it's first alternative path (i,k,j). Without a TR parameter for this alternate path, overflow traffic from the (i,j) link could completely block both the (i,k) link and the (k,j) link. New traffic wishing to use the direct links (i,k) or (k,j) would then be forced to take their first alternate routes, further congesting these links in the network. The net effect of this scenario is that one overloaded link in the network could cause the network to become completely blocked to new traffic while carrying only 50% of it's potential calls.

Fortunately, simulations have shown that reserving a small number of circuits on each link for direct traffic can prevent this undesirable behavior. If slightly higher TR parameter values are set, only the first link of an alternate path needs to restrict circuits to overflow traffic.

Dynamic Non-Hierarchical Routing

Dynamic Non-Hierarchical Routing (DNHR) has resulted in the improvement of network efficiency and reduction of network costs since introduced in 1987 as AT&T replacement of fixed hierarchical routing.

The DNHR system is a centralized routing system utilizing Common Channel Signaling (CCS) to collect and distribute routing information from the nodes throughout the network. The CCS system is actually a packet-switched network developed by AT&T in the 70's to carry signaling information separate from voice traffic. Each node in the circuit-switched network has a dedicated 2400-bps CCS data link to exchange signaling and routing information.

Each node in a DNHR network maintains a table of two-link alternate routes in the event that the direct link between source and destination is unavailable. There are up to 14 alternate routes for any link in the network. The alternate paths are tried in order of their position in the routing table until a path can be found that won't violate the TR parameter on either of the links in that path. The switch can only verify the trunk reservation of the first link of each path since it has no state information for other links in the network. If a call from node i to node j is routed to alternate path (i,k,j) and the second link (k,j) is found to be operating beyond it's TR point, the network will perform crankback for that call.

Crankback is handled by the central facility which receives a message from the second node k in the path indicating a call could not be completed via that path. The central facility then relays the failure to complete on to the source node i which will then try to route the call through the next path in the routing table. If all paths in the routing table have been tried unsuccessfully, the call is considered block and dropped from the network.

The central computing facility is responsible for updating the switches with a new routing table every hour. The order of alternate routes in the routing table is set to distribute overflow traffic through the least busy nodes in the network based on predicted traffic flow for that hour. Obviously, the success of this method of routing depends on accurate predictions of regional network traffic. For small networks, traffic prediction is relatively straightforward using the simplex algorithm to characterize past measurements. [6]

However, in even a moderate sized communications network the simplex algorithm would require hundreds of hours to complete on standard computing hardware. A heuristic approach to this problem termed the unified algorithm was developed by Bell Laboratories.[3] The unified algorithm is a simplified yet accurate and efficient technique to design the DNHR routing tables in a fraction of the time required for the simplex calculation. In fact, the unified algorithm can analyse and route the full scale network in about one hour.

Network design is still a complex and critical task in order to fully realize the benefits of this Dynamic Routing technique. DNHR has no provisions for unexpected network behavior so the network must be designed to handle traffic well beyond the busy-hour peaks or substantial blocking could occur in unpredictable situations.

Dynamic Adaptive Routing

Dynamic Alternative Routing (DAR) is an adaptive call-routing strategy that stochastically selects an alternative route using local information about the loading of outgoing trunks to determine the feasibility of selected routes. DAR has been implemented in the British Telecom public switched-trunk network.

The DAR algorithm for a fully connected N-node network is as follows. Each link (i,j) has a capacity Cij and assigns a trunk reservation parameter r based on a percentage of it total capacity. Every source-destination pair (i,j) in the network maintains a tandem node k, which is the node in common to the two links of that pair's first alternative path. The tandem node is selected at random from all nodes providing a two-link path from the source to the destination.

Fresh-offered traffic between nodes i and j is first offered to the direct link and is always accepted by this link if a circuit is available. Otherwise, the call attempts the two-link alternative specified by it's tandem node k. If the call can not be routed via node k without violating the TR parameters that call is lost and the pair (i,j) selects a new tandem node at random from those available. If the call is successfully delivered through the direct link or through the tandem node that pair retains k as the tandem node.

One of the desirable properties of DAR is it's speed of response. The reselection of a tandem node only occurs after a call fails so per-call processing is low. Since only one alternate path is retained for each link, the routing decision is fast and predictable.

DAR locks on to a good alternate route, and once a route fails, another route is immediate sought. This can be thought of as a learning scheme wherein the probabilities of choosing a particular overflow path are 1 if the path was just successful, 0 if the path just failed, and 1/(N-2) for all other alternate paths.

Over a sufficiently long period, each DAR alternative path will have been selected an equal number of times, since the tandem nodes are selected at random. Since one call is lost each time a new tandem is selected an equal number of calls will be lost on each overflow path. Thus, if pt denotes the proportion of calls offered to tandem t and bt denotes the probability that a call is blocked through t, then the blocking rate ptbt is equal for all tandems t. Hence,

which says that the proportion of overflowing calls routed through any given tandem t is inversely related to the blocking probability on the route. This is the essence by which DAR spreads traffic evenly throughout the network.

DAR can be easily modeled and several variations of DAR have been proposed based on the results of these simulations. In general, the variations of DAR use a method other than pure random to select new tandems from those available. These variations have been shown to yield better performance than classic DAR but they reduce the simplicity of the algorithm.

Real-Time Network Routing

Real-Time Network Routing (RTNR) is adaptive routing method that replaced DNHR in the AT&T network starting in 1991. RTNR is a move away from centralized routing for AT&T but the CCS system continues to play an important role in the implementation of RTNR.

RTNR has several features that are beyond the scope of this survey but bear mentioning. It has been designed to be adaptive over multiple classes of service which makes it easily extendible to carry B-ISDN traffic. RTNR also provided ingress/egress routing service to facilitate integration of Dynamic Routing into global networks. These additional features make RTNR the technology for the foreseeable future at AT&T.

Like the other methods we've seen, RTNR first tries the direct route and the call is accepted on that route if a circuit is available. If the direct trunk is not available, the originating switch communicates with the terminating switch through the central facilities and the CCS network. The terminating switch sends the busy-idle status of all links terminating at that switch which the sending switch then compares with the status of it's own links to find the Least-Loaded Route (LLR) to the destination.

An alternate path through two links, A and B, with TR parameters rA and rB, is considered least-loaded if it has the lowest value loadA,B where

load(i,j) = min { ( r(A)-load(A) ), ( r(B)-load(B) )}

This quantity is often referred to as the TR permissibility of a path since it is a reflection of the bandwidth available in addition to the reservation parameters. A negative TR permissibility would indicate that an alternate route is not available while a large TR permissibility indicates an underutilized path. This is a computationally intensive routing decision to find the "best" route when any available route can carry the call but LLR has been shown to have better performance than either DNHR or DAR. The DAR algorithm may not prove extensible to multiple classes of service while AT&T's RTNR network is currently operating in this mode.

The Future of Dynamic Routing

There have been many studies into the relative performance merits and demerits of each approach to Dynamic Routing.[1,2,8,15] The common arguments heard for and against each approach are listed below.

Centralized vs. Distributed

The dependence of centralized computing facility has caused massive routing failures under adverse network conditions while distributed algorithms have proven robust to multiple failures of nodes and links in the network. For single class of service networks distributed routing algorithms have been shown to provide better performance at a lower cost than centralized systems of similar size. The simplicity of distributed routing schemes such as DAR make it superior in cost and performance to DNHR.

The addition of multiple classes of service may prove too complex for localized routing decisions in a switch. RTNR has been demonstrated as a robust routing scheme that takes advantage of overall knowledge of network conditions to route multiple classes of service.

Time-Dependent vs. Adaptive

As communications networks transition to provide a wider range of services, the range and complexity of traffic variations will also widen. Time-dependent routing alone can not provide the necessary routing efficiency to maintain this level of service in a cost-effective manner.

Several hybrid routing schemes, such a State- and Time-dependent Routing (STR) [11] have been proposed to address this problem but their contribution will most likely be only a transition to purely adaptive strategies such as DAR, and RTNR.

Conclusions

I have described three seperate Dynamic Routing schemes, each encompassing a different attitude towards the problem of routing in a constantly changing network. By extracting the commonalties of the routing problem the essential aspects of Dynamic Routing become apparent - Trunk Reservation, taking the direct route whenever possible, and limiting alternative routes to two-links. These are the rules of Dynamic Routing that can not be ignored by a successful routing scheme. The examples presented here, DNHR, DAR and RTNR have applied the available technology to each provide a different solution to the problem of Dynamic Routing but they all follow the essential rules.

The mapping of technology to improved service is vital if communications networks are to continue providing satisfactory service to users in light of new demand for services. It has become apparent that demand will always outpace the technology to provide the service and Dynamic Routing will be a key component in meeting increased demands in future communications networks.

List of References

Back to the Report Title Page


Last modified: December 11, 1995



trangmoe@ece.arizona.edu

Department of Electrical and Computer Engineering

The University of Arizona

Tucson, AZ  85721

A Comparative Study of Dynamic Routing in Circuit-Switched Networks

by Greg Trangmoe

Introduction

Why Dynamic Routing?

Routing Principles

Fixed Routing

Dynamic Routing

Dynamic Routing Strategies

Centralized vs. Distributed

Time-Dependent vs. Adaptive

Examples of Dynamic Routing

Commonalties Among Dynamic Routing Strategies

Dynamic Non-Hierarchical Routing

Dynamic Adaptive Routing

Real-Time Network Routing

The Future of Dynamic Routing

Centralized vs. Distributed

Time-Dependent vs. Adaptive

Conclusions

List of References Back to the Report Title Page

List of References

Back to the Report Title Page