The reader is assumed to have fundamental knowledge of communications networks. The routing problem is described in some detail and would be sufficient for application to the content of this report.
The goal of routing in a communications network is to direct user traffic from a source to the correct destination in accordance with the network’s service requirements. The service requirements for a given network are often expressed as a set of objectives. Objectives can include maximizing network performance (e.g., delay and throughput) and minimizing costs (e.g., equipment and facilities). The underlying technology of the network imposes constraints on the network objectives. Constraints arise from the limitations of the switching technology, the volume of user and network traffic, and the services requested by the network. It is the multi-objective, multi-constraint nature of routing that makes it such a complex problem in communications networks.
Although many routing solutions have been employed, all communications networks share a core of basic routing functionality. [16]
At one extreme are static routing systems whose routing remains fixed, independent of the current state of the users and the network. Static routing is based on expected rather than actual network state. Static routing involves virtually no real-time activities other than traffic forwarding and thus requires almost no computational resources within the network itself.
Why Dynamic Routing?
There has been a steady move in research and industry toward Dynamic Routing solutions. A brief description of the history and evolution of routing is presented to motivate current thinking in network routing.
Routing Principles
Routing systems vary widely in the types of state changes to which they respond and the speed of their responses.
State information for an active network may include available services, available resources, and abnormal conditions in the network. State information may comprise both measured and predicted values obtained from the network and from external sources.
Feasible routes are those which satisfy all the user- and network-imposed service constraints. Optimal routes are feasible routes that are “best” with respect to the network objectives. It is often too computationally intensive to find the most optimal route for every call but heuristic approaches have been shown to closely approximate optimal routing for a fraction of the computational costs.
Once a route has been selected traffic can be forwarded according to one of two forwarding paradigms: Connection-oriented or Connection-less forwarding. Connection-oriented forwarding requires forwarding directives to be set up prior to using the route to carry user traffic. This approach is often compared to placing a phone call where both participants in the conversation must pick up the phone in order for communication to take place. Fixed Routing
The secondary reason for fixed routing was that the telephone companies were reluctant to relinquish a large portion of network control to the network itself, because of the economic consequences if the network were to make incorrect routing decisions. The rate and unpredictability of changes in modern communications networks makes it a necessity to use dynamic routing within the network to improve performance.Safeguards were built into the hierarchical network to provide alternate routes in the event of node or link failure. Figure 1 shows these redundant links as dashed lines between different levels of the hierarchy. The redundant links provide alternate routes for traffic but the network is still considered a static network since routing patterns can not be changed according to the time of day or current network traffic.
The telephone companies have come to recognize that dynamic routing is profitable because it reduces both network trunking and operating costs along with call blocking under all conditions. The integration of non-hierarchical routing into the AT&T switched network was completed in 1987.
The basic routing functions can in either centralized or decentralized form. The degree of decentralization depends upon the desired dynamism, robustness and manageability of the routing system.
In a centralized implementation, a single entity performs the routing computations. Centralized procedures are easy to manage because the functionality resides at a single entity. Costs can be reduced by concentrating the computational engine at a central location.
There are also drawbacks to centralized implementations. When a central entity fails or is otherwise isolated from the network it affects the performance of the entire network. This fact was evidenced in 1990 when the failure of AT&T's routing computer prevented completion of many calls during a 9-hour period. [7] Also, a routing function's responsiveness to state changes depends on the state of the central processing facility. A highly loaded network will have slower response to state changes than a lightly loaded network. This loss in responsiveness comes at time when the network is loaded and needs to be more responsive. Portions of the network that are physically more distant from the central processor will also see slower response to state changes in the network. Thus, concentrating a routing function at a single entity limits the responsiveness of that function.
With a decentralized routing system, multiple peer entities perform routing functions. If the functionality is replicated, each entity (switch) independently provides routing functionality without exchanging it's state information with it`s peers. If the functionality is distributed, peer switches share state information to cooperatively provide routing functionality.
While decentralization requires higher switching costs and complicated network management schemes, it has numerous advantages. Replicating functionality at multiple entities increases the reliability of the system. It also makes the network responsive to local state changes that a centralized controller may not have detected. Since the computational load is spread over the entire network the routing functionality will be less affected by overloading the network. A distributed implementation will also be more scaleable because adding switches also adds computational units to the network.
Dynamic Routing Strategies
Dynamic Routing strategies are typically classified along two lines:
Centralized vs. Distributed
Time-Dependent vs. Adaptive
Another important distinction in dynamic routing systems is the nature of the state information used to make routing decisions. In time-dependent routing implementations, the choice of routes taken is a factor of the time of day when the request for traffic takes place. This approach relies on accurate predictions of network traffic as a function of time to avoid congested links or switches in a network. There is strong correlation between network traffic and the time of day as shown in [6]. By taking advantage of time zone changes and regional usage measurements, AT&T has significantly reduced busy-hour blocking without extending network bandwidth. Time-dependent routing is not considered adaptive because routing alternatives remain fixed during a constant time period, such as one hour.
Adaptive routing is driven purely by the current state information available for the network. The term adaptive refers to the networks ability to adapt to conditions and find a route optimal to the present conditions. Unlike time-dependent routing, alternative routes are evaluated on a real-time basis and as such have no dependence on the time of day. Adaptive routing is computation intensive at the switches since the state information must be examined frequently but it is also more responsive to local network conditions.
Examples of Dynamic Routing
Perhaps the best way to make a comparison of Dynamic Routing strategies is through examples of existing methods that embody different characteristics of Dynamic Routing.
The fourth class, Decentralized Time-Dependent Routing would be cost prohibitive since each switch would need the computational power to calculate time-dependent routes for the entire network. Such a system has not been implemented and as such will not be considered in this report.
The routing algorithms presented for each example are presented for a fully-connected network. Fully connected networks are true non-hierarchical networks in which every node has a direct connection to every other node in the network. In reality, most "non-hierarchical" networks do not satisfy this criteria but routing algorithms handle this case as if the missing link were temporarily out of service.
At certain critical traffic levels a network with no trunk reservation controls can exhibit unstable behavior. [16] This behavior was first predicted by simple fixed-point analytical models and later verified through simulation. The problem can be most simply observed by looking at the effect that one overloaded link could have on a network without TR parameters. Consider link (i,j) which becomes full and begins routing overflow traffic to it's first alternative path (i,k,j). Without a TR parameter for this alternate path, overflow traffic from the (i,j) link could completely block both the (i,k) link and the (k,j) link. New traffic wishing to use the direct links (i,k) or (k,j) would then be forced to take their first alternate routes, further congesting these links in the network. The net effect of this scenario is that one overloaded link in the network could cause the network to become completely blocked to new traffic while carrying only 50% of it's potential calls.
Fortunately, simulations have shown that reserving a small number of circuits on each link for direct traffic can prevent this undesirable behavior. If slightly higher TR parameter values are set, only the first link of an alternate path needs to restrict circuits to overflow traffic.
The DNHR system is a centralized routing system utilizing Common Channel Signaling (CCS) to collect and distribute routing information from the nodes throughout the network. The CCS system is actually a packet-switched network developed by AT&T in the 70's to carry signaling information separate from voice traffic. Each node in the circuit-switched network has a dedicated 2400-bps CCS data link to exchange signaling and routing information.
Each node in a DNHR network maintains a table of two-link alternate routes in the event that the direct link between source and destination is unavailable. There are up to 14 alternate routes for any link in the network. The alternate paths are tried in order of their position in the routing table until a path can be found that won't violate the TR parameter on either of the links in that path. The switch can only verify the trunk reservation of the first link of each path since it has no state information for other links in the network. If a call from node i to node j is routed to alternate path (i,k,j) and the second link (k,j) is found to be operating beyond it's TR point, the network will perform crankback for that call.
Crankback is handled by the central facility which receives a message from the second node k in the path indicating a call could not be completed via that path. The central facility then relays the failure to complete on to the source node i which will then try to route the call through the next path in the routing table. If all paths in the routing table have been tried unsuccessfully, the call is considered block and dropped from the network.
The central computing facility is responsible for updating the switches with a new routing table every hour. The order of alternate routes in the routing table is set to distribute overflow traffic through the least busy nodes in the network based on predicted traffic flow for that hour. Obviously, the success of this method of routing depends on accurate predictions of regional network traffic. For small networks, traffic prediction is relatively straightforward using the simplex algorithm to characterize past measurements. [6]
However, in even a moderate sized communications network the simplex algorithm would require hundreds of hours to complete on standard computing hardware. A heuristic approach to this problem termed the unified algorithm was developed by Bell Laboratories.[3] The unified algorithm is a simplified yet accurate and efficient technique to design the DNHR routing tables in a fraction of the time required for the simplex calculation. In fact, the unified algorithm can analyse and route the full scale network in about one hour.
Network design is still a complex and critical task in order to fully realize the benefits of this Dynamic Routing technique. DNHR has no provisions for unexpected network behavior so the network must be designed to handle traffic well beyond the busy-hour peaks or substantial blocking could occur in unpredictable situations.
The DAR algorithm for a fully connected N-node network is as follows. Each link (i,j) has a capacity Cij and assigns a trunk reservation parameter r based on a percentage of it total capacity. Every source-destination pair (i,j) in the network maintains a tandem node k, which is the node in common to the two links of that pair's first alternative path. The tandem node is selected at random from all nodes providing a two-link path from the source to the destination.
Fresh-offered traffic between nodes i and j is first offered to the direct link and is always accepted by this link if a circuit is available. Otherwise, the call attempts the two-link alternative specified by it's tandem node k. If the call can not be routed via node k without violating the TR parameters that call is lost and the pair (i,j) selects a new tandem node at random from those available. If the call is successfully delivered through the direct link or through the tandem node that pair retains k as the tandem node.
One of the desirable properties of DAR is it's speed of response. The reselection of a tandem node only occurs after a call fails so per-call processing is low. Since only one alternate path is retained for each link, the routing decision is fast and predictable.
DAR locks on to a good alternate route, and once a route fails, another route is immediate sought. This can be thought of as a learning scheme wherein the probabilities of choosing a particular overflow path are 1 if the path was just successful, 0 if the path just failed, and 1/(N-2) for all other alternate paths.
Over a sufficiently long period, each DAR alternative path will have been selected an equal number of times, since the tandem nodes are selected at random. Since one call is lost each time a new tandem is selected an equal number of calls will be lost on each overflow path. Thus, if pt denotes the proportion of calls offered to tandem t and bt denotes the probability that a call is blocked through t, then the blocking rate ptbt is equal for all tandems t. Hence,
DAR can be easily modeled and several variations of DAR have been proposed based on the results of these simulations. In general, the variations of DAR use a method other than pure random to select new tandems from those available. These variations have been shown to yield better performance than classic DAR but they reduce the simplicity of the algorithm.
RTNR has several features that are beyond the scope of this survey but bear mentioning. It has been designed to be adaptive over multiple classes of service which makes it easily extendible to carry B-ISDN traffic. RTNR also provided ingress/egress routing service to facilitate integration of Dynamic Routing into global networks. These additional features make RTNR the technology for the foreseeable future at AT&T.
Like the other methods we've seen, RTNR first tries the direct route and the call is accepted on that route if a circuit is available. If the direct trunk is not available, the originating switch communicates with the terminating switch through the central facilities and the CCS network. The terminating switch sends the busy-idle status of all links terminating at that switch which the sending switch then compares with the status of it's own links to find the Least-Loaded Route (LLR) to the destination.
An alternate path through two links, A and B, with TR parameters rA and rB, is considered least-loaded if it has the lowest value loadA,B where
The Future of Dynamic Routing
There have been many studies into the relative performance merits and demerits of each approach to Dynamic Routing.[1,2,8,15] The common arguments heard for and against each approach are listed below.
The addition of multiple classes of service may prove too complex for localized routing decisions in a switch. RTNR has been demonstrated as a robust routing scheme that takes advantage of overall knowledge of network conditions to route multiple classes of service.
Several hybrid routing schemes, such a State- and Time-dependent Routing (STR) [11] have been proposed to address this problem but their contribution will most likely be only a transition to purely adaptive strategies such as DAR, and RTNR.
The mapping of technology to improved service is vital if communications networks are to continue providing satisfactory service to users in light of new demand for services. It has become apparent that demand will always outpace the technology to provide the service and Dynamic Routing will be a key component in meeting increased demands in future communications networks.
Conclusions
I have described three seperate Dynamic Routing schemes, each encompassing a different attitude towards the problem of routing in a constantly changing network. By extracting the commonalties of the routing problem the essential aspects of Dynamic Routing become apparent - Trunk Reservation, taking the direct route whenever possible, and limiting alternative routes to two-links. These are the rules of Dynamic Routing that can not be ignored by a successful routing scheme. The examples presented here, DNHR, DAR and RTNR have applied the available technology to each provide a different solution to the problem of Dynamic Routing but they all follow the essential rules.
List of References
Back to the Report Title Page
Last modified: December 11, 1995
trangmoe@ece.arizona.edu
Department of Electrical and Computer Engineering
The University of Arizona
Tucson, AZ 85721