TY - JOUR

T1 - A fault tolerant routing scheme for hypercubes

AU - Day, Khaled

AU - Harous, Saad

AU - Al-Ayyoub, Abdel Elah

PY - 2000

Y1 - 2000

N2 - An efficient distributed fault-tolerant routing algorithm for the hypercube is proposed based on the existence of a complete set of node-disjoint paths between any two nodes. Node failure and repairs may occur dynamically provided that the total number of faulty nodes at any time is less than the node-connectivity n of the n-cube. Each node maintains for each possible destination which of the associated node-disjoint paths to use. When a message is blocked by a node failure, the source node is warned and requested to switch to a different node-disjoint path. The methods used to identify the paths, to propagate node failure information to source nodes, and to switch from one routing path to another incur little communication and computation overhead. We show that if the faults occur reasonably apart in time, then all messages will be routed on optimal or near optimal paths. In the unlikely case where many faults occur in a short period, the algorithm still delivers all messages but via possibly longer paths. An extension of the obtained algorithm to handle link failures in addition to node failures is discussed. We also show how to adapt the algorithm to k-ary n-cube networks. The algorithm can be similarly adapted to any interconnection network for which there exists a simple characterization of node-disjoint paths between its nodes.

AB - An efficient distributed fault-tolerant routing algorithm for the hypercube is proposed based on the existence of a complete set of node-disjoint paths between any two nodes. Node failure and repairs may occur dynamically provided that the total number of faulty nodes at any time is less than the node-connectivity n of the n-cube. Each node maintains for each possible destination which of the associated node-disjoint paths to use. When a message is blocked by a node failure, the source node is warned and requested to switch to a different node-disjoint path. The methods used to identify the paths, to propagate node failure information to source nodes, and to switch from one routing path to another incur little communication and computation overhead. We show that if the faults occur reasonably apart in time, then all messages will be routed on optimal or near optimal paths. In the unlikely case where many faults occur in a short period, the algorithm still delivers all messages but via possibly longer paths. An extension of the obtained algorithm to handle link failures in addition to node failures is discussed. We also show how to adapt the algorithm to k-ary n-cube networks. The algorithm can be similarly adapted to any interconnection network for which there exists a simple characterization of node-disjoint paths between its nodes.

UR - http://www.scopus.com/inward/record.url?scp=26444502158&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=26444502158&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:26444502158

VL - 13

SP - 29

EP - 44

JO - Telecommunication Systems

JF - Telecommunication Systems

SN - 1018-4864

IS - 1

ER -