Artigo Revisado por pares

Design of an extended 2D mesh network‐on‐chip and development of A fault‐tolerantrouting method

2019; Institution of Engineering and Technology; Volume: 13; Issue: 3 Linguagem: Inglês

10.1049/iet-cdt.2018.5032

ISSN

1751-861X

Autores

Yota Kurokawa, Masaru Fukushi,

Tópico(s)

Parallel Computing and Optimization Techniques

Resumo

IET Computers & Digital TechniquesVolume 13, Issue 3 p. 224-232 ArticleFree Access Design of an extended 2D mesh network-on-chip and development of A fault-tolerantrouting method Yota Kurokawa, Yota Kurokawa Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Tokiwadai 2-16-1, Ube, Yamaguchi, 755-8611 JapanSearch for more papers by this authorMasaru Fukushi, Corresponding Author Masaru Fukushi mfukushi@yamaguchi-u.ac.jp Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Tokiwadai 2-16-1, Ube, Yamaguchi, 755-8611 JapanSearch for more papers by this author Yota Kurokawa, Yota Kurokawa Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Tokiwadai 2-16-1, Ube, Yamaguchi, 755-8611 JapanSearch for more papers by this authorMasaru Fukushi, Corresponding Author Masaru Fukushi mfukushi@yamaguchi-u.ac.jp Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Tokiwadai 2-16-1, Ube, Yamaguchi, 755-8611 JapanSearch for more papers by this author First published: 11 April 2019 https://doi.org/10.1049/iet-cdt.2018.5032Citations: 2AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Abstract This paper proposes an extended two-dimensional mesh Network-on-Chip architecture for region-based fault tolerant routing methods. The proposed architecture has an additional track of links and switches at the four sides of a mesh network so that it can partially reconfigure the network around faulty regions to provide new detour paths. This allows to simplify the complex routing rules of the existing fault-tolerant routing methods and avoid long detour routing paths. Modified routing method is also proposed for the new architecture and the deadlock freeness is proved. Simulation results show that the proposed architecture with the modified routing method reduces the average communication latency by about 39% compared to the existing state-of-the-art method at the expense of low hardware overhead. 1 Introduction Network-on-Chip (NoC) has emerged as a promising paradigm for the design of large scale System-on-Chip, where hundreds or thousands of circuit modules (i.e. nodes) are implemented into a single VLSI chip. In NoC, nodes are connected by an interconnection network through routers, and data communications between nodes are done by transmitting packets on the network. Using global interconnection structure reduces the difficulty of wiring design and the latency of signal transmission in comparison to point-to-point signal wires or shared buses [[1]]. The most popular network topology is two-dimensional (2D) mesh which offers a simple and regular structure and a small wire length between adjacent nodes. Hence; 2D mesh NoC is appealing for wide range of applications such as robotics, multimedia, networking, and cloud, as well as scientific parallel computing. An important and fundamental issue that must be addressed for NoCs is fault tolerance. Defects during fabrication and faults in system runtime are inevitable. Due to the defects or faults, regular 2D mesh networks become irregular ones, which makes packet routing complex. A deadlock occurs when some packets cannot proceed toward their destinations forever because of the blocking by each other. Therefore, in NoC, packet routing must ensure deadlock freeness, while providing fault-tolerance capability. There have been many fault-tolerant routing methods proposed for the 2D mesh NoCs, e.g. [[2]-[10]]. They can be classified into four categories in terms of the mechanism to prevent deadlocks; virtual channel (VC)-based methods, table-based methods, buffer-less methods, and region-based methods. The VC-based methods adopt several VCs to multiplex networks. Janfaza et al. [[11]] proposed an adaptive routing method which employs timeout and packet reinjection. Information of intermediate nodes is recorded to prevent livelock. Sinha et al. [[12]] proposed a fault-tolerant routing method based on the common XY routing method. This method does not guarantee 100% packet reachability when failure rate increases. Those methods provide adaptiveness for path selection using VCs; however, adding VCs involves adding buffer space and a complex control circuit to the routers. It has been reported [[13]] that routers with VCs require 2 to 3 times as many circuits as those without VCs. It also makes the routers prone to be faulty. The table-based methods adopt a routing table in each router to route packets to the destination. The routing table contains status of the network and/or failure information. Hsin et al. [[14]] proposed a method which employs ant colony optimization. This method does not guarantee 100% packet reachability. Liu et al. [[15]] proposed a method which introduces coarse and fine-grained look-ahead schemes to obtain the information of other routers in the range of 4 hops. Zhao et al. [[16]] proposed a method to provide minimal paths using the information of whole networks. The drawback of this method is high calculation cost for the minimal paths. Mota et al. [[17]] proposed a method to reduce table size by the concept of regions. In general, the table-based methods offer high flexibility in faults/congestion avoidance; however, they require more circuit space to implement a table in the routers and complex calculations to create/update the table. The buffer-less methods [[18]] eliminate the input/output buffers to reduce the size of routers. Instead, they use a cache memory in the core to temporally store packets. Basically, they always route a packet to one of the output ports without keeping the packet waiting (i.e. hot-potato routing). In this method, packets will be discarded using the control information stored in the packets and retransmitted to avoid deadlock/livelock. Yao et al. [[19]] proposed a method which considers the QoS of the buffer-less method. This method gives a priority for each packet which is used in the flow control in the routers. The main drawback of buffer-less methods compared with the other methods is high communication latency due to the retransmission. The region-based methods forms a detour path called Fault Ring (FR) around a cluster of faulty nodes (i.e. faulty region). By defining the moving direction of packets for each FR, deadlock- and livelock-freeness are guaranteed. Chen et al. [[13]] proposed a region-based method called Message Route, and Holsmark et al. [[20]] corrected fraws of Message Route. Fukushima et al. [[21]] proposed a method called Position Route, which enables to reduce the size of clusters of faulty nodes. The region-based methods guarantee 100% packet reachability for arbitrary fault distribution without using VCs and tables, thus require less circuit area than the VC-based and table-based methods. However, there exist several types of FRs depending on their location and they require specific routing rules to detour the faulty regions. This makes routing rules very complex and leads to an increase in the circuit area and the delay of routing process. To overcome the drawback of the region-based methods, we propose a new NoC architecture which has an additional track of links and switches at each side (i.e. boundary) of an original 2D mesh. The basic idea of the proposed architecture is to utilize the different functions of packet routing and partial network reconfiguration. The additional track is used to partially reconfigure the 2D mesh network to provide an alternative detour path for each FR touched on the boundary of the 2D mesh NoC. In addition, we propose a routing method suitable for the proposed architecture by improving the existing region-based fault-tolerant routing method [[21]]. The remainder of this paper is organized as follows. Section 2 presents the NoC architecture and existing region-based fault-tolerant routing methods. Section 3 describes the proposed architecture and the routing method. Section 4 reveals the area overhead of the proposed architecture and the routing performance with the proposed method. Section 5 provides some brief concluding remarks. 2 NoC and fault-tolerant routing 2.1 2D mesh NoC We focus on a 2D mesh NoC which has m and n homogeneous nodes in row and column directions, respectively. Each node consists of a processor core and a router circuit, and has a unique address , where and . In the 2D mesh NoCs, routers are connected with at most four neighbors (i.e. north, south, east, and west) via two unidirectional links to configure the mesh topology. In this paper, a common assumption is made for faults; permanent faults are considered to be associated only with nodes, and links are fault-free. This does not lose generality, because a single faulty link can be treated as two faulty nodes connected on it. 2.2 Region-based fault-tolerant routing Basically, in the typical wormhole routing, a packet is divided into several flits (i.e. flow control digits) and the flits are routed in a pipeline fashion. A header flit, which contains the address of destination node, leads the packet through the network. Region-based routing is an approach of fault-tolerant routing, where faulty regions are defined in advance and detoured by predefined routing rules. Example methods include Message Route [[13], [20]] and Position Route [[21]]. Because this approach does not require any virtual channels to guarantee the deadlock freeness, routers can be implemented at a low area cost, thus suitable for NoC. We give some definitions to describe the routing behavior of Position Route [[21]]. Deactivated node is defined as a fault-free node which has at least one faulty or deactivated neighbor node in both row and column. Fault Block (FB) is a minimum rectangular region which covers more than one faulty/deactivated nodes. Fault Ring (FR) is a set of fault-free nodes that are located around a FB. The north-east node of each FR is called Reference node (Rn). FB is classified into three types depending on the location, i.e. s-chain, f-chain, and f-ring. If a FR touches only south edge of NoC, it is called s-chain. If a FR is on the west edge of NoC, it is called f-chain. If a FR is neither s-chain nor f-chain, it is called f-ring. If a FR is on the north and/or east edge of the NoC, it is treated as f-ring. The examples of these three FRs are illustrated in Fig. 1. Fig. 1Open in figure viewer Types of FR and routing examples In the region-based fault-tolerant routing methods, the concept of message is introduced to guarantee the deadlock-freeness. The message is given for each packet and it has three types, Row First (RF), Column First (CF), and Row Only (RO). Each message represents the routing phase for the packets as follows; RF: westward movement CF: northward or southward movement RO: eastward movement For ease of explanation, in this paper, CF message is denoted as follows; CF-SN: northward movement (from South to North) CF-NS: southward movement (from North to South) The message is stored in every packets and updated at each node based on the addresses of source , destination , and current nodes by the following rules; Initially, if is to the west of , the massage is set to RF at . If is not to the west but to the north or south of , it is set to CF. Otherwise, it is set to RO. At every intermediate nodes s between and , the message is updated by either one of the following rules; (1) If and is on the same column, RF is updated to CF, or (2) if and are on the same row, CF is updated to RO. Any other message updates are prohibited. Note that message is updated in the order of priority, i.e. RF, CF, and RO. In other words, once message becomes CF or RO, it never changes to RF or CF, respectively. The example of messages is also shown in Fig. 1. Fig. 1 also shows a simple routing example for each FR. In the case of f-ring (Fig. 1a), the packet starts from with RF message and detours the FB in a clockwise direction. Then, the message is updated to CF-SN at node [1, 1], and the packet proceeds toward . In the case of s-chain (Fig. 1b), the packet starts from with CF-NS message and proceeds to the south. Then, the message is updated to RO at node [0, 0], and the packet detours the FB in a clockwise direction. In the case of f-chain (Fig. 1c), the packet starts from with RF message and detours the FB in a counter-clockwise direction. Then, the message is updated to CF-SN at node [0, 2], and the packet proceeds toward . By using the message and defining the proposer routing rules for each FR, the region-based routing methods provide a deterministic path for any source-destination pair and guarantee the deadlock- and livelock-freeness (see [[13], [20], [21]] for details). 2.3 Problems of the existing routing methods We point out the problems of the existing routing methods. First, routing rules defined in the existing methods are complicated. For every FRs and messages, proper rules are defined to detour FBs so as not to cause deadlocks. Because those rules are implemented as a circuit module in each router, this will increase the circuit area and process delay of routing decision. Second, exceptional routing mode called Sp-Mode is used. In Position Route [[21]], all nodes surrounded by s-chain and f-chain are subject to be Sp-Mode. Fukushima et al. [[21]] proved that Sp-Mode is essential to avoid deadlocks. Fig. 2 shows an example of Sp-Mode and its routing. As shown in Fig. 2a, a complex deadlock occurs by four packets (from to for ) if Sp-Mode is not considered. On the other hand, as shown in Fig. 2b, packets with CF-NS message (i.e. the packet from to ) is routed to the north in the area of Sp-Mode to prevent the deadlock. Since routing in this mode needs additional information and control, routing rules are further complicated. Moreover, as shown in Fig. 2, routing paths in large S-chains and F-rings are likely to become longer to detour the FBs (e.g. packets from to and from to ), resulting in a large communication latency. Fig. 2Open in figure viewer Example of Sp-Mode Thus, the above two problems lead to the practical disadvantages for NoC. 3 Extended 2D mesh NoC 3.1 Proposed architecture To overcome the problems of the existing methods, we propose an extended 2D mesh NoC architecture. Fig. 3 shows the proposed architecture, where a track of links and switches is added at the four sides of 2D mesh. The possible switch states are also shown in the figure. By using additional new links and switches, it can partially reconfigure the network around FBs touched on the south and west boundary to provide new detour paths. Fig. 3Open in figure viewer Routing examples on the proposed architecture Fig. 4 shows routing examples on the proposed architecture. As shown in this figure, it is possible to take a clockwise detour path in the southern and western boundary. Therefore, even for f-chain and s-chain, the same routing rules as the f-ring can be applied. Furthermore, by eliminating the rules for f-chain and s-chain, there is no need for considering Sp-Mode. The proposed architecture has the effect of reducing the circuit area of all routers in the network. Due to the above reasons, the addition of relatively small hardware enables to overcome the two problems of the existing methods. Fig. 4Open in figure viewer Proposed NoC architecture and switch state Note that each additional switch is changed its state electrically. The state can be easily determined from the fault information of the two neighboring nodes, once the distribution of faulty nodes is fixed. Fig. 5 shows an example of the switch states on the south edge of the network. The switch states on the other edges can be determined similarly. Fig. 5Open in figure viewer Example of a switch state decision on the south edge 3.2 Routing method and deadlock freeness Fig. 6 shows the fault-tolerant routing method for the proposed extended 2D mesh NoC. Basically, the proposed method is based on the existing routing method [[21]]. However, since s-chain, f-chain, and Sp-Mode do not exist, routing rule is significantly simplified. Some modification is also done for ring selection method to guarantee correct routing for overlapped FRs. Fig. 6Open in figure viewer Proposed routing method We now prove the deadlock freeness of the proposed routing method. Lemma 1.The proposed routing method provides deadlock-free routing in the extended 2D mesh NoCs. Proof.Deadlock occurs when some packets wait on one another to move forward (i.e. circular waiting). Therefore, we show that circular waiting never occurs in both clockwise and counterclockwise directions in our routing method.First, we define router functions. EW is defined as the router function that a packet sent from the East neighbor is sent to the West neighbor. NE turn is defined as the function that a packet from the North neighbor is sent to the West neighbor changing the direction 90 degree on the router. Functions for other directions can be defined similarly.For a clockwise direction, we prove that packet transmission path including ES and SW turns never occurs on the same column. ES turn occurs when a packet with RO message encounters the west boundary of a FR. Suppose there exists a packet on the same column as the west boundary of the FR which must proceed southward below the FR. Due to the routing rule in Fig. 6, such packet is once forwarded west neighbor and then proceeds southward. Therefore, no packet proceeds southward along the west boundary over the FR and connects to SW turn.For a counter-clockwise direction, we prove that packet transmission path including EN and NW turns never occurs on the same column. EN turn occurs when the a packet with RO message or a packet with CF-SN message reaches the southeast corner of a FR. Suppose there exists a packet on the same column which must proceed northward over the Rn of the FR. In our routing rule, such packet moves a clockwise direction. Therefore, no packet proceeds northward along the east boundary over the FR and connects to SW turn.Thus, the proposed method is deadlock-free. □ 3.3 Problems of the extended 2D mesh NoC One problem we need to consider in the proposed method is longer paths possibly occurred on FBs on the south boundary of NoCs. Fig. 7 shows the routing example of the existing method. When a destination node exists above the s-chain, the packet detour the FB in a counter-clockwise direction. Fig. 8 shows the routing example of the proposed method for the same faulty node distribution. In the proposed method, FRs are unified to f-ring by the external links; hence, the packet must detour the FB in a clockwise direction by the routing rule shown in Fig. 6. This results in a longer path length and increase in the communication latency in the extended 2D mesh NoCs. Fig. 7Open in figure viewer Routing example of existing routing method Fig. 8Open in figure viewer Routing example of long path 3.4 Routing method based on the formation of partial detour paths and deadlock freeness To overcome the problem of the proposed method shown in Fig. 8, we propose a routing method based on the formation of partial detour paths. For that purpose, after creating FRs, a partial detour path is formed on the east neighbor of the FRs. The proposed method with the partial detour paths is summarized in Fig. 9. Fig. 9Open in figure viewer Proposed routing method with partial detour paths Fig. 10 shows the routing example of the proposed partial detour path for the same faulty node distribution as in Figs. 7 and 8. Thanks to the partial detour path, the packet can detour the FB in a counter-clockwise direction along the partial detour path. As the figure shows, this method allows to take a shorter path like in the s-chain in the existing method. Fig. 10Open in figure viewer Routing example of partial detour path We now prove the deadlock freeness of the proposed routing method. Lemma 2.The proposed routing method with partial detour paths provides deadlock-free routing in the extended 2D mesh NoCs. Proof.For a counter-clockwise direction, we prove that packet a transmission path including EN and NW turns never occurs on the same column. EN turn occurs when the a packet with RO message or a packet with CF-SN message reaches the southeast corner of a FR. In our routing rule, such packet moves a counter-clockwise direction. Therefore, no packet proceeds northward along the east boundary over the FR and connects to NW turn.Thus, the proposed routing method with partial detour paths is deadlock-free. □ 4 Performance evaluation 4.1 Area reduction effect Fig. 11Open in figure viewer Block diagram of a router To evaluate the effect of the proposed extended 2D mesh NoC in reducing the circuit of allrouters in the network, we designed two routers for the existing method [[21]] and the proposed method. Fig. 11 shows the block diagram of the router.Routers are responsible for forwarding packets to the destination nodesdetouring faulty nodes. In the common wormhole routing in NoCs, a packets isdivided into several flits (i.e. flow control digits). Each router is composedof five routing circuits, five input/output units, a crossbar switch, and aswitch allocator. An incoming flit is stored in a buffer in the input unitassociated with the incoming port, and an output port is decided based on therouting method implemented in the route circuit. Then, the crossbar switch isset up to connect the input and the output port by the switch allocator. In thisevaluation, the depth of each input/output buffer is set to be 8/1 flits for16bits flit. We designed the above routers for the existing and the proposed routing methods with Verilog HDL, and built a normal and an extended 2D mesh NoC of 10 10 nodes using them. Note that cores were not designed in both NoCs. We used Xilinx Vivado EDA tool for synthesizing the NoCs. The target FPGA device is Vertex 7 xc7vx485tffg1761-2. We verified the functionality of the circuits of the NoCs using simulator by injecting several test packet patterns. The router of the normal 2D mesh NoC needs 1794 Look Up Tables (LUTs) in the FPGA device, while that of the extended 2D mesh NoC needs 1174 LUTs, which indicates about 35% area reduction. This is the effect of eliminating the routing rules for s-chain, f-chain and Sp-Mode in the proposed extended 2D mesh NoCs. The designed normal 2D mesh NoC consumes 179,400 LUTs, while the extended 2D mesh NoC consumes 119,416 LUTs. The extended 2D mesh NoC with the proposed routing method reduces the circuit area by about 33% compared with the normal 2D mesh NoC with the existing routing method. Those two reduction rates imply that the circuit area for the additional links and switches is quite small, which will be explored in detail in the next section. The above result is interesting in that the reduction of area is achieved by adding extra circuits in the proposed approach. 4.2 Area overhead To evaluate the area overhead of the additional links and switches, we estimate the total area of a normal and the extended 2D mesh NoC. Fig. 12 shows the parameters used in the estimation. Let , , , and be the width of a core, a router, a switch, and a link, respectively, and be the length of a link between adjacent switches, and s be the length of a link between adjacent two modules (i.e. between two cores in the normal 2D mesh NoC, and between a core and a switch in the extended 2D mesh NoC). Fig. 12Open in figure viewer Area parameter The area required for a normal 2D mesh NoC of nodes without additional links and switches is represented by (1)In the extended architecture, the length of an inside link between two adjacent nodes is . Let w represent the width of a wire and also the space between two adjacent wires. For the links propagating 16 bits flit data in both directions, the link width is . The link length (vertical) and (horizontal) between the outer and inner switches are expressed by the following equations, respectively. (2) (3)Let and be the additional area for one node in the south and west sides, respectively. Examples of and are also shown in Fig. 12. Using and , and are given by (4) (5)Total area required for the extended 2D mesh NoC of nodes is represented by (6) is the area derived by assuming that the link length between adjacent two cores in the normal 2D mesh NoC is the same as that in the extended 2D mesh NoC (i.e. ). We evaluate the area overhead defined as follows. (7)Here, following assumptions are made; for the evaluation with the variety of cores, , , and nm. The coefficient of is obtained by the evaluation in the previous section; that is, the ratio of LUTs for a router and an additional switch (i.e. ) is 1174 : 56. Fig. 13 shows AO as a function of for . When , and . When , and . As shown in the figure, despite the addition of the hardware, is less than 2% and negligibly small. Fig. 13Open in figure viewer Area overhead 4.3 Average latency To evaluate the routing performance of our proposed architecture, computer simulations have been conducted. A cycle-accurate custom simulator was developed in C to simulate the behavior of flits in each router. A traditional router is assumed, as shown in Fig. 11. In one cycle of the simulation, one of the following processes is applied for a flit [[22]]; Routing computation: an output port is determined from the addresses of and . Switch allocation: arbitration and setup of the crossbar switch are performed. Switch traversal: the flit traverses the crossbar switch and is stored in the determined output buffer. Each flit passes through a router in 3 cycles if there is no contention; flit moves to the adjacent router at the 4th cycle. We measured average latency for the proposed and the existing routing methods on the extended and normal 2D mesh NoCs, respectively. Latency is defined for each packet as the total cycles to reach from generating at . Each trial is repeated 1000 times changing fault distribution, and the average latency is measured. Other simulation parameters are summarized in Table 1. Table 1. Simulation parameters network size 1010 fault rate 2, 4, 6, 8, 10, 12, 14% packet length 8 flits packet generation rate 0.051.0 packets/cycle/network input/output buffer depth 4/1 flits total simulation length 50,000 cycles Figs. 14-20 show the simulation results for 2–14% fault rates, respectively. First, we compare the average latencies of the proposed method (‘E2DM-NoC’ in Figs. 14-20) with that of the existing method (‘2DM-NoC’). When packet generation rate is relatively low, both methods show almost the same average latencies. On the other hand, when it is high, the proposed method outperforms the existing method. To see the difference in both methods, Table 2 summarizes packet generation rates at which maximum difference is observed, the average latency for each method at the packet generation rate, and the reduction rate. In the case of 2 and 4% fault rates, the proposed method with the extended architecture reduces average latency by about 2.5 and 4.4% at the packet generation rate of 0.70 and 0.70, respectively, compared with the existing state-of-the-art method [[21]]. In the case of 6%, 8%, and 10% fault rates, the reduction is about 10, 17, and 17% at the packet generation rate of 0.55, 0.50, and 0.45, respectively. Those reductions are because of the shorter detour paths provided by the additional links. Table 2. Average latency and reduction rate Fault rate, % Packet generation rate Average latency, cycle Reduction rate, % (1) 2DM-NoC (2) E2DM-NoC (3) Proposed method 2 0.7 128.31 125.15 78.06 2.5 39 4 0.7 471.49 450.55 398.89 4.4 15 6 0.55 251.55 225.98 196.90 10 22 8 0.5 352.83 293.74 272.95 17 23 10 0.45 373.75 310.74 302.91 17 19 12 0.35 484.76 464.01 463.49 4.3 4.4 14 0.2 44.27 43.53 43.43 1.7 1.9 Fig. 14Open in figure viewer Average latency (2% faults) Fig. 15Open in figure viewer Average latency (4% faults) Fig. 16Open in figure viewer Average latency (6% faults) Fig. 17Open in figure viewer Average latency (8% faults) Fig. 18Open in figure viewer Average latency (10% faults) Fig. 19Open in figure viewer Average latency (12% faults) Fig. 20Open in figure viewer Average latency (14% faults) Next, we compare the average latencies of the proposed method (‘E2DM-NoC of extend ring’ in Figs. 14-20) with those of others. Up to a fault rate of 10%, the proposed method outperforms other two methods. In the case of 2, 4, 6, 8, and 10% fault rates, the reduction is about 39, 15, 22, 23, and 19% at the packet generation rate of 0.70, 0.70, 0.55, 0.50, and 0.45, respectively. When fault rate exceeds about 14%, those methods show almost the same latencies. This is because, in such higher fault rates, partial detour paths are less likely to be crated as they may include faulty nodes. Then, the effect of using additional links diminishes by taking longer detour paths. Note that, even in such cases, the advantage of the proposed method to reduce the circuit area of all routers remains unchanged. To further reduce the average latency, we need to add virtual channels and employ certain flow control mechanism [[23], [24]] in the routers to preferentially route non-blocked packets or high priority packets. 5 Conclusion In this paper, we have proposed an extended 2D mesh NoC architecture for region-based fault-tolerant routing methods and modified deadlock free routing method. The proposed architecture has an additional track of links and switches at the four sides of a mesh network to partially reconfigure the network around FBs to provide new detour paths. This allows to simplify the complex routing rules of the existing methods and avoid long detour paths. We have designed NoC routers for the existing and the proposed routing methods and built a normal and an extended 2D mesh NoCs of nodes using them. The NoC router for the proposed method and the extended 2D mesh NoC reduced the circuit area by about 35 and 33%, respectively, compared with the existing ones. The area overhead of the additional links and switches was derived by the estimation of the total area of the extended 2D mesh NoC and found to be less than 2%. Those results are interesting in that area reduction is achieved by adding small amount of hardware. Average communication latency was also evaluated by computer simulations and the results indicated that the proposed architecture with the routing method reduces average latency by up to 39% compared with the existing method. Our future work includes applying this approach to other fault-tolerant routing methods to show the effectiveness and extending the proposed routing method to handle several alternative paths for congestion avoidance. 6 Acknowledgment This work was supported by JSPS KAKENHI Grant Number 18K11217 and Program to Disseminate Tenure Tracking System, MEXT, Japan. 7 References [1]Nurmi, J., Tenhunen, H., Isoaho, J., et al.: ‘ Interconnect-centric design for advanced SoC and NoC’ ( Springer, Berlin, 2004) [2]Charasani, S., Boppana, R.V.: ‘Communication in multicomputers with nonconvex faults’, IEEE Trans. Comput., 1997, 46, (5), pp. 616– 622 [3]Lee, D., Parikh, R., Bertacco, V.: ‘Highly fault-tolerant NoC routing with application-aware congestion management’. Proc. of the 9th Int. Symp. on Networks-on-Chip, Vancouver, Canada, September 2015, pp. 1– 8 [4]Tasi, W.C., Chu, K.C., Hu, Y.H., et al.: ‘Non-minimal,turn-model based NoC routing’, Microprocess. Microsyst., 2013, 37, (8), pp. 899– 914 [5]Glass, C.J., Ni, L.M.: ‘The turn model for adaptive routing’, J. ACM, 1994, 41, (5), pp. 874– 902 [6]Chen, Y.Y., Chang, E.J., Hsin, H.K., et al.: ‘Path-diversity-aware fault-tolerant routing algorithm for network-on-chip systems’, IEEE Trans. Parallel Distrib. Syst., 2017, 28, (3), pp. 838– 849 [7]Glass, C.J, Ni, L.M.: ‘Fault-tolerant wormhole routing in meshes without virtual channels’, IEEE Trans. Parallel Distrib. Syst., 1996, 7, (6), pp. 620– 636 [8]Chiu, G.M.: ‘The odd-even turn model for adaptive routing’, IEEE Trans. Parallel Distrib. Syst., 2000, 11, (7), pp. 729– 738 [9]Mejia, A., Flich, J., Duato, J., et al.: ‘Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori’. Proc. of the 40th Int. Parallel and Distributed Processing Symp., Rhodes Island, Greece, April 2006, pp. 84– 93 [10]Wu, J.: ‘A fault-tolerant and deadlock-free routing protocol in 2D meshes based on odd-even turn model’, IEEE Trans. Comput., 2003, 52, (9), pp. 1154– 1169 [11]Janfaza, V., Baharlouei, E.: ‘A new fault-tolerant deadlock-free fully adaptive routing in NOC’. IEEE East-West Design & Test Symp. (EWDTS), Novi Sad, Sebia, September 2017, pp. 1– 6 [12]Sinha, D., Roy, A., Kumar, K.V., et al.: ‘Dn-FTR: fault-tolerant routing algorithm for mesh based network-on-chip’. 4th Int. Conf. on Recent Advances in Information Technology (RAIT), Dhanbad, India, March 2018, pp. 1– 5 [13]Chen, K.H., Chiu, G.M.: ‘Fault-tolerant routing algorithm for meshes without using virtual channels’, J. Inf. Sci. Eng., 1998, 14, (4), pp. 765– 783 [14]Hsin, H.K., Chang, E.J., Lin, C.A., et al.: ‘Ant colony optimization-based fault-aware routing in mesh-based network-on-chip systems’, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., 2014, 33, (11), pp. 1693– 1705 [15]Liu, J., Harkin, J., Li, Y., et al.: ‘Fault-tolerant networks-on-chip routing with coarse and fine-grained look-ahead’, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., 2016, 35, (2), pp. 260– 273 [16]Zhao, H., Bagherzadeh, N., Wu, J.: ‘A general fault-tolerant minimal routing for mesh architectures’, IEEE Trans. Comput., 2017, 66, (7), pp. 1240– 1246 [17]Mota, R.G., Silveira, J., Silveira, J., et al.: ‘Efficient routing table minimization for fault-tolerant irregular network-on-chip’. IEEE Int. Conf. on Electronics, Circuits and Systems (ICECS), Monte Carlo, Monaco, December 2016, pp. 632– 635 [18]Moscibroda, T., Mutlu, O.: ‘A case for bufferless routing in on-chip networks’. 36th annual int. symp. on Computer architecture, New York, USA, June 2009, pp. 196– 207 [19]Yao, Z., Sui, X., Xu, T., et al.: ‘QBLESS: a case for QoS-aware bufferless NoCs’. IEEE 22nd Int. Symp. of Quality of Service (IWQoS), Hong Kong, China, May 2014, pp. 93– 98 [20]Holsmark, R., Kumar, S.: ‘Corrections to chen and chiu's fault tolerant routing algorithm’, J. Inf. Sci. Eng., 2007, 23, (6), pp. 1649– 1662 [21]Fukushima, Y., Fukushi, M., Yairi, I.E.: ‘A region-based fault-tolerant routing algorithm for 2D irregular mesh network-on-chip’, J. Electron. Test., 2013, 29, (3), pp. 415– 429 [22]Dally, W.J., Towles, B.: ‘ Principles and practices of interconnection networks’ ( Morgan Kaufman Publishers, San Francisco, CA, USA, 2004) [23]Ruaro, M., Carara, E.A., Moraes, F.G.: ‘Runtime adaptive circuit switching and flow priority in NoC-based MPSoCs’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2015, 23, (6), pp. 1077– 1088 [24]Avramenko, S., Azad, S.P., Esposito, S., et al.: ‘Qosinnoc: analysis of QoS-aware NoC architectures for mixed-criticality applications’. IEEE 21st Int. Symp. on Design and Diagnostics of Electronic Circuits & Systems (DDECS), Budapest, Hungary, April 2018, pp. 67– 72 Citing Literature Volume13, Issue3Special Issue: Defect and Fault Tolerance in VLSI and Nanotechnology SystemMay 2019Pages 224-232 FiguresReferencesRelatedInformation

Referência(s)