Communication control mechanism in reconfiguration network on chip architectures

30 PROPOSED COMMUNICATION CONTROL MECHANISM IN RECONFIGURABLE NOCS ARCHITECTURES .... The network topology, communication mechanisms, routing modes, buffering strategies, routing algorit

Trang 1

VIETNAM NATIONAL UNIVERSITY HANOI

UNIVERSITY OF ENGINEERING AND TECHNOLOGY

TECHNOLOGY

Hanoi - 2015

Trang 2

VIETNAM NATIONAL UNIVERSITY HANOI

UNIVERSITY OF ENGINEERING AND TECHNOLOGY

THI-THUY NGUYEN

COMMUNICATION CONTROL MECHANISM IN RECONFIGURABLE NETWORK-ON-CHIPS

ARCHITECTURES

Branch: Electronics – Telecommunications Technology

Major: Electronics Engineering

Code: 60520203

MASTER’S THESIS OF ELECTRONICS – TELECOMMUNICATIONS

TECHNOLOGY

Supervisor: Assoc Prof Xuan-Tu Tran

Hanoi - 2015

Trang 4

TABLE OF CONTENTS

AUTHORSHIP 1

TABLE OF CONTENTS 2

List of Figures 5

List of Tables 8

List of Abbreviations 9

INTRODUCTION 10

12

NETWORK-ON-CHIP 12

1.1 Basic concepts 12

1.1.1 Basic components of a NoC 12

1.1.2 Network topology 13

1.1.3 Communication protocol 15

1.1.4 Routing modes 16

1.1.5 Buffering strategies 17

1.1.6 Routing algorithms 19

1.1.7 Data blocking 19

1.1.8 Quality of the service 21

1.2 Communication mechanism in NoC 21

1.2.1 Definition and classification 21

1.2.2 Previous works 24

1.3 Reconfigurable NoCs 27

1.4 Conclusions 29

30

PROPOSED COMMUNICATION CONTROL MECHANISM IN RECONFIGURABLE NOCS ARCHITECTURES 30

2.1 Target reconfigurable NoC architectures 30

Trang 5

2.1.1 General parameters 30

2.1.2 RNoC platform 31

2.2 Proposed Communication control mechanism 33

2.2.1 Tracking and replacing routing information mechanism 36

2.2.2 Flit structures 37

2.3 Conclusions 40

41

ARCHITECTURE OF MODIFIED ROUTER AND NETWORK INTERFACE 41

3.1 Modified architecture for reconfigurable network router 41

3.2 NI architecture 45

3.2.1 C2r_buffer 45

3.2.2 C2r_controller 46

3.2.3 Flitizer 47

3.2.4 Routing_table 49

3.2.5 Updating _path 50

3.2.6 R2c_buffer 51

3.2.7 R2c_controller 51

3.2.8 De_flitizer 52

3.3 Conclusions 53

54

VERIFICATION, IMPLEMENTATION AND EVALUATION 54

4.1 Verification method 54

4.1.1 Verifying basic function of NI 55

4.1.2 Verifying the mechanism of tracking and replacing routing information 56

4.2 Implementation result 61

4.3 Evaluation 62

4.4 Conclusions 64

CONCLUSIONS 65

Publications 67

References 68

Trang 6

Appendix 70

Trang 7

List of Figures

Figure 1-1: An example architecture of a 2D mesh 13

Figure 1-2: Popular NoC topologies: (a) ring and chordal ring; (b) fat-tree; butterfly fat-tree; (d) 2D-mesh; (e) 2D-torus; (f) 2D folded torus [5] 14

Figure 1-3: Three frequent routing modes of NoC: (a) Store And Forward; (b) Virtual Cut Though; (c) Wormhole [5] 16

Figure 1-4: Four popular buffering strategies of NoC: (a) Input queuing;(b) Output queuing;(c) Virtual output queuing;(d)Virtual channel priority input queuing [5] 18

Figure 1-5: Deadlock example [5] 20

Figure 1-6: Separated buffer virtual channels are used to solve the deadlock problem The data of one VC can use the physical link when the others one in the same port is stalled[4] 20

Figure 1-7: An example of livelock The ouside routers can’t reach the inside once because they are in deadlock [5] 21

Figure 1-8: The classification of NOC’s flow cotrol mechanisms 22

Figure 1-9: Connection implementation of end-to-end flow control [13] 24

Figure 1-10: NoC architecture with injection level flow control strategy [14] 25

Figure 1-11: Router architecture [15] 26

Figure 1-12: Proposed idea of T-error [12] 27

Figure 1-13: Logic level of T-error [12] 27

Figure 2-1: RNOC router architecture [2] 31

Figure 2-2: (a) Probihited router is in the middle of a straight segment routing path, (b) the probihited router is at the conrner of routing path [2] 32

Figure 2-3: The prohibited router appears just before or just after the corner of the routing path [2] 33

Figure 2-4: Block diagram of proposed communicaiton mechanism 34

Figure 2-5: Router-to-router/NI interface and send/accept protocol [23] 35

Figure 2-6: A flow diagram of the tracking and replacing routing information mechansim 37

Figure 2-7: Structure of header flit 38

Trang 8

Figure 2-8: Structue of body flit of a normal packet 39

Figure 2-9: A structue of body flit of a special packet 39

Figure 2-10: A structue of tail flit 39

Figure 3-1: Micro-architecture of INPUT PORT of RNOC’s router [2] 42

Figure 3-2: Modifying the VC_Demux of the North input port 43

Figure 3-3: Modifying the VC_Demux of the East input port 44

Figure 3-4: Modifying the VC_Demux of the South input port 44

Figure 3-5: Modifying the VC_Demux of the West input port 44

Figure 3-6 : A architecture of NI 45

Figure 3-7: C2R_buffer module 46

Figure 3-8: C2R_controller module 47

Figure 3-9: Flitizer module 48

Figure 3-10: Routing table module 49

Figure 3-11: Updating path module 50

Figure 3-12: R2C_Buffer module 51

Figure 3-13: R2C_controller module 52

Figure 3-14: De_flitizer module 53

Figure 4-1 Testbench model 55

Figure 4-2: Test case 1 56

Figure 4-3: Simulation for tracking phase and processing phase in test case 1 57

Figure 4-4: Simulation for replacing phase in test case 1 57

Figure 4-6 Simulation for tracking phase and processing phase in test case 2 59

Figure 4-7: Simulation for replacing phase in test case 2 59

Figure 4-9: Simulation for tracking phase and processing phase in test case 3 61

Trang 9

Figure 4-10: Simulation for replacing phase in test case 3 61 Figure 4-11: Timing information 62 Figure 4-12: ASIC & VLSI design flow [24] 70

Trang 10

List of Tables

Table 2-1: Function and code for each type of flits 38

Table 3-1: Codes for directions 43

Table 3-2: Code for VC_Demux of each input port 43

Table 3-3: Pin desciption of C2R_buffer module 46

Table 3-4: Pin desciption of R2c_buffer module 51

Table 4-1: Routing informaton in test case 1 57

Table 4-2: Routing information in test case 2 58

Table 4-3: Rouiting information in test case 3 60

Table 4-4: Device Utilization Sumary (estimated values) 61

Table 4-5: Delay information 63

Trang 11

List of Abbreviations

ASIC Application-Specific Integrated Circuit

CCM Communication Control Mechanism

FIFO First In First Out

FPGA Field Programmable Gate Array

IP Intellectual Property

NoC Network on Chip

QoS Quality of Service

SAF Store -And-Forward

SoC System on Chip

VC Virtual Channel

VCPIQ Virtual Channel Priority Queuing

VCT Virtual Cut Through

VHDL VHSIC Hardware Description Language

VHSIC Very High Speed Integrated Circuit

VOQ Virtual Output Queuing

WH Wormhole

Trang 12

INTRODUCTION

To meet the quickly increasing in demand of applications, more and more computing resources such as Central Processing Units, Digital Signal Processors, and particular Intellectual Property (IP) cores are added into a System-on-Chip (SoC) Therefore, interconnection between these computing resources become a big challenge

in SoC Network-on-Chip (NoC) have recently emerged as a promising solution for communication in large SoCs The NoC architecture provides a new method for connecting of different IP-cores through an effective, modular, and scalable network [1] However, standard NoCs architectures seem not flexible enough to support dynamic environment where communication characteristics can strongly changing at run-time A new NoC design methodology called reconfigurable NoCs emerges an alternative solution to tackle current challenges Reconfigurable NoCs have the ability to provide adaptive communication infrastructures and the flexible network protocols [1].Thanks

to dynamically reconfiguring hardware ability, reconfigurable NoCs allow many hardware tasks to be mapped onto the same hardware platform

There are several proposed reconfigurable NoCs, and each architecture has a different method to make NoC reconfigurable Depending on the characteristic and features of each reconfigurable NoC, communication infrastructures and communication protocol change adaptively after the reconfiguring process As be a part

of the communication protocol, the communication control mechanisms (CCM) also need to be updated with a new environment to ensure the correct operation as well as the communication performance of the network The RNoC [2] is a reconfigurable NoC platform developed at the key laboratory for Smart Integrated Systems (SIS lab), VNU University of Engineering and Technology It provides a router architecture which can reconfigure the routing path when the being transmitted data face to prohibited node (i.e, dead node) However, the routing path reconfiguration process leads to an increase in packet delay In the other words, the process of reconfiguration lengthens the period the packet occupy communication resources and thus affect to network traffic Therefore,

we will propose a CCM for the RNoC platform to guarantee the lossless communication

as well as reduce the packet delay of the network

Trang 13

In our CCM, both the node-to-node flow control and the end-to-end flow control are used to reach the target While the node-to-node flow control ensures the lossless communication of RNoC, the end-to-end flow control is responsible for updating the CCM after reconfiguration process The main part of our end-to-end flow control is the tracking and replacing information mechanism To implement our tracking and replacing information mechanism, the RNoC architecture will be modified to support the tracking phase while NI architecture is designed to implement the processing phase, replacing phase Our NI design is modeled by using VHDL language Then our NIs are plugged into a 4×4 mesh modified RNoC platform for simulating and verifying the operation of our CCM The simulation results show that our CCM helps NoC decrease about 23.5% to 50% in header-to-header delay, 0% to 50% in tail-to-tail delay and 5.2%

to 11.7% in packet-to-packet delay After the verification process, our NI architecture is synthesized to kit Virtex5 XC5VLX330 -2ff1760 using tools of Xilinx The result of

synthesis process gives that our design can operate at a maximum frequency of 294MHz

The thesis is organized into four chapters, the Introduction section and Conclusion section Chapter 1 will give an overview of the NoC and its basic concepts Besides, the communication mechanism in NoC and reconfigurable NoCs are also reviewed in this chapter Chapter 2 is about our proposed communication control mechanism for reconfigurable NoC architectures Then, the modified architecture of RNoC router and our NI design are described in Chapter 3 In Chapter 4, the method to simulate and verify our design will be explained The implementation results and the evaluation are also given in chapter 4 Finally, conclusions section will summarize the work presented and give the future work which far related

Trang 14

NETWORK-ON-CHIP

In this chapter, we will address on basic concepts of network-on-chip (NoC) paradigm Then the communication control mechanism will also be presented Finally, some previous works will be reviewed and discussed to provide the state of the art of the research topic

1.1 Basic concepts

Network on chip (NoC) is emerging as a revolutionary methodology in solving the bandwidth bottleneck of shared bus interconnection.In this section, to give a good overview of NoC, we will introduce main concepts of NoC paradigm The network topology, communication mechanisms, routing modes, buffering strategies, routing algorithms, quality of service, flow control and data blocking are explained and analyzed After that, the state of the art about NoC will give the investigating trend which the topic of this thesis is clearer

1.1.1 Basic components of a NoC

There are three fundamental components in a Network-on-chip: Routers, Network Interface (NI) or Network Adapter (NA) and links [3] In NoC, routers connect

to each other through links with a specific topology An example architecture of a 2D mesh NoC example is shown in Figure 1-1 Each router can communicate with an IP-core through a NI

Trang 15

Figure 1-1: An example architecture of a 2D mesh

 Links: A communication link is a set of wires The function of links is

physically connecting routers in the network and actually implementing the communication [3]

 Router: the main function of a router is routing data from source to

destination according to chosen protocol and routing strategy [4] Architecture of router depend on many factors such as communication protocol, routing mode, routing algorithm, …

 Network Interface (or Network Adapter): As each IP-core may have a

distinct interface protocol with respect to the network, NoCs must include NIs which makes the logic connection between IP-cores and the routers [3]

Trang 16

Figure 1-2: Popular NoC topologies: (a) ring and chordal ring; (b) tree; butterfly

fat-tree; (d) 2D-mesh; (e) 2D-torus; (f) 2D folded torus [5]

Because each network topology owns good and bad features, the following physical parameters are usually used to make a comparison between them: router degree, network diameter, regularity, symmetry, path diversity, and bisection width Based on these characteristics, the 2D mesh topology is considered as the most predominant one because of some advantages For instance, it is easy to be implemented by using the current IC plane technologies, has simple routing strategies, and has network scalability Besides, 2D torus topology is also popular choice if the designers want to minimize the diameter of network as well as improve its bandwidth Nevertheless, this type of topology has a more complex implementation, and its long wrap-around link at the external boundary router can causes the decrease in communication performance With the 2D folded torus, the long wrap-around link at the external boundary router problem

is overcome, but its routing algorithm is more complex in compare with above topologies All of three mentioned topologies have drawback on associated network latency

Fat-tree topology, butterfly fat-tree topology and choral ring topology are also the frequent option used to construct a robust NoC Fat-tree and butterfly fat-tree topologies are often used to get lower associated network latency thanks to their very small diameter network However, it has to deal with the fan-in/ fan-out and wiring complexity problems In compare with the ring topology, the chordal ring topology has higher communication performance, but its wiring complexity and routing difficulty are also increased

Trang 17

In conclusion, there are many topologies can be used in NoC architecture, so it is very hard to make the best decision Moreover, constraints of the application and network information traffic have to take into account We always have to make a tradeoff between intrinsic performance and the topology and its implementation overhead

1.1.3 Communication protocol

Communication protocol defines strategy of moving data in a NoC [4] In fact, the communication on network bases on the switching (known as the setting up temporary links between two or more nodes in network to transfer information or a part

of information) techniques There are two principle techniques usually used in NoC: circuit-switching and package switching

Circuit-switching: This is the first switching technique has been used in

old-generation telephone network When circuit-switching technique is applied, the communication link between the source node and the destination one must be set up before any transmission can occur Therefore, there always exists a physic channel between two communication units during the data transferring period The control information and communication information are independent

The circuit-switching has many advantages such as high guaranty, maximum bandwidth and low transmission latency Therefore, it is sufficient for real time application Nevertheless, the circuit has to be granted before the transmission, so the network resources will be occupied until the end of transmission Moreover, the low flexibility is another disadvantage of this technique, for instance, the central controller must be repaired whenever there is router changing

Packet-switching: In package-switching, data is encapsulated into packages at

the source before be sent to the destination Each package is composed of successive flits and contains the control and communication information These flits will be simultaneously transferred via the many different paths and rearranged then at the destination

Thanks to using this technique, the upper limit of possible network performance can be reached and resources (i.e routers, links…) can be shared In contrast to circuit-switching, the network using this technique is locally switched between network nodes, not the communication units Another advantage of this switching method is no central controller is in need because of the separately routing of packages However, to rout packages individually, the routing information has to be included in each package This lead to the drawback of package-switching is the router become more and more

Trang 18

complex Moreover, the network has to deal with some critical problems, such as high transmission delay, dead-lock, ordering management

1.1.4 Routing modes

Because of the complexity of package-switching, it is necessary to define routing mode Routing mode is known as the way a package is forwarded from one network node to the next one [5] Store-And-Forward (SAF), Virtual-Cut-Through (VCT) and Wormhole are three common routing modes of NoC

Figure 1-3: Three frequent routing modes of NoC: (a) Store And Forward; (b) Virtual

Cut Though; (c) Wormhole [5]

Store-And-Forward (SAF): In this routing mode, all of the flits of a package

are transmitted from one router to the next one Therefore, the buffer of each router must

be large enough to store an entire data package as represented in Figure 1-3(a) However, the limit of implementation area and power consumption in NoC causes a narrow router buffer space In addition, the data delay at every routing stage will be increase and package’s flits are not sent to the next router until all of them reach the current router

Virtual Cut-Though (VCT): In order to reduce the package delay at every

router, the Virtual Cut-Though is proposed In this mode, transmitting flits of one package to the next router can be started before all of that package’s flits are received

by the current router as in the SAF mode This is illustrated in Figure 1-3(b) However, the current router must have capability to store the whole package in case the next router

is busy Therefore, buffers of router using this mode have the same size with these ones when SAF mode is applied

Trang 19

Wormhole (WH): In this mode, a flit of one package can be forwarded whenever

the next router is available as shown in Figure 1-3(c) As in VCT mode, it is not affected

by the incompletely receiving the entire package of the current router anymore In NoC which using WH routing, each package includes one header flit, some data flits in the middle and the last one called tail flit The header flit contains the routing information has to reserve routing channel of each router The data flits then follow the reserved channel which will be later released by the tail flit In contrast to two above mode, this routing mode reduces the buffer size of network router as much as possible Moreover, one package can occupy several switches at the same time, so the package latency is reduced also However, the main drawback of WH mode is if one flit is blocked, all of flits flow that one will be blocked

In computer network, SAF mode is preferred to use thanks to the large implementing area While in NoC, the frequent one is WH because this mode has low data latency and small router buffers

Input queuing: with this strategy, N queues will be established at N inputs port

of one router, see Figure 1-4(a) A scheduling arbiter will decide which output port will

be connected to which given input at a given time to avoid the conflict problem However, because of the head-of-line blocking when the number input port is large, the router traffic will be saturates at 59 %.Head-of-line blocking occur when there is head data of queue can’t access output ports and it blocks all data flows

Trang 20

Figure 1-4: Four popular buffering strategies of NoC: (a) Input queuing;(b) Output queuing;(c) Virtual output queuing;(d)Virtual channel priority input queuing [5]

Output queuing: In this strategy, N queues will be located at N output ports of

each router as depictedFigure 1-4(b) All flits arrive at the same time slot must be scheduled before the starting of the next time slot This causes the crossbar fabric has to run N times as fast as the input/output port work, even if all of flits have the same target output port This is the disadvantage of output queuing While, the advantage of this scheme is that it can achieve maximum throughput of one per input and the best delay performance

Virtual Output queuing (VOQ): The idea of this strategy is to combine the

advantages of input queuing and output queuing From the model of input queuing mechanism, this one locates N buffers at N input port and add (N-1) virtual buffer for each port to eliminate the head-of-line blocking issue, so there are N2 buffer in one N-input-ports router Moreover, this strategy imitates the working of output queuing strategy, so all inputs can be connected to output at the same time Therefore, the VOQ mechanism has high storage performance in compare with two above ones However, the very large number of buffers makes it become really expensive in term of implementation area

Virtual Channel Priority Input Queuing (VCPIQ): This strategy is proposed

to improve the switching performance and reduce the disadvantage of VOQ mechanism For one physical channel, we set up P queues (P < N) for P virtual channel to share the bandwidth of the link as illustrated in Figure 1-4(a) By using virtual channel, the head-

Trang 21

of-line blocking problem is reduced, this lead to increase the capacity usage of the links

In the other words, data latency and the complexity of switch increase with the number

of virtual channel

1.1.6 Routing algorithms

Routing algorithm forms a path which data package is transmitted from the source to the destination [5] It plays an important role in communication of NoC Therefore, we have to analysis and tradeoff between requirements to make a sufficient solution which can use the maximum capacity of communication nodes as well as be simple to be implemented on NoC

There are two main types of routing algorithms are deterministic routing algorithm and adaptive routing algorithm [5] In deterministic routing algorithm, the path between the source and the destination is established without the affection of the current state of network While, if the NoC using adaptive routing algorithm, the setting

up of path between the source and the destination will be in tight relationship with the network’s current state The first type of routing algorithm is simpler than the second one, so it is preferred to use in ASIC which usually have stable transmission data The adaptive routing algorithm is used in system which have unpredictable network throughput like in MPSoC (Multi-Processor SoC)

Rely on the routing decision, we can divide routing algorithms into four subtypes [6]:centralized routing, source routing, distributed routing and multiphase routing With the centralized routing, the path of data is given by the central controller If the source decides the routing algorithm, it called source routing Nevertheless, the routing decisions are made during the period when data is being transmitted from the source to the destination The final type, multiphase routing, is the hybrid type of source routing and distributed routing

1.1.7 Data blocking

Deadlock and livelock are two typical type of data blocking in NoC

Trang 22

Figure 1-5: Deadlock example [5]

Deadlock : This type of data blocking occur when there is at least one flit blocked

because of waiting the event which can be happened Figure 1-5 illustrates an example about deadlock The data occupies the link L1, L2 is waiting for the release of the link L3, while link L1 is released if only if the L3 is not occupied This problem can be addressed by using virtual channel technique The way Virtual Channel (VC) technique works is represented in Figure 1-6 The data of one VC can use the physical link when the others one in the same port is stalled

Figure 1-6: Separated buffer virtual channels are used to solve the deadlock problem The data of one VC can use the physical link when the others one in the same port is

stalled[4]

Livelock: The case in which package can’t reach the target router while it is not

totally blocked called livelock This occurs when resources perpetually change their state waiting for other communication to complete [5] One example of this phenomenon

is shown in Figure 1-7

Trang 23

Figure 1-7: An example of livelock The ouside routers can’t reach the inside once

because they are in deadlock [5]

1.1.8 Quality of the service

Quality of Service (QoS) are the aspects of service provided by the network to determine the guarantees level for data transfer There are some characteristic used to find out the QoS of NoC, for example related to timing (minimum throughput, maximum latency, maximum latency jitter), related to integrity (maximum error rate, maximum package loss), and related to package delivery (the order of input or output) [7] Depend

on these aspects, QoS of NoC is divided into two following types:

Best Effort (BE): The package will be transmitted as soon as possible and the

sized of network will be calculated to get the best average performance However, in general, it is not guaranteed for latency or throughput

Guaranteed Service (GS): In contrast to BE known as the non-guaranteed

Service, this type of QoS brings to the NoC a warrant about the minimum throughput and maximum latency This lead to NoC using QS demands more network resource than the one using BE [5]

1.2 Communication mechanism in NoC

1.2.1 Definition and classification

There are many control mechanisms in NoC built to ensure the correct operation

of the NoC such as: flow control mechanisms, congestion control mechanisms, error control mechanisms, etc In this work, we concern about communication control

Trang 24

mechanism as the flow control mechanism which briefly introduced in subsection 1.1.7 The aim of flow control mechanism is to avoid the overloading of network and moderate the network traffic [5]

Flow control mechanisms classification is demonstrated in Figure 1-8 Firstly, these mechanisms can be divided into two types: end-to-end flow control and node-to-node (switch-to-switch) flow control After that, node-to-node flow controls can be further categorized into bufferless flow control and buffered flow control There are many buffered flow control techniques such as Credit Based flow control, Handshaking signal based flow control, ACK/NACK flow control, STALL/GO flow control and T-error flow control [8, 9]

Figure 1-8: The classification of NOC’s flow cotrol mechanisms

With respect to moderating the network traffic target, end-to-end flow control is proposed to regulate global traffic, while node-to-node flow control have responsibility for the local traffic (between routers) To achieve the avoid overloading of the network aim, end-to-end flow control mechanism ensure that a source node not produce more data than the destination node to can be address, so the buffer in NI cannot be overflow

In addition, buffers in routers are protected using node-to-node flow control which determine the ways the up-stream router checks the buffer’s status of down-stream one

Trang 25

As results, end-to-end flow control is implemented in NI (or NA) while node-to-node flow control occurs in routers The Node-to-node flow control can be used to prevent the buffer at NI from being overflow since the last router in routing path can communicate with the destination NI as a normal router It will check the status of buffer

in NI is not full, the data in the last router will be sent, and else data will be store at the buffers in routers along routing path However in case that the consuming operation in destination node is slower compare to injection operation at the source node, flit/data will be stalled across the network This tends to cause the congestion and deadlock in NoC [10] Therefore, using link-level flow control without end-to-end flow control is not recommended In contrast, using exclusively end-to-end flow control without node-to-node flow control is not feasible [9]

Bufferless flow control is mainly used in circuit switched networks as in works

of [11] Argawal et.al comment that the buffer less flow control is more latency and less throughput in comparison with the buffered one [8].In contrast to the bufferless flow control, buffered one is used for packet switched networks Flowing paragraphs will give some buffered flow control

 Credit Based flow control:is a flow control mechanism where the upstream node has a counter to track the number of available free slots in the downstream queue The counter state is decremented whenever a flit is transmitted and incremented whenever a credit signal arrives from the downstream node, which in turns sends the credit whenever it has succeeded in forwarding a flit from its queue to the next hop

 Handshaking signal based flow control: whenever upstream node send data to the downstream node, a valid signal is raise at the former node As soon as successfully receiving data, downstream node raises the valid signal to inform the upstream node that data is consumed

 ACK/NACK flow control: in this mechanism, a copy of data which is sent to the

downstream node is deleted from the buffer of the upstream node when an ACK signal is asserted at the downstream In the case of receiving a NACK signal, the upstream will schedule to retransmit that data Thanks for the retransmission mechanism, ACK/NACK flow control technique supports thorough fault detection

and handling

 STALL/GO:there are two wires (signals) are used to implement this flow control mechanism: one wire going forward and another wire going backward The former informs that being sent data is valid while the latter is asserted either when buffer

Trang 26

of downstream node is empty (GO) or full (STALL) This is a low-overhead

scheme which assumes reliable flit delivery [9]

 T-ERROR: provides logic to detect timing errors in data transmission by using

an error control circuit augmented with the pipeline buffer in each pipeline stage

[12] This is more complex mechanism than other flow control and this support

however only partial fault tolerance

1.2.2 Previous works

Credit-based end-to-end flow control is implemented in [13] For each channel,

there is a counter (space) used to track the empty buffer space of the remote destination

queue as shown in Figure 1-9 This counter is initialized with the remote buffer size

When data is sent from the source queue, the counter is decremented At the remote

destination side, a counter in the remote NI (credit) is used to indicate the number of

available slots in NI buffer When data is consumed by the IP module, this counter is

incremented, meaning credits are produced These credits are sent to the producer by

being piggybacked in the header of packet

Figure 1-9: Connection implementation of end-to-end flow control [13]

The paper showed that overhead bandwidth can be reduced with up to 20% by

piggybacking scheme [13] The given reason is that the larger the burst size is, the larger

amount of credits are reported in a credit packet Consequently, the number of credit

packets (i.e., overhead introduced by credits) decreases However, the drawback of

increasing burst sizes is that buffers need to increase to accommodate the larger bursts

Tang and Lin [14] introduced a new end-to-end flow control mechanism called

injection level flow control The proposed mechanism manages data injection rate at

source nodes depend on network status In this work, network status refers to how many

data flows sharing the channel (called share pattern) [14]

Trang 27

The injection level flow control can be separated into for stages as described in

Figure 1-10 The calculating process used determine the share patter (sp) is implemented

in the destination PE, and the analyzing sp process used to adjust the packet injection rate is implemented in the source PE The sp signal is transmitted from the destination

PE to the source PE through the control network (it means that there are two networks

in this work: a data network for data transmission and a control network for control information transmission)

Figure 1-10: NoC architecture with injection level flow control strategy [14]

To evaluate the proposed mechanism, authors extend an open source simulator Noxim to carry out the simulations and find the latency and throughput information The area overhead of proposed scheme is very small since it does not make any change to the network The first given reason is the PE but the routers takes the responsibility of collecting the status information and processing it The second one is control information is sent through a dedicated control network which is dependent from data network However, since PEs implement the proposed mechanism which requires calculating and processing operations, this scheme cannot be used in systems which contain some non-processor IPs

The network presented in [15] applies a mesh topology and employs wormhole packet forwarding with hop-by-hop credit-based backpressure flow-control (for lossless buffer operation and minimal buffer requirements) The credit-based backpressure flow control is implemented in each router as shown in Figure 1-11

Whenever a flit is forwarded from an input to an output port, one buffer becomes available Then a buffer-credit is sent back to the previous router on separate out-of band wires As results, an output port of a router maintains the number of available flit slots per each service level in the buffer of the input port in the next router That number is decremented upon transmitting a flit and incremented upon receiving a buffer-credit from the next router

Trang 28

Figure 1-11: Router architecture [15]

T-error flow control mechanism is proposed to make the NoC links tolerant to timing errors caused by the unpredictability in environment and wire characteristic is used in design of R.R.Tamhankar et al [12] T-error flow control is useful for communication over physical links which either stretches the distance among repeaters

or increase the operating frequency with respect to conventional design These links are likely to face with timing error as a consequence Each pipeline buffer is augmented with an error control circuit to detect and correct error as shown in Figure 1-12

Trang 29

Figure 1-12: Proposed idea of T-error [12]

The logic level implementation of T-error scheme is demonstrated in Figure 1-13

Input data will be sample twice with an original clock (ck) and a delay clock (ckd) The delay between the clock edge ck and ckd is calculated to ensure that the delayed flip-flop always operate in error-free manner As a result, data got from the delayed flip-flop is

error-free data and used to be the reference data

Figure 1-13: Logic level of T-error [12]

Data sample by main flip-flop will be compare with the reference data If that data is error, then the correct data (reference data) will be sent at the next clock edge ck

The work gives a frequency boost of about 50% while introducing a much smaller correction overhead However, T-error is only support partial fault tolerance, for instance, error with large time constants would not be detected [16] Moreover, real time systems which operate in a noisy environment must avoid using this flow control mechanism In additional, T-error is more complex in comparison with other flow control [8]

1.3 Reconfigurable NoCs

As introduced in the starting of 12Chapter 1, NoC has emerged as the design paradigm for scalable System on Chip with harsh bandwidth requirements However,

Trang 30

current NoCs remain not flexible enough to support communication dynamic behaviors One of ways to provide such flexibility is the reconfigurable NoC Thanks for dynamically reconfiguring hardware ability, reconfigurable NoCs allow many hardware tasks to be mapped onto the same hardware platform This results in reduction in area and power consumption of the design

Depend on the time NoC is reconfigured, reconfigurable NoCs can be classified into two types: run-time reconfigurable NoC and design-time reconfigurable NoC In design-time reconfigurable NoCs, the hardware reconfiguration is implemented at design time (before the system is executed) For example, the router introduced in [17] has ports which are divided into many small channels (aka lanes), and the width and the number of lanes can be adjusted at design time to meet the flexibility and bandwidth requirement of the applications In contrast, run-time reconfigurable methodologies allow NoCs to autonomously adapt its structure and their behavior during the period of their operation Since having high flexibility, run-time reconfigurable NoCs have been paid a lot of attention The parameters/elements of NoC can be modified to be reconfigurable at run-time are topology (e.g [18], [19]), links (e.g Direction/Bandwidth

of links [20]), router (e.g [2], [21], [22]) For example, a reconfigurable router can supports adaptive routing techniques, adaptive switching techniques, adaptive number

of VCs (virtual channels) and the buffer size per each VC etc., so that they can be dynamically adjusted based on the traffic load and network status

In most of reconfigurable NoC architecture, reconfiguring process is the changing in communication infrastructure (routers, Nis and links), topology and communication protocols (switching style, routing algorithm, communication control mechanism…) to get a new configuration of NoC Changing in any one of these factors effects each other’s For instance, when communication infrastructure is reconfigured, and communication protocol has to be updated to guarantee the operation of NoC In contrast to that, to implement the changing communication protocol, the communication infrastructure is reconfigured This can results in a new topology For example, as faults are detected in links or routers, the links and routers will be disable or forbidden, changing the topology Therefore, the routing algorithm must be changed to ensure the communication between IPs This can leads to changing in router architectures or NI architectures In another case, NI architectures should have ability to resize the flit to adapt to the new width of links when they are changed to a new configuration Therefore,

in reconfigurable NoC architectures, communication control mechanism should have ability to adapt with these changing to guarantee its responsibility

Trang 31

1.4 Conclusions

In this chapter, we introduced basic concepts of network-on-chip (NoC) paradigm as well as the overview of reconfigurable NoCs Besides, several previous works related to communication control mechanism are reviewed and discussed to give the state of the art of the topic In the next chapter, we will introduce our proposed communication control mechanism in reconfigurable NoCs architectures

Trang 32

PROPOSED COMMUNICATION

CONTROL MECHANISM IN RECONFIGURABLE NOCS

ARCHITECTURES

As mentioned in the previous chapter, communication control mechanism in reconfigurable NoC architectures should have ability to adapt to new configurations of NoC after reconfiguring process In this chapter, we will introduce our proposed communication control mechanism for targeted reconfigurable NoC architectures which are developed at the key laboratory for Smart Integrated Systems (SIS lab)

2.1 Target reconfigurable NoC architectures

2.1.1 General parameters

As described in Chapter 1, there are many proposed architectures for NoC implementations Each architecture has its own specification and features Therefore, the communication control mechanism is also specified for each type of NoC architecture In this work, our targeted reconfigurable NoC architectures have the following common parameters:

 Topology: 2D mesh

 Switching style: packet-switching

 Routing mode: wormhole

 Routing algorithm: deterministic, source routing, can be changing in run-time at routers

Trang 33

In this thesis, we introduce a communication control mechanism which targets to RNoC platform - a reconfigurable NoC platform developed at the SIS lab, VNU University of Engineering and Technology The next section will introduce briefly the RNoC platform

2.1.2 RNoC platform

The RNoC [2] which has router architecture shown in Figure 2-1 provides a communication solution for reconfigurable NoC with the RNoC router Thanks to the Routing Modification (RM) port (a virtual port), RNoC router can route communication data dynamically according to the mode of router Whenever the target routing path is blocked by unwanted defects or intently by a software programmer to meet the requirements of applications, the network will changes from normal mode to reconfiguration mode In the normal communication mode, the deterministic source XY routing algorithm is used; therefore, a packet will be transmitted flowing a determined path that stored in a routing table in Network Interface of Source node While in the reconfiguration mode, a West-First algorithm with a proposed prohibited router surrounding technique will be applied

Figure 2-1: RNOC router architecture [2]

Trang 34

The proposed prohibited router surrounding technique is used to change the routing path to route data around the prohibited router, and it is implemented in the RM port of the router before the prohibited router Depending on the position of the prohibited router and the destination router, the routing path can be modified in different ways The authors of RNoC divide the reconfiguration strategy into three cases With case 1 in which the prohibited router is in the middle of a straight segment of routing path, the proposed technique makes the changing in routing path as shown in Figure 2-2(a) (the dashed line is old routing path while the solid line is the new one) The Figure 2-2(b) represents examples of case 2 in which the prohibited router appears at the corner

of the routing path

Figure 2-2: (a) Probihited router is in the middle of a straight segment routing path, (b)

the probihited router is at the conrner of routing path [2]

In case 3, the prohibited router appears just before or just after the corner of the routing path Figure 2-3 shows some examples of case 3 included a special context in which the data is routed back out of the router before the prohibited router

Trang 35

2.2 Proposed Communication control mechanism

To deal with the increasing in packet delay which appear after reconfiguring of routing path as introduced in above subsection, we proposed a communication control mechanism for reconfigurable NoCs This CCM not only aims to avoid losing data during transmission process but also reduce the packet delay after reconfiguration process The block diagram of our mechanism is represented in Figure 2-4

Trang 36

Figure 2-4: Block diagram of proposed communicaiton mechanism

The Obsevator module collects and analyses the information related to current CCM – the routing path, for instance If the Obsevator finds out that the current CCM need to be update, the received information will be sent to Control making module At this module, the parameters of new CCM created and sent to the Actuator module The Actuator implement the new CCM and applied to the network under control For instance, the old routing path which contain prohibited node is replaced by the new routing path As no prohibited node in new routing path, reconfiguration processing unit (RM port) no longer needs to processes head-flits (or all flits of a packet) This results

in a drecrease in the packet delay

To implement our CCM, we proposed end-to-end flow control mechanism in transport layer and node-to-node control mechanism in link layer The node-to-node flow control ensures lossless transmission between the mentioned modules through the network while the end-to-end flow control represents the operation of Obervator, Controll making and Actuator modules In the other words, our node-to-node flow control has responsibility for avoiding losing data during transmission process, and our end-to-end flow control has responsibility for reducing the packet delay after reconfiguration process

Our node-to-node flow control mechanism is handshaking signal based It ensures that data at the upstream node is sent when the downstream node is ready for receiving data (input buffer of downstream node is not full) For instance, we use is send/accept protocol which has operation mechanism demonstrated in Figure 2-5

Định dạng
Số trang	73
Dung lượng	2,71 MB