1. Trang chủ
  2. » Công Nghệ Thông Tin

Model-Based Design for Embedded Systems- P14 pdf

30 423 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Model-Based Design for Embedded Systems- P14 pdf
Trường học Xilinx University Program
Chuyên ngành Embedded Systems
Thể loại thesis
Định dạng
Số trang 30
Dung lượng 841,75 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Slave PLBv46 master burst Slave buffer interface LocalLink write buffer LocalLink read buffer Reset logic Interrupt generation Bridge control logic Bridge status signals Interrupt reques

Trang 1

Slave

PLBv46 master burst

Slave buffer interface

LocalLink write buffer LocalLink read buffer

Reset

logic

Interrupt generation

Bridge control logic

Bridge status signals Interrupt request

Reset

Control bus

Control bus Control bus

DCR

bus

Reconfigured region Bus macro enable

Reconfigurable socket Reset request

FIGURE 12.7

Reconfigurable socket abstraction based on the “PLBv46 PLBv46 bridge”architecture The “PLBv46 slave” and “PLBv46 master burst” blocks are stan-

dard IP components and all blocks except the DCR slave block are part of the

bridge Bus macros are implicitly present on all signals crossing the ary of the reconfigured region

bound-An alternative is to architect the interface around a bus bridge, with pendent busses in the static region and in the reconfigurable region Thedesign of the socket is based on partitioning the Xilinx “PLBv46 PLBv46bridge” IP [23], as shown in the block diagram in Figure 12.7 Internally thiscore is based around 32-bit fixed-width data FIFOs and a small number ofcontrol signals Most of the bridge is treated as part of the static region, withonly a small amount of logic required in the reconfigurable region to com-plete the bridge In addition to the bus interface, which is primarily used tointerface to the reconfigured region, the socket core also contains a controlinterface (based on the DCR protocol [7]) which is used to generate an inde-pendent reset signal to the reconfigurable region and to force signals driven

inde-by the reconfigurable module to stable values during reconfiguration

12.5.3 Direct Memory Access Interfaces

The bus interface above is a generic and flexible interface, which can be used

to communicate with the reconfigured portion of the system in differentways For instance, it may be used by the processor to both send and receivedata from the reconfigured region or as a control interface to set parame-ter values of IP cores executing in the reconfigured region However, it doeshave several disadvantages Primarily, the bandwidth of data to or from the

Trang 2

processor is limited because of the overhead of bus arbitration and the factthat the memory range is treated as uncached I/O transactions Althoughperformance could be improved somewhat for large transactions by usingDMA engines or treating data transfer regions as cached and manually man-aging cache coherency, this would significantly increase the complexity ofthe processor software Secondly, many FPGA algorithms require access toexternal memory for buffering data until it can be processed For instance, in

a network router, packet data may need to be stored until a routing decisioncan be made, or in a streaming video system, several frames of video datamay need to be stored to analyze object motion between frames

Because of these limitations, it is best to consider the bus interface above

as primarily an interface used for low-bandwidth control and configurationinformation In systems that require higher bandwidth communication, ordirect access to external memory, the control interface can be augmentedwith additional interfaces to memory Although it may seem straightforward

to include a complementary bus bridge that can be driven by the ured region to provide this functionality, this tends not to be the highestbandwidth option since performance can be limited by the arbitration logic

reconfig-of the PLB bus This logic is heavily pipelined in order to maximize the busthroughput under a wide variety of usage, typically incurring three cycles oflatency before a slave can respond to a bus access

One solution is to provide an interface connected directly to the nativeport interface (NPI) of the Xilinx MPMC IP core, as shown in Figure 12.8

External memory (e.g., DDR/DDR2)

Arbiter

Multiported memory controller Physical interface

Trang 3

Typically, this interface exhibits both lower latency and higher bandwidththan the PLB bus Although the MPMC must still arbitrate between differentports attempting to use the memory controller, this arbitration can be per-formed locally within the memory controller and concurrently with the databeing provided The only disadvantage of connecting directly to the mem-ory controller is that other IP cores in the static region cannot be accessedfrom the reconfigured region However, since in the SRP usage model these

IP cores are likely being managed by device drivers in the operating system

of the processor, it is questionable whether such access should be allowedanyway

12.5.4 External Interfaces

In addition to communicating with the static region, a reconfigurable ule may also communicate with other interfaces external to the FPGA Inorder to accomplish this, a reconfigurable region may include external I/Opins and/or high-speed serial transceivers For the most part, these resourcescan be treated as any other FPGA primitives and can be placed and routed

mod-as usual

However, there is some complexity with regard to external I/O pins,since in many FPGA designs, the input/output buffer (IOB) primitives rep-resenting external I/O pins are not explicitly instantiated in a user design butare inferred in the synthesis process Normally in a hierarchical design, thenetlist can be synthesized using a special option to disable inference of theseprimitives, since they will be inferred or instantiated during synthesis of thetoplevel design However, when building a generic FPGA platform, relying

on this may not be desirable, since the reconfigured region may require morecontrol over the configuration of these primitives In other cases, exactlywhich IOB primitives are explicitly instantiated in a reconfigurable moduleand which ones are not may not be known when the static design is synthe-sized and implemented One way to solve this is to not expose any I/O pins

of the reconfigurable region as external signals of the static region, implyingthat synthesis of the static design will never include IOB primitives for thesepins When a reconfigurable module is synthesized, signals interfacing withthe static region are individually tagged with the constraintBUFFER_TYPEset to NONE, indicating that no IOB primitives should be inferred for thosesignals

High-speed serial transceivers also have additional design complexity,since each transceiver is associated with specialized clock resources in theFPGA These clock resources typically include phase-locked loops for clocksynchronization and dedicated clock distribution paths and may be sharedbetween transceivers From the perspective of building FPGA platforms, thisresource sharing combined with how transceivers are grouped into configu-ration frames may need to be considered during the floorplanning stage inorder to gain maximum usage of the available transceivers

Trang 4

Static design flow EDK base

planning ngc

Floor-.dts

PR-enabled NGDBuild, Map, and PAR

PR-enabled bitgen

EDK genace.tcl static.ace bit ncd static.used

UCF merge ucf EDKplatgen mhs

EDK hand design

Module design flow Hand

design

.ucf

.ngc PR-enabled NGDBuild, Map, and PAR

.ncd PRMergeDesign + PR-enabled bitgen Meta-information

C code gcc + objcopy

EDK genace.tcl partial.bit bit

merged.ace configure.elf

static.usedfor later use Since by default the interface with the reconfiguredregion is driven to an idle state, the resulting bitstream can be used in a sys-tem without programming the remainder of the FPGA The device tree for

a particular design is generated from the EDK design, and after being verted to a binary device tree blob, can be included in the Linux kernel image,

con-or stcon-ored as the initial value of a BRAM in the bitstream Lastly, EDK is used

to package the FPGA bitstream with the Linux kernel binary in a bootableimage that can be used with Xilinx SystemAce [24] to boot the kernel

The right-hand side of Figure 12.9 shows a second pass for the mentation of a reconfigurable module During this pass, the logic of the

Trang 5

imple-reconfigurable module is implemented together with a small portion of thestatic logic called the “context logic.” The context logic is necessary to pro-vide the context of the reconfigurable module, so that hierarchical names

in the design and location constraints for clock signals and bus macroscan be preserved The design constraints for implementation are created bymerging the design constraints from the static design with any additionaldesign constraints specific to the reconfigurable module, such as pin loca-

tion constraints During this pass, the routing resources in the file static.used

are excluded from use, since these resources are already used in the staticdesign The final bitstream for the reconfigurable module is generated byfirst merging the design database (contained in an ncd file) from bothpasses, ensuring that the configuration bits used in the static design are pro-grammed correctly In addition, design rule checks and timing analysis can

be applied to the merged design database, to ensure that individual passeswere implemented correctly From the merged design database, it is possi-ble to generate both a partial bitstream that can be used after configurationwith the static bitstream and a merged bitstream which can be used as an ini-tial configuration bitstream, with the reconfigurable module already loaded

To enable reconfiguration in a Linux system, the partial bitstream is lated with the Linux code for performing PR and the meta-information aboutthe reconfigurable module, to generate a Linux executable, as described inSection 12.6

encapsu-12.6 Managing Partial Reconfiguration in Linux

Two device drivers are used to manage the reconfiguration process ily, the device driver for the ICAP device performs the actual reconfigura-tion When a partial bitstream is written to this device (for instance, usingthecpcommand or thewrite()system call), the bytes are transferred tothe ICAP Since the device driver does not inspect or modify the stream ofbytes, the data being written must include the appropriate control words, asexpected by the configuration interface [26] The device driver also includessimple locking of the ICAP resource, in order to prevent different processesfrom unexpectedly interleaving accesses to the ICAP Readback is also possi-ble using this device driver by writing the correct readback request bitstream

Primar-to the ICAP and subsequently reading data (using theread()system call).The second device driver used to manage reconfiguration is associatedwith the reconfigurable socket core This driver exports a character interface

to which meta-information about a reconfigurable module can be written Asimple way of representing this meta-information is in the form of an array

of struct platform_device, a data structure which is used internally

by Linux to represent devices A more complex, but perhaps more robust

Trang 6

Reconfigure FPGA

Notify kernel of devices

Load kernel modules

Enable bus macros Reset reconfigurable module

Processing Unload kernel modules

Release devices

Disable bus macros

FIGURE 12.10

The reconfiguration process

representation of meta-information could be an additional device tree blob.This meta-information is parsed and checksummed and, if valid, is used tonotify the Linux kernel of the presence of new devices, which can then bebound to other device drivers An invalid checksum is interpreted as an indi-cation to unbind any previously loaded devices and release ownership of thereconfigured region Secondarily, this device driver also enables and disablesthe bus macros between the static region and the reconfigured region, andcontrols the reset of the reconfigured region As with the ICAP device driver,the socket device driver includes a simple locking mechanism in order toprevent a process from unexpectedly reconfiguring an active region in use

by loading the appropriate kernel modules and the Linux kernel binds thosedevice drivers to the reconfigured devices At this point, application codemay use the device drivers to communicate with the reconfigured region

A similar sequence of steps in reverse order occurs to unbind the devicedrivers and release the reconfigured region so that different processingmay occur

Since the ICAP device and the control interface of the socket are exposedthrough device drivers, it is relatively straightforward to implement recon-figuration through a regular user process One possibility for implementing

Trang 7

this involves linking the bitstream and meta-information into a single cutable along with the code for reconfiguration The process created whenthis executable is executed can be controlled through any operating systemmechanism (such as POSIX signals) to manage the life cycle of the moduleloaded in the FPGA The executable can also be linked together with otherapplication code, resulting in a familiar processor-centric usage model forthe FPGA fabric This approach is similar in spirit, but greatly different inimplementation from that proposed in [18], which performs essentially thesame processes using the Linux kernel’s ability to implement new executableformats.

exe-It is important to recognize that although the reconfiguration process

is managed by a user process, it must be treated as a privileged tion executed as the root user, since there are many places where bothunintended errors and malicious attacks may result in unintended behav-ior Some of these places are not specific to the PR process, such as loadingkernel modules, whereas others are more subtle vulnerabilities For instance,

opera-as noted before, partial bitstreams have significant constraints on how theyare constructed and are specific to a particular implementation of the staticsystem More directly, it is possible to trigger reconfiguration of the FPGAthrough the ICAP interface, resulting in the loss of the current state of thesystem If the bus macros are enabled during PR, then it is likely that glitch-ing on the interface signals will result in unintended behavior of the staticsystem

One particularly common usage error is simply attempting to load a tial bitstream that does not correspond to the current implementation of thestatic design This may happen during development when a modification ismade to the static region, but a designer neglects to reimplement a recon-figured module One way of avoiding such errors is to prepend each partialbitstream with a hash generated from the static design This hash can also

par-be stored in the static design, possibly in the device tree blob, and checkedbefore being loaded into the FPGA If the partial bitstream is not signedproperly, then the reconfiguration process can be halted without affectingthe operation of the static design This technique can be simply applied toprevent unintended errors, or adapted using more cryptographically securetechniques to prevent malicious attacks [2,4]

12.7 Putting It All Together

This section illustrates a SRP design targeted at a variant of the WARPSoftware-defined Radio hardware built by Rice University [12] Since theoriginal hardware is based on an older Virtex 2 Pro FPGA, we present

a design based on an updated Virtex 4 FX 100 device in order to better

Trang 8

PPC405 (ppc_virtex4 v2.00.b)

Interrupt controller (xps_intc v1.00a)

Multiported memory controller (mpmc v3.00b)

Ethernet MAC (xps_ll_temac v1.01a)

plb plb sdma

Reconfigurable socket

Reconfigured region

a bridge from Wired Ethernet to a two-radio MIMO system The design uses

a processor to manage the packet headers and to perform configuration agement of the radios, while packet payloads are communicated directlybetween the wired and wireless network interfaces using direct memoryaccess to a processor-managed memory buffer In the reference design, thepacked payload buffer is implemened in BRAM and communicated through

man-a PLB bus In the reconfigurman-able design, we man-assume thman-at the pman-acket pman-ayloman-adbuffer is implemented in external DRAM, which must be accessed from thereconfigurable region through a separate port of the memory controller As anonreconfigurable system, this design uses approximately 50% of the device(21294 of 42176 slices)

The design of the static subsystem is shown in Figure 12.11 This design isarchitected around the PowerPC 405 processor core and was largely gener-ated using the Base System Builder capability in Xilinx EDK Standard serialport and ethernet IP cores provide external connectivity Access to external

64 bit wide DDR2 SDRAM, including DMA access for the ethernet core, isprovided by the Xilinx MPMC IP core In this system, the processor, memorybus, and memory controller are designed to be “quasi-synchronous,” mean-ing that clocks must be edge-aligned Based on the speeds of the individ-ual components, a design point was chosen targeting a slow speed gradeFPGA (−10) with the memory bus clocked at 83.3 MHz, the memory con-

troller clocked twice as fast (166.6 MHz), and the processor clocked threetimes as fast (250 MHz)

Trang 9

Reconfigured region

ICAP interface

Control interface bus macros

Memory interface bus macros

Utilized powerPC core

Static region

FIGURE 12.12

Placed and routed design of an FPGA processor platform, targeting a Virtex

4 FX 100

The FPGA layout of the design is shown in Figure 12.12, overlaid with the

PR floorplanning constraints The static region is at the south of the chip, and

is exactly two configuration frames tall This layout provides approximately

8600 slices and 128 external I/O pins, which accommodates both the logic

Trang 10

requirements of a simple processor design, and the I/O pins requirements of

a 64-bit DDR2 memory interface A significantly smaller region would fail toprovide enough logic cells for the static design, while a larger region wouldallocate too many pins to the static region, which would be difficult to accessfrom the reconfigurable region

Note that the majority of the routed signals are contained within the planned area for the static region The routes entering the top region connectprimarily to external I/O pins and FPGA resources, such as clock buffers andthe ICAP, located in the center column of the FPGA Some routes into the topregion also connect to the PowerPC cores Although only one PowerPC isactually used in the static design, current versions of the EA PR tools do notallow PowerPC cores to be part of the reconfigured portion of the design.Hence, this design instantiates both PowerPC cores in the static region, inorder to enable use of the JTAG chain, which is assumed to connect throughboth cores

floor-The device tree for this design is shown in Figure 12.13 Since the targetedboard includes Xilinx SystemACE, this is used to configure the FPGA andinitialize external memory with the kernel image The compressed devicetree blob is initialized in the BRAM at address 0xfffff800 and decom-pressed by the Linux bootwrapper executing out of external memory Theroot filesystem is stored on an external file server and loaded over the net-work interface using the NFS protocol

solu-as processor cores, and where physical interfaces to the rest of the system arehighly flexible and incorporate many features that cannot be easily modeledeven at the circuit and gate level

However, using the architectural features of some FPGAs, such as PR,higher level platforms can be constructed that abstract many of these detailsand are more appropriate for mapping from a high-level design tool Thischapter has particularly shown how this technique can abstract the complex-ities associated with including a control processor and operating system aspart of an FPGA platform

Trang 11

reg = < 41300000 10000 >;

xlnx,family = "virtex4";

} ; } ; plbv46_dcr_bridge_0: plbv46−dcr−bridge@80700000 { compatible = "xlnx,plbv46−dcr−bridge−1.00.a"; dcr−access−method = "mmio";

dcr−controller ; dcr−mmio−range = < 80700000 1000 >;

dcr−mmio−stride = <4>;

} ; rs232: serial@84000000 { clock−frequency = <4f790d5>;

} ; xps_intc_0: interrupt−controller@81800000 {

#interrupt−cells = <2>;

compatible = "xlnx,xps−intc−1.00.a";

interrupt−controller ; reg = < 81800000 10000 >;

xlnx,num−intr−inputs = <5>;

} ; xps_socket_0: xps−socket@50000000 { compatible = "xlnx,xps−socket";

FIGURE 12.13

Device tree

Trang 12

1 B Blodget, P James-Roxby, E Keller, S McMillan, and P Sundararajaran

A self-reconfiguring platform In Proceedings of the International Field grammable Logic and Applications Conference (FPL), Lisbon, Portugal, 2003 Lecture Notes in Computer Science, Vol 2778, Springer-Verlag, September

Pro-2003

2 J Castillo, P Huerta, V Lopez, and J Martinez A secure

self-reconfiguring architecture based on open-source hardware In national Conference on Reconfigurable Computing and FPGAs (ReConFig),

Inter-Puebla City, Mexico, September 2005

3 J Corbett, A Rubini, and G Kroah-Hartman Linux Device Drivers.

O’Reilly, Sebastopol, CA, 3rd edition, 2005

4 R Fong, S Harper, and P Athanas A versatile framework for FPGA field

updates: An application of partial self-reconfiguration In Proceedings of the IEEE International Workshop on Rapid System Prototyping, San Diego,

7 IBM Device control register bus architecture specifications version 3.5,January 2006

8 IBM 128-bit processor local bus architecture specifications version 4.7,May 2007

9 I Kuon and J Rose Measuring the gap between FPGAs and ASICs IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,

26(2):203–215, February 2007

10 E A Lee and S Neuendorffer Actor-oriented models for codesign: ancing re-use and performance In S Shukla and J.-P Talpin (editors),

Bal-Formal Methods and Models for System Design: A System Level Perspective,

pp 33–56, Kluwer, Norwell, MA, 2004

11 M Majer, J Teich, A Ahmadinia, and C Bobda The Erlangen slot

machine: A dynamically reconfigurable FPGA-based computer Journal

of VLSI Signal Processing Systems, 47(1):15–31, March 2007.

Trang 13

12 P Murphy, A Sabharwal, and B Aazhang Design of WARP: A

flexi-ble wireless open-access research platform In Proceedings of the European Signal Processing Conference (EUSIPCO), Florence, Italy, 2006.

13 A Parsons et al A scalable correlator architecture based on modular

FPGA hardware and data packetization In Asilomar Conference on nals, Systems, and Computers, Pacific Grove, CA, November 2006.

Sig-14 A Parsons et al A scalable correlator architecture based on

modu-lar FPGA hardware and data packetization Submitted to IEEE actions on Signal Processing, available at http://casper.berkeley.edu/

17 P Sedcole, B Blodget, T Becker, J Anderson, and P Lysaght Modular

dynamic reconfiguration in Virtex FPGAs IEE Proceedings on Computers and Digital Techniques, 153(3):157–164, May 2006.

18 H K.-H So and R W Brodersen Improving usability of FPGA-based

reconfigurable computers through operating system support In ings of the International Field Programmable Logic and Applications Conference (FPL), Madrid, Spain, 2006.

Proceed-19 Sun Opensparc web page, available at http://www.opensparc net,accessed on March 7, 2008

20 Triscend Triscend e5 configurable system-on-chip platform datsheet,July 2001, v1.06

21 M Uhm and J Bezile Meeting software defined radio cost and power

targets: Making SDR feasible in Military Embedded Systems, pp 6–8, May

2005

22 J Williams and N Bergmann Embedded linux as a platform for

dynami-cally self-reconfiguring systems-on-chip In Proceedings of the International Multiconference in Computer Science and Computer Engineering (ERSA), Los

Vegas, CA, June 2004

23 Xilinx PLBv46 to PLBv6 Bridge Data Sheet, ds618 edition

Ver-sion 1.00.a, available at http:/www.xilinx.com/bvdocs/ipcenter/data_sheet/plbv46_plbv46_bridge.pdf, accessed on March 6, 2008

24 Xilinx Embedded System Tools Reference Manual, ug111 v9.2 edition,

September 2007

Trang 14

25 Xilinx PLBV46 Interface Simplifications, sp026 edition, October 2007.

26 Xilinx Virtex-4 FPGA Confituration User Guide, ug071 v1.10 edition, April

2008

27 Xilinx Virtex-4 FPGA Guide, ug070 v2.40 edition, April 2008.

28 K Yaghmour Building Embedded Linux System O’Reilly, Sebastopol, CA,

2003

Ngày đăng: 02/07/2014, 15:20

TỪ KHÓA LIÊN QUAN