In order to exploreseveral different implementations with minimal effort, the design envi-ronment needs to provide a fast and an efficient way of mapping with-out modifying either the fu
Trang 1terms of the resulting language We have tested this simulation approach inSystemC 2.0 [49], Java [3], and C++ with a thread library In addition, wehave devised a service-based formalism [62] that can effectively integratemodels specified at different abstraction levels, in different specification lan-guages, and with different MoCs We also enhanced our simulation tool
to support the co-simulation of these heterogeneous models Further, thisservice-based formalism became the foundation of the second generation of
10.3.3.2 Formal Property Verification
Both academia and industry have long studied formal property verification,but the state-explosion problem restricts its usefulness to protocols and otherhigh abstraction levels At the implementation level or other low abstractionlevels, hardware and software engineers have used simulation monitors asbasic tools to check simulation traces while debugging designs
Verification languages, such as Promela, which are used by the Spinmodel checker [28], allow only simple concurrency modeling and are notamenable to the system design specification, which requires complex syn-
formal semantics, automatically generates verification models for all the els of the design [15]
lev-Our translator automatically constructs the Spin verification model fromthe MMM specification, taking care of all system-level constructs For exam-ple, it can automatically generate a verification model for the example inFigure 10.2 and verify the medium’s nonoverwriting properties Further, asthe translator refines the design through structural transformation and archi-tectural mapping, it can prove more properties, including throughput andlatency This kind of property verification typically requires several minutes
of computation on a 1.8 GHz Xeon machine with 1 Gbyte of memory When
approxi-mate verification and provides the user with a confidence factor on the ing result
pass-10.3.3.3 Simulation Monitor
Simulation monitors offer an attractive alternative to formal property
[7] to specify quantitative properties The system can automatically translatethe specification to simulation monitors in C++ [16], thus relieving design-ers from the tedious and error-prone task of writing monitors in the simula-tor’s language The monitors analyze the traces and report any LOC formulaviolations Like any other simulation-based approach, this one can only dis-prove an LOC formula if it finds a violation—it can never prove conclusivelythe formula’s correctness because that would require exhaustively analyzing
Trang 2traces The automatic trace analyzer can be used in concert with model ers It can perform property verification on a single trace even when otherapproaches would fail because of their excessive memory and space require-ments.
check-In our experience with applying the automatic LOC-monitor technique
to large designs with complex traces, we have found that in most cases theanalysis completes in minutes and consumes only hundreds of bytes of datamemory to store the LOC formulas The analysis time tends to grow linearlywith the trace size, while the memory requirement remains constant regard-less of the trace size
10.3.3.4 Quasi-Static Scheduling
We have developed an automatic synthesis technique called quasi-staticscheduling (QSS) [19] to schedule a concurrent specification on computa-tional resources that provide limited concurrency The QSS considers a sys-tem to be specified as a set of concurrent processes communicating throughthe FIFO queues, and generates a set of tasks that are fully and staticallyscheduled, except for data-dependent controls that can be resolved only
at runtime A task usually results from merging parts of several processestogether and shows less concurrency than the initial specification Moreover,the QSS allows interprocess optimizations that are difficult to achieve if pro-cesses remain separated, such as replacing interprocess communication withassignments
This technique proved particularly effective and allowed us generate aproduction-quality code with improved performance Applying the QSS to
a significant portion of an MPEG-2 decoder resulted in a 45% increase in theoverall performance
The assumptions that the QSS requires for the input specification form
a subset of what the MMM can represent Therefore, when integrating the
to verify if a design satisfies the required set of rules and how to convey allrelevant design information to the QSS tool
We addressed the first problem by providing a library of interfaces andcommunication media that implement a FIFO communication model Thoseparts of the design optimized with the QSS need to use these communicationprimitives
To convey relevant design information to the QSS, we use a back-end toolthat translates a design to be scheduled with the QSS into a Petri net speci-fication, which is QSS’s underlying model The QSS then uses the Petri net
to produce a new set of processes These new processes show no cess communication because the QSS removes it The processes communicatewith the environment using the same primitives implemented in the library.The new code can thus be directly plugged into the MMM specification as arefinement of the network selected for scheduling
Trang 3interpro-10.4 METROII Design Environment
the University of California, Berkeley, starting in 2006 The following
framework, and the mapping and execution semantics used
10.4.1 Overview
that were carried out in collaboration with our industrial partners These siderations are as follows
con-1 Heterogeneous IP import IP providers create models using
domain-specific languages and tools Requiring a singular form of design entry
in a system-level environment requires complex translation of the inal specification into the new language while making sure that seman-tics is preserved If different designs or different components within thesame design can have different semantics, the heterogeneity has to besupported by the new environment There are two main challenges thathave to be addressed: wrapping and interconnecting the IP
orig-First, IPs can be described in different languages and can have ferent semantics that can be tightly related to a particular simulator.Importing the IP entails providing a way of exposing the IP interface.The user must have the necessary aids to define wrappers that medi-ate between the IP and the framework such that the behavior can beexposed in an unambiguous way
dif-Secondly, wrapped components have to be interconnected Even ifthe interfaces are exposed in a unified way, interconnecting them isnot usually a straightforward process The data and the flow of controlbetween IP blocks must be exposed in such a way that the frameworkhas sufficient visibility
2 Behavior-performance orthogonalization For design frameworks that
sup-port multiple abstraction levels, different implementations of the samebasic functionality may have the same behavioral representations butdifferent costs For instance, different processors will be abstracted intothe same programmable components What distinguishes them is theperformance vs cost trade-off Moreover, not all metrics are considered
or optimized simultaneously It should be possible to introduce mance metrics during the design process, as the design proceeds fromspecification to implementation
perfor-The specification of what a component does should be independent
of how long it takes or how much power it consumes to carry out a
Trang 4task This is the reason why we introduce dedicated components, calledannotators, to annotate quantities to events.
A distinction has to be made between quantities used just to track thevalue of a specific metric of interest and quantities whose value is usedfor synchronization For instance, time is used to synchronize actionsand it is not merely a number that is computed based on the state evo-lution of the system For quantities that influence the evolution of thesystem, special components, called schedulers, are provided to arbitrateshared resources
The separation of schedulers from annotators allows for a pler specification and provides a cleaner separation between behaviorand performance As a result, instead of a two-phase execution as in
sim-METROPOLIS, the execution semantics become three phase
3 Mapping specification Mapping relates the functional and architectural
models to realize the system model The specification of this mappingmust be carried out such that there is minimal modification to the func-tional and architectural models themselves
Following the PBD approach, we want to keep the functionality andthe architecture separate The implementation of the functionality onthe architecture is achieved in the mapping step In order to exploreseveral different implementations with minimal effort, the design envi-ronment needs to provide a fast and an efficient way of mapping with-out modifying either the functional or the architectural model much In
METROPOLIS, this is achieved by event-level synchronization constraints,
as shown in [22] While providing a powerful way to link the models,this approach breaks the encapsulation of the models by allowing con-straints between arbitrary pairs of events and allowing access to anylocal variables in the scope of the events Also, since there are no spe-cial declarative constructs for mapping, this process of finding eventsand setting up constraints is not easy for designers to manipulate anddebug
only accessible events for synchronization constraints are the begin/endevents of interface methods in function and architecture models Also,the only accessible values are parameters and return values of the inter-face methods This coarser granularity and a more restrictive map-ping approach maintain the IP encapsulation and make mapping morerobust for designers
in SystemC 2.2 The framework has been tested under Linux, Solaris, andcygwin
Trang 5sc_event sc_module
Method Port
Event Interface
Mapper Adaptor
solver Annotator
Scheduler Manager
Implementation platform:
SystemC 2.2
M ETRO II core
FIGURE 10.6
The infrastructure is summarized in Figure 10.6 The sc_event and
The connection and coordination of components are carried out through
< p, T, V >, where p is a process that generates the event, T is a tag set, and
system and values are used to represent the states of the system
Methods, interfaces, and ports are built on the concept of event A method
is characterized by a pair of begin and end events An interface contains one
or more methods Ports are associated with interfaces, and only ports withcompatible interfaces can be connected A component can have zero or moreports To handle different aspects of the events, special objects are defined,including annotators, schedulers, and constraint solvers Annotators anno-tate events with quantities, schedulers coordinate the execution sequence ofevents, and constraint solvers resolve the declarative constraints on events.Mappers and adaptors are defined to interconnect components Mappersbridge the function methods and architecture services Adaptors intercon-nect components with heterogeneous MoCs Finally, the manager coordi-nates the execution of all the objects using three-phase execution semantics
a reader component, a mapper, a scheduler, and an annotator in a typicalproducer–consumer design example is shown in the figure More details ofthese elements are introduced below
10.4.2.1 Components
A component is an object that encapsulates an imperative code in adesign, either functional or architectural Components interface with other
Trang 6Event <proc, tag, value>
M2_INTERFACE(i_func_receiver) { M2_TWOARG_PROCEDURE(receive, void *, unsigned long); M2_COMPONENT(Reader) public: sc_process_handle this_thread; SC_HAS_PROCESS(Reader); Reader(sc_module_name n) : m2_component(n)
} void main() this_thread = sc_get_current_process_handle();
c_double_handshake c("rendezvous"); Reader r("Reader"); M2_CONNECT(r, out_port, c, read_port);
//mapper definition { receive_mapper(sc_module_name name) :
{} void receive(void *
}; //instantiation //mapping between ports // setup physical time annotator ptime_event_list.push_back(r.read_event_end); std::map<const char*, double, ltstr> ptime_table; ptime_table[w.write_event_end->get_full_name()]=2; m2_physical_time_annotator* ptime=new register_annotator(ptime);
// setup logical time scheduler m2_logical_time_scheduler("lt_scheduler"); ltime->add_event(r.read_event_beg); ltime->add_event(w.write_event_beg); register_scheduler(ltime);
Trang 7components via ports There are two descriptions of component tion: atomic components and composite components An atomic component
composi-is a block specified in some language and composi-is viewed by the framework as ablack box with only its interface information exposed A composite compo-nent is a group of one or more objects as well as any connections betweenthem When an existing IP is being imported, it will be encapsulated by
a wrapper, which translates and exposes the appropriate events and faces from the IP The wrapped IP becomes an atomic component in theframework
By setting constraints between events associated with the ports of ferent components, the execution of these components can be coordinated.There are two types of ports: required ports and provided ports Requiredports are used by components to request methods that are implemented inother components Provided ports are used by components to provide meth-ods to other components Connections between components are made onlybetween a required port and a provided port with the same interface Theexecution semantics that coordinate a pair of required and provided portswill be introduced in Section 10.4.3
dif-10.4.2.3 Constraint Solvers
Constraints are used to specify the design via declarative means, as opposed
to imperative specification which is contained in components Constraintsare described in terms of events: their status (enabled or disabled), their tags,and the values associated with them The events referenced by constraintsmust be exposed by ports
Constraint solvers are objects that resolve these declaration constraintsduring runtime Depending on the status, tags, and values of the events, con-straint solvers decide whether to enable or disable events, thereby coordinat-ing the execution of components
Designers can derive various constraint solvers from the base class solver
constraint solver is provided Two events that are specified in a tion constraint need to be enabled at the same time—during simulation, theyneed to be enabled in the same iteration Further examples will be given inSection 10.4.3 Synchronization constraints are used for mapping betweenthe functionality and the architecture, as is explained later
Trang 8synchroniza-10.4.2.4 Annotators and Schedulers
In METROPOLIS, both the performance annotation and the scheduling of eventswere carried out by a type of special component called a quantity man-ager As stated before, to have a more clear separation of design concerns,these two aspects will be handled separately by annotators and schedulers
The instantiation of a physical time annotator is shown in Figure 10.7 The
r.read_event_end and w.write_event_end are events associated with a reader
and writer component, respectively These two events are added to a list ofevents to be considered for annotation In addition, a table indexed by theseevents is created along with the assigned time units required for execution(1 and 2 units, respectively) This list and the table are then added to theannotator object itself If these events are present during the second phase ofexecution, their tags will be updated accordingly
Schedulers coordinate the execution of the components by abling the events proposed by the processes of the components Based onthe local state of the scheduler, the status of the events, as well as their val-ues and tags, the scheduler determines the scheduling of the events A base
sched-ulers A logical time scheduler that schedules the events based on the cal time tags, and a round-robin scheduler that schedules the access to sharedresources are provided as library schedulers An example using the logicaltime scheduler is shown in the code snippet in Figure 10.7
physi-10.4.2.5 Mappers
map-pers, which synchronize the begin and end events of the functional ods and architectural methods Designers are only allowed to specify map-ping at this service level, with access to the parameters and return values ofthe methods When the begin/end events in the functional and architecturalmethods are synchronized, the parameters and return values can be trans-fered between the two models For instance, a functional method may haveone parameter that the corresponding architectural method is unaware of.During mapping, the value of this parameter can be passed to the architec-
Trang 9the service level The implementation of mappers is a synchronization straint solver with value passing of parameters and return values.
con-An example of a mapper is shown in Figure 10.7 This mapper is called
“receive_mapper” and is used to map the consumer in a producer–consumerdesign example to a processing element, p During mapping when thereceive method is called by the functional model with two arguments, the
mapper’s out_port will call the architectural model’s receive method that has
three arguments Also shown in Figure 10.7 are the instantiation of the per along with how the mapper is connected between the functional modeland the architectural model
map-10.4.2.6 Adaptors
There are various ways of handling heterogeneous MoCs in a design One ofthe most common approaches is the hierarchical composition as in Ptolemy II[38] With the hierarchical composition, each level of the hierarchy is homo-geneous, i.e., a single MoC exists at each level, while different interactionmechanisms are allowed to be specified at different levels in the hierarchy[26] To allow models in two heterogeneous MoCs to communicate, a thirdMoC may need to be found within which the two will be embedded
In our experience, there is a strong need to interconnect heterogeneousmodels directly at the same level For instance, the user may want to connectthe output of a base-band-processing component (described by a dataflowmodel) to the input of an RF component (described by a continuous-timemodel) This way of handling complexity does not require changing theinterface of a model in order to behave like another model This is in linewith one of our main concerns: being able to reuse IPs in different contexts.The complexity of this approach lies in designing the correct intercon-nections between different MoCs To bridge the different semantics of het-erogeneous components, we use adaptors to modify events as they passfrom one component to another Denotationally, an adaptor is a relation,
another model
Adaptors are connected with components through specialized adaptorchannels In the PBD methodology, adaptors can be regarded as the bridgebetween heterogeneous functional components or between heterogeneous
exam-ple of adaptors between dataflow and finite-state machine (FSM) semantics
cen-tered around the connection and coordination of components The tion semantics discussed here are involved in the simulation of a system fordesign-space exploration
Trang 10execu-10.4.3.1 Three-Phase Execution
seman-tics, two other concepts must be introduced: process states and event states
In Figure 10.8, the states that an event can have are shown Events can
be inactive, proposed, and annotated All events begin as inactive As theself loop shows, they can remain inactive indefinitely When a method call
on a required port generates an event it becomes proposed It will then beannotated If the event is then deemed appropriate to enable (via a variety ofscheduling decisions) it will transition to inactive again
execute concurrently until an event is proposed on a required port of thecomponent containing the process or until they are blocked on a providedport At this point they transition to the suspended state Once the event
is enabled or the internal blocking is resolved, the processes return to therunning state
Based on this treatment of events, the design is partitioned into threephases of execution In the first phase, processes propose possible events; thesecond phase associates tags with the proposed events; and the third phaseallows a subset of the proposed events to execute
1 Base model execution The base model consists of concurrently
exe-cuting processes that may suspend only after proposing events or by
3b Enable some events
Phase 2 Phase 1
Event proposed by process Propose event(s) or block
Enable event or resume process
METRO II process states
Logical time
FIGURE 10.8
Trang 11waiting for (blocking) other processes A process may atomically pose multiple events—this represents nondeterminism in the system.After all processes in the base model are blocked, the design shifts tothe second phase The execution of processes between blocking points
pro-is beyond the control of the framework
2 Quantity annotation In the second phase, each of the proposed events
is annotated with various quantities of interest For instance, a proposedevent may be annotated with local and global time tags New eventsmay not be proposed during this phase of execution In this way, eventsand the methods they correspond to can be associated with cost
3 Constraint solving In this phase, a subset of the proposed events is
enabled and permitted to execute, while the remaining events remainsuspended Events are enabled according to schedulers and constraintsolvers These enabled events then become inactive again while simul-taneously allowing their associated processes to resume to the runningstate At most one event per process is permitted to execute Once again,new events may not be proposed during this stage Constraint solv-ing may be based on the resolution of declarative constraints or on theimperative code
A collection of three completed phases is referred to as a round After theconstraint solving phase, the states of some processes are switched to run-ning while some others might still be suspended The execution will thenshift to the first phase and start a new round Those processes that are inthe running state will resume their executions The iterations of these threephases will end when all processes finish their executions Figure 10.8 illus-trates the process states, event states, and the three phases in the executionsemantics Self loops on the inactive and annotated states illustrate that mul-tiple rounds may pass without an update to a particular event’s state.Table 10.1 illustrates the relationships between events and phases In thefirst phase (base), events can be proposed and their values can be read orwritten In the second phase (annotation), tags can be read and written andvalues can be read In the final phase (constraint solving), events can be dis-abled and their tags and values can be read The semantics have been care-fully designed so that the event manipulation adheres to our separation of
TABLE 10.1
Phase–Event Relationships
Trang 12TABLE 10.2
concerns methodology This is very helpful not only in debugging tion but also in making sure that the framework functions efficiently
details the presence of threads as well as the ability to manipulate events,tags, and values It also indicates if there is hierarchy Components and adap-tors may have zero or more threads, while annotators and schedulers do nothave any threads
Events, and by extension, services, may be annotated by quantities ofinterest Quantities capture the cost of carrying out particular operations andare implemented using quantity managers Annotators are special compo-nents that provide annotation services Schedulers are similar to quantitymanagers, but instead of a quantity they provide scheduling and arbitration
of shared resources Adaptors modify tags and provide interfacing betweendifferent MoCs Depending on the MoC used and the needs of the design,different annotators and schedulers can be used
10.4.3.2 Semantics of Required/Provided Ports
The execution semantics of the required and provided ports are as follows.For required ports, a component proposes a begin event and associatesvalues with the proposed event that represent the arguments of the methodthat is requested When the proposed event is enabled and executed, the con-trol transfers to the component at the other end of the connection, whichowns the corresponding provided port The component waits for the endevent to be executed and obtains the return values from the method
For provided ports, no separate process exists in the component to carryout the provided method Instead, the component inherits the process fromthe caller component and executes the events in the provided method usingthat process After the method has been executed, the component proposesthe end event
10.4.3.3 Semantics of Mapping
architectural models The two are then mapped together to produce a systemmodel with performance metrics Mapping is realized by adding constraints
Trang 13between events from the functional model and events from the architecturalmodel.
We will present three options for the execution semantics of mapping
10.9 Option 1 is the first call graph shown, option 2 follows, and option 3 isthe last For options 1 and 2 the structural view (upper right of the figure)
is a connection between required ports in the functional model and vided ports in the architectural model For option 3, the mapping structure
pro-is different and pro-is between provided ports in function and provided ports inarchitecture
The first option is a sequential option in which the functional modelbegins execution before the architectural model Some of the highlights ofthis option are captured in Table 10.3
Figure 10.9 shows both a structural and a call-graph view of mapping inthe first option The ports in these and future diagrams are specified with thefirst letter of the component they belong to Also, ports are designated as “R”
or “P” if they are required or provided “b” and “e” designate the begin andend events, respectively These designations can be combined For example,
“FP.e” would indicate the end event of component F’s provided port.Figure 10.9 shows the mapping structure of a system using this option.The functional model contains a method call to G from P The mapping
of this method call occurs by assigning events proposed by FR to eventsproposed by AP This is considered a required port to the provided port-mapping structure
Figure 10.9 also shows the call graph of the system Boxes with singleline borders are events Boxes that have two line borders are code blocks thatmay or may not contain events The arrows indicate program flow (from left
to right) If an arrow is dashed it means that two events connected to it aretreated as a single event by the framework The functional component F calls
a method from the component G This is mapped to the architectural ponent A, which further uses the architectural component B when providingthe service
com-The execution in this option occurs as follows: component F contains aprocess This process is responsible for proposing the event “FR.b.” “FR.b”corresponds to “GP.b” (in G) Once these events are enabled, the “G body”(the code body of the function call to G) can now execute Upon comple-tion, “FP.e” (in G) will be proposed This event corresponds to “AP.b” in thearchitecture The architecture body, “A body,” can now execute and culmi-nate with the proposal of “AP.e.” As shown, “AP.e” corresponds to “FR.e,”which completes the execution
As shown, the mapping of methods is carried out by invoking themapped architectural service in the process of the caller after the correspond-ing functional method has completed the execution
In option 2, the execution semantics of mapping involve executingmapped architectural services before their functional counterparts When a
Trang 15TABLE 10.3
Mapping Options Overview
Execution
Option Simulation (Func ↔ Arch) Port Correspondence Blocking
and Architecture
mapped method is invoked by a functional process, the begin event of thatmethod is initially proposed, and a phase change is permitted to occur If thisevent is enabled, then the architectural service executes first, immediatelyfollowed by the invoked functional method After this, the end event of thatmethod is proposed, with a subsequent phase change Both the functionalmethod and the architectural service are executed by the functional process;there are no special mapping processes Additionally, both the functionalmethod and the architectural service may block internally while waiting forother processes
The functional method is parameterized with arguments and has a returntype The architectural service is also parameterized, but the return value isnot used The correspondence between the architectural service parametersand the functional service parameters is specified at compile time
This proposal is in some regards the opposite of the first proposal It issummarized in Table 10.3
Figure 10.9 for the previous option shows the call graph for executionbetween the functional and architectural models Basically, the functionalmethods need to be completed before the corresponding architecture ser-vices start However, in some cases, this approach may not be able to reflectall the situations in the mapped system
For instance, let us consider a shared FIFO example Option #1 cannotassure that the architectural ordering decision impacts the functional execu-tion, since the function methods will finish before the architecture is invoked.Therefore, the shared FIFO example may not work as expected with option
# 1 if one wants to use the state of the architectural FIFO to block functionalprocesses (i.e., it is full) Essentially, functional nondeterminism cannot beresolved by the architecture Such operations may be desirable when thearchitecture is better able to perform given the opportunity to make deci-sions based on its state (free resources, for example) This also removes somescheduling burden from other areas of the system