For example, a straight-forward approach is to approximate the processor’s request event model in proces-a given time window with the proces-aggregproces-ation of the request event model
Trang 1in the application binary Depending on the current cache state and theexecution history, cache misses may occur at different points in time How-ever, formal methods are able to identify for each basic block the maximumnumber of cache misses that may occur during the execution [46] The con-trol flow graph can be annotated with this information, making the longestpath analyses feasible again.
Depending on the actual system configuration, the upper bound on thenumber of transactions per task execution may not be sufficiently accurate
In a formal model, this could translate into an assumed burst of requeststhat may not occur in practice This can be addressed with a more detailedanalysis of the task control flow, as is done in [1,39], which provides bounds
on the minimum distances between any n requests of an activation of that
task This pattern will then repeat with each task activation
This procedure allows to conservatively derive the shared resourcerequest bound functions ˜η+
τ(w) and ˜η−
τ(w) that represent the transaction
traf-fic that each task τ in the system can produce within a given time window
of size w Requesting tasks that share the same processor may be executed in
alternation, resulting in a combined request traffic for the complete sor This again can be expressed as an event model For example, a straight-forward approach is to approximate the processor’s request event model (in
proces-a given time window) with the proces-aggregproces-ation of the request event models ofeach individual task executing on that processor Obviously, this is an over-estimation, as the tasks will not be executed at the same time, but rather thescheduler will assign the processor exclusively The resulting requests will
be separated by the intermediate executions, which can be captured in thejoint shared resource request bound by a piecewise assembly from the ele-mentary streams [39]
3.3.2 Response Time Analysis in the Presence of Shared Memory
Accesses
Memory access delays may be treated differently by various processor mentations Many processors, and some of the most commonly used, allowtasks to perform coprocessor or memory accesses by offering a multi-cycleoperation that stalls the entire processor until the transaction has been pro-cessed by the system [44] In other cases, a set of hardware threads may allow
imple-to perform a quick context switch imple-to another thread that is ready, effectivelykeeping the processor utilized (e.g., [17]) While this behavior usually has
a beneficial effect on the average throughput of a system, multithreadingrequires caution in priority-based systems with reactive or control applica-tions In this case, the worst-case response time of even high-priority tasksmay actually increase [38]
The integration of dynamic memory access delays into the real-time ysis will in the following be performed for a processor with priority-basedpreemptive scheduling that is stalled during memory accesses In such a
Trang 2anal-system, a task’s case response time is determined by the task’s case execution time plus the maximum amount of time the task can be keptfrom executing because of preemptions by higher-priority tasks and blocking
worst-by lower-priority tasks A task that performs memory accesses is ally delayed when waiting for the arrival of requested data Furthermore,preemption times are increased, as the remote memory accesses also causehigh-priority tasks to execute longer
addition-A possible runtime schedule is depicted in Figure 3.5 In the case whereboth tasks execute in the local memory (Scenario 3.5a), the low-priority task
is kept from executing by three invocations of the high-priority tasks Localmemory accesses are not explicitly shown, as they can be considered to bepart of the execution time When both tasks access the same remote mem-ory (Scenario 3.5b), the finishing time of the lower-priority task increases,because it itself fetches data from the remote memory, and also because ofthe prolonged preemptions by the higher-priority task (as its request alsostalls the processor) The execution of the low-priority task in the example
is now stretched such that it suffers from an additional preemption of theother task Finally, Scenario 3.5c shows the effect of a task on another coreCPUb that is also accessing the same shared memory, in this case, periodi-cally Whenever the memory is also used by a task on CPUb, CPUa is stalledfor a longer time, again increasing the task response times, possibly leading
to the violation of a given deadline As the busy wait adds to the execution
t
(a)
Preemption Stalling CPU
Trang 3time of a task, the total processor load increases—possibly making the overallsystem unschedulable.
On the basis of these observations, a response time equation can bederived for the example scheduler The response time represents the sum
• The aggregate delay caused by the memory accesses that is a function
of the memory accesses of a specific task and its higher-priority tasks.This is investigated in Section 3.3.3
Variations of such a response time analysis have been presented forsingle- and multithreaded static-priority preemptive scheduling [38], as well
as for round-robin scheduling [41] Other scheduling policies for which sical real-time analysis is available can be straight-forwardly extended toinclude memory delays by including a term that represents the aggregatebusy time due to memory accesses
clas-3.3.3 Deriving Aggregate Busy Time
Deriving the timing of many memory accesses has recently become animportant topic in real-time research Previously, the worst-case timing ofindividual events was the main concern Technically, a sufficient solution
to find the delay that a set of many events may experience, is to derivethe single worst-case load scenario and assume it for every access How-ever, not every memory request will experience a worst-case system state,such as worst-case time wheel positions in the time division multiple access(TDMA) schedules, or transient overloads in priority-based components Forexample, the task on CPUb in Figure 3.5 will periodically access the sharedmemory, and, as a consequence, disturb the accesses by the two tasks onCPUa A “worst-case memory access” will experience this delay, but of allaccesses from CPUb, this happens maximally three times in this example.Thus, accounting this interference for every single memory access leads tovery unsatisfactory results—which has previously prevented the use of con-servative methods in this context
The key idea is instead to consider all requests that are processed during
the lifetime of a task jointly We therefore introduce the worst-case lated busy time, defined as the total amount of time, during which at least one
accumu-request is issued but is not finished Multiple accumu-requests in a certain amount oftime can in total only be delayed by a certain amount of interference, which
is expressed by the aggregate busy time
This aggregate busy time can be efficiently calculated (e.g., for a sharedbus): a set of requests is issued from different processors that may interfere
Trang 4with each other The exact individual request times are unknown and theiractual latency is highly dynamic Extracting detailed timing information(e.g., when a specific cache miss occurs) is virtually impossible and consid-ering such details in a conservative analysis yields exponential complexity.Consequently, we disregard such details and focus on bounding the aggre-gate busy time Given a certain level of dynamism in the system, this con-sideration will not result in excessive overestimations Interestingly, even inmultithreaded multicore architectures, the conservatism is moderate, sum-ming up to less than a total of 25% of the overestimated response time, asshown in practical experiments [42].
Without bus access prioritization, it has to be assumed that it is ble for every transaction issued by any processor during the lifetime of a
possi-task activation i that it will disturb the transactions issued by i Usually, the
interference is then given by the transactions issued by the other rently active tasks on the other processors, as well as the tasks on the sameprocessor as their requests are treated on a first-come-first-served basis Theinterested readers are referred to [40] for more details on the calculation ofaggregate memory access latencies
concur-If a memory controller is utilized, this can be very efficiently ered For example, all requests from a certain processor may be prioritizedover those of another Then, the imposed interference by all lower-priorityrequests equals zero Additionally, a small blocking factor of one elementarymemory access time is required, in order to model the time before a transac-tion may be aborted for the benefit of a higher-priority request
consid-The compositional analysis approach of Section 3.2, used together withthe methods of Section 3.3, now delivers a complete framework for theperformance analysis of heterogeneous multiprocessor systems with sharedmemories The following section turns to detailed modeling of inter-processor communication with the help of HEMs
3.4 Hierarchical Communication
As explained in Section 3.2, traditional compositional analysis models buscommunication by a simple communication task that is directly activated bythe sending task, and which directly activates the receiving task Figure 3.6shows a simple example system that uses this model for communication,
where each output event of the sending tasks, Taand Tb, triggers the mission of one message over the bus
trans-However, the modern communication stacks employed in today’sembedded control units (ECUs), for example, in the automotive domain,make this abstraction inadequate Depending on the configuration of thecommunication layer, the output events (called signals here) may or may
Trang 5Communication via ComLayer.
not directly trigger the transmissions of messages (called frames here).For instance, AUTOSAR [2] defines a detailed API for the communicationstack, including several frame transmission modes (direct, periodic, mixed,
or none) and signal transfer properties (triggered or pending) with keyinfluences on communication timings Hence, the transmission timings ofmessages over the bus do not have to be directly connected to the outputbehaviors of the sending tasks anymore, but they may even be completelyindependent of the task’s output behavior (e.g., sending several outputsignals in one message)
In the example shown in Figure 3.7, the tasks Taand Tbproduce output
signals that are transmitted over the bus to the tasks Tc and Td The ing tasks write their output data into registers provided by the communi-cation layer, which is responsible for packing the data into messages, calledframes here, and triggering the transmission of these frames according to thesignal types and transmission modes On the receiving side, the frames areunpacked, which means that the contained signals are again written into dif-ferent registers for the corresponding receiving task Using flat event models,the timings of signal arrivals can only be bound with a large overestimation
send-To adequately consider such effects of modern communication stacks inthe system analysis, two elements must be determined:
1 The activation timings of the frames
2 The timings of signals transmitted within these frames arriving at thereceiving side
To cope with both the challenges, we introduce hierarchical event streams(HESs) modeled by a HEM, which determines the activating function of theframe and also captures the timings of the signals assigned to that frame, and,
Trang 6most importantly, defines how the effects on the frame timings influence thetimings of the transmitted signals The latter allows to unpack the signals
on the receiving side, giving tighter bounds for the activations of those tasksreceiving the signals
The general idea is that a HES has one outer representation in the form
of an event stream ESouter, and each combined event stream has one innerrepresentation, also in the form of an event stream ESi , where i denotes the
task to which the event stream corresponds The relation between the outerevent stream and the inner event stream depends on the hierarchical streamconstructor (HSC) that combined the event streams Each of the involvedevent streams is defined by functions δ−(n) and δ+(n) (see Section 3.2.2), returning the minimum and the maximum distance, respectively, between n
consecutive events
Figure 3.8 illustrates the structure of the HES at the input of the
chan-nel C of the example shown in Figure 3.7 The HSC combines the output streams of the tasks Taand Tb, resulting in the hierarchical input stream of
the communication task C According to the properties and the configuration
of the communication layer that is modeled by the HSC, the inner and outerevent streams of the HES are calculated Each event of the outer event stream,
ESouter, represents the sending of one message by the communication layer.The events of a specific inner event stream, ESaand ESb, model the timings ofonly those messages that contain data from the corresponding sending tasks.The detailed calculations of the inner and outer event streams, consideringthe different signal properties and frame transmission modes, are presented
in [37]
For the local scheduling analysis of the bus, only the outer event stream
is relevant As a result, the best-case response time, Rmin, and the worst-case
HESa,b
ESouter
HSC Distances between total message releases
Distances between messages containing a
new signal from Tb
Distances between messages containing a
new signal from Tb
Trang 7response time, Rmax, are obtained Based on the outer event stream, ESouter,
of the hierarchical input stream, we obtain the outer event stream, ESouter, ofthe hierarchical output stream by using the following equations:
δ−
outer(n) = max{δ−outer(n) − Jresp, δouter− (n − 1) + dmin} (3.2)
δ+
outer(n) = max{δ+outer(n) + Jresp, δouter+ (n − 1) + dmin} (3.3)
In fact, Equations 3.2 and 3.3 are generalizations of the output model lation presented in Equation 3.1 As can be seen, actually two changes havebeen made to the message timing First, the minimum (maximum) distancebetween a given number of events decreases (increases) by no more than the
calcu-response time jitter, Jresp = Rmax− Rmin Second, two consecutive events
at the output of the channel are separated by at least a minimum distance,
dmin= Rmin The resulting event stream, modeled by δouter− (n) and δouter+ (n),
becomes the outer stream of the output model
To obtain the inner event streams, ESi, of the hierarchical output stream,
we adapt the inner event streams, ESi, of the hierarchical input streamaccording to the changes applied to the outer stream For the adaptation,
we consider the two changes mentioned above separately First, consider
that the minimum distance between n messages decreases by Jresp Then,
the minimum distance between k messages that contain the data of a cific task decreases by Jresp Second, we must consider that two consecu-
spe-tive messages become separated by a minimum distance dmin Figure 3.9aillustrates a sequence of events consisting of two different event types,
a and b Assume that this event sequence models the message timing, where
the events labeled by a lowercase a correspond to the messages containing data from task Ta, and the events labeled by a lowercase b correspond to the messages containing data from task Tb Figure 3.9b shows how this event
sequence changes when a minimum distance dminbetween two consecutiveevents is considered As indicated, the distance between the last two events
of type b further decreases because of the minimum distance Likewise, the
maximum distance increases because of the minimum distance, dmin, as can
be seen for the first and the second of the events of type b Based on the
min-imum distance, dmin, the maximum possible decrease (increase), Dmax, in the
(a) The event sequence before applying the minimum distance and (b) the
event sequence after considering the minimum distance dmin
Trang 8minimum (maximum) distance between events that can occur because of theminimum distance can be calculated Note that, in the case of large bursts,
Dmaxcan be significantly larger than dmin, since an event can be delayed byits predecessor event, which itself is delayed by its predecessor and so on.More details can be found in [37]
In general, considering the response time jitter, Jresp, and the minimum
distance, dmin, the inner stream of the hierarchical output stream, modeling
messages that contain data from the task T i, can be modeled by
given by the corresponding inner stream Assuming that the task Tc is only
activated every time a new signal from the task Ta arrives, then the inner
event stream ESa of the hierarchical output stream of the communication
task C can directly be used as an input stream of the task Tc.
It is also possible to have event streams with multiple hierarchical ers, for example, when modeling several layers of communication stacks orcommunications over networks interconnected by gateways, where severalpackets may be combined into some higher-level communication structure.This can be captured by our HEM by having an inner event stream of a HESthat is the outer event stream of another HEM For more details on multilevelhierarchies, refer to [36]
lay-3.5 Scenario-Aware Analysis
Because of the increasing complexity of modern applications, hard real-timesystems are often required to run different scenarios (also called operatingmodes) over time For example, an automotive platform may exclusively exe-cute either an ESC or a parking-assistant application While the investigation
of each static scenario can be achieved with classical real-time performanceanalysis, timing failures during the transition phase can only be uncoveredwith new methods, which consider the transient overload situation duringthe transition phase in which both scenarios can impress load artifacts on thesystem
Each scenario is characterized by a specific behavior and is associatedwith a specific set of tasks A scenario change (SC) from one scenario toanother is triggered by a scenario change request (SCR) which may be causedeither by the need to change the system functionality over time or by a system
Trang 9transition to a specific internal state requiring an SC Depending on the taskbehavior across an SC, three types of tasks are defined:
• Unchanged task: An unchanged task belongs to both task sets of the
ini-tial (old) and the new scenario It remains unchanged and continuesexecuting normally after the SCR
• Completed task: A completed task only belongs to the old scenario task
set However, to preserve data-consistency, completed task jobs vated before the SC are allowed to complete their execution after theSCR Then the task terminates
acti-• Added task: An added task only belongs to the new scenario task set.
It is initially activated after the SCR Each added task is assigned anoffset value, φ, that denotes its earliest activation time after the SCR
During an SC, executions of completed, unchanged, and added tasks mayinterfere with one another, leading to a transient overload on the resource.Since the timing requirements in the system have to be met at any time dur-ing the system execution, it is necessary to verify if task deadlines could bemissed because of an SC
Methods analyzing the timing behavior across an SC under static-prioritypreemptive scheduling already exist [32,45,47] However, they are limited
to independent tasks mapped on single resources Under such an tion, the worst-case response time for an SC for a given task under analysis
assump-is proved to be obtained within the busy window during which the SCRoccurs, called the transition busy window These approaches can howevernot be applied to distributed systems because of the so-called echo effect.The echo effect is explained in the following section using the system exam-ple in Figure 3.11
3.5.1 Echo Effect
The system used in the experiments of Section 3.8 (depicted in Figure 3.11)represents a hypothetical automotive system consisting of two IP compo-nents, four ECUs, and one multicore ECU connected via a CAN bus Thesystem is assumed to run two mutually exclusive applications: an ESP
application (Sens1, Sens2 → eval1, eval2) and a parking-assistant application (Sens3 → SigOut) A detailed system description can be found in Section 3.8.
Let us focus on what happens on the CAN bus when the ESP application
is deactivated (Scenario 1) and the parking-assistant application becomesactive (Scenario 2) Depending on which application a communication taskbelongs to, we can determine the following task types on the bus when an SC
occurs from Scenario 1 to Scenario 2: C1 and C5 are unchanged tion tasks, C3 and C4 are added communication tasks, and C2 is a completed
communica-communication task Furthermore, we assume the following priority
order-ing on the bus: C1 > C2 > C3 > C4 > C5.
Trang 10When an SC occurs from Scenario 1 to Scenario 2, the added
communica-tion task C3 is activated by events sent by the task mon3 However, C3 may have to wait until the prior completed communication task C2 finishes exe-
cuting before being deactivated This may lead to a burst of events waiting at
the input of C3 that in turn may lead to a burst of events produced at its put This burst of events is then propagated through the task ctrl3 on ECU4
out-to the input of C4 In between, this burst of events may have been fied because of scheduling effects on ECU4 (the task ctrl3 might have to wait until calc finishes executing) Until this burst of events arrives at C4’s input—
ampli-which is a consequence of the SC on the bus—the transition busy windowmight already be finished on the bus The effect of the transient overloadbecause of the SC on the bus may therefore not be limited to the transitionbusy window but be recurrent We call this recurrent effect the echo effect
As a consequence of the echo effect, for the worst-case response time
calcu-lation across the SC of the low-priority unchanged communication task C5,
it is not sufficient to consider only its activations within the transition busywindow Rather, the activations within the successive busy windows need to
be considered
3.5.2 Compositional Scenario-Aware Analysis
The previous example illustrates how difficult it is to predict the effect ofthe recurrent transient overload after an SC in a distributed system As aconsequence of this unpredictability, it turns to be very difficult to describethe event timings at task outputs and therefore to describe the event tim-ings at the inputs of the connected tasks, needed for the response time cal-culation across the SC To overcome this problem, we need to describe theevent timing at each task output, in a way that covers all its possible tim-ing behaviors, even those resulting from the echo effect that might occurafter an SC
This calculation is performed by extending the compositional ogy presented in Section 3.2 as follows As usual, all external event models atthe system inputs are propagated along the system paths until an initial acti-vating event model is available at each task input Then, global system anal-ysis is performed in the following way In the first phase, two task responsetime calculations are performed on each resource First, for each task we cal-culate its worst-case response time during the transition busy window Thiscalculation is described in detail in [13] Additionally, for each unchanged oradded task, using the classical analysis techniques we calculate its worst-caseresponse times assuming the exclusive execution of the new scenario Then,for each task, a response time interval is built into which all its observableresponse times may fall (i.e., the maximum of its response time during thetransition busy window and its response time assuming the exclusive exe-cution of the new scenario) The tasks’ best-case response times are given bytheir minimum execution times in all scenarios
Trang 11methodol-Having determined a response time interval across the SC for each task,the second phase of the global system analysis is performed as usual, describ-ing the traffic timing behavior at task outputs, by using event models.Afterward, the calculated output event models are propagated to the con-nected components, where they are used as activating event models forthe subsequent global iteration If after an iteration all calculated outputevent models remain unmodified, convergence is reached As the propa-gated event models contain all potential event timings—during and after thetransition—the calculated task response times are considered valid.
3.6 Sensitivity Analysis
As a result of an intensive HW/SW component reuse in the design of ded systems, there is a need for analysis methods that, besides validating theperformance of a system configuration, are able to predict the evolution ofthe system performance in the context of modifications of component prop-erties
embed-The system properties represent intrinsic system characteristics mined by the configuration of the system components and the system’s inter-action with the environment These include the execution/communicationtime intervals of the computational/communication tasks, the timing param-eters of the task activation models, or the speed factor of the HW resources.These properties are used to build the performance model of the system.Based on this, the system quality is evaluated using a set of performancemetrics such as response times, path latencies, event timings at components’outputs, and buffer sizes These are used to validate the set of performanceconstraints, determined by local and global deadlines, jitter and bufferingconstraints, and so on
deter-The sensitivity analysis of real-time systems investigates the effects on thesystem quality (e.g., end-to-end delays, buffer sizes, and energy consump-tion) when subjected to system property variations It will in the following
be used to cover two complementary aspects of real-time system designs:performance characterization and the evaluation of the performance slack
3.6.1 Performance Characterization
This aspect investigates the behavior of the performance metrics when ing modifications of different system properties Using the mathematicalproperties of the functions describing the performance metrics, one can showthe dependency between the values of the system properties specified in theinput model and the values of the performance metrics This is especiallyimportant for system properties leading to a discontinuous behavior of the
Trang 12apply-performance metrics The system designer can efficiently use this tion to apply the required modifications without critical consequences onthe system’s performance.
informa-Of special interest is the characterization of system properties whose ation leads to a nonmonotonic behavior of the performance metrics, referred
vari-to as timing anomalies Such anomalies mostly occur because of inter-taskfunctional dependencies, which are directly translated into timing depen-dencies in the corresponding performance model The analyses of timinganomalies become relevant in the later design phases, when the estimatedproperty values turn into concrete, fixed values It is important for the systemdesigner to know the source of such anomalies and which of the propertyvalues correspond to nonmonotonic system performance behavior Since
it divides the nonmonotonic—performance unpredictable—design ration space into monotonic—performance predictable—subspaces, timinganomaly analysis is an important element of the design space explorationprocess
configu-3.6.2 Performance Slack
In addition to performance characterization, sensitivity analysis determinesthe bound between feasible and infeasible property values This bound iscalled the sensitivity front The maximum amount by which the initial value
of a system property can be modified without jeopardizing the system bility is referred to as performance slack
feasi-In general, to preserve the predictability, the modification of the designdata is performed in isolation This means, the system designer assumes thevariation of a single system characteristic at a time, for example, the imple-mentation of a software component In general, in terms of performance,this corresponds to the variation of a single system property, for example,the worst-case execution time of the modified task Such modifications aresubject to one-dimensional sensitivity analysis When the modification of asingle system property has only local effects on the performance metrics, thecomputation of the performance slack is quite simple Several formal meth-ods were previously proposed [3,48] However, in some other situations, thevariation of the initial value affects several local and global performance met-rics Therefore, in order to compute the performance slack, sensitivity anal-ysis has to be assisted by an appropriate system-level performance analysismodel
In many design scenarios though, changing the initial design data cannot
be performed in isolation, such that a required design modification involvesthe simultaneous variation of several system characteristics, and thus systemproperties For example, changes in the execution time of one task may coin-cide with a changed communication load of another task Such modificationsare the subject of multidimensional sensitivity analysis Since the system per-formance metrics are represented as functions of several system properties
Trang 13and the dependency between these properties is generally unknown, thesensitivity front is more difficult to determine In general, the complexity ofthe sensitivity analysis exponentially increases with the number of variableproperties in the design space.
Based on the design strategy, two scenarios for using the performanceslack are identified:
• System robustness optimization: Based on the slack values, the designer
defines a set of robustness metrics to cover different possible designscenarios In order to maximize the system robustness at a given costlevel, the defined robustness metrics are used as optimization objec-tives by automatic design space exploration and optimization tools[12] The scope is to obtain system configurations with less sensitivi-ties to later design changes More details are given in Section 3.7
• System dimensioning: To reduce the global system cost, the system
designer can decide to use the performance slack for efficient systemdimensioning In this case, instead of looking for system configurationsthat can accommodate later changes, the performance slack is used tooptimize the system cost by selecting cheaper variants for processors,communication resources, or memories A sufficient slack may evensuggest the integration of the entire application on alternative plat-forms, reducing the number of hardware components [30] Note thatlower cost implies lower hardware costs on one side, and lower powerconsumption and smaller size on the other
The sensitivity analysis approach has been tailored differently in order toachieve the previous design goals Thus, to perform robustness optimization,the sensitivity analysis was integrated into a global optimization framework.For the evaluation of the robustness metrics it is not necessary to accuratelydetermine the sensitivity front Instead, using stochastic analysis, the sensi-tivity front can be approximated using two bounds for the sensitivity front:the lower bound determines the minimum guaranteed robustness (MGR),while the upper bound determines the maximum possible robustness (MPR).The benefit of using a stochastic analysis instead of an exact analysis is thenonexponential complexity with respect to the number of dimensions, whichmakes it suitable for a large number of variable system properties Detailsabout the implementation are given in [12]
For the second design scenario, the exact sensitivity front is required
in order to perform the modifications of the system properties Compared
to previous formal sensitivity analysis algorithms, the proposed approachuses a binary search technique, ensuring complete transparency with respect
to the application structure, the system architecture, and scheduling rithms In addition, the compatibility with the system-level performanceanalysis engine allows analysis of systems with global constraints Theone-dimensional sensitivity analysis algorithms are also used to boundthe search space investigated by the stochastic sensitivity analysis approach
Trang 14algo-The advantage of the exact analysis, when compared to the stochasticanalysis, is the ability to handle nonmonotonic search spaces A detaileddescription of the sensitivity analysis algorithms for different system prop-erties can be found in [31].
3.7 Robustness Optimization
In the field of embedded system design, robustness is usually associatedwith reliability and resilience Therefore, many approaches to fault toleranceagainst transient and permanent faults with different assumptions and fordifferent system architectures can be found in the literature (e.g., in [9,22,28]).These approaches increase the system robustness against effects of externalinterferences (radiation, heat, etc.) or partial system failure, and are, there-fore, crucial for safety critical systems
In this chapter, a different notion of robustness for embedded systems
is introduced and discussed: robustness to variations of system properties.Informally, a system is called robust if it can sustain system property modi-fications without severe consequences on system performance and integrity
In contrast to fault tolerance requiring the implementation of specific ods such as replication or reexecution mechanisms [18] to ensure robust-ness against faults, robustness to property variations is a meta problem thatdoes not directly arise from the expected and specified functional systembehaviors It rather represents an intrinsic system property that depends onthe system organization (architecture, application mapping, etc.) and its con-figuration (scheduling, etc.)
meth-Accounting for property variations early during design is key, since evensmall modifications in systems with complex performance dependencies canhave drastic nonintuitive impacts on the overall system behavior, and mightlead to severe performance degradation effects [29] Since performance eval-uation and exploration do not cover these effects, it is clear that the use ofthese methods alone is insufficient to systematically control system perfor-mance along the design flow and during system lifetime Therefore, explicitrobustness evaluation and optimization techniques that build on top of per-formance evaluation and exploration are needed They enable the designer
to introduce robustness at critical positions in the design, and thus help toavoid critical performance pitfalls
3.7.1 Use-Cases for Design Robustness
In the following, we discuss situations and scenarios where robustness ofhardware and run-time system performance against property variations isexpected and is crucial to efficiently design complex embedded systems
Trang 15Unknown quality of performance data: First, robustness is desirable to account
for data quality issues in early design phases, where data that are requiredfor performance analysis (e.g., task execution times and data rates) are oftenestimated or based on measurements As a result of the unknown input dataquality, also the expressiveness and accuracy of performance analysis resultsare unknown Since even small deviations from estimated property valuescan have severe consequences on the final system performance, it is obviousthat robustness against property variations leverages the applicability of for-mal analysis techniques during design Clearly, design risks can be consid-erably reduced by identifying performance critical data and systematicallyoptimizing the system for robustness
Maintainability and extensibility: Secondly, robustness is important to ensure
system maintainability and extensibility Since major system changes in tion to property variations are not usually possible during late design phases
reac-or after deployment, it is impreac-ortant to choose system architectures andconfigurations that offer sufficient robustness for future modifications andextensions, as early on as possible For instance, the huge number of featurecombinations in modern embedded systems has led to the problem of prod-uct and software variants Using robustness optimization techniques, sys-tems can be designed, at the outset, to accommodate additional features andchanges Other situations where robustness can increase system maintain-ability and extensibility include late feature requests, product and softwareupdates (e.g., new firmware), bug fixes, and environmental changes
Reusability and modularity: Finally, robustness is crucial to ensure component
reusability and modularity Even though these issues can be solved on thefunctional level by applying middleware concepts, they are still problematicfrom the performance verification point of view The reason is that systemperformance is not composable, prohibiting the straightforward combina-tion of individually correct components in a cut-and-paste manner to wholesystems In this context, robustness to property variations can facilitate thereuse of components across product generations and families, and simplifyplatform porting
3.7.2 Evaluating Design Robustness
Sensitivity analysis has already been successfully used for the evaluationand optimization of specific system robustness aspects In [26], for instance,the authors present a sensitivity analysis technique calculating maximuminput rates that can be processed by stream-processing architectures with-out violating on-chip buffer constraints The authors propose to integratethis technique into automated design space exploration to find architectureswith optimal stream-processing capabilities, which exhibits a high robust-ness against input rate increases