4.10 Instrumenting the Prototype § 2 During regression testing it is necessary to observe and record values of all defined variables so that the correctness and completeness of the targ
Trang 1error-correction, and for accommodating mandated changes such as might come from a standards-setting body
The result of this effort (reading the various documents that define the design, discussing them with team members) must yield an exhaustive list
of all variables with precise definitions, exact ranges, and clearly defined relations between dependent ranges and independent variables The cate-gory (subspace) of a variable is determined by whether its value will be used while creating the instance, the context, the activation, the operation,
or the excitation
Likewise, this effort must also yield an exhaustive list of rules (how the
design must behave) and guidelines (how the design should behave)
Well-written specifications will contain keywords that indicate which are which Words such as “must”, “will”, and “shall” indicate that a rule is being de-scribed, whether explicitly or implicitly Words such as “should”, “rec-ommended”, “preferred”, and “optimum” indicate that a guideline is being described
Additionally, this step will very likely expose bugs (contradictions, am-biguities, omissions, etc.) in the specification itself Clarification and agreement on such matters should be obtained as early as possible in the development to facilitate more rapid project execution and consequent faster time-to-market for the device
Finally, specifications will change, whether to eliminate bugs or to in-corporate enhancements In particular, industry standards are often revised
on schedules contrary to the needs of the many teams around the world who are attempting to realize devices compliant to these specifications Consequently, it’s vital to associate all variables and their ranges, and all
rules and guidelines with a particular version of the specification When
revised specifications are published, another pass at interpretation will be needed to bring the resulting sets of variables, rules, and guidelines up to date with the revisions
The results to be obtained from a thorough interpretation of the design specifications are as shown in Fig 4.5:
Trang 2Fig 4.5 Outcome of interpretation of the specifications
Trang 3Unresolved bugs listed in the interpretation mean that the rest of the in-terpretation is still subject to change and may be incomplete or contain er-rors A bug in a document may be anything that is:
• incorrect,
• incomplete,
• unclear (vague, contradictory, in need of an illustration or example, etc.), or
• illegible
Internal documents are those that are under the control of the organiza-tion developing the IC or system Such documents are, in theory, more readily brought up to date, because the decision-makers are all members of the organization responsible for content of the documents
External documents are those that are not under control of the
organiza-tion developing the IC Such documents include industry standards and specifications for devices with which the IC must inter-operate Such documents are usually not readily brought up to date because external or-ganizations must make whatever changes are needed and then republish the document
When all documents have been fully interpreted into their variables and ranges with their rules and guidelines, planning can begin (or continue) with a complete knowledge of the complexity at hand Planning the project should consider
• architecture for verification software,
• instrumentation of the target(s),
• processes for applying tests across available compute resources, and
• scope, schedule, and resources
4.10 Instrumenting the Prototype (§ 2)
During regression testing it is necessary to observe and record values of all defined variables so that the correctness and completeness of the target can
be determined Rules and guidelines governing the variables are coded as
assertions Assertions are used not only to check rules derived from design
specifications, but also by RTL engineers to ensure that correct design principles have not been violated
For example, a designer of a block that receives a set of one-hot enable signals could write an assertion on these signals that, indeed, only one signal
Trang 4is logically true at any given time within a window of validity Similarly, ensuring that there are no drive fights on shared signals can be accom-plished by writing appropriate assertions
A variety of tools are available to assist in the creation of assertions and
a recent book sets forth a standard for the use of assertions in SystemVer-ilog (see Bergeron et al 2004)
But, there is one drawback to reliance on assertions: they are not synthe-sized and they will not be present in the hard prototype Consequently, the
design of internal logic to facilitate coverage analysis and failure analysis
(i.e., diagnosability) as well as collect coverage data should be designed into the device
The design requirements for these capabilities (which might constitute a separate morph of the instance) must be defined jointly between the verifi-cation team and the device design team The degree of instrumentation needed will depend, of course, on the functionality of the target and, when the time comes to verify the hard prototype, how readily the values of variables can be observed at the pins of the device containing the target During CRV the responses of the target are monitored continuously and checked against expected values Checking is performed in two different
orthogonal dimensions: value and time.3 Value checks are those that com-pare an observed value with an expected value, computed in some manner
by the testbench Temporal checks are those that compare the arrival time
of some value with an expected arrival time, computed in some manner by the testbench
It is useful to consider the many types of checking that must be accom-modated by the instrumentation for the soft prototype and for the hard pro-totype These many types of checking may be grouped into the following levels of abstraction:
1 Expected results: This level of abstraction is the most basic check, whether values generated by the target match or agree with values generated independently by the testbench and are the values produced
at the time expected by the testbench The testbench constitutes a ref-erence machine (of a sort) in that it must know in advance what val-ues to expect from the target and inform the verification engineer when the target’s results do not agree with the testbench’s results As such, this is perhaps not so much a level of abstraction for checking,
3 The concepts of function points and function arcs as defined in chapter 3 are al-ready present in commercially available verification software in the form of these two types of checking
Trang 5but rather a broad definition of checking relevant to all levels of ab-straction
2 Protocol: This level of abstraction refers to the particular way in which information is communicated to and from (and within) the tar-get, more or less disregarding the information being communicated Checks for Ethernet protocol, for example, are concerned with packet size and bit-stuffing and so forth necessary to move a payload from here to there The payload (the information actually being communi-cated) is checked at the next level of abstraction One class of proto-col always requires detailed monitoring: handshake protoproto-cols at clock-domain crossings (CDCs) This will be discussed in greater de-tail later in this chapter
3 Transport: This level of checking is concerned with movement of in-formation across the target Are bits received over the Ethernet cable properly re-assembled to represent the structure of the information sent from the other end of the cable? Does an item removed from a queue remain unchanged when it is eventually removed? Score-boards, in which expected data are placed in a holding area (the scoreboard) so that when the target finally generates its received copy, it can be checked against the copy of expected data in the scoreboard, are often used for this level of checking
4 State coherency: Many targets have distributed state that must re-main in agreement, such as the individual caches in a multi-processor system
5 Transformation of data: Many targets transform data in a manner that transcends basic checking A JPEG encoder performs an elaborate transformation of data, for example
6 Properties: These are checks that usually transcend verification with a testbench and CRV Properties (discussed in chapter 2) such as per-formance and arbitration fairness will typically be verified through other means and typically in the hard prototype Properties of compli-ance to industry standards are certainly checked at a protocol level, but also at a property level in a hard prototype or firm prototype (FPGA) Running a compliance suite in a software testbench envi-ronment is usually impractical
At some point in the construction phase of the project, the logic that will provide sufficient visibility into the internal behavior must be defined De-signers of individual blocks might each provide a list of signals that are the most useful in diagnosing faulty behavior on the part of the block If the
morph provides some means to capture the values of N signals and if there
Trang 6are B blocks in the target, then each block might be allocated N/B signals
for diagnostic capture
One mechanism that is always needed is one that can impose defined er-rors on stimuli, so that this functionality can be verified and to facilitate testing of system responses to erroneous stimuli received by (or generated by) the verification target
Other mechanisms that are useful for verification of hard prototypes in-clude:
• Exerting control internally on demand of software or firmware by re-serving a range within a processor’s address space for verification con-trol registers Read/write actions on these registers can exert concon-trol over internal signals or enable observation of internal signals
• Queue throttling, in which the length of a queue can be temporarily re-defined to be very small (perhaps having a length of 1), can be very helpful in diagnosing test failures in pipelined processors, prefetching logic, and any other subsystems dependent on proper queuing behavior
• Special cases that require monitoring may require including additional logic to facilitate collecting data related to the cases to monitor
4.10.1 An Ounce of Prevention (§ 2)
There are a few rather well known techniques that can be applied to pre-vent unnecessary full-layer tape-outs to achieve functional parts One
widely used technique is the inclusion of spare gates in the layout with
connectivity to inputs and outputs available at metal layers They are not connected to any part of the target’s logic, but they are present in the layout
These spare gates are powered but not activated, their inputs being tied
to a constant logical 0 or 1 to prevent undesired electrical side effects Then, if a bug is exposed in silicon, it may be possible to make use of ad-vantageously located spare gates to implement whatever changes are nec-essary This avoids the expense and schedule delay associated with a full-layer tape-out (to lay down more transistors)
Another technique is similar to spare gates in that control logic is im-plemented using a programmable logic array (PLA) These elements can undergo re-wiring to implement logical changes without resorting to a full-layer tape-out
Another preventive step is referred to as holding a split lot at metal A lot (a particular quantity) of wafers is normally processed as a batch By
Trang 7diverting some fraction (such as half) of the wafers from the production line and holding them “at metal” (before metal layers have been laid down), bugs in silicon can be fixed by making metal mask changes only to rewire the transistors already implemented on these held wafers This avoids delays associated with having to start with unprocessed wafers, sav-ing time in gettsav-ing silicon back from metal-only changes Arrangements for holding wafers at metal must be made in advance with the processing factory
Some remedial measures are also available but they can be expensive Using a focused ion beam (FIB) to remove layers of oxide and metal and deposit metal to rewire existing transistors (such as for spare gates) is oc-casionally used to try out a set of changes before a tape-out The yield of this expensive process is rather poor and it may be necessary to sacrifice several parts to get one that works
During the design for verification it is worthwhile identifying those
“tricky” features that might need to be disabled if they do not work prop-erly A novel instruction reordering mechanism in an advanced superscalar processor that seemed so nifty during simulation might make silicon proto-types useless for subsequent development work, such as debugging the re-lated operating system or compiler or optimizer By providing a means to disable this new functionality, downstream development work might be able to continue while the IC team debugs it
Finally, it should be acknowledged that not all bugs in a digital device
must necessarily be fixed Sometimes it is possible to work around the bug
with software of firmware that avoids evoking faulty behavior from the device If the workaround doesn’t have intolerable side effects, it might be decided to update the device’s documentation to reflect these new usage requirements.4
4.11 Standard Results (§ 3)
As described in the verification plan, a certain set of data must be
pro-duced so that standard measures can be made and standard views of the
verification results can be provided The particular measures are discussed
in chapter 6
4 In fact, the careful reader might discover odd limitations described in product documentation that suggests that the product had a bug for which a workaround was developed
Trang 8Fig 4.6 Standard Results
The standard results to be generated include the following:
• values of all variables, suitably time-stamped
• code coverage on a clock-domain basis so that the following measures can be determined:
- statement coverage
- branch coverage
- condition coverage
- convergence
That’s it – really nothing more than is already typically preserved in re-gression runs
Trang 9The particular formats in which data are stored will depend on the tools used for verification In-house tools and vendor-supplied software will vary widely in the particular ways in which they store data
In fact, the raw data may come from a variety of sources and appear in multiple formats For example, simulations of a processor might produce multiple sets of data for each test: a simulation trace of the signals appearing
at the I/O ports of the target, an execution history of the program executed
by the processor during the test, and a dump of final state In addition, some data may actually come not from simulation, but from an FPGA-based im-plementation of the target (a firm prototype) Tools will evolve to consoli-date results from all verification platforms:
• simulation of the soft prototype
• emulation using FPGAs (a firm prototype)
• operation of the hard prototype
Fig 4.7 Producing standard results
What matters is that verification results be stored in such a manner that standard results (values of standard variables) can be readily obtained for
the purposes of producing standard measures and standard views (the
top-ics of chapter 6
In other words, the standard results may consist of these raw data plus custom methods that understand the formats of the raw data such that the values of standard variables can be extracted (see Fig 4.7) These methods provide the standard results while hiding (but not disturbing) the raw data
Trang 10Future standardization within the industry may eventually address some standards for data storage
4.12 Setting Goals for Coverage and Risk (§ 4)
The verification plan should clearly state the coverage goals so that re-sources can be managed effectively to reach the goals on schedule The related goals for risk must be stated as well so that the verification effort is aligned with management’s expectations regarding risk Relating coverage
to risk is the topic of chapter 7, and formal statements of risk as related to project resources will be described there
4.12.1 Making Trade-offs (§ 4)
As mentioned in the preceding discussions, the verification manager is faced with the challenge of deploying a finite set of resources to achieve a commercially viable result Consequently, the entire space of a design will not necessarily undergo thorough verification before tape-out or ramping for production The initial samples of a device might support only a single clock frequency (or set of frequencies) with production devices or up-grades supporting a wider range of frequencies
The manager might also choose to exclude previously verified code (“leveraged” code, IP, etc.) from the target for regression and from cover-age analysis if the risk of an unexposed bug in the levercover-aged code is re-garded as acceptably low
4.12.2 Focusing Resources (§ 4)
As mentioned at the end of chapter 3, the function space to be verified can
be quite vast, but it’s usually not necessary to exercise it exhaustively be-fore tape-out or release to volume manufacturing It is the nature of design bugs that they are nearly always associated with some corner case, so a
re-gression suite that focuses on corner cases will be much more productive
at exposing faulty behavior than one which attempts to exercise the entire space uniformly By adjusting the weights (a.k.a turning the knobs) for random selection of values for pseudo-random variables such that bound-ary values are chosen and used preferentially, bug discovery will probably
be accelerated