Tài liệu ARM Architecture Reference Manual- P21 docx

C2-19 2.6 System registers A VFP implementation contains three or more special-purpose system registers: • The Floating-point System ID register FPSID is a read-only register whose value

Trang 1

VFP Programmer’s Model

2.6 System registers

A VFP implementation contains three or more special-purpose system registers:

• The Floating-point System ID register (FPSID) is a read-only register whose value indicates which VFP implementation is being used See FPSID on page C2-20 for details.

• The Floating-point Status and Control register (FPSCR) is a read/write register which provides all user-level status and control of the floating-point system See FPSCR on page C2-21 for details of

the FPSCR

• The Floating-point Exception register (FPEXC) is a read/write register, two bits of which provide

system-level status and control The remaining bits of this register can be used to communicate exception information between the hardware and software components of the implementation, in an IMPLEMENTATION DEFINED manner See FPEXC on page C2-24 for details of the FPEXC.

• Individual VFP implementations can define and use further system registers for the purpose of communicating between the hardware and software components of the implementation All such registers are IMPLEMENTATION DEFINED They may not be used outside the implementation itself, except as described in implementation-specific documentation

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Trang 2

2.6.1 FPSID

The FPSID has the following format:

Bits[31:24] Contain an implementor code The following code is defined:

0x41 = A (ARM Ltd)All other values of the implementor code are reserved by ARM Ltd

Bit[23] Contains 0 if the implementation contains a hardware coprocessor, or 1 if it is a pure

software implementation

Bits[22:21] Indicate which FSTMX/FLDMX format is used (see Storing and reloading values of unknown

precision on page C2-15):

0b00 Indicates standard format 1

0b01 Indicates standard format 2

0b10 Is reserved

0b11 Indicates a non-standard format

Bit[20] Contains 0 if the implementation supports both single precision and double precision (a D

variant of the architecture), or 1 if it only supports single precision (a non-D variant)

Bits[19:16] Contain the architecture version number, encoded as follows:

0b0000 indicates VFPv1

All other values of this architecture version code are reserved by ARM Ltd

Bits[15:8] Contain an IMPLEMENTATION DEFINED representation of the primary part number of the

VFP implementation

Bits[7:4] Contain an IMPLEMENTATION DEFINED variant number This is typically used to distinguish

variants of the same primary part For example, two variants of the same VFP implementation might have hardware coprocessor interfaces designed to work with different ARM processors

Bits[3:0] Contain the IMPLEMENTATION DEFINED revision number of the part

The FPSID register is read-only, and can be accessed in both privileged and unprivileged modes Attempts

to write the FPSID register are ignored

implementor SW format SNG architecture part number variant revision

Trang 3

2.6.2 FPSCR

The FPSCR has the following format:

All of these bits can be read and written, and can be accessed in both privileged and unprivileged modes

Note

All bits described as DNM (Do Not Modify) in the diagram are reserved for future expansion They are initialized to zeros Non-initialization code must use read/modify/write techniques when handling the FPSCR, in order to ensure that these bits are not modified Failure to observe this rule can result in code which has unexpected side effects on future systems

The FPSCR bits are described in the following subsections

Condition flags

Bits[31:28] of the FPSCR contain the results of the most recent floating-point comparison:

N Is 1 if the comparison produced a less than result

Z Is 1 if the comparison produced an equal result

C Is 1 if the comparison produced an equal, greater than or unordered result

V Is 1 if the comparison produced an unordered result.

These condition flags do not directly affect conditional execution, either of ARM instructions or of VFP instructions A comparison instruction is normally followed by an FMSTAT instruction This transfers the FPSCR condition flags to the ARM CPSR flags, after which they can affect conditional execution

For more details of how comparisons are performed, see Comparison instructions on page C3-6.

Flush-to-zero mode control

Bit[24] of the FPSCR is the FZ bit and controls flush-to-zero mode See Flush-to-zero mode on page C2-13

for details of this processing mode

FZ == 0 Flush-to-zero mode is disabled and the behavior of the floating-point system is fully

compliant with the IEEE 754 standard

FZ == 1 Flush-to-zero mode is enabled

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

N Z C V DNM FZ RMODESTRIDE

D NMLEN DNM

I

X E

U

F E

O

F E

D

Z E

I

O EDNM

I

X C

U

F C

O

F C

D

Z C

I

O C

Trang 4

Rounding mode control

Bits[23:22] of the FPSCR select the current rounding mode This rounding mode is used for almost all floating-point instructions The only floating-point instructions which do not use it are FTOSIZD, FTOSIZS, FTOUIZD and FTOUIZS, which always use RZ mode

The rounding modes are encoded as follows:

0b00 Indicates Round to Nearest (RN) mode

0b01 Indicates Round towards Plus Infinity (RP) mode

0b10 Indicates Round towards Minus Infinity (RM) mode

0b11 Indicates Round towards Zero (RZ) mode.

See Rounding on page C2-9 for details of the rounding modes.

Vector length/stride control

The LEN field (bits[18:16]) of the FPSCR controls the vector length for VFP instructions that operate on short vectors, that is, how many registers are in a vector operand Similarly, the STRIDE field (bits[21:20]) controls the vector stride, that is, how far apart the registers in a vector lie in the register bank The allowed combinations of LEN and STRIDE are shown in Table 2-2 on page C2-23

All other combinations of LEN and STRIDE produce UNPREDICTABLE results

The combination LEN == 0b000, STRIDE == 0b00 is sometimes called scalar mode When it is in effect,

all arithmetic instructions specify simple scalar operations Otherwise, most arithmetic instructions specify

a scalar operation if their destination lies in the range S0-S7 (for single precision) or D0-D3 (for double precision) The full rules used to determine which operands are vectors and full details of how vector

operands are specified can be found in Chapter C5 VFP Addressing Modes and in the individual instruction

Trang 5

Exception status and control

Bits[12:8] and bits[4:0] of the FPSCR are the trap enable bits and cumulative exception bits respectively for

the five types of exception For details of what these do, see Floating-point exceptions on page C2-10

Table 2-3 shows which bits are associated with each exception

Table 2-2 Vector length/stride combinations

length

Vector stride Double-precision vector instructions

0b000 0b00 1 - All instructions are scalar

Trang 6

2.6.3 FPEXC

The FPEXC register has the following format:

This register can only be accessed in privileged modes

The EX bit

The EX bit (bit[31]) is a status bit which specifies how much information needs to be saved to record the state of the floating-point system It can be read on all VFP implementations, and is mainly of interest to process swap code

EX == 0 In this case, the only significant state in the floating-point system is the contents of the

architecturally defined writable registers, that is, of the general-purpose registers, FPSCR and FPEXC If EX == 0 when a process is swapped out, only these registers need to be saved, or reloaded when the process is swapped back in Also, no unexpected ARM exceptions (such as an undefined instruction exception to process a pending exception in the hardware) must occur during the saving and reloading of the registers

EX == 1 Here, there is additional IMPLEMENTATION DEFINED significant state in the floating-point

system which process swap code needs to handle This typically occurs when VFP hardware requires support code assistance to handle a potential exception, and one or more of the additional hardware system registers contains details of the potential exception (Some

implementations describe this by saying that the hardware is in an exceptional state.) The

actions required to swap a process out when EX == 1 and to swap such a process back in are IMPLEMENTATION DEFINED

The behavior of the EX bit when FPEXC is written is IMPLEMENTATION DEFINED, subject to the constraint that writing a 0 to the EX bit must be a legitimate action Otherwise, the process swap technique described above for the case EX == 0 cannot work

The EN bit

The EN bit (bit[30]) is a global enable bit, and can be both read and written

EN == 1 In this case, the floating-point system is enabled and operates normally

EN == 0 Here, the floating-point system is disabled In this state, all VFP instructions are treated as

undefined instructions when executed in an unprivileged ARM processor mode, and all except the following are treated as undefined instructions when executed in a privileged ARM processor mode:

• an FMXR to the FPEXC or FPSID register

• an FMRX from the FPEXC or FPSID register

Trang 7

Note

An FMXR to the FPSCR or an FMRX from the FPSCR is treated as an undefined instruction when EN == 0

If a VFP implementation contains additional system registers besides FPSID, FPSCR, and FPEXC, the behavior of FMXR instructions to them and FMRX instructions from them is IMPLEMENTATION DEFINED

Other bits

All bits of the FPSCR other than the EX and EN bits are IMPLEMENTATION DEFINED, including whether they are readable, writable or both They are typically used in hardware implementations for communicating exception information between the VFP hardware and its support code

A constraint on how these bits are defined is that when the EX bit is 0, it must be possible to save and reload all significant state in the floating-point system by saving and reloading only the VFP general-purpose registers, FPSCR and FPEXC

Trang 8

2.7 Reset behavior and initialization

When a hardware VFP implementation is reset, the FPEXC EN bit is reset to 0 The behavior of all other VFP registers and of the remaining bits of FPEXC on hardware reset is IMPLEMENTATION DEFINED.When the software component of a VFP implementation has finished initializing, the following are true:

• The FPEXC EN bit is set to 1

• The FPEXC EX bit is set to 0

• All bits of the FPSCR are set to 0, with the possible exception of the condition code flags in some cases This selects the following settings:

— normal IEEE 754 mode, not flush-to-zero mode

— the Round to Nearest rounding mode

— scalar mode (vector length 1)

— all exceptions are untrapped, and their cumulative status bits indicate that no exceptions of that type have been detected yet

It is IMPLEMENTATION DEFINED whether the VFP general-purpose registers and the FPSCR condition flags are initialized, and if so, what values they are initialized to

Trang 9

Chapter C3

VFP Instruction Set Overview

This chapter gives an overview of the VFP instruction set It contains the following sections:

• Data-processing instructions on page C3-2

• Load and Store instructions on page C3-13

• Register transfer instructions on page C3-17.

Trang 10

VFP Instruction Set Overview

3.1 Data-processing instructions

All VFP data-processing instructions are CDP instructions for coprocessors 10 or 11, with the following format:

p, q, r, s These bits collectively form the instruction’s primary opcode See Table 3-1 for the

assignment of these opcodes When all of p, q, r and s are 1, the instruction is a two-operand

extension instruction, with an extension opcode specified by the Fn and N bits.

Fd and D These bits normally specify the destination register of the instruction:

• For a single-precision instruction, Fd holds the top 4 bits of the register number and

D holds the bottom bit

• For a double-precision instruction, Fd holds the register number and D must be 0

If D is 1 in a double-precision instruction, the instruction is UNDEFINED.For multiply-accumulate instructions, this register is also the accumulate operand register For comparison instructions, it is the first operand register rather than a destination register

Fn and N These bits normally specify the first operand register of the instruction

• For a single-precision instruction, Fn holds the top 4 bits of the register number and

N holds the bottom bit

• For a double-precision instruction, Fn holds the register number and N must be 0.However, if p, q, r and s are all 1, the instruction is an extension instruction, and the Fn and

N fields form an extension opcode instead of specifying a register See Table 3-2 for the assignment of these extension opcodes

If N is 1 in a double-precision non-extension instruction, the instruction is UNDEFINED

Fm and M These bits specify the second operand register of the instruction, or the only operand register

for some extension instructions

• For a single-precision instruction, Fm holds the top 4 bits of the register number and

M holds the bottom bit

• For a double-precision instruction, Fm holds the register number and M must be 0

If M is 1 in a double-precision instruction, the instruction is UNDEFINED

cp_num If cp_num is 0b1010 (coprocessor number 10), the instruction is a single-precision

instruction If cp_num is 0b1011 (coprocessor number 11), the instruction is a double-precision instruction

For the instructions that convert between single-precision and double-precision (FCVTDS and FCVTSD), cp_num matches the source precision

cond 1 1 1 0 p D q r Fn Fd cp_num N s M 0 Fm

Trang 11

Table 3-1 and Table 3-2 show the assignment of VFP data-processing opcodes In these tables, Fd is used

to mean a destination register of the appropriate precision, that is, Sd for single-precision instructions and

Dd for double-precision instructions Fn and Fm are used similarly

Table 3-1 VFP data-processing primary opcodes

p q r s Instruction name

cp_num=10

Instruction name cp_num=11 Instruction functionality

Extension instructions

Trang 12

Table 3-2 VFP data-processing extension opcodes Extension opcode Instruction name

Fn N cp_num=10 cp_num=11 Instruction functionality

0100 0 FCMPS FCMPD Compare Fd with Fm, no exceptions on quiet NaNs

0100 1 FCMPES FCMPED Compare Fd with Fm, with exceptions on quiet NaNs

0101 0 FCMPZS FCMPZD Compare Fd with 0, no exceptions on quiet NaNs

0101 1 FCMPEZS FCMPEZD Compare Fd with 0, with exceptions on quiet NaNs

0111 1 FCVTDS FCVTSD Single ↔ double precision conversions

1000 0 FUITOS FUITOD Unsigned integer → floating-point conversions

1000 1 FSITOS FSITOD Signed integer → floating-point conversions

1100 0 FTOUIS FTOUID Floating-point → unsigned integer conversions

1100 1 FTOUIZS FTOUIZD Floating-point → unsigned integer conversions, RZ

mode

1101 0 FTOSIS FTOSID Floating-point → signed integer conversions

1101 1 FTOSIZS FTOSIZD Floating-point → signed integer conversions, RZ mode

Trang 13

3.1.1 Basic arithmetic instructions and square root

The FADDS, FSUBS, FMULS, FDIVS, and FSQRTS instructions provide the four basic arithmetic operations and square root on single-precision values Similarly, the FADDD, FSUBD, FMULD, FDIVD, and FSQRTD instructions supply these operations on double-precision values In addition, the FNMULS and FNMULD instructions supply negated multiplications in single and double precision respectively Their results are precisely equivalent to those of performing an FMULS or FMULD instruction followed by an FNEGS or FNEGD instruction (which inverts the sign of the result)

All of these instructions can be made to operate on short vectors by setting the FPSCR LEN and STRIDE

fields appropriately (see Chapter C5 VFP Addressing Modes for details).

The operations performed by all these instructions are always treated as floating-point operations, both for NaN handling and flush-to-zero mode In particular, signaling NaN operands cause Invalid Operand exceptions, and in flush-to-zero mode, denormalized operands are treated as +0 and sufficiently small results are forced to +0

3.1.2 Multiply-accumulate instructions

FMACS, FMACD, FNMACS, FNMACD, FMSCS, FMSCD, FNMSCS, and FNMSCD are multiply-accumulate instructions They multiply their two main operands, possibly invert the sign bit of the product, add or subtract the value in the destination register and write the result back to the destination register They are in all respects equivalent to the following sequences of basic arithmetic and negation instructions:

FMACS Sd,Sn,Sm: FMULS St,Sn,Sm FADDS Sd,St,Sd FMACD Dd,Dn,Dm: FMULD Dt,Dn,Dm FADDD Dd,Dt,Dd FNMACS Sd,Sn,Sm: FMULS St,Sn,Sm FNEGS St,St FADDS Sd,St,Sd FNMACD Dd,Dn,Dm: FMULD Dt,Dn,Dm FNEGD Dt,Dt FADDD Dd,Dt,Dd FMSCS Sd,Sn,Sm: FMULS St,Sn,Sm FSUBS Sd,St,Sd FMSCD Dd,Dn,Dm: FMULD Dt,Dn,Dm FSUBD Dd,Dt,Dd FNMSCS Sd,Sn,Sm: FMULS St,Sn,Sm FNEGS St,St FSUBS Sd,St,Sd FNMSCD Dd,Dn,Dm: FMULD Dt,Dn,Dm FNEGD St,St FSUBD Dd,Dt,Dd

Trang 14

where St or Dt describes a notional register used to hold intermediate results, treated as being a scalar if Sd

or Dd is a scalar and a vector if Sd or Dd is a vector

Note

This implies that each multiply-accumulate operation involves two roundings:

• one on the multiplication result

• one on the result of the final addition or subtraction

Both of these roundings are performed fully and as defined by the IEEE 754 standard In particular, these

instructions do not specify fused multiply-accumulates as used in a number of other architectures.

All of these instructions can be made to operate on short vectors by setting the FPSCR LEN and STRIDE

fields appropriately (see Chapter C5 VFP Addressing Modes for details) The operations performed by all

these instructions are always treated as floating-point operations, both for NaN handling and flush-to-zero mode In particular, signaling NaN operands cause Invalid Operand exceptions, and in flush-to-zero mode, denormalized operands are treated as +0 and sufficiently small results are forced to +0

3.1.3 Comparison instructions

The FCMPS, FCMPD, FCMPES, and FCMPED instructions perform comparisons between two register values The FCMPZS, FCMPZD, FCMPEZS, and FCMPEZD instructions perform comparisons between a register value and the constant +0

The IEEE 754 standard specifies that precisely one of four relationships holds between any two values being compared These are as follows:

• Two values are considered equal if any of the following conditions holds:

— They are both numeric and have the same numerical value This usually means that they have precisely the same representation, but also includes the case that one is +0 and the other is -0

— They are both +∞ (plus infinity)

— They are both −∞ (minus infinity)

• The first value is considered less than the second value if any of the following conditions holds:

— They are both numeric and the numeric value of the first is less than that of the second

— The first is −∞ (minus infinity) and the second is numeric

— The first is numeric and the second is +∞ (plus infinity)

— The first is −∞ (minus infinity) and the second is +∞ (plus infinity)

• The first value is considered greater than the second value if any of the following conditions holds:

— They are both numeric and the numeric value of the first is greater than that of the second

— The first is +∞ (plus infinity) and the second is numeric

— The first is numeric and the second is −∞ (minus infinity)

— The first is +∞ (plus infinity) and the second is −∞ (minus infinity)

• Two values are unordered if either or both of them are NaNs.

Trang 15

Note

If both values are the same NaN, the comparison result is unordered, not equal If an exact bit-by-bit

comparison is wanted, the ARM comparison instructions must be used rather than VFP comparison instructions, both for this reason and because +0 and -0 compare as equal

For all the comparison instructions, the result of the comparison is placed in the FPSCR flags, as shown in Table 3-3:

These FPSCR flag values need to be copied to the ARM CPSR flags before ARM conditional execution can

be based on them For this purpose, a special form of the FMRX instruction (called FMSTAT) is used This

is described in System register transfer instructions on page C3-20.

When the result of the comparison is unordered, it is possible that the comparison can also generate an

Invalid Operation exception because of the NaN operand(s) These instructions supply two distinct forms

of Invalid Operation exception generation:

• The FCMPS, FCMPD, FCMPZS, and FCMPZD instructions have the normal behavior of generating an Invalid Operation exception when either or both of their operands are signaling NaNs If neither

operand is a signaling NaN, but one or both are quiet NaNs, they generate an unordered result without

an accompanying Invalid Operation exception

• The FCMPES, FCMPED, FCMPEZS, and FCMPEZD instructions generate an Invalid Operation exception when either or both of their operands are NaNs, regardless of whether they are signaling

or quiet NaNs It is not possible to get an unordered result from these instructions without an

accompanying Invalid Operation exception

The VFP comparison instructions always treat their operands as scalars, regardless of the settings of the FPSCR LEN and STRIDE fields

The operations performed by all these instructions are always treated as floating-point operations, both for NaN handling and flush-to-zero mode In particular, signaling NaN operands cause Invalid Operand exceptions, and in flush-to-zero mode, denormalized operands are treated as +0

Table 3-3 VFP comparison flag values Comparison result N Z C V

Tiêu đề	VFP Programmer’s Model
Trường học	Sample University
Chuyên ngành	Computer Architecture
Thể loại	Document
Năm xuất bản	2000
Thành phố	Unknown

Định dạng
Số trang	30
Dung lượng	398,44 KB