ARM Architecture Reference Manual- P26

C5-55.1.3 Scalar operations If the destination register lies in the first bank of eight registers, the instruction specifies a scalar operation: if d_bank == 0 then vec_len = 1 Sd[0] = d

Trang 1

5.1.3 Scalar operations

If the destination register lies in the first bank of eight registers, the instruction specifies a scalar operation:

if d_bank == 0 then vec_len = 1 Sd[0] = d_num Sn[0] = n_num Sm[0] = m_num

Note

Source operands The source operands are always scalars, regardless of which bank they are in This

allows individual elements of vectors to be used as scalars

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Trang 2

5.1.4 Mixed vector/scalar operations

If the destination register specified in the instruction does not lie in the first bank of eight registers, but the second source register does, then the destination register and first source register specify vectors and the second source register specifies a scalar:

if d_bank != 0 and m_bank == 0 then vec_len = vector length specified by FPSCR for i = 0 to vec_len-1

Sd[i] = (d_bank << 3) | d_index Sn[i] = (n_bank << 3) | n_index Sm[i] = m_num

d_index = d_index + (vector stride specified by FPSCR)

if d_index > 7 then d_index = d_index - 8 n_index = n_index + (vector stride specified by FPSCR)

if n_index > 7 then n_index = n_index - 8

Notes

First source operand

The first operand is always a vector, regardless of which bank it is in This allows a set of consecutive registers in the first bank to be treated as a vector

Vector wrap-around

A vector operand must not wrap around so that it re-uses its first element Otherwise, the results of the instruction are UNPREDICTABLE When the FPSCR specifies a vector stride of

1, this is not a restriction, because the vector length is at most 8 When the FPSCR specifies

a vector stride of 2, it implies that the vector length must be at most 4

Operand overlap

If two operands overlap, they must be identical both in terms of which registers are accessed and the order in which they are accessed Otherwise, the results of the instruction are UNPREDICTABLE This implies that:

• If the set of register numbers generated in Sd[i] overlaps the set of register numbers generated in Sn[i], then d_num and n_num must be identical

• If the set of register numbers generated in Sn[i] includes m_num, the vector length must be 1

It is impossible for the set of register numbers generated in Sd[i] to include m_num, because they lie in different banks

Trang 3

Sd[i] = (d_bank << 3) | d_index Sn[i] = (n_bank << 3) | n_index Sm[i] = (m_bank << 3) | m_index d_index = d_index + (vector stride specified by FPSCR)

if n_index > 7 then n_index = n_index - 8 m_index = m_index + (vector stride specified by FPSCR)

if m_index > 7 then m_index = m_index - 8

Notes

Vector wrap-around A vector operand must not wrap around so that it re-uses its first element

Otherwise, the results of the instruction are UNPREDICTABLE When the FPSCR specifies a vector stride of 1, this is not a restriction, since the vector length is at most 8 When the FPSCR specifies a vector stride of 2, it implies that the vector length must be at most 4

Operand overlap If two operands overlap, they must be identical both in terms of which registers are

accessed and the order in which they are accessed Otherwise, the results of the instruction are UNPREDICTABLE This implies that:

• If the set of register numbers generated in Sd[i] overlaps the set of register numbers generated in Sn[i], then d_num and n_num must be identical

• If the set of register numbers generated in Sd[i] overlaps the set of register numbers generated in Sm[i], then d_num and m_num must be identical

• If the set of register numbers generated in Sn[i] overlaps the set of register numbers generated in Sm[i], then n_num and m_num must be identical

Trang 4

5.2 Addressing Mode 2 - Double-precision vectors (non-monadic)

When the vector length indicated by the FPSCR is greater than 1, the double-precision two-operand instructions FADDD, FDIVD, FMULD, FNMULD, and FSUBD can specify three different types of behavior:

• One arithmetic operation between two scalar values, yielding a scalar:

ScalarA op ScalarB → ScalarD

When this case is selected (see Scalar operations on page C5-11), it causes just one operation to be

performed, overriding the vector length specified in the FPSCR This allows scalar operations and vector operations to be mixed without the need to reprogram the FPSCR between them

• A set of N arithmetic operations, where N is the vector length specified in the FPSCR, with the first operand scanning through a vector, the second operand remaining constant and the destination scanning through a vector:

VectorA[0] op ScalarB → VectorD[0]

VectorA[1] op ScalarB → VectorD[1]

VectorA[N-1] op ScalarB → VectorD[N-1]

This can be abbreviated to:

VectorA op ScalarB → VectorD

• A set of N arithmetic operations, where N is the vector length specified in the FPSCR, with both operands and the destination scanning through vectors:

VectorA[0] op VectorB[0] → VectorD[0]

VectorA[1] op VectorB[1] → VectorD[1]

VectorA[N-1] op VectorB[N-1] → VectorD[N-1]

VectorA op VectorB → VectorDThe double-precision three-operand instructions FMACD, FMSCD, FNMACD and FNMSCD each use the same register for their addition/subtraction operand as for their destination So they have three forms

corresponding to the above three:

• A pure scalar form:

± (ScalarA * ScalarB) ± ScalarD → ScalarD

• A form in which the second multiplication operand is a scalar and everything else scans through vectors:

± (VectorA[0] * ScalarB) ± VectorD[0] → VectorD[0]

± (VectorA[1] * ScalarB) ± VectorD[1] → VectorD[1]

Trang 5

± (VectorA * ScalarB) ± VectorD → VectorD

• A form in which everything scans through a vector:

± (VectorA[0] * VectorB[0]) ± VectorD[0] → VectorD[0]

± (VectorA[1] * VectorB[1]) ± VectorD[1] → VectorD[1]

± (VectorA[N-1] * VectorB[N-1]) ± VectorD[N-1] → VectorD[N-1]

± (VectorA * VectorB) ± VectorD → VectorD

5.2.1 Register banks

To allow these various forms to be specified, the set of 16 double-precision registers is split into four banks, each of four registers The form used by an instruction depends on which operands are in the first bank The general principle behind the rules is that the first bank must be used to hold scalar operands while the other banks are used to hold vector operands All destination register writes and many source register reads adhere

to this principle, but some source register reads can result in scalar access to vector elements or vector accesses to groups of scalars

A vector operand consists of 2-4 registers from a single bank, with the number of registers being specified

by the vector length field of the FPSCR (see Vector length/stride control on page C2-22) The register

number in the instruction specifies the register that contains the first element of the vector Each successive element of the vector is formed by incrementing the register number by the value specified by the vector stride field of the FPSCR If this causes the register number to overflow the top of the register bank, the register number wraps around to the bottom of the bank, as shown in Figure 5-2

Figure 5-2 Double-precision register banks

d12 d13 d14 d15

d0 d1 d2 d3

d4 d5 d6 d7

d8 d9 d10 d11

Trang 6

5.2.2 Operation

The following pages describe each of the three possible forms of the addressing mode:

• Scalar operations on page C5-11

• Mixed vector/scalar operations on page C5-12

• Vector operations on page C5-13.

In each case, the following values are generated:

vec_len The number of individual operations specified by the instruction

Second source registers of the individual operations

The register numbers specified in the instruction are broken up into bank numbers and indices within the banks as follows:

Trang 7

5.2.3 Scalar operations

If the destination register lies in the first bank of four registers, the instruction specifies a scalar operation:

if d_bank == 0 then vec_len = 1 Dd[0] = Dd Dn[0] = Dn Dm[0] = Dm

Notes

Source operands The source operands are always scalars, regardless of which bank they are in This

Trang 8

5.2.4 Mixed vector/scalar operations

If the destination register specified in the instruction does not lie in the first bank of four registers, but the second source register does, then the destination register and first source register specify vectors and the second source register specifies a scalar:

if d_bank != 0 and m_bank == 0 then vec_len = vector length specified by FPSCR for i = 0 to vec_len-1

Dd[i] = (d_bank << 2) | d_index Dn[i] = (n_bank << 2) | n_index Dm[i] = Dm

if n_index > 3 then n_index = n_index - 4

Notes

First source operand The first operand is always a vector, regardless of which bank it is in This allows a

set of consecutive registers in the first bank to be treated as a vector

Otherwise, the results of the instruction are UNPREDICTABLE When the FPSCR specifies a vector stride of 1, this implies that the vector length must be at most 4 When the FPSCR specifies a vector stride of 2, it implies that the vector length must

be at most 2

• If the set of register numbers generated in Dd[i] overlaps the set of register numbers generated in Dn[i], then Dd and Dn must be identical

• If the set of register numbers generated in Dn[i] includes Dm, then the vector length must be 1

It is impossible for the set of register numbers generated in Dd[i] to include Dm, because they lie in different banks

Trang 9

Dd[i] = (d_bank << 2) | d_index Dn[i] = (n_bank << 2) | n_index Dm[i] = (m_bank << 2) | m_index d_index = d_index + (vector stride specified by FPSCR)

if n_index > 3 then n_index = n_index - 4 m_index = m_index + (vector stride specified by FPSCR)

Notes

Otherwise, the results of the instruction are UNPREDICTABLE When the FPSCR specifies a vector stride of 1, this implies that the vector length must be at most 4 When the FPSCR specifies a vector stride of 2, it implies that the vector length must

be at most 2

• If the set of register numbers generated in Dd[i] overlaps the set of register numbers generated in Dn[i], then Dd and Dn must be identical

• If the set of register numbers generated in Dd[i] overlaps the set of register numbers generated in Dm[i], then Dd and Dm must be identical

• If the set of register numbers generated in Dn[i] overlaps the set of register numbers generated in Dm[i], then Dn and Dm must be identical

Trang 10

5.3 Addressing Mode 3 - Single-precision vectors (monadic)

When the vector length indicated by the FPSCR is greater than 1, the single-precision one-operand instructions FABSS, FCPYS, FNEGS, and FSQRTS can specify three different types of behavior:

• An operation on a scalar value, yielding a scalar:

Op(ScalarB) → ScalarD

When this case is selected (see Scalar-to-scalar operations on page C5-16), it causes just one

operation to be performed, overriding the vector length specified in the FPSCR This allows scalar operations and vector operations to be mixed without the need to reprogram the FPSCR between them

• An operation on a scalar value, whose result is written to each of the N elements of a vector, where

N is the vector length specified in the FPSCR:

To allow these various forms to be specified, the set of 32 single-precision registers is split into four banks,

each of eight registers For a description of this, see Register banks on page C5-3.

31 28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 8 7 6 5 4 3 0

Trang 11

5.3.1 Operation

The following pages describe each of the three possible forms of the addressing mode:

• Scalar-to-scalar operations on page C5-16

• Scalar-to-vector operations on page C5-17

• Vector-to-vector operations on page C5-18.

In each case, the following values are generated:

vec_len The number of individual operations specified by the instruction

Sd[0] Sd[vec_len-1]

Destination registers of the individual operations

Sm[0] Sm[vec_len-1]

Source registers of the individual operations

In all cases, the registers specified by the instruction are determined by concatenating the Fd and Fm fields

of the instruction with the D and M bits respectively:

d_num = (Fd << 1) | D m_num = (Fm << 1) | MThese register numbers are then broken up into bank numbers and indices within the banks as follows:d_bank = d_num[4:3]

Trang 12

5.3.2 Scalar-to-scalar operations

If the destination register lies in the first bank of eight registers, the instruction specifies a scalar operation:

if d_bank == 0 then vec_len = 1 Sd[0] = d_num Sm[0] = m_num

Notes

Source operands The source operand is always a scalar, regardless of which bank it lies in This

Trang 13

Sd[i] = (d_bank << 3) | d_index Sm[i] = m_num

if d_index > 7 then d_index = d_index - 8

Notes

Otherwise, the results of the instruction are UNPREDICTABLE When the FPSCR specifies a vector stride of 1, this is not a restriction, because the vector length is at most 8 When the FPSCR specifies a vector stride of 2, it implies that the vector length must be at most 4

Operand overlap If the source and destination overlap, they must be identical both in terms of which

registers are accessed and the order in which they are accessed This implies that if the set of register numbers generated in Sn[i] includes m_num, the vector length must be 1

Trang 14

Sd[i] = (d_bank << 3) | d_index Sm[i] = (m_bank << 3) | m_index d_index = d_index + (vector stride specified by FPSCR)

if d_index > 7 then d_index = d_index - 8 m_index = m_index + (vector stride specified by FPSCR)

Notes

Otherwise, the results of the instruction are UNPREDICTABLE When the FPSCR specifies a vector stride of 1, this is not a restriction, since the vector length is at most 8 When the FPSCR specifies a vector stride of 2, it implies that the vector length must be at most 4

Operand overlap If the source and destination overlap, they must be identical both in terms of which

registers are accessed and the order in which they are accessed Otherwise, the results of the instruction are UNPREDICTABLE This implies that if the set of register numbers generated in Sd[i] overlaps the set of register numbers generated in Sm[i], d_num and m_num must be identical

Trang 15

5.4 Addressing Mode 4 - Double-precision vectors (monadic)

When the vector length indicated by the FPSCR is greater than 1, the double-precision one-operand instructions FABSD, FCPYD, FNEGD, and FSQRTD can specify three different types of behavior:

• An operation on a scalar value, yielding a scalar:

Op(ScalarB) > ScalarD

When this case is selected (see Scalar-to-scalar operations on page C5-21), it causes just one

operation to be performed, overriding the vector length specified in the FPSCR This allows scalar operations and vector operations to be mixed without the need to reprogram the FPSCR between them

• An operation on a scalar value, whose result is written to each of the N elements of a vector, where

N is the vector length specified in the FPSCR:

To allow these various forms to be specified, the set of 16 double-precision registers is split into four banks,

each of four registers For a description of this, see Register banks on page C5-9.

31 28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 8 7 6 5 4 3 0

Tiêu đề	ARM Architecture Reference Manual - P26
Trường học	ARM Education (ARM Limited)
Chuyên ngành	Computer Architecture
Thể loại	Manual
Năm xuất bản	2000
Thành phố	Cambridge

Định dạng
Số trang	30
Dung lượng	393,06 KB