Analternative flow graph to the one shown in Figure 6.5 can be obtained with or-dered output and scrambled input.. At stage 2: 6.3 Decimation-in-Frequency FFT Algorithm with Radix-2 171
Trang 1앫 The fast Fourier transform using radix-2 and radix-4
앫 Decimation or decomposition in frequency and in time
equiva-6.2 DEVELOPMENT OF THE FFT ALGORITHM WITH RADIX-2
The FFT reduces considerably the computational requirements of the discrete
Fourier transform (DFT) The DFT of a discrete-time signal x(nT) is
165
6
Fast Fourier Transform
Digital Signal Processing: Laboratory Experiments Using C and the TMS320C31 DSK
Rulph Chassaing Copyright © 1999 John Wiley & Sons, Inc Print ISBN 0-471-29362-8 Electronic ISBN 0-471-20065-4
Trang 2multiplications Hence, the computational requirements of the DFT can be very
intensive, especially for large values of N.
The FFT algorithm takes advantage of the periodicity and symmetry of thetwiddle constants to reduce the computational requirements of the FFT From
Trang 3For a radix-2 (base 2), the FFT decomposes an N-point DFT into two point or smaller DFT’s Each (N/2)-point DFT is further decomposed into two (N/4)-point DFT’s, and so on The last decomposition consists of (N/2) two-
(N/2)-point DFT’s The smallest transform is determined by the radix of the FFT For a
radix-2 FFT, N must be a power or base of two, and the smallest transform or
the last decomposition is the two-point DFT For a radix-4, the last tion is a four-point DFT
N
ᎏ2
N
ᎏ2
N
ᎏ2
N
ᎏ2
N
ᎏ2
6.3 Decimation-in-Frequency FFT Algorithm with Radix-2 167
Trang 4are even in the upper half and they are odd in the lower half The decomposition
process can now be repeated such that each of the (N/2)-point DFT’s is further decomposed into two (N/4)-point DFT’s, as shown in Figure 6.3, again using
N = 8 to illustrate.
The upper section of the output sequence in Figure 6.2 yields the sequence
X(0) and X(4) in Figure 6.3, ordered as even X(2) and X(6) from Figure 6.3
rep-resent the odd values Similarly, the lower section of the output sequence in
Fig-ure 6.2 yields X(1) and X(5), ordered as the even values, and X(3) and X(7) as
the odd values This scrambling is due to the decomposition process The final
N
ᎏ2
N
ᎏ2
N
ᎏ2
N
ᎏ2
Trang 5order of the output sequence X(0), X(4), in Figure 6.3 is shown to be
scram-bled The output needs to be resequenced or reordered A special instruction ing indirect addressing with bit-reversal, introduced in Chapter 2 in conjunctionwith circular buffering, is available on the TMS320C3x to reorder such a se-
us-quence The output sequence X(k) represents the DFT of the time sequence x(n) This is the last decomposition, since we have now a set of (N/2) two-point DFT’s, the lowest decomposition for a radix-2 For the two-point DFT, X(k) in
(6.1) can be written as
6.3 Decimation-in-Frequency FFT Algorithm with Radix-2 169
FIGURE 6.2 Decomposition of N-point DFT into two (N/2)-point DFT’s, for N = 8.
FIGURE 6.3 Decomposition of two (N/2)-point DFT’s into four (N/4)-point DFT’s, for
N = 8.
Trang 6X(k) = n = 0冱1 x(n)W nk k = 0, 1 (6.19)or
X(0) = x(0)W0+ x(1)W0= x(0) + x(1) (6.20)
X(1) = x(0)W0+ x(1)W1= x(0) – x(1) (6.21)
since W1= e –j2/2= –1 Equations (6.20) and (6.21) can be represented by theflow graph in Figure 6.4, usually referred to as a butterfly The final flow graph
of an eight-point FFT algorithm is shown in Figure 6.5 This algorithm is
re-ferred as decimation-in-frequency (DIF) because the output sequence X(k) is
decomposed (decimated) into smaller subsequences, and this process continues
through M stages or iterations, where N = 2 M The output X(k) is complex with
both real and imaginary components, and the FFT algorithm can accomodateeither complex or real input values
The FFT is not an approximation of the DFT It yields the same result as theDFT with less computations required This reduction becomes more and moreimportant with higher-order FFT
There are other FFT structures that have been used to illustrate the FFT Analternative flow graph to the one shown in Figure 6.5 can be obtained with or-dered output and scrambled input
An eight-point FFT is illustrated through an exercise as well as through aprogramming example We will see that flow graphs for higher-order FFT (larg-
er N) can readily be obtained.
Exercise 6.1 Eight-Point FFT Using Decimation-in-Frequency
Let the input x(n) represent a rectangular waveform, or x(0) = x(1) = x(2) = x(3)
= 1, and x(4) = x(5) = x(6) = x(7) = 0 The eight-point FFT flow graph in Figure 6.5 can be used to find the output sequence X(k), k = 0, 1, , 7 With N = 8,
four twiddle constants need to be calculated, or
FIGURE 6.4 Two-point FFT butterfly.
Trang 7[x(2) – x(6)]W2= –j 씮 x⬘(6) [x(3) – x(7)]W3= –0.707 – j 0.707 씮 x⬘(7) where x ⬘(0), x⬘(1), , x⬘(7) represent the intermediate output sequence after
the first iteration that becomes the input to the second stage
2 At stage 2:
6.3 Decimation-in-Frequency FFT Algorithm with Radix-2 171
FIGURE 6.5 Eight-point FFT flow graph using decimation-in-frequency.
Trang 8x ⬘(0) + x⬘(2) = 2 씮 x⬘⬘(0)
x ⬘(1) + x⬘(3) = 2 씮 x⬘⬘(1) [x ⬘(0) – x⬘(2)]W0= 0 씮 x⬘⬘(2) [x ⬘(1) – x⬘(3)]W2= 0 씮 x⬘⬘(3)
x ⬘(4) + x⬘(6) = 1 – j 씮 x⬘⬘(4)
x ⬘(5) + x⬘(7) = (0.707 – j0.707) + (–0.707 – j0.707) = –j1.41 씮 x⬘⬘(5)
[x ⬘(4) – x⬘(6)]W0= 1 + j 씮 x⬘⬘(6) [x ⬘(5) – x⬘(7)]W2= –j1 41 씮 x⬘⬘(7) The resulting intermediate, second-stage output sequence x ⬘⬘(0), x⬘⬘(1), ,
x⬘⬘(7) becomes the input sequence to the third stage
Exercise 6.2 Sixteen-Point FFT
Given x(0) = x(1) = = x(7) = 1, and x(8) = x(9) = = x(15) = 0, which
rep-resents a rectangular input sequence The output sequence can be found usingthe 16-point flow graph shown in Figure 6.6 The intermediate output results af-ter each stage are found in a similar manner to the previous example Eight
twiddle constants W0, W1, , W7need to be calculated for N = 16.
Verify the scrambled output sequence X’s as shown in Figure 6.6 Reorder
this output sequence and take its magnitude Verify the plot in Figure 6.7, which
Trang 10represents a sinc function The output X(8) represents the magnitude at the
Nyquist frequency These results can be verified with an FFT function availablewith MATLAB, described in Appendix B
6.4 DECIMATION-IN-TIME FFT ALGORITHM WITH RADIX-2
Whereas the decimation-in-frequency (DIF) process decomposes an output quence into smaller subsequences, the decimation-in-time (DIT) is anotherprocess that decomposes the input sequence into smaller subsequences Letthe input sequence be decomposed into an even sequence and an odd se-quence, or
Trang 11which represents two (N/2)-point DFT’s Let
6.4 Decimation-in-Time FFT Algorithm with Radix-2 175
FIGURE 6.8 Decomposition of eight-point DFT into two four-point DFT’s using DIT.
Trang 12two two-point DFT’s, as shown in Figure 6.9 Since the last decomposition is
(N/2) two-point DFTs, this is as far as this process goes.
Figure 6.10 shows the final flow graph for an eight-point FFT using a mation-in-time process The input sequence is shown to be scrambled in Figure
deci-6.10, in the same manner as the output sequence X(k) was scrambled during the decimation-in-frequency process With the input sequence x(n) scrambled, the resulting output sequence X(k) becomes properly ordered Identical results are
obtained with an FFT using either the decimation-in-frequency (DIF) or thedecimation-in-time (DIT) process
An alternative DIT flow graph to the one shown in Figure 6.10, with orderedinput and scrambled output, also can be obtained
The following exercise shows that the same results are obtained for an point FFT with the DIT process as in Exercise 6.1 with the DIF process
eight-Exercise 6.3 Eight-Point FFT Using Decimation-in-Time
Given the input sequence x(n) representing a rectangular waveform as in cise 6.1, the output sequence X(k), using the DIT flow graph in Figure 6.10, is
Exer-the same as in Exercise 6.1 The twiddle constants are Exer-the same as in Exercise
6.1 Note that the twiddle constant W is multiplied with the second term only
(not with the first)
1 At stage 1:
x(0) + W0x(4) = 1 + 0 = 1 씮 x⬘(0) x(0) – W0x(4) = 1 – 0 = 1 씮 x⬘(4)
FIGURE 6.9 Decomposition of two four-point DFT’s into four two-point DFT’s using DIT.
Trang 13x(2) + W0x(6) = 1 + 0 = 1 씮 x⬘(2) x(2) – W0x(6) = 1 – 0 = 1 씮 x⬘(6) x(1) + W0x(5) = 1 + 0 = 1 씮 x⬘(1) x(1) – W0x(5) = 1 – 0 = 1 씮 x⬘(5) x(3) + W0x(7) = 1 + 0 = 1 씮 x⬘(3) x(3) – W0x(7) = 1 – 0 = 1 씮 x⬘(7) where the sequence x⬘s represents the intermediate output after the first itera-tion and becomes the input to the subsequent stage.
se-FIGURE 6.10 Eight-point FFT flow graph using decimation-in-time.
6.4 Decimation-in-Time FFT Algorithm with Radix-2 177
Trang 143 At stage 3:
X(0) = x ⬘⬘(0) + W0x⬘⬘(1) = 4
X(1) = x ⬘⬘(4) + W1x ⬘⬘(5) = 1 – j2.414 X(2) = x ⬘⬘(2) + W2x⬘⬘(3) = 0
X(3) = x ⬘⬘(6) + W3x ⬘⬘(7) = 1 – j0.414 X(4) = x ⬘⬘(0) – W0x⬘⬘(1) = 0
X(5) = x ⬘⬘(4) – W1x ⬘⬘(5) = 1 + j0.414 X(6) = x ⬘⬘(2) – W2x⬘⬘(3) = 0
X(7) = x ⬘⬘(6) – W3x ⬘⬘(7) = 1 + j2.414
which is the same output sequence as found in Example 6.1
6.5 BIT REVERSAL FOR UNSCRAMBLING
A bit-reversal procedure allows a scrambled sequence to be reordered To
illus-trate this bit-swapping process, let N = 8, represented by three bits The first and
third bits are swapped For example, (100)bis replaced by (001)b As such,(100)bspecifying the address of X(4) is replaced by or swapped with (001)b
specifying the address of X(1) Similarly, (110)b is replaced/swapped with(011)b, or the addresses of X(6) and X(3) are swapped In this fashion, the out-
put sequence in Figure 6.5 with the DIF, or the input sequence in Figure 6.10with the DIT, can be reordered
This bit-reversal procedure can be applied for larger values of N For ple, for N = 64, represented by six bits, the first and sixth bits, the second and
exam-fifth bits, and the third and fourth bits are swapped
Bit Reversal with Indirect Addressing
Swapping memory locations is not necessary if the bit-reversed addressing
mode available on the TMS320C3x is used Let N = 8 to illustrate this indirect addressing mode with reversed carry Given a set of data x(0), x(1), x(2), , x(7) that we wish to resequence or scramble, to obtain x(0), x(4), x(2), x(6), x(1), x(5), x(3), x(7) as we would do in an FFT using the decimation-in-time (DIT)
flow graph in figure 6.10
1 Set the index register IR0 to one-half the length of the FFT, or IR0 = N/2
= 4, assuming a set of real-input sequence For a complex input sequence, IR0
is set to N to accomodate for the real and imaginary components.
Trang 152 Let an auxiliary register such as AR1 contain a base address such as zero
or (0000)bfor illustration purpose
3 The instruction
NOP *AR1++(IR0)B
is an indirect mode of addressing instruction for bit reversal, introduced inChapter 2 On execution, the address 0 is selected, then AR1 is incremented topoint at memory address 4, which is the base address of zero offset by IR0
4 On the second execution of this instruction, memory address 4 is
select-ed, then AR1 is incremented to point at the address 2 We arrive at this address
by adding the current address to N/2, or (0100)b + (0100)b = (0010)bwith reversed carry That is, the carry is to the right, or in the reversed direction,
so that the binary addition of 1 and 1 is 0, with a carry of 1 to the right This iscaused by the B in the instruction
5 On the third execution, memory address 2 is selected, then AR1 is
incre-mented to point to memory address 6, and after the fourth execution, AR1points to memory address 1, because (0110)b + (0100)b = (0001)bwith reversed carry, and so on
We have used this indirect mode of addressing with reversed carry on the put sequence We can use a similar procedure on the output sequence, whichcan be performed by loading the auxiliary register AR1 with the last or highestaddress, then postdecrementing, or
in-NOP *AR1––(IR0)BThis procedure can be used for higher-order FFT length For a complex FFT,the real components of the input sequence can be arranged in even-numberedaddresses and the imaginary components in odd-numbered addresses The in-
dex (offset) register IR0 = N (instead of N/2) The programming FFT
exam-ples included later incorporate the bit reversal procedure for swapping dresses
ad-6.6 DEVELOPMENT OF THE FFT ALGORITHM WITH RADIX-4
A radix-4 (base 4) algorithm can increase the execution speed of the FFT FFTprograms on higher radices and split radices have been developed We will use adecimation-in-frequency (DIF) decomposition process to introduce the devel-opment of the radix-4 FFT The last or lowest decomposition of a radix-4 algo-rithm consists of four inputs and four outputs The order or length of the FFT is
4M , where M is the number of stages For a 16-point FFT, there are only two
stages or iterations as compared with four stages with the radix-2 algorithm
6.6 Development of the FFT Algorithm with Radix-4 179
Trang 16The DFT in (6.1) is decomposed into four summations, instead of two, as lows:
fol-X(k) = (N/4) – 1 n = 0冱 x(n)W nk+ (N/2) – 1 n = N/4冱 x(n)W nk+ (3N/4) – 1 n = N/2冱 x(n)W nk+ n = 3N/4 N – 1冱 x(n)W nk
(6.30)
Let n = n + N/4, n = n + N/2, n = n + 3N/4 in the second, third, and fourth
sum-mations, respectively Then (6.30) can be written as
for k = 0, 1, , (N/4) – 1 Equations (6.33) through (6.36) represent a
decom-position process yielding four four-point DFT’s The flow graph for a 16-point
Trang 17radix-4 decimation-in-frequency FFT is shown in Figure 6.11 Note the
four-point butterfly in the flow graph The ±j and –1 are not shown in Figure 6.11.
The results shown in the flow graph are for the following exercise
Exercise 6.4 16-Point FFT With Radix-4
Given the input sequence x(n) as in Exercise 6.2, representing a rectangular quence x(0) = x(1) = = x(7) = 1, and x(8) = x(9) = = x(15) = 0 We will
se-find the output sequence for a 16-point FFT with radix-4 using the flow graph
in Figure 6.11 The twiddle constants are shown in Table 6.1
The intermediate output sequence after stage 1 is shown in Figure 6.11 Forexample, after stage 1:
For example, after stage 2:
6.6 Development of the FFT Algorithm with Radix-4 181 TABLE 6.1 Twiddle constants for 16-point FFT with
Trang 18X(3) = (1 + j) + (1.307 – j0.541) + (–j1.414) + (–1.307 – j0.541) = 1 – j1.496
and
X(15) = (1 + j)(1) + (1.307 – j0.541)(–j) + (–j1.414)(1)
+ (–1.307 – j0.541)(–j) = 1 + j5.028 The output sequence X(0), X(1), , X(15) is identical to the output sequence
obtained with the 16-point FFT with the radix-2 in Figure 6.6 These results alsocan be verified with MATLAB, described in Appendix B
The output sequence is scrambled and needs to be resequenced or reordered.This can be done using a digit reversal procedure, in a similar fashion as a bitreversal in a radix-2 algorithm The radix-4 (base 4) uses the digits 0, 1, 2, 3
For example, the addresses of X(8) and X(2) need to be swapped because (8)10
in base 10 or decimal is equal to (20)4in base 4 Digits 0 and 1 are reversed toyield (02)4in base 4, which is also (02)10in decimal
Although mixed or higher radices can provide further reduction in tion, programming considerations become more complex As a result, the radix-
computa-2 is still the most widely used, followed by the radix-4
FIGURE 6.11 16-point radix-4 FFT flow graph using decimation-in-frequency.