Lattice Semiconductor Parallel FIR Filter User’s Guide
4
Table 2. Parallel FIR Filter Parameter Definitions
Functional Description
Tap Array
The Tap Array module essentially stores delayed versions or taps of input data. The number of taps of the FIR filter
and the data width are user parameters, and they are fixed at the time of core generation. The array consists of
N
taps each of width
W
, which are organized as shift registers. All the data registers are reset when the
reset_n
input is asserted. At every clock edge, the data values are shifted into the next sequential shift register inside the
Tap Array, with the first register getting the value from the input data port
din
.
Name Default Value Value Description
Filter Type
Single-cycle Single-cycle,
multi-cycle,
decimation or
interpolation
Type of filter selected by the user. This determines the rest of the
parameter options.
Data Width
8 bits Real: 4 to 32 bits
Complex: 4 to 16
bits
Width of input data
(W)
in bits. The width of the coefficients is also
equal to this parameter. For complex data types, the data width is
equal to the width of the real part and the range is from 4 to 16 bits.
Number of Taps
16 4 to 64
Number of taps
(N)
in the filter.
Computational
Cycles
2 2 to 32
Number of cycles
(C)
for multi-cycle filters. Number of cycles to
perform the filtering process. The output is computed once in
cycles.
Decimation Ratio
2 2 to 32
For decimation filters. Decimation is downsampling of the bit
stream.
Interpolation Ratio
2 2 to 32
For interpolation filters. Interpolation is the reverse of decimation.
Rounding Method
Nearest Truncation or
nearest
Types of rounding available.
Arithmetic Type
Signed Signed or
unsigned
Specifies the type of arithmetic modules for the core. If the sym-
metricity of the core is even or odd, then the arithmetic type is
always signed.
Data Type
Real Real or complex
Specifies the data type of the inputs (
din
and
coeff
) and the out-
put (
dout
) of the Parallel FIR core. When complex I/O mode is
selected, the arithmetic type is always signed.
Complex I/O Mode
Parallel Parallel or serial
In the parallel I/O mode, real and imaginary parts are applied on
the data bus in the same clock cycle. In the serial mode, real data
is applied in the first clock cycle, followed by the imaginary data in
the next cycle.
Output Width
Full precision 4 to 97
Width of output data
(W)
in bits. If the width is less than the maxi-
mum output width determined by the core generator, the outputs
are scaled.
Coeffs Loadable
Fixed Fixed or run-time
loadable
Determines if the coefficients are run-time loadable. If the coeffi-
cients are run-time loadable, the core has two additional input
ports,
coeff
and
loadc
, for loading purposes. If the coefficients
are fixed during core configuration, no additional input ports are
used.
Coefficients Format
Hexadecimal Hexadecimal or
decimal
The coefficient values are either in hexadecimal or decimal format.
Symmetricity
Even None, even, or
odd
Specifies the impulse response of the filter.
Even
symmetricity
applies to symmetric impulse response, while
Odd
symmetricity
applies to anti-symmetric impulse response. Decimation and Inter-
polation filters do not have symmetricity (The value
None
should be
selected). If the symmetricity of the core is even or odd, then the
arithmetic type is always signed.
Lattice Semiconductor Parallel FIR Filter User’s Guide
5
Coefficient Registers
The Coefficient Registers module stores the FIR filter coefficients. The coefficients can either be loaded at run time
or can be fixed during core generation. If the user chooses to fix the coefficients, then the
coeff
bus and
loadc
ports are not used in this module. For fixed coefficients, the values are hardcoded. If the coefficients are configured
to be loaded, they are loaded into the
coeff
registers sequentially at every clock edge. The
coeff
loading starts
at the first clock edge after
loadc
goes high and continues as long as
loadc
is active.
Data Scheduler
Data scheduling is necessary to schedule the tap and coefficient data to the multiplier bank for multi-cycle compu-
tations. This module has the necessary multiplexers to supply the tap and coefficient data to the multiplier bank in
batches. For a multi-cycle implementation with
C
cycles, the number of multipliers,
M
is equal to (N/C) rounded to
the next higher integer. For a fully parallel implementation (C = 1), the data scheduler reduces to a direct connec-
tion. The data scheduler is also used to multiplex data for optimizing decimation and interpolation filters.
Multiplier Bank
The Multiplier Bank has
M
number of
W
bit wide multipliers, where
M
is determined as the number of taps
N
divided by the number of computational cycles
C
rounded to the next higher integer
(M = ceil (N/C))
. The number of
multipliers is equal to the number of taps for a fully parallel implementation. The input to the bank comes from the
data scheduler and the output goes to the adder tree. The maximum delay through the multiplier bank is equal to
the delay of a singe multiplier.
Adder Tree and Output Control Unit
The Adder Tree has parallel adders instantiated in a binary tree fashion. The Output Control Unit has the scaling
and rounding logic to achieve output scalability and selectable rounding. There are also data registers to provide
synchronous registered output from the filter core. For a multi-cycle or decimation filtering, an adder is present in
the block, which when combined with the output registers, makes an accumulator.
Core Operation
There are four distinct implementations of parallel FIR filter: single-cycle, multi-cycle, decimation and interpolation.
This section describes these implementation types in detail. A note on rounding and truncation is also given in this
section. Complex data type is supported in all the filter implementations. For a complex data type, the complex
input data can be either supplied all at once (complex-parallel) or in two stages, real data followed by imaginary
data (complex-serial). The following notations are used:
N Number of taps
W Width of input data and coefficients
C Number of cycles for a multi-cycle operation
D Decimation ratio
U Interpolation ratio
M Number of multipliers, determined as M = Next higher integer to (N/C)
OW Output width
OFW Output full width
Single Cycle
This is the simplest of all implementations, in that it assumes availability of sufficient resources for parallel imple-
mentation. For an N-tap filter, it uses
N
multipliers and
N - 1
adders. The output is available on every cycle. The tim-
ing diagrams for the single-cycle implementations are given in Figures 2 and 3. As seen in the timing diagram, real
and imaginary parts of the input are supplied in successive clock cycles in complex serial mode. The data rate is
equal to half the clock rate. The input
irdy
should be asserted high to coincide with every valid real data at the
din port. Similarly, the core asserts the output real_out whenever the real part of the output data is placed on
the output bus.
Lattice Semiconductor Parallel FIR Filter User’s Guide
6
Figure 2. Timing for Single-cycle, Real or Complex-parallel Mode
Figure 3. Timing for Single-cycle, Complex-serial Mode
Multi-cycle
In a multi-cycle implementation, each output is computed over a period of C cycles. The implementation is similar
to the parallel implementation, except that fewer resources are used over multiple cycles. The number of multipliers
and adders used is not more than 1/M
th
of those used in fully parallel implementation. There is an additional accu-
mulator (an adder and a register combination) to accumulate the final sum through the C cycles. The timing dia-
grams for multi-cycle implementations are given in Figures 4 and 5.
Real and Complex-parallel Modes
The signal irdy is asserted during the first cycle of a multi-cycle operation, in the real and complex-parallel
modes. The data output of the core changes every C cycles and remains unchanged during the data cycle (each
data cycle is C clock cycles wide).The output ordy goes high during the first clock cycle of each data cycle. This
operation is shown in Figure 4.
clk
din
dout
1
2
3
4
1
2
3
4
57
56
1
2
3
456
6
8
7 8
7
x
x
x
internal data
processing
clk
din
dout
1r 1i
2r
2i
3r 3i 4r
irdy
12 3
4
1r 1i
2r
2i 3r
4i
real_out
x
x
x
x
x
internal data
processing

FIR-PARA-XP-N1

Mfr. #:
Manufacturer:
Lattice
Description:
Development Software FIR Filter Parallel
Lifecycle:
New from this manufacturer.
Delivery:
DHL FedEx Ups TNT EMS
Payment:
T/T Paypal Visa MoneyGram Western Union

Products related to this Datasheet