DSP Building Blocks & Arithmetic

Home

Low-Area Dual Basis Divider over GF($2^m$)

Authors:

Leilei Song, University of Minnesota (U.S.A.)
Keshab K. Parhi, University of Minnesota (U.S.A.)

Volume 1, Page 627

Abstract:

This paper presents a low-area finite field divider using dual basis representation. This divider is based on the division algorithm of solving Discrete Wiener-Hopf Equation using Gauss-Jordan elimination method. The hardware complexity of the matrix generation part has been reduced dramatically form $O(m^2)$ to $O(m)$. When it is used as a building block for a large system, this divider can achieve more savings in hardware by utilizing sub-structure sharing techniques.

ic970627.pdf

TOP

VLSI Architecture for Datapath Integration of Arithmetic over $GF(2^m)$ on Digital Signal Processors

Authors:

Wolfram Drescher, Technical University of Dresden (Germany)
Kay Bachmann, Technical University of Dresden (Germany)
Gerhard P. Fettweis, Technical University of Dresden (Germany)

Volume 1, Page 631

Abstract:

This paper examines the implementation of Finite Field arithmetic, i.e. multiplication, division, and exponentiation, for any standard basis $GF(2^m)$ with $m<=8$ on a DSP datapath. We introduce an opportunity to exploit cells and the interconnection structure of a typical binary multiplier unit for the Finite Field operations by adding just a small overhead of logic. We develop division and exponentiation based on multiplication on the algorithm level and present a simple scheme for implementation of all operations on a processor datapath.

ic970631.pdf

TOP

A Fast Direction Sequence Generation Method for CORDIC Processors

Authors:

Seunghyeon Nahm, Seoul National University (Korea)
Wonyong Sung, Seoul National University (Korea)

Volume 1, Page 635

Abstract:

This paper describes a new direction sequence generation method for the circular CORDIC algorithm. A conventional approach employs an angle computation algorithm to control the direction of rotation in the form of a sign sequence, where the sign generation is a bottle-neck for the fast implementations. The proposed method reduces the number of sequential computations by employing a new angle representation model and linearizing the arctangent function in small angles. The direction sequence can be generated by about a third of the iterative computations required in the conventional algorithm, which also reduces the hardware requirements as much. Especially, this algorithm is attractive when pipelining is not allowed for feedback control, such as found in phase tracking applications. A VLSI implementation example for a high-speed quadrature demodulator is also discussed.

ic970635.pdf

TOP

A Radix-4 Redundant Cordic Algorithm with Fast On-line Variable Scale Faktor Compensation

Authors:

Chieh-Chih Li, Industrial Research Institute (Taiwan)
Sau-Gee Chen, Nat. Chiao Tung University (Taiwan)

Volume 1, Page 639

Abstract:

In this work, a fast radix-4 redundant CORDIC algorithm with variable scale factor is proposed. The algorithm includes an on-line scale factor decomposition algorithm that transforms the complicated variable scale factor into a sequence of simple shift-and-add operations and does the variable scale factor compensation in the same fashion. On the other hand, the on-line decomposition algorithm itself can be realized with a simple and fast hardware. The new CORDIC algorithm has the smallest number of 0.8n iterations among all the CORDIC algorithms, which requires only about two-third rotation number that of the existing best (hybrid radix-2 and radix-4) redundant algorithms. Therefore, the new algorithm achieves fast rotation iterations, high-speed and low-overhead scale factor compensations, which are hard to attain simultaneously for the existing algorithms. The on-line scale factor compensation can be also applied to the existing on-line CORDIC algorithms.

ic970639.pdf

TOP

Pipelining of Cordic Based IIR Digital Filters

Authors:

Jun P. Ma, University of Minnesota (U.S.A.)
Keshab K. Parhi, University of Minnesota (U.S.A.)
Ed F. Deprettere, Delft University of Technology (The Netherlands)

Volume 1, Page 643

Abstract:

Cordic based IIR digital filters possess desirable properties for VLSI implementation such as local connection, regularity, and good finite word-length behavior, but can't be pipelined to finer levels (such as bit or multi-bit levels) due to the presence of feedback loops. In this paper, a pipelining method for the cordic based IIR digital filters is proposed using the constrained filter design methods and the polyphase decomposition technique. Using this method, the filter sample rate can be increased to any desired level.

ic970643.pdf

TOP

An Asynchronous Implementation of the MAXLIST Algorithm

Authors:

Chris J. Myers, University of Utah (U.S.A.)
Hao Zheng, University of Utah (U.S.A.)

Volume 1, Page 647

Abstract:

We present an efficient asynchronous VLSI architecture for calculating running maximum or minimum values over a sliding window. Running maximums or minimums are very useful for many signal and image processing tasks. Our architecture performs the calculation using the MAXLIST algorithm. In order to take advantage of the wide delay variations due to data-dependencies and operating conditions, an asynchronous approach is taken to achieve higher performance and lower power. Simulation results demonstrate that our asynchronous architecture is significantly faster than existing and potential synchronous architectures.

ic970647.pdf

TOP

A Novel Systematic Mapping Approach for Highly Efficient Multiplexed FIR-Filter Architectures

Authors:

Wolfgang Wilhelm, RWTH Aachen (Germany)
Tobias Noll, RWTH Aachen (Germany)

Volume 1, Page 651

Abstract:

A systematic mapping approach leading to efficient VLSI-architectures for FIR-filters with a wide range of system parameters is presented. This approach is subdivided into two steps. In the first step the folding technique is applied at bit-level. The free parameters of this technique are then fixed in the second step according to guidelines which are derived from design-strategies for efficient VLSI-architectures. For many applications this approach leads to a reduced hardware complexity in comparison with state-of-the-art techniques. In addition, regularity and scalability of the resulting architectures keep the design effort small. In order to demonstrate the efficiency and the flexibility of this approach a new class of efficient time-shared FIR-filters for adaptive equalizing and a new class of efficient matched filters for rapid code acquisition in spread spectrum receivers are presented.