feature article
Subscribe Now

Optimized FIR Filter Implementation Using SSE Instructions

Intel® software products take advantage of the performance potential of the Intel® Atom™ processor. When compiler technologies or optimized off-the-shelf libraries are not sufficient to meet extreme performance requirements, hand-optimized routines are justified to maximize performance. This paper describes step-by-step development of an ultra-fast impulse response (FIR) filter using Intel® Streaming SIMD Extensions (Intel® SSE) and other Intel Atom processor features.

FIR filters are one of the primary types of filters used in Digital Signal Processing. This paper describes the optimization of a 16-bit fix point FIR filter of order 63. In several steps, the filter performance was improved by a factor of more than 5 and was brought close to the theoretical limit of the current architecture of Intel® Atom™ processors. This was enabled by loop unrolling, forceful use of Intel® SSE instructions, consideration of memory alignment and selection of the most efficient rather than the most obvious SSE instructions.

The methodologies described here can be applied to other FIR filters with little or no modification. The filter order and number of output values can be changed easily, though the number of output values must be a multiple of eight. The benefit of Intel SSE instructions will increase with the increasing number of output values and higher order filters. Floating point FIR filters can be optimized following the same recipes, with the Intel® SSE instruction set allowing for four-way parallelism.

Multirate filters are commonly implemented using FIR filters. Using interpolation and decimation, the output is resampled to a different data rate. For example, 640 input values may result in 480 output values. The optimization steps described in this paper also allow for optimal performance of multirate filters on Intel® architecture-based processors.

Leave a Reply

featured blogs
May 2, 2024
I'm envisioning what one of these pieces would look like on the wall of my office. It would look awesome!...

featured video

Why Wiwynn Energy-Optimized Data Center IT Solutions Use Cadence Optimality Explorer

Sponsored by Cadence Design Systems

In the AI era, as the signal-data rate increases, the signal integrity challenges in server designs also increase. Wiwynn provides hyperscale data centers with innovative cloud IT infrastructure, bringing the best total cost of ownership (TCO), energy, and energy-itemized IT solutions from the cloud to the edge.

Learn more about how Wiwynn is developing a new methodology for PCB designs with Cadence’s Optimality Intelligent System Explorer and Clarity 3D Solver.

featured paper

Altera® FPGAs and SoCs with FPGA AI Suite and OpenVINO™ Toolkit Drive Embedded/Edge AI/Machine Learning Applications

Sponsored by Intel

Describes the emerging use cases of FPGA-based AI inference in edge and custom AI applications, and software and hardware solutions for edge FPGA AI.

Click here to read more

featured chalk talk

Silence of the Amps: µModule Regulators
In this episode of Chalk Talk, Amelia Dalton and Younes Salami from Analog Devices explore the benefits of Analog Devices’ silent switcher technology. They also examine the pros and cons of switch mode power supplies and how you can utilize silent switcher µModule regulators in your next design.
Dec 13, 2023
19,471 views