feature article
Subscribe Now

Image Processing Applications On New Generation FPGAs

The new generation of FPGAs with DSP resource and embedded processors are attracting the interest of the image processing market. With enhanced capabilities most of the DSP processing work can be off loaded from the software program stack to embedded processors and DSP resources on the FPGA to improve performance and reduce the cost of the whole system.

The traditional way of implementing algorithms in software limits the performance because the data is processed serially. Frequency of operation can be increased up to a certain extent to increase the performance or the required data rate to process the image data, but increasing the frequency above certain limits causes system level and board level issues that become a bottle neck in the design.

With the current image processing applications moving towards consumer markets, the amount of data to be processed has increased at a fast pace. The new compression algorithms on the market are keeping up with the increasing data requirement.

The DSP processors are also trying to keep up with these requirements. With the ever increasing need for processing, parallel processing on any hardware can help reduce the processing overhead and offer better system performance. A FPGA’s flexible architecture enables parallel processing providing a proper balance between the performance and the cost of the system, and in addition the flexibility to reprogram gives a quick turn around time.

With the right combination of IP’s, fast time-to- market can be achieved with real time prototyping. At the same time FPGAs also provide flexibility to upgrade to new standards. Figure 1 shows a group of Image IP’s that can be easily reshuffled to quickly create applications like video cell phones, set-top boxes, LCD projectors, Keyboard and Mouse Over IP (KVMIP), Digital cameras/camcorders etc. Along with parallel processing and high data rates, these groups of IP’s also provides high configurability that help to fine-tune the system to achieve certain performance rates.

The image processing block can be divided into two sub sections namely pixel processing blocks and frame processing blocks. The Pixel processing blocks works directly on the incoming pixel data whereas the frame processing blocks works on image stored in terms of frames. Color space converters, Gamma corrections and brightness control are some of the examples of pixel processing blocks. Static Huffman, AES, DCT, Interlace De-interlace come under the category of frame processing blocks.

20060307_einfo_fig1.gif

Figure 1. Image IP Blocks

Implementation of pixel processing blocks, like color space conversion, can be a simple job if FPGA resources are freely available and the pixel data frequency is low. The implementation on a FPGA with at least nine multipliers can be fairly simple. The matrix coefficients and offsets are stored in the ROM or loaded dynamically through external host configuration interface. Conversions are performed using a generic 3×3 matrix multiplication. The same can be achieved with just three multipliers, if the incoming pixel frequency is low, by running the internal core frequency three times the pixel frequency.

In terms of implementing a simple Color Space Conversion IP it really helps if a couple of hooks are kept in the design to make it flexible, for example, to add or remove pipe line registers, to configure the number of multiplier to be used etc. When you move on the Xilinx Virtex4 the implementation of multiplier and accumulator blocks simplifies the process even more. All the above considerations reduce the time it take to customize the IP for a particular FPGA while integrating it with other cores for different applications, saving the engineering time necessary to modify the core and avoiding human errors in doing so.

Image resize and Image rotation blocks fall under the category of pixel processing blocks with an overhead of line storages and low frequency of operations or as a frame processing blocks. A real time image resize can take up to 8 blocks of RAM with adder and multiplier trees with limited upscaling and downscaling capability. This can be a good solution for a system that requires real time image processing without worrying about the frame storage on an external memory like DDR. But if you are targeting applications that are constrained by the FPGA area and high pixel frequency, the real time image resize might not be a feasible solution. Going for an external DDR/SDRAM storage offers a better solution.

Proper partitioning of the logic to be implemented in hardware and software is one of the factors which decide the overall system efficiency. To implement Image resizing in hardware, you can have the hardware calculate the complete image size and come up with horizontal and vertical displacement that the hardware needs to read and compress. This logic will use up a good amount of hardware and arithmetic blocks. Since this operation won’t be performed often (in terms of calculating the image size and displacement) it can be moved to software and just provide the image size, vertical displacement and horizontal displacement values in the control registers.

Video compression plays a vital role in image processing applications. Uncompressed, high- definition pictures can easily take 1920x1080x24x30=1.49Gbps. Reserving the compression blocks to work on the stored data, helps in tolerating the latency that the IP’s might infer while running at high system frequency.

However, when you have many cores, in terms of frame processing blocks, to fetch the data from memory like DDR/SDRAM/SRAM, it really becomes crucial to define the interfaces between all the IP’s when they talk to the memory controllers.

Any standard bus like PLB/OPB/AMBA can be an easy solution to this, but do the IP really need to use the overhead? A very simple protocol can be defined with the least overhead in the form of request and grant. Various mechanisms like multiplexing the address, data and burst length bus over the same line can easily help reduce the number of lines which will play an important role when implementing anything in an FPGA compared to making an ASIC where every route is created as per the requirement. These buses are surely the way to go because of the standardization, but remember, they were NOT developed for implementation in FPGA. Image IP solutions should not take more than a couple of weeks for integration. Since bus standards are not optimized for FPGAs, we developed our own bus. Using this bus we can optimize the design for frequency and area with the FPGA architecture in mind.

One of the key points to be noted in developing Image Processing IP for the FPGA is the reusability and efficiency with which the hardware is implemented. This allows an efficient system in terms of cost and performance and also helps to reduce the time-to-market, by quickly integrating the IP blocks without touching them again for any modification, avoiding repetition of verification cycle.

Leave a Reply

featured blogs
Dec 2, 2024
The Wi-SUN Smart City Living Lab Challenge names the winners with Farmer's Voice, a voice command app for agriculture use, taking first place. Read the blog....
Dec 3, 2024
I've just seen something that is totally droolworthy, which may explain why I'm currently drooling all over my keyboard....

Libby's Lab

Libby's Lab - Scopes Out Littelfuse's SRP1 Solid State Relays

Sponsored by Mouser Electronics and Littelfuse

In this episode of Libby's Lab, Libby and Demo investigate quiet, reliable SRP1 solid state relays from Littelfuse availavble on Mouser.com. These multi-purpose relays give engineers a reliable, high-endurance alternative to mechanical relays that provide silent operation and superior uptime.

Click here for more information about Littelfuse SRP1 High-Endurance Solid-State Relays

featured chalk talk

Infineon and Mouser introduction to Reliable Solid State Isolators
Sponsored by Mouser Electronics and Infineon
In this episode of Chalk Talk, Amelia Dalton and Daniel Callen Jr. from Infineon explore trends in solid state isolator and relay solutions, the benefits that Infineon’s SSI solutions bring to the table, and how you can get started using these solutions for your next design. 
May 28, 2024
36,508 views