Image format conversion is commonly implemented within various broadcast infrastructure systems such as servers, switchers, head-end encoders, and specialty studio displays.
At the basic level, the need for image format conversion is driven by the multitude of input image formats that must be converted to high definition (HD) or a different resolution before being stored, encoded, or displayed.
The broadcast infrastructure is a fragmented market with every vendor having slightly different ‘image format conversion’ requirements – be it the number of channels, the output resolution, progressive vs. interlaced image processing, etc. Also different characteristics are important within different sub-segments. While overall delay is very important in switcher applications, latency is a key factor for displays and video-conferencing systems. Server system requirements are more about image quality and have a higher priority than latency.
Most of this video processing is done in real-time on HD (generally 1080p) resolution video and in many cases multiple channels of such video streams are processed.
Given both the custom nature of the design and the video processing performance requirements, most broadcast engineers develop these designs on FPGAs to get the highest signal processing performance.
Recognizing this market requirement, FPGA vendors have started providing tools and design frameworks that make it easy for system designers to put together a custom image format conversion signal chain.
This article details the key components of such a design framework and shows how a two channel up-down-cross conversion design with polyphase scaling and motion-adaptive de-interlacing can be built using this design framework.
Video Design Framework for ‘Image Format Conversion’
Custom image format conversion designs do share many functions. These range from simple video pre-processing functions such as color space conversion to complex functions like polyphase scaling engines and motion-adaptive de-interlacers. However each customer puts these functions together differently and implements the functions with slightly different feature sets.
To enable customers to build their custom image format conversions quickly, FPGA vendors have developed a portfolio of video processing function blocks, starting point reference designs, tools, video interface standards, and development kits that enable rapid development and debugging of these designs.
Figure 1 shows how a collection of video function blocks and open interface standards for both streaming and memory-mapped data come together to create a custom video processing signal chain. The key functions that are extensively implemented in image format conversion designs are polyphase scaling and motion adaptive de-interlacing.
Polyphase Scaling & Motion Adaptive Deinterlacing
Put simply video scaling is the process of generating a pixel that did not exist previously. This is true whether an image is upscaled or downscaled.
There are many ways of generating this pixel, the higher the complexity the better quality of the results.
The simplest way is to copy the preceding pixel, known as the nearest neighbor method or a 1×1 interpolation. A slightly more intelligent method is to take an average of the two neighboring pixels in both vertical and horizontal dimension. Sometimes this is known as bilinear scaling, bilinear because a 2×2 size pixel array is used to compute the value of a single pixel.
Taking this concept further one can compute the new pixel by using to ‘m’ pixels in the horizontal dimension and ‘n’pixels in the vertical dimension to compute the value of the new pixel. The four pixels in each dimension are assigned a different weight. So this scaling algorithm, shown in Figure 2, is essentially the same as polynomial fitting, with the requirement to determine the right coefficients.
The new pixel is generated using a polynomial fitting algorithm.
Altera’s scaling engine comes pre-built with various Lanczos polynomial fitting algorithms (or filter functions) that can be used for image resizing. Lanczos filtering is a multivariate interpolation method used to compute new values for any digitally sampled data. When used for scaling digital images the Lanczos function indicates which pixels and in what proportion in the original image make up each pixel of the final image. Alternatively customers can add in their own coefficients to build a custom scaling engine.
There are various methods to de-interlace video. The method commonly used for image format conversion is some variant of a motion adaptive deinterlacing. Motion adaptive deinterlacing algorithms generally first compute if or not there is motion between successive frames of video.>
The motion adaptive algorithm employed by Altera IP calculates motion by compares two 3×3 windows for two consecutive frames (or four consecutive fields). This de-interlacing function then implements a form of bob de-interlacing for moving areas of the image and weaves style de-interlacing for the still areas.
Image Format Conversion in Broadcast Systems
Various broadcast studio systems such as servers, switchers, converters, and monitors use image format conversion. Figure 3 below shows the generic block diagram functions included in studio equipment.
Figure 3. A Functional Block Diagram of a Studio System
Typically the studio equipment ingests or plays out video over a SDI interface and could also have a DVI interface for monitoring the application. The interfaces are followed by some type of format conversion to handle various video formats.
The key video processing functions performed within the format conversion are the deinterlacing and scaling. There are also other functions involved to complete the format conversion such as chroma resampling, color space conversion, and frame rate conversion. All are a part of Altera’s video processing library.
Along with the format conversion functions, functions such video encoders, decoders, SDI switch function, audio processing, and video transport are also seen. Additionally, it is becoming common to see interfaces such as PCIe and Ethernet to enable IP connectivity.
FPGA vendors provide various reference designs that can help get started building these image format conversion signal chains. One of the latest designs offered for image format conversion delivers high-quality up, down and cross conversion of SD, HD, and 3G video streams in interlaced or progressive format (see Figure 4).
Figure 4. A Functional Block Diagram of an ‘Image Format Conversion’ Reference Design
The design ingests video over two SDI channels that can handle SD, HD, or 3G-SDI (1080p). The SDI video (YCbCr 4:2:2 with embedded synchronization signals) is first converted from a clocked video format to the flow controlled Avalon-ST Video protocol that is used to interface with the other video processing blocks.
Video from both the channels are deinterlaced and scaled using differing algorithms. One video streams goes through motion adaptive deinterlacing and polyphase 2×12 scaling, while the other video stream goes through ‘weave’ deinterlacing and nearest neighbor scaling.
Both video streams are then overlaid and mixed with a color test pattern that is generated from within the FPGA. The reference design described here is designed from the ground up to enable user customization, thus allowing three levels of customization with varying degree of flexibility – as illustrated in Figure 5.
Figure 5. Three Levels of Design Customization
At the top is the ability to customize the design in software. Most of these video function blocks have a run-time update capability. This means that these blocks use a memory-mapped slave interface that allows a state machine or an on-chip processor to update their behavior. Each slave interface provides access to a set of control registers and the set of available control registers and their width in binary bits varies with each control interface.
In practice this allows updating scaling coefficients, changing the scaling ratio – indeed, even go from downscaling to upscaling – while the system is running (see Figure 6).
The next level of customization is achieved through adjusting the parameters for the MegaCore functions. All of these video functions are completely parametrizable. As the Figure 7 shows the deinterlacer core allows not only the deinterlacing algorithm to be set, but also allows options such as default field, pass-through mode, the amount frames buffered in external memory, output frame rate, and control for motion bleed.