Category: Journal Papers
An Architecture for Programmable Multi-core IP Accelerated Platform with an Advanced Application of H.264 Codec Implementation
Abstract
A new integrated programmable platform architecture is presented, with the support of multiple accelerators and extensible processing cores. An advanced application for this architecture is to facilitate the implementation of H.264 baseline profile video codec. The platform architecture employs the novel concept of virtual socket and optimized memory access to increase the efficiency for video encoding. The proposed architecture is mapped on an integrated FPGA device, Annapolis WildCard-II™ or WildCard-4™, for verification. According to the evaluation under different configurations, the results show that the overall performance of the architecture, with the integrated accelerators, can sufficiently meet the real-time encoding requirement for H.264 BP at basic levels, and achieve about 2–5.5 and 1–3 dB improvement, in terms of PSNR, as compared with MPEG-2 MP and MPEG-4 SP, respectively. The architecture is highly extensible, and thus can be utilized to benefit the development of multi-standard video codec beyond the description in this paper.
Yifeng Qiu, Wael Badawy and Robert Turney, “An Architecture for Programmable Multi-core IP Accelerated Platform with an Advanced Application of H.264 Codec Implementation” Journal of Signal Processing Systems, Volume 57, Number 2 / November, 2009, 123-137.
Link to the list of other Peer Journal Publications
Interpolation-Free Fractional-Pixel Motion Estimation Algorithms with Efficient Hardware Implementation`
A Prototyping Virtual Socket System-On-Platform Architecture with a Novel ACQPPS Motion Estimator for H.264 Video Encoding Applications
Abstract
H.264 delivers the streaming video in high quality for various applications. The coding tools involved in H.264, however, make its video codec implementation very complicated, raising the need for algorithm optimization, and hardware acceleration. In this paper, a novel adaptive crossed quarter polar pattern search (ACQPPS) algorithm is proposed to realize an enhanced inter prediction for H.264. Moreover, an efficient prototyping system-on-platform architecture is also presented, which can be utilized for a realization of H.264 baseline profile encoder with the support of integrated ACQPPS motion estimator and related video IP accelerators. The implementation results show that ACQPPS motion estimator can achieve very high estimated image quality comparable to that from the full search method, in terms of peak signal-to-noise ratio (PSNR), while keeping the complexity at an extremely low level. With the integrated IP accelerators and optimized techniques, the proposed system-on-platform architecture sufficiently supports the H.264 real-time encoding with the low cost.
Yifeng Qiu and Wael Badawy, “A Prototyping Virtual Socket System-On-Platform Architecture with a Novel ACQPPS Motion Estimator for H.264 Video Encoding Applications” EURASIP Journal on Embedded Systems, Volume 2009
Link to the list of other Peer Journal Publications
Efficient Variable Block Size Selection for AVC Low Bitrate Applications
ABSTRACT
The Advanced Video Coding (AVC) standard proposes the usage of Variable Block Size (VBS) motion-compensated prediction and mode decision aiming for an optimized Rate-Distortion (R-D) performance. Unlike Fixed Block Size (FBS) motion-compensated prediction, where all regions of the pictures are treated similarly in terms of temporal prediction, VBS increases the efficiency of encoding by allowing more active regions to be represented with more bits than less active ones. The main concern regarding the usage of VBS motion-compensated prediction is the dramatic increase it adds to the encoder computational requirements, which not only prevents the encoder from satisfying real-time constraints, but also makes it impractical for hardware implementation. This paper presents an efficient VBS selection scheme, which can be applied to any VBS Motion Estimation (ME) module, leading to significant reduction in its computational requirements with minor loss in the quality of the reconstructed picture. The computational requirements reduction is achieved by minimizing the number of required ME searches and simplifying the Mode Decision (MD) operation. In order to meet different applications’ demands, the proposed algorithm can be adjusted to function at any of three operating points, trading off computational requirements with R-D performance. In the paper, the algorithm is described in detail, focusing on the theoretical computational requirements savings. This theoretical analysis is then supported with simulation results performed on three benchmark video sequences with various types of motion. Keywords-H.264/AVC, motion estimation, variable block size.
Reference: Ihab Amer, Wael Badawy, Graham Jullien, Adrian Chirila-Rus, Robert Turney, and Rana Hamed, “Efficient Variable Block Size Selection for AVC Low Bitrate Applications,” IARIA on-line journals, 2010 Vol. 1&2, July 2010.
Link to the list of other Peer Journal Publications
Automatic License Plate Recognition (ALPR): A State-of-the-Art Review
Abstract:
Automatic license plate recognition (ALPR) is the extraction of vehicle license plate information from an image or a sequence of images. The extracted information can be used with or without a database in many applications, such as electronic payment systems (toll payment, parking fee payment), and freeway and arterial monitoring systems for traffic surveillance. The ALPR uses either a color, black and white, or infrared camera to take images. The quality of the acquired images is a major factor in the success of the ALPR. ALPR as a real-life application has to quickly and successfully process license plates under different environmental conditions, such as indoors, outdoors, day or night time. It should also be generalized to process license plates from different nations, provinces, or states. These plates usually contain different colors, are written in different languages, and use different fonts; some plates may have a single color background and others have background images. The license plates can be partially occluded by dirt, lighting, and towing accessories on the car. In this paper, we present a comprehensive review of the state-of-the-art techniques for ALPR. We categorize different ALPR techniques according to the features they used for each stage, and compare them in terms of pros, cons, recognition accuracy, and processing speed. Future forecasts of ALPR are given at the end.
Published in:
Circuits and Systems for Video Technology, IEEE Transactions on (Volume:23 , Issue: 2 )
- Page(s):
- 311 – 325
- ISSN :
- 1051-8215
- INSPEC Accession Number:
- 13270696
- DOI:
- 10.1109/TCSVT.2012.2203741
- Date of Publication :
- 07 June 2012
- Date of Current Version :
- 01 February 2013
- Issue Date :
- Feb. 2013
- Sponsored by :
- IEEE Circuits and Systems Society
- Publisher:
- IEEE
- Download the paper here Automatic License Plate Recognition (ALPR): A State-of-the-Art Review
Link to the list of other Peer Journal Publications
Reference: Shan Du; Ibrahim, M.; Shehata, M.; Badawy, W., “Automatic License Plate Recognition (ALPR): A State-of-the-Art Review,” IEEE Transactions on Circuits and Systems for Video Technology, vol.23, no.2, pp.311,325, Feb. 2013.
On-Chip Electrical Field Sensing For Lab-On-A-Chip Applications
Authors: Yehya H. Ghallab, and Wael Badawy
Abstract: This paper presents a novel CMOS electric field sensor, termed as a Differential Electric Field Sensitive Field Effect Transistor (DeFET). It’s based on a standard 0.18μm CMOS technology. The DeFET shows a sensitivity of 76 μA/V/μm. Also, the DefET’s theory of operation is presented and discussed. Both the experimental and simulation results confirm the DeFET’s theory of operation is presented.
Link to the paper
Link to download the ECS Trans.-2006-Badawy-1-15
Link to the list of other Peer Journal Publications
Reference: Yehya H. Ghallab, and Wael Badawy, “On-chip Electrical Field Sensing for Lab-on-a-chip applications“, ElectroChemical Transaction, 1, (28) 1 (2006), pp. 1-15.
Hierarchical Adaptive Structure Mesh for Efficient Video Coding
Wael Badawy, “Hierarchical Adaptive Structure Mesh for Efficient Video Coding,” The International Journal on Image and Video Processing, Vol. 17, November 2001
A Multiplication-Free Algorithm and A Parallel Architecture for Affine Transformation
Affine transformation is widely used in image processing. Recently, it is recommended by MPEG-4 for video motion compensation. This paper presents a novel low power parallel architecture for texture warping using affine transformation (AT). The architecture uses a novel multiplication-free algorithm that employs the algebraic properties of the AT. Low power has been achieved at different levels of the design. At the algorithmic level, replacing multiplication operations with bit shifting saves the power and delay of using a multiplier. At the architecture level, low power is achieved by using parallel computational units, where the latency constraints and/or the operating latency can be reduced. At the circuit level, using low power building blocks (such as low power adders) contributes to the power savings. The proposed architecture is used as a computational kernel in video object coders. It is compatible with MPEG-4 and VRML standards. The architecture has been prototyped in 0.6 μm CMOS technology with three layers of metal. The performance of the proposed architecture shows that it can be used in mobile and handheld applications.
Wael Badawy and Magdy Bayoumi, “A Multiplication-Free Algorithm and A Parallel Architecture for Affine Transformation,” The Journal of VLSI Signal Processing-Systems, Kluwer Academic Publishers, Vol. 31, No 2, May 2002, pp. 173-184.
A Parallel Multiplication-Free Algorithm and Architecture for Affine-based Motion Compensation
Affine transformation is widely used in image processing. Recently, it has been recommended by MPEG-4 for video motion compensation. We present a novel low-power parallel architecture for texture warping using affine transformation (AT). The architecture uses a novel multiplication-free algorithm that employs the algebraic properties of the affine transformation. Low power has been achieved at different levels of the design. At the algorithmic level, replacing multiplication operations with bit shifting saves the power and delay of using a multiplier. At the architecture level, low power is achieved by using parallel computational units. At the circuit level, using low-power cells contributes to the power savings. The proposed architecture is used as a computational kernel in video object coders. It is compatible with MPEG-4 and virtual reality modeling language (VRML) standards. The architecture has been prototyped in 0.6-µm CMOS technology with three layers of metal. The performance of the proposed architecture shows that it can be used in mobile and handheld applications.
Wael Badawy and Magdy Bayoumi, “A Parallel Multiplication-Free Algorithm and Architecture for Affine-based Motion Compensation,” The SPIE Journal on Optical Engineering, Vol. 42 No. 1, January 2003 pp. 255 – 264