Tag: video coding

 
+

A Low Power VLSI Architecture for Mesh-based Video Motion Tracking

This paper proposes a low-power very large-scale integration (VLSI) architecture for motion tracking. It uses a hierarchical adaptive structured mesh that generates a content-based video representation. The proposed mesh is a coarse-to-fine hierarchical two-dimensional mesh that is formed by recursive triangulation of the initial coarse mesh geometry. The structured mesh offers a significant reduction in the number of bits that describe the mesh topology. The motion of the mesh nodes represents the deformation of the video object. The architecture consists of motion estimation and motion compensation units. The motion estimation architecture generates a progressive mesh code and the motion vectors of the mesh nodes. It reduces the power consumption, uses a simpler approach for mesh construction, approximates the mesh nodes motion vector by using the three step search algorithm and uses a parallel motion estimation core to evaluate the mesh nodes motion vectors. Moreover, it maximizes the lifetime of the internal buffers. The motion compensation architecture uses a multiplication-free algorithm for affine transformation, which significantly reduces the complexity of the motion compensation architecture. Moreover, using pipelined affine units contributes to the power savings. The video motion compensation architecture processes a reference frame, mesh nodes and motion vectors to predict a video frame. It implements parallel threads in which each thread implements a pipelined chain of scalable affine units. This motion compensation algorithm allows the use of one simple warping unit to map a hierarchical structure. The affine unit warps the texture of a patch at any level of hierarchical mesh independently. The processor uses a memory serialization unit, which interfaces the memory to the parallel units. The architecture has been prototyped using top-down low-power design methodology. The performance analysis shows that this processor can be used in online object-based video applications such as in MPEG and VRML.

Published in:

Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on  (Volume:49 , Issue: 7 )

Wael Badawy and Magdy Bayoumi, “A Low Power VLSI Architecture for Mesh-based Video Motion Tracking,” The IEEE Transactions on Circuits and Systems II, Vol. 49, July 2002, pp. 488-504.

+

A Computational RAM (C-RAM) Architecture for Real-Time Mesh-Based Video Motion Tracking: Part II Motion Compensation

This paper presents a new Computational-RAM (C-RAM) architecture for real-time mesh-based video motion tracking. In Part 1, the motion estimation part of the proposed architecture is presented. Here in Part 2, a new C-RAM mesh-based motion compensation architecture is presented. The input data to the architecture is the mesh nodes motion vectors and the reference frame and the output data is the compensated (i.e., predicted) frame. The architecture uses the affine transformation for warping the deformed patches in the reference frame into the undeformed patches in the current frame. The architecture computes the affine parameters using a multiplication-free algorithm. The reference and current frames are stored in embedded S-RAMs generated with Virage™ Memory Compiler. The proposed motion compensation architecture has been prototyped, simulated and synthesized using the TSMC 0.18 μm CMOS technology. Using 100 MHz clock frequency, the proposed architecture processes one CIF video frame (i.e., 352×288 pixels) in 0.59 ms, which means it can process up to 1694 frames per second. The core area of the proposed motion compensation architecture is 28.04 mm2 and it consumes 31.15 mW.

 

Mohammed Sayed and Wael Badawy, “A Computational RAM (C-RAM) Architecture for Real-Time Mesh-Based Video Motion Tracking: Part II Motion Compensation,” Journal of Circuits, Systems and Computer, Vol. 13, Issue 6, December 2004, pp. 1217-1232.

+

A Computational RAM (C-RAM) Architecture for Real-Time Mesh-Based Video Motion Tracking: Part I Motion Estimation,

 

This paper presents a new Computational-RAM (C-RAM) architecture for real-time mesh-based video motion tracking. The motion tracking consists of two operations: mesh-based motion estimation and compensation. The proposed motion estimation architecture is presented in Part 1 and the proposed motion compensation architecture is presented in Part 2. The motion estimation architecture stores two frames and computes motion vectors for a regular triangular mesh structure as defined by MPEG-4 Part 2.1 The motion estimation architecture uses the block-matching algorithm (BMA) to estimate the vertical and horizontal motion vectors for each mesh node. Parallel and pipelined implementations have been used to overcome the huge computational requirements of the motion estimation process. The two frames are stored in embedded S-RAMs generated with Virage™ Memory Compiler. The proposed motion estimation architecture has been prototyped, simulated and synthesized using the TSMC 0.18 μm CMOS technology. At 100 MHz clock frequency, the proposed architecture processes one CIF video frame (i.e., 352×288 pixels) in 1.48 ms, which means it can process up to 675 frames per second. The core area of the proposed motion estimation architecture is 24.58 mm2 and it consumes 46.26 mW.

Read More: https://www.worldscientific.com/doi/abs/10.1142/S0218126604001921

 

Mohammed Sayed and Wael Badawy, “A Computational RAM (C-RAM) Architecture for Real-Time Mesh-Based Video Motion Tracking: Part I Motion Estimation,” Journal of Circuits, Systems and Computers, Vol. 13, Issue 6, December 2004, pp. 1203-1216.

+

Algorithm-Based Low Power VLSI Architecture For 2d-Mesh Video Object Motion Tracking

The new VLSI architecture for video object (VO) motion tracking uses a novel hierarchical adaptive structured mesh topology. The structured mesh offers a significant reduction in the number of bits that describe the mesh topology. The motion of the mesh nodes represents the deformation of the VO. Motion compensation is performed using a multiplication-free algorithm for affine transformation, significantly reducing the decoder architecture complexity. Pipelining the affine unit contributes a considerable power saving. The VO motion-tracking architecture is based on a new algorithm. It consists of two main parts: a video object motion-estimation unit (VOME) and a video object motion-compensation unit (VOMC). The VOME processes two consequent frames to generate a hierarchical adaptive structured mesh and the motion vectors of the mesh nodes. It implements parallel block matching motion-estimation units to optimize the latency. The VOMC processes a reference frame, mesh nodes and motion vectors to predict a video frame. It implements parallel threads in which each thread implements a pipelined chain of scalable affine units. This motion-compensation algorithm allows the use of one simple warping unit to map a hierarchical structure. The affine unit warps the texture of a patch at any level of hierarchical mesh independently. The processor uses a memory serialization unit, which interfaces the memory to the parallel units. The architecture has been prototyped using top-down low-power design methodology. Performance analysis shows that this processor can be used in online object-based video applications such as MPEG-4 and VRML

Wael Badawy and Magdy Bayoumi, “Algorithm-Based Low Power VLSI Architecture For 2d-Mesh Video Object Motion Tracking,” The IEEE Transaction on Circuits and Systems for Video Technology, Vol. 12, No. 4, April 2002, pp. 227-237

+

A Proposed Hardware Reference Model for Spatial Transformation and Quantization in H.264,

 

This paper presents three Very Large Scale Integration prototypes to exploit spatial redundancy in the H.264 standard. The proposed architectures are: (1) forward 4 × 4 integer approximation of DCT transform and quantization, which is applied to all blocks of a frame, (2) the 4 × 4 Hadamard transform and quantization that is applied to the DC coefficients of the luma component when the macroblock is encoded in 16 × 16 intra prediction mode, and (3) the 2 × 2 Hadamard transform and quantization that is applied to the DC coefficients of the chroma component as a second level in the transformation hierarchy. The developed algorithms are adopted by the H.264 standard. A performance analysis shows that the architectures satisfy the real-time constraints required by different digital video applications.

 

I. Amer, W. Badawy, G. Jullien, “A Proposed Hardware Reference Model for Spatial Transformation and Quantization in H.264,” Elsevier Journal of Visual Communication and Image Representation, Volume 17, Issue 2, April 2006, Pages 533-552.

+

An Affine Based Algorithm and SIMD Architecture for Video Compression with Low Bit-rate Applications

This paper presents a new affine-based algorithm and SIMD architecture for video compression with low bit rate applications. The proposed algorithm is used for mesh-based motion estimation and it is named mesh-based square-matching algorithm (MB-SMA). The MB-SMA is a simplified version of the hexagonal matching algorithm [1]. In this algorithm, right-angled triangular mesh is used to benefit from a multiplication free algorithm presented in [2] for computing the affine parameters. The proposed algorithm has lower computational cost than the hexagonal matching algorithm while it produces almost the same peak signal-to-noise ratio (PSNR) values. The MB-SMA outperforms the commonly used motion estimation algorithms in terms of computational cost, efficiency and video quality (i.e., PSNR). The MB-SMA is implemented using an SIMD architecture in which a large number of processing elements has been embedded with SRAM blocks to utilize the large internal memory bandwidth. The proposed architecture needs 26.9 ms to process one CIF video frame. Therefore, it can process 37 CIF frames/s. The proposed architecture has been prototyped using Taiwan Semiconductor Manufacturing Company (TSMC) 0.18-μm CMOS technology and the embedded SRAMs have been generated using Virage Logic memory compiler.

Published in:

Circuits and Systems for Video Technology, IEEE Transactions on  (Volume:16 ,  Issue: 4 )

Back to  a complete list of Peer-Reviewed Journal Papers

Mohammed Sayed , Wael Badawy, “An Affine Based Algorithm and SIMD Architecture for Video Compression with Low Bit-rate Applications“, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 16, Issue 4, pp. 457-471, April 2006. Abstract

+

CAVLC Encoder Design for Real-Time Mobile Video Applications

Abstract

This brief presents a new context-based adaptive variable length coding (CAVLC) architecture. The prototype is designed for the H.264/AVC baseline profile entropy coder. The proposed design offers area savings by reducing the size of the statistic buffer. The arithmetic table elimination technique further reduces the area. The split VLC tables simplify the process of bit-stream generation and also help in reducing some area. The proposed architecture is implemented on Xilinx Virtex II field-programmable gate array (2v3000fg676-4). Simulation result shows that the architecture is capable of processing common/quarter-common intermediate format frame sequences in real-time at a core speed of 50 MHz with 6.85-K logic gates.

Published in:

Circuits and Systems II: Express Briefs, IEEE Transactions on  (Volume:54 ,  Issue: 10 )

C. A. Rahman and W. Badawy, “CAVLC Encoder Design for Real-time Mobile Video Applications”, The IEEE Trans. on Circuits and Systems II, Oct. 2007 Vol 54, Issue: 10, pp. 873-877.
Link to the list of other Peer Journal Publications

+

A Simplified 8×8 Transformation And Quantization Real-Time Ip-Block For Mpeg-4 H.264/Avc Applications: A New Design Flow Approach

Abstract

Current multimedia design processes suffer from the excessively large time spent on testing new IP-blocks with references based on large video encoders specifications (usually several thousands lines of code). The appropriate testing of a single IP-block may require the conversion of the overall encoder from software to hardware, which is difficult to complete in the short time required by the competition-driven reduced time-to-market demanded for the adoption of a new video coding standard. This paper presents a new design flow to accelerate the conformance testing of an IP-block using the H.264/AVC software reference model. An example block of the simplified 8 × 8 transformation and quantization, which is adopted in FRExt, is provided as a case study demonstrating the effectiveness of the approach.

To Download A SIMPLIFIED 8 × 8 TRANSFORMATION AND QUANTIZATION REAL-TIME IP-BLOCK FOR MPEG-4 H.264/AVC APPLICATIONS: A NEW DESIGN FLOW APPROACH

 

Ihab Amer, Wael Badawy, Graham Jullien, Marco Mattavelli, And Robert Turney, “A Simplified 8×8 Transformation And Quantization Real-Time Ip-Block For Mpeg-4 H.264/Avc Applications: A New Design Flow Approach,” Journal of Circuits, Systems, and Computers Vol. 16, No. 6 (2007) 1011–1026

Link to the list of other Peer Journal Publications

+

Towards an H.264/AVC HW/SW Integrated Solution: An Efficient VBSME Architecture

Abstract:

This paper presents an efficient real-time variable block size motion estimation architecture. The proposed architecture provides motion vectors for each 16 times 16 block and its 40 sub-blocks. The proposed architecture is a single-instruction multiple-data architecture integrated with embedded SRAMs on one chip. The architecture has been prototyped using Xilinx Virtex-4 XC4VSX35-10 field-programmable gate array. It processes 30-CIF fps using 71-MHz clock frequency. Its maximum clock frequencyuency is 187.7 MHz and the maximum throughput is 20 4CIF fps. The prototyped architecture has 175 k gates and 18 kbits embedded SRAM.

Published in:

Circuits and Systems II: Express Briefs, IEEE Transactions on  (Volume:55 ,  Issue: 9 )

Mohammed Sayed, Wael Badawy, and Graham Jullien, “Towards an H.264/AVC HW/SW Integrated Solution: An Efficient VBSME Architecture”, IEEE Transactions on Circuits and Systems II, Volume: 55, Issue: 9, pp. 912-916, Sept. 2008.

+

A Low Power VLSI Architecture for Mesh-based Video Motion Tracking

This paper proposes a low-power very large-scale integration (VLSI) architecture for motion tracking. It uses a hierarchical adaptive structured mesh that generates a content-based video representation. The proposed mesh is a coarse-to-fine hierarchical two-dimensional mesh that is formed by recursive triangulation of the initial coarse mesh geometry. The structured mesh offers a significant reduction in the number of bits that describe the mesh topology. The motion of the mesh nodes represents the deformation of the video object. The architecture consists of motion estimation and motion compensation units. The motion estimation architecture generates a progressive mesh code and the motion vectors of the mesh nodes. It reduces the power consumption, uses a simpler approach for mesh construction, approximates the mesh nodes motion vector by using the three step search algorithm and uses a parallel motion estimation core to evaluate the mesh nodes motion vectors. Moreover, it maximizes the lifetime of the internal buffers. The motion compensation architecture uses a multiplication-free algorithm for affine transformation, which significantly reduces the complexity of the motion compensation architecture. Moreover, using pipelined affine units contributes to the power savings. The video motion compensation architecture processes a reference frame, mesh nodes and motion vectors to predict a video frame. It implements parallel threads in which each thread implements a pipelined chain of scalable affine units. This motion compensation algorithm allows the use of one simple warping unit to map a hierarchical structure. The affine unit warps the texture of a patch at any level of hierarchical mesh independently. The processor uses a memory serialization unit, which interfaces the memory to the parallel units. The architecture has been prototyped using top-down low-power design methodology. The performance analysis shows that this processor can be used in online object-based video applications such as in MPEG and VRML.

Published in:

Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on  (Volume:49 , Issue: 7 )

Wael Badawy and Magdy Bayoumi, “A Low Power VLSI Architecture for Mesh-based Video Motion Tracking,” The IEEE Transactions on Circuits and Systems II, Vol. 49, July 2002, pp. 488-504.