

# Optimization proposal for an FPGAbased H.264/AVC decoder model

#### Authors: Eng. Laura Quesada del Busto Eng. Gustavo Javier Aguirre Soler

**III TVD International Forum** 

Havana, September 2015



#### Introduction (1/3)

#### **Digital Terrestrial Television System**



#### **Introduction (2/3)**

# FPGA software/hardware co-design for the H.264/AVC decoder implemented in *LACETEL*



Embedded system on a Xilinx Virtex 5 FPGA.

The H.264/AVC decoder software is run on the PowerPC processor

The communication between the PowerPC processor and the hardware is done over PLB bus.

#### Xilinx ML507 evaluation board



#### Introduction (3/3)

Bypass decoding



Bypass Decoding IP module implemented in VHDL, in 2014 4/24



#### **Problematic Situation**

# The FPGA-based H.264/AVC decoder implemented in *LACETEL:*

- does not meet time requirements to become a practical final solution.
- does not have complementary tools to identify and experiment with blocks that are critical in terms of speed.



# Insert the Bypass Arithmetic Decoding IP module of CABAC in the H.264/AVC decoder model, and necessary timing tools to evaluate the system behavior.



**Design flow** 



### modules addiction to the embedded system (1/3)



#### **Interrupt Controller**

- Connected as a 32 bits slave of Bus PLB.
- Two interrupt inputs used.
- Single interrupt output.
- Priority between interrupt requests determined.
- Each input configured for rising edge sensitivity. 8/24

#### modules addiction to the embedded system (2/3)



#### **Timer/Counter**

- Connected as a 32 bits slave of Bus PLB.
- Configured in Generation Mode.
- Interrupt output signal to the interrupt controller.

#### modules addiction to the embedded system (3/3)



**Bypass Decoder IP module** Hardware module in charge of doing bypass decoding of CABAC.



**Design flow** 



Bypass Decoder IP module optimization



#### **Decoder Bypass IP module previous implementation**

- Implemented in a test system with the PowerPC processor and the PLB Bus.
- The processor polls the module's outputs to know when the values are ready.



### Bypass Decoder IP module optimization (1/3)

### **Decoder Bypass IP module previous implementation:**

- Twelve software registers
- Single intern signal in each register

#### **Decoder Bypass IP module optimized:**

- Six software registers.
- Many intern signals in each register.
- Memory consumption reduced.

#### **Bypass Decoder IP module optimization (3/3)**

Bypass Decoder IP module processing time 9,6 ns

<

PLB single data beat read time 32 ns

It is possible just to read and write the data from processor to the Bypass IP module without using any method like the poll or the interruption.



**Design flow** 



IP Bypass Decoder module optimization

H.264/AVC decoder temporal performance evaluation

#### H.264/AVC Decoder temporal performance evaluation (1/3)

## Experiment 1: Software H.264/AVC Decoding Timing (previous work in *LACETEL*)



#### H.264/AVC Decoder temporal performance evaluation (2/3)

Experiment 2: Software/Hardware H.264/AVC Decoding Timing (with Bypass Decoder IP module on)



17/24

#### H.264/AVC Decoder temporal performance evaluation (3/3)

Experiment 3: Software/Hardware H.264/AVC Decoding Timing (with optimized Bypass Decoder IP module on)



18/24

H.264/AVC Decoder temporal performance evaluation (3/3)

| nx Software<br>velopment Kit | H.264/AVC<br>Video | Resolution | Frames | Profile    |
|------------------------------|--------------------|------------|--------|------------|
|                              | Video_1            | 176 x 144  | 3      | Main       |
|                              | Video_2            | 352 x 240  | 8      | High       |
|                              | Video_3            | 176 x 144  | 10     | High 4:2:0 |

Time = <u>
gotten\_value+max\_cont\*tmr\_overflow</u> <u>
bus\_frec</u>

Xilir

Dev

**TELECOMMUNICATION'S INSTITUTE** 

#### Analysis of experiments results

#### **Results for Video\_2**



## **LaceleL** RESEARCH & DEVELOPMENT TELECOMMUNICATION'S INSTITUTE Conclusions (1/2)

- The Interrupt controller and the Timer/Counter were added to the embedded system in which the H.264/AVC model is implemented.
- The Bypass Decoder IP module was optimized and fitted into the H.264/AVC decoder system.
- The H.264/AVC Decoder temporal performance was evaluated for different designs and the results showed that the optimization of Bypass Decoder IP module made possible to reduce decoding delay.

## **LaceleL** RESEARCH & DEVELOPMENT TELECOMMUNICATION'S INSTITUTE **Conclusions (2/2)**

- A methodology of IP modules insertion into the H.264/AVC decoder in an embedded system was build.
- The H.264/AVC decoder showed is a platform capable of supporting next implementations such as the insertion of IP modules following the way showed in this work.

- Introduce some IP modules into the H.264/AVC decoder system that allow to visualize the decoded frames through interfaces such as VGA or DVI.
- Time the delay produced by the rest of the elements in the H.264/AVC Decoder System to find which of them need to be optimized.



# DIGITAL TELEVISION LABORATORY





