Compared with H.264/AVC, the new generation of video compression standard H.265/HEVC can achieve a lower bit rate under the same image quality, that is, a higher compression ratio. Due to the visual characteristics of the human eye, in the dynamic rate coding process, the rate allocation of different regions is a key issue in the research. If the video can be divided into a region of interest (ROI) and a normal region during the encoding process, and the bit rate allocation of the two can be dynamically adjusted, a better subjective video can be obtained at the same or even lower bit rate Quality, thereby enhancing the user experience. The extraction speed and quality of the ROI region have a great impact on the encoding algorithm, so it is particularly important to achieve low-complexity and high-quality ROI region extraction, and to allocate the bit rate according to the characteristics of H.265/HEVC video coding.
The ROI extraction and bit rate allocation method is used for JPEG 2000 still image compression, which improves the image quality of the ROI area and achieves a better bit rate saving effect, and uses FPGA to carry out VLSI hardware design for ROI extraction, so Satisfactory results have been obtained under the premise of significantly increasing the image encoding time, but the system can only be used for still image encoding; both have proposed ROI-based H.265/HEVC rate control, that is, compression performance optimization methods, and achieved certain results . Research shows that although the H.265/HEVC coding standard has reduced the bit rate to a certain extent compared with the H.264/AVC standard, the rate control of the ROI region also works for the latest HEVC standard, but the complexity of the ROI extraction algorithm is not considered The impact on the encoding rate; the literature uses the Gaussian background model to establish a virtual background frame, which reduces the code rate of H.265/HEVC encoding, but does not consider the ROI variable quality encoding for the human eye characteristics, and does not consider the efficiency of background frame establishment Impact on encoder rate.
Based on the block-wise feature of video coding algorithm and the fine-grained parallel feature of FPGA, this paper proposes a Gaussian background modeling-ROI mapping method based on block matching, and uses HLS tools to implement hardware implementation and verification of the algorithm on FPGA platform. The FPGA processing speed reaches 22 fps@1 080 p, and the ROI-mapped CTU area is encoded with variable quality, which can get an average bit rate saving of about 10%, and the overall video quality remains stable.
1 Gaussian background modeling and its improvement for video coding1.1 The basic principles of pixel-based Gaussian background modeling
Gaussian background modeling is a background modeling method based on probability models. The traditional Gaussian background modeling algorithm is based on pixels. A frame of image in digital video can be regarded as a two-dimensional discrete function f (x, y, t) for the space-time position (x, y, t), in a given channel of a given color space, for a given (X0, y0, t0), f has only a unique value; for a given time t0, f can be regarded as a two-dimensional random field, which is generally regarded as a stationary random field.
From a statistical point of view, the appearance and movement of foreground objects are temporary and sudden, while the background is long-term and has a certain stability. For time t, given (x0, y0), f (X0, y0, t) satisfies a certain probability distribution, usually a Gaussian distribution.
The expression of the Gaussian background model is:
1.2 Gaussian background modeling based on block matching-ROI mapping algorithm
It can be seen from the expression of the original Gaussian background model that the pixel-based Gaussian background modeling algorithm requires a large number of complex floating-point calculations. Generally, it takes hundreds of frames to complete the model establishment, which causes the algorithm to be time-consuming and not suitable for hardware. achieve.
The Gaussian background modeling method only considers the time correlation of the pixels at the same position, and treats all pixels as isolated points. On the one hand, a lot of repetitive calculations are required. On the other hand, when the background changes, it will produce "False alarm" phenomenon.
There are spatial redundancy, temporal redundancy and knowledge redundancy in video sequences. Aiming at the redundancy of a frame of image space, the video coding algorithm adopts the block method for intra-frame prediction, and performs transform coding and quantization coding on the residual between the predicted value and the original value to achieve the purpose of video compression.
In this paper, the block matching method is used to replace the pixel matching and update method of the original Gaussian background modeling, and a Gaussian background modeling-ROI extraction algorithm based on block matching is proposed. On the one hand, block-based background modeling and calculation can avoid a large number of calculations in the pixel-based algorithm; on the other hand, block-based Gaussian background modeling can unify the background establishment and the division of video coding blocks.
After the background is established by Gaussian modeling, the new video frame is divided into blocks, and the foreground block and the background block are judged according to the SAD criterion. The expression of the SAD discrimination is shown in equation (5). Among them, B represents the background block that has been established, and C represents the pixel block at the corresponding position of the current video frame. In this article, N is 8.
The basic steps are described as follows:
Step 1: Video block division. According to N&TImes;N scale, the original video is divided into several disjoint sub-regions.
Step 2: Model initialization. For the block area, initialize the basic parameters μ, σ, λ, and α of the Gaussian model.
Step 3: Frame count determination. Read in the video, if the number of video frames meets the update period p, go to step 4, otherwise go to step 5.
Step 4: Model update. Update the tiled background model.
Step 5: Front background judgment. According to the SAD criterion, the foreground and background are divided.
Step 6: ROI area mapping. According to the foreground block distribution, the CTU in the video is mapped. In this article, the HEVC CTU scale is set to 32&TImes;32, and the mapping result will be sent to the H.265/HEVC encoder.
The algorithm flow is shown in Figure 1.
2.1 Rate distortion optimization for ROI area
In order to reduce the bit rate and achieve better image quality, rate-distortion optimization can be defined as the following optimization problem: when the bit rate R≤Rmax, the encoding algorithm is adjusted to minimize the distortion D, namely:
Equation (8) is usually used as the basis for RDO, but in fact, the coding blocks are not independent of each other, which leads to the value of the local optimal solution.
In this paper, by dividing the ROI area, assuming that the ROI area and the non-ROI area are independent and identically distributed in a frame, the rate-distortion optimization function can be described as:
Since equation (9) takes into account the correlation problem of coding blocks, it can avoid falling into the local optimum to a certain extent. Analysis shows that equation (9) will get a better solution than equation (8).
Further, starting from subjective video quality, the human eye expects better video quality in the ROI area. Therefore, this article adds restrictions in the implementation process:
2.2 HEVC encoding integrated with ROI extraction
This article sends the ROI area into the HEVC encoder for variable quality coding. In order to prevent the large difference between the coding parameters of the ROI area and the surrounding non-ROI areas from causing obvious block effects, this paper uses nonlinear compensation to adjust the quantization parameters. The specific methods are as follows.
Remember that the quantization parameter of the coding block A where the ROI area is located is q1, and the quantization parameter of the coding block B in the non-ROI area nearby is q2, the center point of A is marked as (xA, yA), and the center point of B is marked As (xB, yB), the Hamming distance D of q1, q2 and the center of A and B should satisfy the following relationship:
In order to illustrate the effectiveness of the method in this paper, the Gaussian background modeling-ROI algorithm based on block matching is implemented in hardware and embedded in the HEVC encoding process.
This article uses high-level synthesis (High Level Synthesis, HLS) tools, based on the Xilinx MPSoC platform ZCU102 to carry out a background modeling-based ROI region mapping and adaptive coding hardware design. The HLS tool can map the high-level description of the C/C++ language to a hardware description language (VHDL or Verilog) to improve development efficiency.
The hardware includes 3 modules, namely: background establishment, background update, ROI determination and mapping, and finally the mapping result is sent to the video encoder. Its basic structure is shown in Figure 2.
The original video data is buffered in the DDR, and the line buffer is used inside the FPGA to accelerate the access rate. The video data multiplexer sends the video to different processing units under the control of the frame counter to map the ROI area to the H.265 standard In the coding tree unit (CTU), the mapping result is sent to the H.265 encoder. In the encoder, ROI adaptive QP adjustment is performed according to the nature of the region, and finally the code stream generated after encoding is written back to DDR.
4 Experimental results and analysis4.1 Experimental environment
This article is based on Xilinx ZCU102 embedded development platform for experiments. ZCU102 is equipped with Zynq UltraScale XCZU9EG-2FFVB1156 FPGA chip. The internal architecture of the chip mainly includes two parts: a processor system (Processing System, PS) and a programmable logic (Programmable Logic, PL).
Among them, PL end hardware resource consumption situation is shown in Table 1. Considering a certain degree of scalability, the resolution of the image in the hardware design can be configured, and the highest resolution is 1 920 & TImes; 1 080.
4.2 Background modeling effect and ROI mapping result
Figure 3 shows the FPGA-based background modeling and ROI mapping results. The sequence used is the HEVC standard test sequence BasketballDrill_832&TImes;480_50.yuv. Figure 3(a) is the 201st frame of the video sequence, Figure 3(b) is the background frame obtained by modeling using the first 200 frames, and Figure 3(c) is the mapping result for HEVC CTU, where the white area is the mapping result ROI area. It can be seen that the moving people in the video are accurately mapped to the area bounded by the CTU size. Observing the original video sequence, it can be seen that the background area in the original video sequence has changed over time (for example, the basket will shake with the impact of a basketball, etc.), but these changes have not affected the mapping of the ROI area (that is, there is no "false alarm" "Phenomenon), the algorithm has certain robustness.
Table 2 compares the processing speeds under different resolutions. The clock frequency of the PL part is 120 MHz. It can be seen from the table that the design of this article can still achieve high real-time performance at a resolution of 1 920 × 1 080.
4.3 HEVC video coding performance evaluation with embedded ROI rate control
In order to further illustrate the effectiveness of HEVC encoding after embedding the ROI region, this paper verifies the encoding results of the HEVC encoder by experiments. The test sequences under different resolutions and different scenarios were selected to calculate the overall bit rate and PSNR changes. The results are shown in Table 3.
It can be seen from Table 3 that using the background modeling-ROI mapping algorithm proposed in this article for rate control, the overall PSNR of the encoded image does not change significantly, but the rate has an average savings of about 10%, which verifies this article The effectiveness of the algorithm in rate control.
5 ConclusionBased on the characteristics of video coding algorithm block, this paper proposes a block-based Gaussian background modeling-ROI mapping method, which is implemented on FPGA through HLS method and used for H.265/HEVC video coding. Experimental results show that the algorithm runs faster on the FPGA platform and can be effectively integrated into the H.265/HEVC hardware encoder; in H.265/HEVC, the extracted ROI area is encoded with variable quality, and the average approximation can be obtained. 10% bit rate saving, the overall video quality remains stable.
Washing Machine Motor,Spin Motor Of Aluminium Wire,Washing Machine Motor Shaft,Automatic Washing Machine Spin Motor
WUJIANG JINLONG ELECTRIC APPLIANCE CO., LTD , https://www.jinlongmotor.com