blob: af0c5e06a4a13c9c1e7252504dc6655da3ca435c [file] [log] [blame]
/*!\page encoder_guide AV1 ENCODING TECHNIQUES
AV1 encoding algorithm consists following modules:
- \ref high_level_algo
- To encode a frame, first call \ref av1_receive_raw_frame() to obtain the
raw frame data. Then call \ref av1_get_compressed_data() to encode raw
frame data into compressed frame data.
- \ref partition_search
- \ref intra_mode_search
- \ref inter_mode_search
- \ref transform_search
- \ref in_loop_filter
- \ref rate_control
<b>[SAMPLE CONTEXT ONLY - copied from AV1 overview paper]:</b>
In this paper, we present the core coding tools in AV1 that contribute to
the majority of the 30% reduction in average bitrate compared with the most
performant libvpx VP9 encoder at the same quality.
\section partition Coding block partition
VP9 uses a four-way partition tree starting from the 64x64 level down to 4x4
level, with some additional restrictions for blocks below 8x8 where within an
8x8 block all the sub-blocks should have the same reference frame, as shown in
the top half of Fig. 1, so as to ensure the chroma component can be
processed in a minimum of 4x4 block unit. Note that partitions designated as
“R” refer to as recursive in that the same partition tree is repeated at a
lower scale until we reach the lowest 4x4 level.
\image html partition.png "Fig. 1. Partition tree in VP9 and AV1"
AV1 increases the largest coding block unit to 128x128 and expands the
partition tree to support 10 possible outcomes to further include 4:1/1:4
rectangular coding block sizes. Similar to VP9 only the square block is
allowed for further subdivision. In addition, AV1 adds more flexibility to
sub-8x8 coding blocks by allowing each unit has their own inter/intra mode and
reference frame choice. To support such flexibility, it allows the use of 2x2
inter prediction for chroma component, while retaining the minimum transform
size as 4x4.
\section intra_prediction Intra prediction
VP9 supports 10 intra prediction modes, including eight directional modes
corresponding to angles from 45 to 207 degrees, and two non-directional
predictors: DC and true motion (TM) mode. In AV1, the potential of an intra
coder is further explored in various ways: the granularity of directional
extrapolation is upgraded, non-directional predictors are enriched by taking
into account gradients and evolving correlations, coherence of luma and
chroma signals is exploited, and tools are developed particularly for
artificial content.
-# Enhanced directional intra prediction\n
To exploit more varieties of spatial redundancy in directional textures, in
AV1, directional intra modes are extended to an angle set with finer
granularity for blocks larger than 8x8. The original eight angles are made
nominal angles, based on which fine angle variations in a step size of 3
degrees are introduced, i.e. the prediction angle is presented by a nominal
intra angle plus an angle delta, which is -3x3 multiples of the step size. To
implement directional prediction modes in AV1 via a generic way, the 48
extension modes are realized by a unified directional predictor that links
each pixel to a reference sub-pixel location in the edge and interpolates
the reference pixel by a 2-tap bilinear filter. In total, there are 56
directional intra modes supported in AV1.
Another enhancement for directional intra prediction in AV1 is that, a low-
pass filter is applied to the reference pixel values before they are used
to predict the target block. The filter strength is pre-defined based on
the prediction angle and block size.
-# New non-directional smooth intra predictors\n
VP9 has two non-directional intra prediction modes: DC_PRED and TM_PRED.
AV1 expands on this by adding three new prediction modes: SMOOTH_PRED,
SMOOTH_V_PRED, and SMOOTH_H_PRED. Also a fourth new prediction mode
PAETH_PRED replaces the existing mode TM_PRED. The new modes work as
follows:
- <b>SMOOTH_PRED</b>: Useful for predicting blocks that have a smooth
gradient.
- <b>SMOOTH_V_PRED</b>: Similar to SMOOTH_PRED, but uses quadratic
interpolation only in the vertical direction.
- <b>SMOOTH_H_PRED</b>: Similar to SMOOTH_PRED, but uses quadratic
interpolation only in the horizontal direction.
- <b>PAETH_PRED</b>: Calculate \f$base=left + top -top\_left\f$. Then
predict this pixel as left, top, or top-left pixel depending on which of them
is closest to “base”.
\section inter_prediction Inter prediction
Motion compensation is an essential module in video coding. AV1 has a more
powerful inter coder, which largely extends the pool of reference frames and
motion vectors, breaks the limitation of block-based translational prediction,
also enhances compound prediction by using highly adaptable weighting
algorithms as well as sources.
-# Extended reference frames\n
AV1 extends the number of references for each frame from 3 to 7. Figure 4
demonstrates the multi-layer structure of a golden-frame group, in which an
adaptive number of frames share the same GOLDEN and ALTREF frames. BWDREF
is a look-ahead frame directly coded without applying temporal filtering,
thus more applicable as a backward reference in a relatively shorter
distance. ALTREF2 serves as an intermediate filtered future reference
between GOLDEN and ALTREF.
\image html gf_group.png "Fig. 4. Example of multi-layer structure of a golden-frame group"
-# Advanced compound prediction\n
A collection of new compound prediction tools is developed for AV1 to make
its inter coder more versatile. In this section, any compound prediction
operation can be generalized for a pixel \f$(i,j)\f$ as:
\f$p_f(i,j)=m(i,j)p_1(i,j)+(1-m(i,j))p_2(i,j)\f$, where \f$p_1\f$ and
\f$p_2\f$ are two predictors, and \f$p_f\f$ is the final joint prediction,
with the weighting coefficients \f$m(i,j)\f$ in \f$[0,1]\f$ that are
designed for different use cases and can be easily generated from predefined
tables.
*/
/*!\defgroup encoder_algo Encoder Algorithm
*
* The encoder algorithm describes how a sequence is encoded, including high
* level decision as well as algorithm used at every encoding stage.
*/
/*!\defgroup high_level_algo High-level Algorithm
* \ingroup encoder_algo
* This module describes sequence level/frame level algorithm in AV1.
* More details will be added.
* @{
*/
/*! @} - end defgroup high_level_algo */
/*!\defgroup partition_search Partition Search
* \ingroup encoder_algo
* This module describes partition search algorithm in AV1.
* More details will be added.
* @{
*/
/*! @} - end defgroup partition_search */
/*!\defgroup intra_mode_search Intra Mode Search
* \ingroup encoder_algo
* This module describes intra mode search algorithm in AV1.
* More details will be added.
* @{
*/
/*! @} - end defgroup intra_mode_search */
/*!\defgroup inter_mode_search Inter Mode Search
* \ingroup encoder_algo
* This module describes inter mode search algorithm in AV1.
* More details will be added.
* @{
*/
/*! @} - end defgroup inter_mode_search */
/*!\defgroup transform_search Transform Search
* \ingroup encoder_algo
* This module describes transform search algorithm in AV1.
* More details will be added.
* @{
*/
/*! @} - end defgroup transform_search */
/*!\defgroup in_loop_filter In-loop Filter
* \ingroup encoder_algo
* This module describes in-loop filter algorithm in AV1.
* More details will be added.
* @{
*/
/*! @} - end defgroup in_loop_filter */
/*!\defgroup rate_control Rate Control
* \ingroup encoder_algo
* This module describes rate control algorithm in AV1.
* More details will be added.
* @{
*/
/*! @} - end defgroup rate_control */