doc/dev_guide/av1_encoder.dox - aom - Git at Google

 /*!\page encoder_guide AV1 ENCODING TECHNIQUES

   AV1 encoding algorithm consists following modules:
     - \ref high_level_algo
       - To encode a frame, first call \ref av1_receive_raw_frame() to obtain the
         raw frame data. Then call \ref av1_get_compressed_data() to encode raw
         frame data into compressed frame data.
     - \ref partition_search
     - \ref intra_mode_search
     - \ref inter_mode_search
     - \ref transform_search
     - \ref in_loop_filter
     - \ref rate_control

   <b>[SAMPLE CONTEXT ONLY - copied from AV1 overview paper]:</b>

   In this paper, we present the core coding tools in AV1 that contribute to
   the majority of the 30% reduction in average bitrate compared with the most
   performant libvpx VP9 encoder at the same quality.

   \section partition Coding block partition
   VP9 uses a four-way partition tree starting from the 64x64 level down to 4x4
   level, with some additional restrictions for blocks below 8x8 where within an
   8x8 block all the sub-blocks should have the same reference frame, as shown in
   the top half of Fig. 1, so as to ensure the chroma component can be
   processed in a minimum of 4x4 block unit. Note that partitions designated as
   “R” refer to as recursive in that the same partition tree is repeated at a
   lower scale until we reach the lowest 4x4 level.

   \image html partition.png "Fig. 1. Partition tree in VP9 and AV1"

   AV1 increases the largest coding block unit to 128x128 and expands the
   partition tree to support 10 possible outcomes to further include 4:1/1:4
   rectangular coding block sizes. Similar to VP9 only the square block is
   allowed for further subdivision. In addition, AV1 adds more flexibility to
   sub-8x8 coding blocks by allowing each unit has their own inter/intra mode and
   reference frame choice. To support such flexibility, it allows the use of 2x2
   inter prediction for chroma component, while retaining the minimum transform
   size as 4x4.

   \section intra_prediction Intra prediction
   VP9 supports 10 intra prediction modes, including eight directional modes
   corresponding to angles from 45 to 207 degrees, and two non-directional
   predictors: DC and true motion (TM) mode. In AV1, the potential of an intra
   coder is further explored in various ways: the granularity of directional
   extrapolation is upgraded, non-directional predictors are enriched by taking
   into account gradients and evolving correlations, coherence of luma and
   chroma signals is exploited, and tools are developed particularly for
   artificial content.

   -# Enhanced directional intra prediction\n
      To exploit more varieties of spatial redundancy in directional textures, in
   AV1, directional intra modes are extended to an angle set with finer
   granularity for blocks larger than 8x8. The original eight angles are made
   nominal angles, based on which fine angle variations in a step size of 3
   degrees are introduced, i.e. the prediction angle is presented by a nominal
   intra angle plus an angle delta, which is -3x3 multiples of the step size. To
   implement directional prediction modes in AV1 via a generic way, the 48
   extension modes are realized by a unified directional predictor that links
   each pixel to a reference sub-pixel location in the edge and interpolates
   the reference pixel by a 2-tap bilinear filter. In total, there are 56
   directional intra modes supported in AV1.

      Another enhancement for directional intra prediction in AV1 is that, a low-
   pass filter is applied to the reference pixel values before they are used
   to predict the target block. The filter strength is pre-defined based on
   the prediction angle and block size.

   -# New non-directional smooth intra predictors\n
      VP9 has two non-directional intra prediction modes: DC_PRED and TM_PRED.
   AV1 expands on this by adding three new prediction modes: SMOOTH_PRED,
   SMOOTH_V_PRED, and SMOOTH_H_PRED. Also a fourth new prediction mode
   PAETH_PRED replaces the existing mode TM_PRED. The new modes work as
   follows:

      - <b>SMOOTH_PRED</b>: Useful for predicting blocks that have a smooth
   gradient.

      - <b>SMOOTH_V_PRED</b>: Similar to SMOOTH_PRED, but uses quadratic
   interpolation only in the vertical direction.

      - <b>SMOOTH_H_PRED</b>: Similar to SMOOTH_PRED, but uses quadratic
   interpolation only in the horizontal direction.

      - <b>PAETH_PRED</b>: Calculate \f$base=left + top -top\_left\f$. Then
   predict this pixel as left, top, or top-left pixel depending on which of them
   is closest to “base”.

   \section inter_prediction Inter prediction
   Motion compensation is an essential module in video coding. AV1 has a more
   powerful inter coder, which largely extends the pool of reference frames and
   motion vectors, breaks the limitation of block-based translational prediction,
   also enhances compound prediction by using highly adaptable weighting
   algorithms as well as sources.

   -# Extended reference frames\n
     AV1 extends the number of references for each frame from 3 to 7. Figure 4
     demonstrates the multi-layer structure of a golden-frame group, in which an
     adaptive number of frames share the same GOLDEN and ALTREF frames. BWDREF
     is a look-ahead frame directly coded without applying temporal filtering,
     thus more applicable as a backward reference in a relatively shorter
     distance. ALTREF2 serves as an intermediate filtered future reference
     between GOLDEN and ALTREF.

     \image html gf_group.png "Fig. 4. Example of multi-layer structure of a golden-frame group"

   -# Advanced compound prediction\n
      A collection of new compound prediction tools is developed for AV1 to make
   its inter coder more versatile. In this section, any compound prediction
   operation can be generalized for a pixel \f$(i,j)\f$ as:
   \f$p_f(i,j)=m(i,j)p_1(i,j)+(1-m(i,j))p_2(i,j)\f$, where \f$p_1\f$ and
   \f$p_2\f$ are two predictors, and \f$p_f\f$ is the final joint prediction,
   with the weighting coefficients \f$m(i,j)\f$ in \f$[0,1]\f$ that are
   designed for different use cases and can be easily generated from predefined
   tables.
 */

 /*!\defgroup encoder_algo Encoder Algorithm
  *
  * The encoder algorithm describes how a sequence is encoded, including high
  * level decision as well as algorithm used at every encoding stage.
  */

 /*!\defgroup high_level_algo High-level Algorithm
  * \ingroup encoder_algo
  * This module describes sequence level/frame level algorithm in AV1.
  * More details will be added.
  * @{
  */
 /*! @} - end defgroup high_level_algo */

 /*!\defgroup partition_search Partition Search
  * \ingroup encoder_algo
  * This module describes partition search algorithm in AV1.
  * More details will be added.
  * @{
  */
 /*! @} - end defgroup partition_search */

 /*!\defgroup intra_mode_search Intra Mode Search
  * \ingroup encoder_algo
  * This module describes intra mode search algorithm in AV1.
  * More details will be added.
  * @{
  */
 /*! @} - end defgroup intra_mode_search */

 /*!\defgroup inter_mode_search Inter Mode Search
  * \ingroup encoder_algo
  * This module describes inter mode search algorithm in AV1.
  * More details will be added.
  * @{
  */
 /*! @} - end defgroup inter_mode_search */

 /*!\defgroup transform_search Transform Search
  * \ingroup encoder_algo
  * This module describes transform search algorithm in AV1.
  * More details will be added.
  * @{
  */
 /*! @} - end defgroup transform_search */

 /*!\defgroup in_loop_filter In-loop Filter
  * \ingroup encoder_algo
  * This module describes in-loop filter algorithm in AV1.
  * More details will be added.
  * @{
  */
 /*! @} - end defgroup in_loop_filter */

 /*!\defgroup rate_control Rate Control
  * \ingroup encoder_algo
  * This module describes rate control algorithm in AV1.
  * More details will be added.
  * @{
  */
 /*! @} - end defgroup rate_control */
	/*!\page encoder_guide AV1 ENCODING TECHNIQUES

	AV1 encoding algorithm consists following modules:
	- \ref high_level_algo
	- To encode a frame, first call \ref av1_receive_raw_frame() to obtain the
	raw frame data. Then call \ref av1_get_compressed_data() to encode raw
	frame data into compressed frame data.
	- \ref partition_search
	- \ref intra_mode_search
	- \ref inter_mode_search
	- \ref transform_search
	- \ref in_loop_filter
	- \ref rate_control

	<b>[SAMPLE CONTEXT ONLY - copied from AV1 overview paper]:</b>

	In this paper, we present the core coding tools in AV1 that contribute to
	the majority of the 30% reduction in average bitrate compared with the most
	performant libvpx VP9 encoder at the same quality.

	\section partition Coding block partition
	VP9 uses a four-way partition tree starting from the 64x64 level down to 4x4
	level, with some additional restrictions for blocks below 8x8 where within an
	8x8 block all the sub-blocks should have the same reference frame, as shown in
	the top half of Fig. 1, so as to ensure the chroma component can be
	processed in a minimum of 4x4 block unit. Note that partitions designated as
	“R” refer to as recursive in that the same partition tree is repeated at a
	lower scale until we reach the lowest 4x4 level.

	\image html partition.png "Fig. 1. Partition tree in VP9 and AV1"

	AV1 increases the largest coding block unit to 128x128 and expands the
	partition tree to support 10 possible outcomes to further include 4:1/1:4
	rectangular coding block sizes. Similar to VP9 only the square block is
	allowed for further subdivision. In addition, AV1 adds more flexibility to
	sub-8x8 coding blocks by allowing each unit has their own inter/intra mode and
	reference frame choice. To support such flexibility, it allows the use of 2x2
	inter prediction for chroma component, while retaining the minimum transform
	size as 4x4.

	\section intra_prediction Intra prediction
	VP9 supports 10 intra prediction modes, including eight directional modes
	corresponding to angles from 45 to 207 degrees, and two non-directional
	predictors: DC and true motion (TM) mode. In AV1, the potential of an intra
	coder is further explored in various ways: the granularity of directional
	extrapolation is upgraded, non-directional predictors are enriched by taking
	into account gradients and evolving correlations, coherence of luma and
	chroma signals is exploited, and tools are developed particularly for
	artificial content.

	-# Enhanced directional intra prediction\n
	To exploit more varieties of spatial redundancy in directional textures, in
	AV1, directional intra modes are extended to an angle set with finer
	granularity for blocks larger than 8x8. The original eight angles are made
	nominal angles, based on which fine angle variations in a step size of 3
	degrees are introduced, i.e. the prediction angle is presented by a nominal
	intra angle plus an angle delta, which is -3x3 multiples of the step size. To
	implement directional prediction modes in AV1 via a generic way, the 48
	extension modes are realized by a unified directional predictor that links
	each pixel to a reference sub-pixel location in the edge and interpolates
	the reference pixel by a 2-tap bilinear filter. In total, there are 56
	directional intra modes supported in AV1.

	Another enhancement for directional intra prediction in AV1 is that, a low-
	pass filter is applied to the reference pixel values before they are used
	to predict the target block. The filter strength is pre-defined based on
	the prediction angle and block size.

	-# New non-directional smooth intra predictors\n
	VP9 has two non-directional intra prediction modes: DC_PRED and TM_PRED.
	AV1 expands on this by adding three new prediction modes: SMOOTH_PRED,
	SMOOTH_V_PRED, and SMOOTH_H_PRED. Also a fourth new prediction mode
	PAETH_PRED replaces the existing mode TM_PRED. The new modes work as
	follows:

	- <b>SMOOTH_PRED</b>: Useful for predicting blocks that have a smooth
	gradient.

	- <b>SMOOTH_V_PRED</b>: Similar to SMOOTH_PRED, but uses quadratic
	interpolation only in the vertical direction.

	- <b>SMOOTH_H_PRED</b>: Similar to SMOOTH_PRED, but uses quadratic
	interpolation only in the horizontal direction.

	- <b>PAETH_PRED</b>: Calculate \f$base=left + top -top\_left\f$. Then
	predict this pixel as left, top, or top-left pixel depending on which of them
	is closest to “base”.

	\section inter_prediction Inter prediction
	Motion compensation is an essential module in video coding. AV1 has a more
	powerful inter coder, which largely extends the pool of reference frames and
	motion vectors, breaks the limitation of block-based translational prediction,
	also enhances compound prediction by using highly adaptable weighting
	algorithms as well as sources.

	-# Extended reference frames\n
	AV1 extends the number of references for each frame from 3 to 7. Figure 4
	demonstrates the multi-layer structure of a golden-frame group, in which an
	adaptive number of frames share the same GOLDEN and ALTREF frames. BWDREF
	is a look-ahead frame directly coded without applying temporal filtering,
	thus more applicable as a backward reference in a relatively shorter
	distance. ALTREF2 serves as an intermediate filtered future reference
	between GOLDEN and ALTREF.

	\image html gf_group.png "Fig. 4. Example of multi-layer structure of a golden-frame group"

	-# Advanced compound prediction\n
	A collection of new compound prediction tools is developed for AV1 to make
	its inter coder more versatile. In this section, any compound prediction
	operation can be generalized for a pixel \f$(i,j)\f$ as:
	\f$p_f(i,j)=m(i,j)p_1(i,j)+(1-m(i,j))p_2(i,j)\f$, where \f$p_1\f$ and
	\f$p_2\f$ are two predictors, and \f$p_f\f$ is the final joint prediction,
	with the weighting coefficients \f$m(i,j)\f$ in \f$[0,1]\f$ that are
	designed for different use cases and can be easily generated from predefined
	tables.
	*/

	/*!\defgroup encoder_algo Encoder Algorithm
	*
	* The encoder algorithm describes how a sequence is encoded, including high
	* level decision as well as algorithm used at every encoding stage.
	*/

	/*!\defgroup high_level_algo High-level Algorithm
	* \ingroup encoder_algo
	* This module describes sequence level/frame level algorithm in AV1.
	* More details will be added.
	* @{
	*/
	/! @} - end defgroup high_level_algo /

	/*!\defgroup partition_search Partition Search
	* \ingroup encoder_algo
	* This module describes partition search algorithm in AV1.
	* More details will be added.
	* @{
	*/
	/! @} - end defgroup partition_search /

	/*!\defgroup intra_mode_search Intra Mode Search
	* \ingroup encoder_algo
	* This module describes intra mode search algorithm in AV1.
	* More details will be added.
	* @{
	*/
	/! @} - end defgroup intra_mode_search /

	/*!\defgroup inter_mode_search Inter Mode Search
	* \ingroup encoder_algo
	* This module describes inter mode search algorithm in AV1.
	* More details will be added.
	* @{
	*/
	/! @} - end defgroup inter_mode_search /

	/*!\defgroup transform_search Transform Search
	* \ingroup encoder_algo
	* This module describes transform search algorithm in AV1.
	* More details will be added.
	* @{
	*/
	/! @} - end defgroup transform_search /

	/*!\defgroup in_loop_filter In-loop Filter
	* \ingroup encoder_algo
	* This module describes in-loop filter algorithm in AV1.
	* More details will be added.
	* @{
	*/
	/! @} - end defgroup in_loop_filter /

	/*!\defgroup rate_control Rate Control
	* \ingroup encoder_algo
	* This module describes rate control algorithm in AV1.
	* More details will be added.
	* @{
	*/
	/! @} - end defgroup rate_control /