Use a better model for tune=ssim
Comparing to the baseline tune=ssim, the average gains are
PSNR -0.75, SSIM -0.31, MS-SSIM -0.95, VMAF -1.65
Details:
150f SP1 Q
PSNR SSIM MS-SSIM VMAF
Lowres -1.503 0.136 -0.514 -2.065
Midres -0.128 -0.521 -1.136 -1.622
Hdres 0.957 -0.987 -1.580 -0.434
Ugc360p -2.038 0.159 -0.677 -2.654
150f SP1 VBR
PSNR SSIM MS-SSIM VMAF
Lowres -1.418 0.022 -0.565 -1.630
Midres -0.513 -0.477 -1.126 -1.928
Hdres 0.892 -0.941 -1.497 -0.060
Ugc360p -2.246 0.128 -0.506 -2.809
Change-Id: I075c44552f02f56bb31994b9e4140630b5cea976
diff --git a/av1/encoder/encoder.c b/av1/encoder/encoder.c
index 62a5340..5bc2155 100644
--- a/av1/encoder/encoder.c
+++ b/av1/encoder/encoder.c
@@ -5751,16 +5751,6 @@
}
}
-// Implementation and modifications of C. Yeo, H. L. Tan, and Y. H. Tan, "On
-// rate distortion optimization using SSIM," Circuits and Systems for Video
-// Technology, IEEE Transactions on, vol. 23, no. 7, pp. 1170-1181, 2013.
-// SSIM_VAR_SCALE defines the strength of the bias towards SSIM in RDO:
-// Test data set: mid_res (33 frames)
-// SSIM_VAR_SCALE avg_psnr ssim ms-ssim
-// 8 8.2 -6.0 -6.4
-// 16 4.0 -5.7 -5.9
-// 32 1.6 -4.4 -4.5
-#define SSIM_VAR_SCALE 16.0
static void set_mb_ssim_rdmult_scaling(AV1_COMP *cpi) {
AV1_COMMON *cm = &cpi->common;
ThreadData *td = &cpi->td;
@@ -5778,10 +5768,6 @@
int row, col;
const int use_hbd = cpi->source->flags & YV12_FLAG_HIGHBITDEPTH;
- // TODO(sdeng): tune this param for 12bit videos.
- double c2 = 58.5225; // (.03*255)^2
- c2 *= SSIM_VAR_SCALE;
-
// Loop through each 16x16 block.
for (row = 0; row < num_rows; ++row) {
for (col = 0; col < num_cols; ++col) {
@@ -5813,7 +5799,10 @@
}
}
var = var / num_of_var;
- var = 2.0 * var + c2;
+
+ // Curve fitting with an exponential model on all 16x16 blocks from the
+ // midres dataset.
+ var = 67.035434 * (1 - exp(-0.0021489 * var)) + 17.492222;
cpi->ssim_rdmult_scaling_factors[index] = var;
log_sum += log(var);
}