Modify final refinement step in disflow Instead of refining the flow field at pyramid level 0, skip this step and instead refine the generated correspondences after interpolation. In theory, refining the generated correspondences is a bit more accurate, as this can compensate for inaccuracies in the interpolation step. In practice, we see only a small difference, likely due to the number of other processing and filtering stages between this and the eventual global motion parameter selection. The main benefit is in encode time - pyramid level 0 has by far the most flow vectors to refine, with one per 64 pixels of the source image. So replacing this with one refine call per feature point, of which there are generally many fewer than one per 64 pixels, leads to a significant reduction in the runtime of the disflow step. This is especially true at larger resolutions. An alternative was also tested, where we refined both the dense flow field and the generated correspondences, but this erased all of the encode time savings here for essentially no further BDRATE gain. Speed | BDRATE-PSNR | BDRATE-SSIM | Enc time -------+-------------+-------------+------------- 1 | -0.010% | -0.015% | -0.075% 2 | -0.001% | -0.018% | -0.189% 3 | 0.000% | +0.035% | -0.466% 4 | -0.021% | -0.011% | -0.689% STATS_CHANGED Change-Id: If04c5cac6114e188462360a110e2d8378bda1a7f

commit: 541e13d5c4bb14a4a2cd8ce1713606cebf0249af [log] [tgz]
author: Rachel Barker <rachelbarker@google.com> Mon Nov 13 18:00:28 2023 +0000
committer: Rachel Barker <rachelbarker@google.com> Wed Nov 22 06:28:51 2023 +0000
tree: b074880fd2432998519ac597175137d759d0b182
parent: 9bdcaa921d7e684eb251a058146f585be272526d [diff]
diff --git a/aom_dsp/flow_estimation/disflow.c b/aom_dsp/flow_estimation/disflow.c
index a010c81..855a44f 100644
--- a/aom_dsp/flow_estimation/disflow.c
+++ b/aom_dsp/flow_estimation/disflow.c

@@ -96,7 +96,9 @@
   return get_cubic_value_dbl(tmp, v_kernel);
 }
 
-static int determine_disflow_correspondence(CornerList *corners,
+static int determine_disflow_correspondence(const ImagePyramid *src_pyr,
+                                            const ImagePyramid *ref_pyr,
+                                            CornerList *corners,
                                             const FlowField *flow,
                                             Correspondence *correspondences) {
   const int width = flow->width;
@@ -134,10 +136,18 @@
     get_cubic_kernel_dbl(flow_sub_x, h_kernel);
     get_cubic_kernel_dbl(flow_sub_y, v_kernel);
 
-    const double flow_u = bicubic_interp_one(&flow->u[flow_y * stride + flow_x],
-                                             stride, h_kernel, v_kernel);
-    const double flow_v = bicubic_interp_one(&flow->v[flow_y * stride + flow_x],
-                                             stride, h_kernel, v_kernel);
+    double flow_u = bicubic_interp_one(&flow->u[flow_y * stride + flow_x],
+                                       stride, h_kernel, v_kernel);
+    double flow_v = bicubic_interp_one(&flow->v[flow_y * stride + flow_x],
+                                       stride, h_kernel, v_kernel);
+
+    // Refine the interpolated flow vector one last time
+    const int patch_tl_x = x0 - DISFLOW_PATCH_CENTER;
+    const int patch_tl_y = y0 - DISFLOW_PATCH_CENTER;
+    aom_compute_flow_at_point(
+        src_pyr->layers[0].buffer, ref_pyr->layers[0].buffer, patch_tl_x,
+        patch_tl_y, src_pyr->layers[0].width, src_pyr->layers[0].height,
+        src_pyr->layers[0].stride, &flow_u, &flow_v);
 
     // Use original points (without offsets) when filling in correspondence
     // array
@@ -469,7 +479,14 @@
   }
 
   // Compute flow field from coarsest to finest level of the pyramid
-  for (int level = src_pyr->n_levels - 1; level >= 0; --level) {
+  //
+  // Note: We stop after refining pyramid level 1 and interpolating it to
+  // generate an initial flow field at level 0. We do *not* refine the dense
+  // flow field at level 0. Instead, we wait until we have generated
+  // correspondences by interpolating this flow field, and then refine the
+  // correspondences themselves. This is both faster and gives better output
+  // compared to refining the flow field at level 0 and then interpolating.
+  for (int level = src_pyr->n_levels - 1; level >= 1; --level) {
     const PyramidLayer *cur_layer = &src_pyr->layers[level];
     const int cur_width = cur_layer->width;
     const int cur_height = cur_layer->height;
@@ -657,8 +674,8 @@
     return false;
   }
 
-  const int num_correspondences =
-      determine_disflow_correspondence(src_corners, flow, correspondences);
+  const int num_correspondences = determine_disflow_correspondence(
+      src_pyramid, ref_pyramid, src_corners, flow, correspondences);
 
   bool result = ransac(correspondences, num_correspondences, type,
                        motion_models, num_motion_models, mem_alloc_failed);
commit	541e13d5c4bb14a4a2cd8ce1713606cebf0249af	[log] [tgz]
author	Rachel Barker <rachelbarker@google.com>	Mon Nov 13 18:00:28 2023 +0000
committer	Rachel Barker <rachelbarker@google.com>	Wed Nov 22 06:28:51 2023 +0000
tree	b074880fd2432998519ac597175137d759d0b182
parent	9bdcaa921d7e684eb251a058146f585be272526d [diff]