Retune global motion related speed features

* Enable global motion for "good" mode speeds 3 and 4

* Use disflow-based global motion by default, as it is much faster
  and gives better results than feature matching

* Add a speed feature, num_refinement_steps, which determines
  how many refinement iterations to do in av1_refine_integerized_param

* Fix interaction between the prune_zero_mv_with_sse speed feature
  and global motion

* Fix assertions in prune_zero_mv_with_sse code, to check the correct
  conditions for compound blocks

* Retune the following speed features:
  * Global motion usage
  * Number of global motion refinement steps
  * prune_zero_mv_with_sse

Results:

      |      lowres2       |      midres2       |       hdres2
Speed |  BDRATE | Enc time |  BDRATE | Enc Time |  BDRATE | Enc Time
------+---------+----------+---------+----------+---------+---------
  1   | -0.029% |  -2.289% | -0.057% |  -2.201% | -0.058% |  -0.192%
  2   | +0.010% |  -5.912% | -0.036% |  -4.855% | -0.023% |  -1.333%
  3   | -1.249% |  +3.028% | -0.808% |  +3.212% | -0.379% |  +3.384%
  4   | -1.304% |  +4.202% | -0.846% |  +4.760% | -0.429% |  +4.960%

No change for speed 5+, or for realtime mode

STATS_CHANGED

Change-Id: Ie054a79890ecf6ef8b57f0dbffd2ebca5aecbda1
diff --git a/av1/encoder/global_motion_facade.c b/av1/encoder/global_motion_facade.c
index 1c31148..941118b 100644
--- a/av1/encoder/global_motion_facade.c
+++ b/av1/encoder/global_motion_facade.c
@@ -103,6 +103,7 @@
   TransformationType model;
   int bit_depth = cpi->common.seq_params->bit_depth;
   GlobalMotionMethod global_motion_method = cpi->oxcf.global_motion_method;
+  int num_refinements = cpi->sf.gm_sf.num_refinement_steps;
 
   for (model = ROTZOOM; model < GLOBAL_TRANS_TYPES_ENC; ++model) {
     int64_t best_warp_error = INT64_MAX;
@@ -151,7 +152,7 @@
             ref_buf[frame]->y_buffer, ref_buf[frame]->y_crop_width,
             ref_buf[frame]->y_crop_height, ref_buf[frame]->y_stride,
             cpi->source->y_buffer, src_width, src_height, src_stride,
-            GM_REFINEMENT_COUNT, best_warp_error, segment_map, segment_map_w,
+            num_refinements, best_warp_error, segment_map, segment_map_w,
             erroradv_threshold);
 
         // av1_refine_integerized_param() can return a TRANSLATION type model