rtc: Superblock ME in partitioning for screen

For real-time screen mode in nonrd_pickmode:
Enable superblock ME (estimate_motion_for_var_based_partition)
for screen, and only allow it for superblocks with motion and
spatial variance above some threshold.

Use it to detect motion, whose SAD is small (much less than zeromv).
This could capture coherent more like scrolling. The motion
vector extracted here is then tested  in the nonrd_pickmode.

Update int_pro_motion() to allow for larger search window and
full search, and consider separately horiz and vert motion.
This is needed for superblocks with motion, to capture scroll motion.

For superblocks where motion estimation is much better
than zero-motion, coding block size will be larger and pickmode
will always test the motion vector for superblock ME,
this can lead to better quality with less bits for scroll motion.

Visual improvement on clips with scroll (buganizer, youtube),
with ~2-3% bdrate gain on those clips with small speedup.

Stats change on rtc_screen: avg/ovr/ssim, IC% speedup
speed 11: -0.923/-1.230/-1.223, 1.332
speed 10: -0.838/-0.857/-1.027	1.302
speed 9:  -0.898/-0.991/-1.126	1.384
speed 8:  -0.658/-0.685/-0.751	0.183
speed 7:  -0.616/-0.889/-0.878	0.881

Change-Id: I210f86712f8c6f958d5498b19afb97ffa9be8654
diff --git a/av1/encoder/mcomp.h b/av1/encoder/mcomp.h
index 2e26579..fa46b48 100644
--- a/av1/encoder/mcomp.h
+++ b/av1/encoder/mcomp.h
@@ -267,10 +267,9 @@
 
 int av1_init_search_range(int size);
 
-unsigned int av1_int_pro_motion_estimation(const struct AV1_COMP *cpi,
-                                           MACROBLOCK *x, BLOCK_SIZE bsize,
-                                           int mi_row, int mi_col,
-                                           const MV *ref_mv);
+unsigned int av1_int_pro_motion_estimation(
+    const struct AV1_COMP *cpi, MACROBLOCK *x, BLOCK_SIZE bsize, int mi_row,
+    int mi_col, const MV *ref_mv, unsigned int *y_sad_zero, int me_search_par);
 
 int av1_refining_search_8p_c(const FULLPEL_MOTION_SEARCH_PARAMS *ms_params,
                              const FULLPEL_MV start_mv, FULLPEL_MV *best_mv);