Allow more partition searches for screen content

Summary:
Speed 1: ~2% gain on screen content videos, with ~25% slowdown.

An observation of screen content videos is that it tends to
select large partition block sizes, because of good prediction.
The ml partition search speed features sometimes skip
these partition searches. And these features would limit the
maximum partition block size.
This cl forces the partition search to always start at largest
possible block size, allows search partition none, and rectangular
block sizes.

The result is as follows:
Speed 4          avg_psnr    ovr_psnr    ssim
screen_content     -2.98       -3.02      -4.28

Screenn content videos from other set:
speed 4          avg_psnr    ovr_psnr    ssim
wikipedia420       -8.68       -8.67      -9.31
Dinner_1080p       -6.22       -6.32      -6.31

For speed 1, there's one clip with weird result (down 14%).
If excluding it, the performance is:
Speed 1          avg_psnr    ovr_psnr    ssim
screen_content     -1.83       -1.87      -2.71

Screenn content videos from other set:
speed 1          avg_psnr    ovr_psnr    ssim
wikipedia420       -6.47       -6.49      -6.74
Dinner_1080p       -6.46       -6.59      -6.79

Local speed tests show about ~25% slowdown on screen content
videos, since we allow more partition searches.

Other discoveries and things to address:
(1). Speed 0 is ~7.5% better than speed 1 on screen content set.
On other test sets, the difference is ~2%.
It suggests that the ml partition speed feature is not working
equally well on screen content set than other test sets.

(2). Speed 0 encoding time on screen content videos is 3x of speed 1.
Therefore I think this change getting ~2% back on speed 1,
with 25% slowdown is a good trade-off.

Change-Id: I71668ace8d595f2868d3d19714e315dc3402a25b
diff --git a/av1/encoder/encodeframe.c b/av1/encoder/encodeframe.c
index 31b221b..a440bf2 100644
--- a/av1/encoder/encodeframe.c
+++ b/av1/encoder/encodeframe.c
@@ -2769,7 +2769,8 @@
   save_context(x, &x_ctx, mi_row, mi_col, bsize, num_planes);
 
   const int try_intra_cnn_split =
-      frame_is_intra_only(cm) && cpi->sf.part_sf.intra_cnn_split &&
+      !cpi->is_screen_content_type && frame_is_intra_only(cm) &&
+      cpi->sf.part_sf.intra_cnn_split &&
       cm->seq_params.sb_size >= BLOCK_64X64 && bsize <= BLOCK_64X64 &&
       bsize >= BLOCK_8X8 && mi_row + mi_size_high[bsize] <= cm->mi_rows &&
       mi_col + mi_size_wide[bsize] <= cm->mi_cols;
@@ -2784,6 +2785,7 @@
   // Use simple_motion_search to prune partitions. This must be done prior to
   // PARTITION_SPLIT to propagate the initial mvs to a smaller blocksize.
   const int try_split_only =
+      !cpi->is_screen_content_type &&
       cpi->sf.part_sf.simple_motion_search_split && do_square_split &&
       bsize >= BLOCK_8X8 && mi_row + mi_size_high[bsize] <= cm->mi_rows &&
       mi_col + mi_size_wide[bsize] <= cm->mi_cols && !frame_is_intra_only(cm) &&
@@ -2797,6 +2799,7 @@
   }
 
   const int try_prune_rect =
+      !cpi->is_screen_content_type &&
       cpi->sf.part_sf.simple_motion_search_prune_rect &&
       !frame_is_intra_only(cm) && do_rectangular_split &&
       (do_square_split || partition_none_allowed ||
@@ -2868,9 +2871,11 @@
 
   // PARTITION_NONE
   if (is_le_min_sq_part && has_rows && has_cols) partition_none_allowed = 1;
+  assert(terminate_partition_search == 0);
   int64_t part_none_rd = INT64_MAX;
-  if (!terminate_partition_search && partition_none_allowed &&
-      !is_gt_max_sq_part) {
+  if (cpi->is_screen_content_type)
+    partition_none_allowed = has_rows && has_cols;
+  if (partition_none_allowed && !is_gt_max_sq_part) {
     int pt_cost = 0;
     if (bsize_at_least_8x8) {
       pt_cost = partition_cost[PARTITION_NONE] < INT_MAX
diff --git a/av1/encoder/partition_strategy.h b/av1/encoder/partition_strategy.h
index 5275432..6405f32 100644
--- a/av1/encoder/partition_strategy.h
+++ b/av1/encoder/partition_strategy.h
@@ -199,13 +199,16 @@
          (mi_col + sb_mi_wide) <= cm->mi_cols;
 }
 
+// Do not use this criteria for screen content videos.
+// Since screen content videos could often find good predictors and the largest
+// block size is likely to be used.
 static INLINE int use_auto_max_partition(AV1_COMP *const cpi,
                                          BLOCK_SIZE sb_size, int mi_row,
                                          int mi_col) {
   assert(IMPLIES(cpi->gf_group.size > 0,
                  cpi->gf_group.index < cpi->gf_group.size));
   AV1_COMMON *const cm = &cpi->common;
-  return !frame_is_intra_only(cm) &&
+  return !frame_is_intra_only(cm) && !cpi->is_screen_content_type &&
          cpi->sf.part_sf.auto_max_partition_based_on_simple_motion !=
              NOT_IN_USE &&
          sb_size == BLOCK_128X128 && is_full_sb(cm, mi_row, mi_col, sb_size) &&