Use unfiltered frame to determine screen content type

BUG=aomedia:2516

Recent encoding results provided by Yunqing@ show that by turning off
"--enable_keyframe_filtering", encoding speed has a big improvement
on several clips.

Filename                                                    | Speedup
---------------------------------------------------------------------
Netflix_PierSeaside_1920x1080_60fps_8bit_420_60f.y4m        |  38.7%
Netflix_RollerCoaster_1280x720_60fps_8bit_420_60f.y4m       |  54.4%
Netflix_TunnelFlag_1920x1080_60fps_8bit_420_60f.y4m         |  48.3%
aspen_1080p_60f.y4m                                         |  48.9%

The reason is that the screen content type is determined by the
filtered keyframe. In the above clips, they are all falsely determined
as screen content type. Therefore more coding tools are evaluated,
which causes encoding speed slowdown.

This change uses the unfiltered source frame to determine screen
content type. Therefore mitigate the slowdown. Coding performance
is not expected to change much.

STATS_CHANGED

Change-Id: Iadee2bcb73959d9e7093335d395e9e1424c8bcf0
diff --git a/av1/encoder/encode_strategy.c b/av1/encoder/encode_strategy.c
index 8fdd282..e9bc75a 100644
--- a/av1/encoder/encode_strategy.c
+++ b/av1/encoder/encode_strategy.c
@@ -1302,6 +1302,8 @@
 #endif
   }
 
+  // Save unfiltered source.
+  cpi->unfiltered_source = frame_input.source;
 #if CONFIG_REALTIME_ONLY
   if (av1_encode(cpi, dest, &frame_input, &frame_params, &frame_results) !=
       AOM_CODEC_OK) {
diff --git a/av1/encoder/encoder.c b/av1/encoder/encoder.c
index 5bc2155..5bf41f8 100644
--- a/av1/encoder/encoder.c
+++ b/av1/encoder/encoder.c
@@ -3828,12 +3828,12 @@
 
   // Estimate if the source frame is screen content, based on the portion of
   // blocks that have few luma colors.
-  const uint8_t *src = cpi->source->y_buffer;
+  const uint8_t *src = cpi->unfiltered_source->y_buffer;
   assert(src != NULL);
-  const int use_hbd = cpi->source->flags & YV12_FLAG_HIGHBITDEPTH;
-  const int stride = cpi->source->y_stride;
-  const int width = cpi->source->y_width;
-  const int height = cpi->source->y_height;
+  const int use_hbd = cpi->unfiltered_source->flags & YV12_FLAG_HIGHBITDEPTH;
+  const int stride = cpi->unfiltered_source->y_stride;
+  const int width = cpi->unfiltered_source->y_width;
+  const int height = cpi->unfiltered_source->y_height;
   const int bd = cm->seq_params.bit_depth;
   const int blk_w = 16;
   const int blk_h = 16;
diff --git a/av1/encoder/encoder.h b/av1/encoder/encoder.h
index 8c8474a..18749e5 100644
--- a/av1/encoder/encoder.h
+++ b/av1/encoder/encoder.h
@@ -771,6 +771,7 @@
   YV12_BUFFER_CONFIG scaled_source;
   YV12_BUFFER_CONFIG *unscaled_last_source;
   YV12_BUFFER_CONFIG scaled_last_source;
+  YV12_BUFFER_CONFIG *unfiltered_source;
 
   uint8_t tpl_stats_block_mis_log2;  // block granularity of tpl score storage
   TplDepFrame tpl_stats_buffer[MAX_LENGTH_TPL_FRAME_STATS];