Remove unneeded push to winner_mode_stats for kfs

In the current codebase, we push a winner_mode_stat with the current
best_rd before we start trying any prediction mode. However, an invalid
mbmi is also pushed to the list of winner_mode_stats, which results in
some extra WINNER_MODE_EVAL in multi-winner mode.

This commit removes the initial push to winner_mode_stats in for kfs.

Performance:
BDRate Incrase (PSNR): 0.02~0.04%
Speed Up: 0.5%~1.0%

STATS_CHANGED

Change-Id: I9141dd62f075964b2e5dde1b6cea6e2126f441d3
diff --git a/av1/encoder/rdopt.c b/av1/encoder/rdopt.c
index 96738e9..0729d305 100644
--- a/av1/encoder/rdopt.c
+++ b/av1/encoder/rdopt.c
@@ -4864,11 +4864,7 @@
   MB_MODE_INFO best_mbmi = *mbmi;
   av1_zero(x->winner_mode_stats);
   x->winner_mode_count = 0;
-  // Initialize best mode stats for winner mode processing
-  const int txfm_search_done = 1;
-  store_winner_mode_stats(
-      &cpi->common, x, mbmi, NULL, NULL, NULL, 0, NULL, bsize, best_rd,
-      cpi->sf.winner_mode_sf.enable_multiwinner_mode_process, txfm_search_done);
+
   /* Y Search for intra prediction mode */
   for (int mode_idx = INTRA_MODE_START; mode_idx < INTRA_MODE_END; ++mode_idx) {
     RD_STATS this_rd_stats;
@@ -4918,6 +4914,7 @@
         intra_mode_info_cost_y(cpi, x, mbmi, bsize, bmode_costs[mbmi->mode]);
     this_rd = RDCOST(x->rdmult, this_rate, this_distortion);
     // Collect mode stats for multiwinner mode processing
+    const int txfm_search_done = 1;
     store_winner_mode_stats(
         &cpi->common, x, mbmi, NULL, NULL, NULL, 0, NULL, bsize, this_rd,
         cpi->sf.winner_mode_sf.enable_multiwinner_mode_process,