Reduce best rdcost value in transform partition search

Adaptively reduce the best rate-distortion cost value in the
recursive transform block partition search. For bus CIF at 1000 kbps
this reduces the encoding time from 1864 seconds to 1756 seconds,
about 6% speed up.

Change-Id: I5433a1825c0f8b13fcc5ab7e19713a98969d53fc
diff --git a/av1/encoder/rdopt.c b/av1/encoder/rdopt.c
index dc6837e..43b00b8 100644
--- a/av1/encoder/rdopt.c
+++ b/av1/encoder/rdopt.c
@@ -4705,6 +4705,8 @@
 
     assert(tx_size < TX_SIZES_ALL);
 
+    ref_best_rd = AOMMIN(this_rd, ref_best_rd);
+
     for (i = 0; i < 4 && this_cost_valid; ++i) {
       int offsetr = blk_row + (i >> 1) * bsl;
       int offsetc = blk_col + (i & 0x01) * bsl;