Fix calculation of tx_type cost for rectangular transforms

This patch fixes up av1_tx_type_cost to match the code in
av1_write_tx_type. Beforehand, we wrongly assumed a 32x16 block needed
to signal its transform size (with rect-tx-ext & rect-tx-ext-intra)
because we were passing 16x16 to get_ext_tx_types.

I've also changed av1_write_tx_type to use get_min_tx_size rather than
inlining its body. No functional change, but it's probably better to
use the same helper function both times.

Change-Id: Iff6ee0bff2d332d5270fe0219db88c95e0b051d0
diff --git a/av1/encoder/bitstream.c b/av1/encoder/bitstream.c
index 3c305e2..df8ba10 100644
--- a/av1/encoder/bitstream.c
+++ b/av1/encoder/bitstream.c
@@ -1127,7 +1127,7 @@
 #endif
 
   if (!FIXED_TX_TYPE) {
-    const TX_SIZE square_tx_size = txsize_sqr_map[tx_size];
+    const TX_SIZE square_tx_size = get_min_tx_size(tx_size);
     const BLOCK_SIZE bsize = mbmi->sb_type;
     if (get_ext_tx_types(tx_size, bsize, is_inter, cm->reduced_tx_set_used) >
             1 &&