Fix for issue #214: Secondary transform contaminates dqcoeff buffer

Proposed fix for AOM Issue #214.

"Secondary transform doesn't update buffers correctly leading to
suboptimal mode decisions in the encoder."

    The input dqcoeff buffer passed to av1_inverse_transform_block()
    must be accessed read-only, because later parts of the mode decision
    will assume it has intact dequantized primary and
    secondary (if available) tx coeffs only.
    For example, computing the distortion in the pixel domain distortion
    will do inverse transform, which requires intact dequantized tx
    coefficients.
    
    Before this fix, the inverse tx of the newly introduced and
    adopted IST (Intra Secondary Tx) in AV2 has been writing to the
    input buffer, which causes the dqcoeff buffer to be contaminated.

BD-Rate change:
For CTC v4 AI A2 test set, -0.10% PSNR (weighted YUV avg) has been
reported.

Fixes #214

Closes #214
diff --git a/av1/common/idct.c b/av1/common/idct.c
index a9a3f1f..4462ef6 100644
--- a/av1/common/idct.c
+++ b/av1/common/idct.c
@@ -318,8 +318,9 @@
 }
 #endif  // CONFIG_CROSS_CHROMA_TX
 
-void av1_inverse_transform_block(const MACROBLOCKD *xd, tran_low_t *dqcoeff,
-                                 int plane, TX_TYPE tx_type, TX_SIZE tx_size,
+void av1_inverse_transform_block(const MACROBLOCKD *xd,
+                                 const tran_low_t *dqcoeff, int plane,
+                                 TX_TYPE tx_type, TX_SIZE tx_size,
                                  uint16_t *dst, int stride, int eob,
                                  int reduced_tx_set) {
   if (!eob) return;
@@ -338,9 +339,14 @@
   assert(((intra_mode >= PAETH_PRED || filter) && txfm_param.sec_tx_type) == 0);
   (void)intra_mode;
   (void)filter;
-  av1_inv_stxfm(dqcoeff, &txfm_param);
 
-  av1_highbd_inv_txfm_add(dqcoeff, dst, stride, &txfm_param);
+  // Work buffer for secondary transform
+  DECLARE_ALIGNED(32, tran_low_t, temp_dqcoeff[MAX_SB_SQUARE]);
+  memcpy(temp_dqcoeff, dqcoeff, sizeof(tran_low_t) * tx_size_2d[tx_size]);
+
+  av1_inv_stxfm(temp_dqcoeff, &txfm_param);
+
+  av1_highbd_inv_txfm_add(temp_dqcoeff, dst, stride, &txfm_param);
 }
 
 // Inverse secondary transform
diff --git a/av1/common/idct.h b/av1/common/idct.h
index b50d972..2652f7c 100644
--- a/av1/common/idct.h
+++ b/av1/common/idct.h
@@ -39,8 +39,9 @@
                                    CctxType cctx_type);
 #endif  // CONFIG_CROSS_CHROMA_TX
 
-void av1_inverse_transform_block(const MACROBLOCKD *xd, tran_low_t *dqcoeff,
-                                 int plane, TX_TYPE tx_type, TX_SIZE tx_size,
+void av1_inverse_transform_block(const MACROBLOCKD *xd,
+                                 const tran_low_t *dqcoeff, int plane,
+                                 TX_TYPE tx_type, TX_SIZE tx_size,
                                  uint16_t *dst, int stride, int eob,
                                  int reduced_tx_set);
 void av1_highbd_iwht4x4_add(const tran_low_t *input, uint16_t *dest, int stride,