Apply the rect fwd tx changes to SSE2 optimization - Apply changes on tx_size: 4x8, 8x4, 8x16, 16x8. - Turn on corresponding unit tests on SSE2. - Partially fix aomedia:113. Change-Id: I29d15540ab8e9e3681e9caa54e5162bcbbd7af11