Apply the rect fwd tx changes to SSE2 optimization

- Apply changes on tx_size: 4x8, 8x4, 8x16, 16x8.
- Turn on corresponding unit tests on SSE2.
- Partially fix aomedia:113.

Change-Id: I29d15540ab8e9e3681e9caa54e5162bcbbd7af11
5 files changed