Add an SSE2 version of vp9_iwht4x4_16_add.

80% fewer cycles than C

Change-Id: I841bde1e268ddd33ae2ee75eee94737a400e2cde
4 files changed