Use aligned copy in 8x8 Hadamard transform SSE2

This reduces the 8x8 Hadamard transform cycles by 20%.

Change-Id: If34c5e02f3afa42244c6efabe121f7cf5d2df41b
1 file changed