Fixed a computation bug in fdct16_sse2()

fdct16_sse2() was not bit-exact with C reference, fdct16().
The inconsistency was found by writing a unit test for
vp10_fht16x16_sse2().  Since the unit test needs a pending
change on the inherited base class.  I will commit this unit
test after making a header file for this base class.
Passed the uncommitted unit test: vp10_fht16x16_test.cc.

Change-Id: If2b617883c633a3ea90c19e1d018240c8007102b
1 file changed