Fix a performance regression

This commit adds back sse2 or ssse3 optimized versio of a couple of
functions, fixes a ~10% performance regression.

Change-Id: I049786906e5a641224dced63c6492aec9d86d183
1 file changed