vp8 quantization -> intrinsics

Use intrinsics for neon quantization. Slight loss (<5%) of performance
compared to the assembly. Roughly 10x faster on arm64 because that was
running C code before.

Change-Id: I7cf5242d8f29b7eab5bca6a1c20c89c9fc9ca66d
4 files changed