Inline Intrinsic optimized Denoiser

Faster version of denoiser, cut cost by 1.7x for C path, by 3.3x for
SSE2 path.

Change-Id: I154786308550763bc0e3497e5fa5bfd1ce651beb
6 files changed