Add av1_convolve_2d_facade

When convolve_round is on, av1_convolve_2d_facade will be used for
interpolation rather than av1_convolve. Will remove the experiment
code of convolve_round experiment from av1_convolve in another CL.

So far we use 4-bit rounding in the intermediate stage on top of using
post rounding for compound mode after the last stage.

This will give us roughly 0.45% gain on lowres , 0.39% on midres and
roughly 0.6-0.7% on hdres
Altogether, is 1.15% on lowresm, 0.74% on midres and roughly 1.7-1.8% on
hdres

Note that there no restriction usage of 12-tap filter in the CL.
Adding that, we will lose roughly 0.1% again on lowres.

Change-Id: I6332e1d888e28a3b3ddc29711817d66e52cb5cdf
4 files changed