Faster AVX2 implementation of motion compensation modules

Improvements have been made to av1_convolve_y_avx2 (~1.5x faster),
av1_convolve_y_sr_avx2 (~1.8x faster) and av1_convolve_2d_sr_avx2 (~1.3x faster).

Change-Id: Iaed764a7c4d069a4180c3edb0b1ac57ad36dad21
diff --git a/aom_dsp/aom_dsp.cmake b/aom_dsp/aom_dsp.cmake
index f61af74..6c1d89e 100644
--- a/aom_dsp/aom_dsp.cmake
+++ b/aom_dsp/aom_dsp.cmake
@@ -81,6 +81,7 @@
     "${AOM_ROOT}/aom_dsp/x86/intrapred_avx2.c"
     "${AOM_ROOT}/aom_dsp/x86/inv_txfm_avx2.c"
     "${AOM_ROOT}/aom_dsp/x86/common_avx2.h"
+    "${AOM_ROOT}/aom_dsp/x86/convolve_avx2.h"
     "${AOM_ROOT}/aom_dsp/x86/inv_txfm_common_avx2.h"
     "${AOM_ROOT}/aom_dsp/x86/txfm_common_avx2.h")