x86: enable SSE4.2 in x86_simd_caps
The function x86_simd_caps() computes flags used to enable various
SIMD variants. Although av1_get_crc32c_value() provides an SSE4.2
variant, support for invoking it is currently disabled. This patch
adds support for enabling SSE4.2.
Encoder performance results averaged over all resolutions are as
follows:
Encoder Instruction
cpu Count Reduction(%)
2 1.17
3 1.34
4 0.98
5 1.21
6 1.41
This change is bit-exact for all presets.
Change-Id: Ic372e4647e586ccfe81e2c75c0da7ad43e7c2921
diff --git a/aom_ports/x86.h b/aom_ports/x86.h
index 6ce163e..4d00d72 100644
--- a/aom_ports/x86.h
+++ b/aom_ports/x86.h
@@ -179,6 +179,7 @@
#define SSE3_BITS BIT(0)
#define SSSE3_BITS BIT(9)
#define SSE4_1_BITS BIT(19)
+#define SSE4_2_BITS BIT(20)
// Bits 27 (OSXSAVE) & 28 (256-bit AVX)
#define AVX_BITS (BIT(27) | BIT(28))
#define AVX2_BITS BIT(5)
@@ -221,6 +222,7 @@
flags |= FEATURE_SET(reg_ecx, SSE3) ? HAS_SSE3 : 0;
flags |= FEATURE_SET(reg_ecx, SSSE3) ? HAS_SSSE3 : 0;
flags |= FEATURE_SET(reg_ecx, SSE4_1) ? HAS_SSE4_1 : 0;
+ flags |= FEATURE_SET(reg_ecx, SSE4_2) ? HAS_SSE4_2 : 0;
// bits 27 (OSXSAVE) & 28 (256-bit AVX)
if (FEATURE_SET(reg_ecx, AVX)) {