Use 1 sample per neighbor for local warping model estimation

Only 1 sample needs to be collected. Max of 8 neighbors are
used.
In LS estimation, the projection samples (sx, sy)->(dx, dy) are
intentionally smoothed by assuming 3 shifted versions
(sx, sy+n)->(dx, dy+n), (sx+n, sy)->(dx+n, dy), (sx+n,
sy+n)->(dx+n, dy+n) also contribute to the estimation.
For example, instead of using A[0] = sx^2, we use the sum of
squares of source x of four points, A[0] += 4sx^2+4*n*sx+n^2.
But computational cost wise, it does not add much overhead. Coding
gain is mostly same as the old version. If no smoothing is added,
will lose 0.3% on lowres.

Change-Id: I04be32cffa525f7dc8ee583c0bf211d7bdc6e609
diff --git a/av1/common/blockd.h b/av1/common/blockd.h
index 3cd0a02..12e5eab 100644
--- a/av1/common/blockd.h
+++ b/av1/common/blockd.h
@@ -1113,7 +1113,7 @@
     if (!check_num_overlappable_neighbors(mbmi)) return SIMPLE_TRANSLATION;
 #endif
 #if CONFIG_WARPED_MOTION
-    if (!has_second_ref(mbmi) && mbmi->num_proj_ref[0] >= 3)
+    if (!has_second_ref(mbmi) && mbmi->num_proj_ref[0] >= 1)
       return WARPED_CAUSAL;
     else
 #endif  // CONFIG_WARPED_MOTION