Replace division in warped motion least squares

Replaces the int64 and int32 divisions in least-squares and
gamma or delta computation with a mechanism that decomposes
the divisor D such that 1/D = y * 2^-k where y is obtained
from a lookup table indexed by 8 highest bits of the difference
D - 2^floor(log2(D)). The main complexity is now only from
computing this decomposition, which is essentially equivalent
to finding floor(log2(D)) (position of highest
bit in a 64-bit integer).

Also includes an out of memory bug fix and some cleanups.

Change-Id: I9247fdff5f6b4191175d4b4656357bfff626f02c
2 files changed