Move width branch out of height loop

- AVX2 Copy and average functions are faster,
  Copy function: ~4%-57%
  Avg function:  ~17%-54%

Change-Id: Ib1732cd90eb353379ef50ecbb1e207860969f1c3
2 files changed