Reducing the number of foreach_transformed_block() calls.

The change doesn't affect the bitstream. It changes the order or function
calls and affects how we reconstruct intra- and inter-blocks. Speed up is
about 1...1.5%.

For intra-blocks:
  Before:
    for each transform block read tokens
    for each transform block do prediction
    for each transform block do inverse transform
  Now:
    for each transform block
      read tokens
      do prediction
      do inverse transform

For inter-blocks:
  Before:
    for each transform block read tokens
    for each transform block do inverse transform
  Now:
    for each transform block
      read tokens
      do inverse transform

Change-Id: I12a79bf1aa5a18c351b8010369bd3ff1deae1570
3 files changed