)]}'
{
  "commit": "32059d384c261a9719bf78e5ffd7d86c63841de5",
  "tree": "418c6f0b426c8ba45c6746465da9663ce9e567a5",
  "parents": [
    "d50b000ac8e16c7bdb2dd5ea2ea98dd2f63c3907"
  ],
  "author": {
    "name": "Jerome Jiang",
    "email": "jianj@google.com",
    "time": "Thu Jun 11 13:05:34 2026 -0400"
  },
  "committer": {
    "name": "Jerome Jiang",
    "email": "jianj@google.com",
    "time": "Mon Jun 15 11:00:15 2026 -0700"
  },
  "message": "Optimize Highway SIMD for av1_warp_affine\n\n1. Replaces template function pointer parameters with\ncompile-time enum/boolean template routing to eliminate\nindirect call overhead.\n2. Decays 2D LUT arrays to 1D pointers to completely bypass\nUBSan array bounds check overhead.\n3. Replaces AVX2 vector concatenation permutes with contiguous\nunaligned loads (hn::LoadU) to eliminate Port 5 permute\ncongestion.\n4. Accumulates vertical dot products directly onto\nres_add_const vertical rounding bias, saving post-filtering\nvector additions.\n5. Replaces row indexing multiplications in StoreRows with\nstrided pointer additions.\n\nThe result below is an aggregation of a/b/g/d values for each\nblock.\n\nAVX2 Speedup:\n Block  | Before  | After   | Speedup\n--------+---------+---------+---------\n 4x4    |   61 n |   45 n |  1.37x\n 4x8    |   79 n |   58 n |  1.35x\n 4x16   |  141 n |  106 n |  1.33x\n 8x4    |   62 n |   45 n |  1.38x\n 8x8    |   80 n |   59 n |  1.37x\n 8x16   |  145 n |  107 n |  1.36x\n 16x8   |  144 n |  106 n |  1.36x\n 16x16  |  272 n |  201 n |  1.35x\n 16x32  |  530 n |  392 n |  1.35x\n 32x8   |  270 n |  200 n |  1.35x\n 32x16  |  524 n |  389 n |  1.35x\n 32x32  |  1.0 u |  776 n |  1.34x\n 32x64  |  2.1 u |  1.5 u |  1.33x\n 64x32  |  2.1 u |  1.5 u |  1.35x\n 64x64  |  4.1 u |  3.1 u |  1.34x\n 64x128 |  8.3 u |  6.4 u |  1.30x\n 128x64 |  8.6 u |  6.8 u |  1.27x\n 128x128| 17.2 u | 14.1 u |  1.22x\n\nAVX512 Speedup:\n Block  | Before  | After   | Speedup\n--------+---------+---------+---------\n 4x4    |   63 n |   46 n |  1.38x\n 4x8    |   79 n |   56 n |  1.40x\n 4x16   |  139 n |  102 n |  1.36x\n 8x4    |   64 n |   46 n |  1.41x\n 8x8    |   81 n |   56 n |  1.45x\n 8x16   |  143 n |  102 n |  1.41x\n 16x8   |  140 n |  100 n |  1.40x\n 16x16  |  263 n |  190 n |  1.38x\n 16x32  |  507 n |  370 n |  1.37x\n 32x8   |  259 n |  189 n |  1.38x\n 32x16  |  500 n |  367 n |  1.36x\n 32x32  |  984 n |  726 n |  1.35x\n 32x64  |  1.9 u |  1.4 u |  1.35x\n 64x32  |  1.9 u |  1.4 u |  1.34x\n 64x64  |  3.8 u |  2.9 u |  1.34x\n 64x128 |  7.7 u |  6.0 u |  1.29x\n 128x64 |  8.1 u |  6.2 u |  1.30x\n 128x128| 16.3 u | 13.0 u |  1.25x\n\nChange-Id: I3e5ed8ccc0a0dd23c97e1ca9d89ad6f508a4a590\n",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "61d5fc6bcdcf35f8f039ef478f11a01546110f37",
      "old_mode": 33188,
      "old_path": "av1/common/warp_plane_hwy.h",
      "new_id": "da0d5450bc7c9be3c85fae8831406641b725369d",
      "new_mode": 33188,
      "new_path": "av1/common/warp_plane_hwy.h"
    }
  ]
}
