)]}'
{
  "commit": "486cc9894b7e76b09b4ee37dff6f313f27b1c501",
  "tree": "5fbee154d908ccc551811db5d44890225b9972c5",
  "parents": [
    "03d8ebedcddb17eb6f8eae993a09413c67097ac8"
  ],
  "author": {
    "name": "David Turner",
    "email": "david.turner@argondesign.com",
    "time": "Fri Nov 09 15:48:58 2018 +0000"
  },
  "committer": {
    "name": "Debargha Mukherjee",
    "email": "debargha@google.com",
    "time": "Sat Nov 10 22:50:00 2018 +0000"
  },
  "message": "SSE3-optimised av1_nn_predict\n\nI have developed a SIMD-optimised neural network implementation using\nSSE3.  I have also added functional equivalence tests between this and\nthe original implementation.  I added aom_clear_system_state() to a few\nplaces where FPU operations are used after av1_nn_predict.\n\nSpeed-ups over the original C implementation for various network shapes:\n10x64x16: 1.72x\n12x12x1:  2.72x\n12x24x1:  2.35x\n12x32x1:  3.34x\n18x24x4:  0.94x\n18x32x4:  0.93x\n4x16x1:   2.01x\n8x16x1:   1.89x\n8x16x4:   2.02x\n8x24x1:   2.77x\n8x32x1:   2.98x\n8x64x1:   3.76x\n9x32x3:   1.08x\n4x8x4:    1.66x\n\nA few awkwardly-shaped networks are slightly slower: these could be\npadded to more convenient sizes to use the SIMD kernels.\n\nI also wrote an AVX/AVX2 implementation but on these relatively small\nnetworks it was barely faster than the SSE3 code.\n\nChange-Id: I6a72be12cb7df8cf946578c3e01b21a439377d45\n",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "7976746662dbac0a43500cf04bc08270479e4da1",
      "old_mode": 33188,
      "old_path": "av1/av1.cmake",
      "new_id": "8c926150dc091d82f89e7eb5c162d0a6c2faef4b",
      "new_mode": 33188,
      "new_path": "av1/av1.cmake"
    },
    {
      "type": "modify",
      "old_id": "e38f897c4ca0084e26fefe8635d1e4c0be05c310",
      "old_mode": 33261,
      "old_path": "av1/common/av1_rtcd_defs.pl",
      "new_id": "c167c65681353fbd0b7401b0b5b7314cc2efb0a8",
      "new_mode": 33261,
      "new_path": "av1/common/av1_rtcd_defs.pl"
    },
    {
      "type": "modify",
      "old_id": "d5833f21e6e9aeae279d369d16d589651d023434",
      "old_mode": 33188,
      "old_path": "av1/encoder/encodeframe.c",
      "new_id": "e5b71206b6687898c6486ca152d9fe1dc1011b66",
      "new_mode": 33188,
      "new_path": "av1/encoder/encodeframe.c"
    },
    {
      "type": "modify",
      "old_id": "d21def43a847baed3d624d35f6d3671da7242d09",
      "old_mode": 33188,
      "old_path": "av1/encoder/ml.c",
      "new_id": "ad664acf1c80a7e66fce35dd7ea2c9a8a03e2fd2",
      "new_mode": 33188,
      "new_path": "av1/encoder/ml.c"
    },
    {
      "type": "modify",
      "old_id": "cb8ef2871b78525d7ffc65a6b5d53192737aaa23",
      "old_mode": 33188,
      "old_path": "av1/encoder/ml.h",
      "new_id": "7f2750b31ddf693191ec1aef7f08354572a1e5e3",
      "new_mode": 33188,
      "new_path": "av1/encoder/ml.h"
    },
    {
      "type": "modify",
      "old_id": "3c1cd8f2139127fc9ddb895e6a8e0df728e35cae",
      "old_mode": 33188,
      "old_path": "av1/encoder/rdopt.c",
      "new_id": "5b7747ec9b03456f6622668e8a1b309ab5d0d7e0",
      "new_mode": 33188,
      "new_path": "av1/encoder/rdopt.c"
    },
    {
      "type": "add",
      "old_id": "0000000000000000000000000000000000000000",
      "old_mode": 0,
      "old_path": "/dev/null",
      "new_id": "c520c3c356ea526ecef383ae6f1dcb7f27c187c0",
      "new_mode": 33188,
      "new_path": "av1/encoder/x86/ml_sse3.c"
    },
    {
      "type": "add",
      "old_id": "0000000000000000000000000000000000000000",
      "old_mode": 0,
      "old_path": "/dev/null",
      "new_id": "bda8868d79bd498ae6d106ae6def171aaa0678a1",
      "new_mode": 33188,
      "new_path": "test/av1_nn_predict_test.cc"
    },
    {
      "type": "modify",
      "old_id": "d15e5806fd56effbe9c1274f6ed504e6ae5f9eec",
      "old_mode": 33188,
      "old_path": "test/test.cmake",
      "new_id": "12f231973d7790f19038019fcb39b838540905f2",
      "new_mode": 33188,
      "new_path": "test/test.cmake"
    }
  ]
}
