Change the evaluation order of luma palette_size in speed 0

Added separate handling for speed 0 case where the speed feature
prune_palette_search_level is 0. The order of luma palette_size
evaluation in av1_rd_pick_palette_intra_sby() is changed from
descending order to ascending order in order to facilitate
early exit based on palette header rdcost.

This change has negligible effect on speed and quality in speed 0
encoding for good and allintra modes. This change is verified to
be bit-exact for speed > 0 in good and allintra encoding modes.

For allintra video encode (on screen content set),

          Instruction Count        BD-Rate Loss(%)
cpu-used     Reduction(%)   avg.psnr  ovr.psnr    ssim
   0          -0.044        -0.0014   -0.0015     0.0000

For good video encode (on screen content set),

          Instruction Count        BD-Rate Loss(%)
cpu-used     Reduction(%)   avg.psnr  ovr.psnr    ssim
   0           0.031         0.0078    0.0056    -0.0094

For AVIF still image encode,

          Instruction Count    BD-Rate Loss(%)
cpu-used     Reduction(%)      psnr       ssim
   0           -0.001          0.0001     0.0000

BUG=aomedia:3096
BUG=aomedia:2959

STATS_CHANGED

Change-Id: I9c9e55813af268c260c6ee0ee30842520cb991f3
diff --git a/av1/encoder/palette.c b/av1/encoder/palette.c
index 04d2390..1b0bf62 100644
--- a/av1/encoder/palette.c
+++ b/av1/encoder/palette.c
@@ -600,6 +600,35 @@
             distortion, skippable, beat_best_rd, ctx, best_blk_skip,
             tx_type_map, color_map, rows * cols);
       }
+    } else if (cpi->sf.intra_sf.prune_palette_search_level == 0) {
+      const int max_n = AOMMIN(colors, PALETTE_MAX_SIZE),
+                min_n = PALETTE_MIN_SIZE;
+      // Perform top color palette search in ascending order.
+      perform_top_color_palette_search(
+          cpi, x, mbmi, bsize, dc_mode_cost, data, top_colors, min_n, max_n + 1,
+          1, /*do_header_rd_based_gating=*/false, &unused, color_cache, n_cache,
+          best_mbmi, best_palette_color_map, best_rd, rate, rate_tokenonly,
+          distortion, skippable, beat_best_rd, ctx, best_blk_skip, tx_type_map);
+      // K-means clustering.
+      if (colors == PALETTE_MIN_SIZE) {
+        // Special case: These colors automatically become the centroids.
+        assert(colors == 2);
+        centroids[0] = lower_bound;
+        centroids[1] = upper_bound;
+        palette_rd_y(cpi, x, mbmi, bsize, dc_mode_cost, data, centroids, colors,
+                     color_cache, n_cache, /*do_header_rd_based_gating=*/false,
+                     best_mbmi, best_palette_color_map, best_rd, rate,
+                     rate_tokenonly, distortion, skippable, beat_best_rd, ctx,
+                     best_blk_skip, tx_type_map, NULL, NULL);
+      } else {
+        // Perform k-means palette search in ascending order.
+        perform_k_means_palette_search(
+            cpi, x, mbmi, bsize, dc_mode_cost, data, lower_bound, upper_bound,
+            min_n, max_n + 1, 1, /*do_header_rd_based_gating=*/false, &unused,
+            color_cache, n_cache, best_mbmi, best_palette_color_map, best_rd,
+            rate, rate_tokenonly, distortion, skippable, beat_best_rd, ctx,
+            best_blk_skip, tx_type_map, color_map, rows * cols);
+      }
     } else {
       const int max_n = AOMMIN(colors, PALETTE_MAX_SIZE),
                 min_n = PALETTE_MIN_SIZE;