[BOLT] a new block reordering algorithm
Summary: A new block reordering algorithm, cache+, that is designed to optimize i-cache performance. On a high level, this algorithm is a greedy heuristic that merges clusters (ordered sequences) of basic blocks, similarly to how it is done in OptimizeCacheReorderAlgorithm. There are two important differences: (a) the metric that is optimized in the procedure, and (b) how two clusters are merged together. Initially all clusters are isolated basic blocks. On every iteration, we pick a pair of clusters whose merging yields the biggest increase in the ExtTSP metric (see CacheMetrics.cpp for exact implementation), which models how i-cache "friendly" a pecific cluster is. A pair of clusters giving the maximum gain is merged to a new clusters. The procedure stops when there is only one cluster left, or when merging does not increase ExtTSP. In the latter case, the remaining clusters are sorted by density. An important aspect is the way two clusters are merged. Unlike earlier algorithms (e.g., OptimizeCacheReorderAlgorithm or Pettis-Hansen), two clusters, X and Y, are first split into three, X1, X2, and Y. Then we consider all possible ways of gluing the three clusters (e.g., X1YX2, X1X2Y, X2X1Y, X2YX1, YX1X2, YX2X1) and choose the one producing the largest score. This improves the quality of the final result (the search space is larger) while keeping the implementation sufficiently fast. (cherry picked from FBD6466264)
Loading
Please sign in to comment