Skip to content
Commit 8b8fd217 authored by Shuxin Yang's avatar Shuxin Yang
Browse files

Fix a defect in code-layout pass, improving Benchmarks/Olden/em3d/em3d by about 30%

(4.58s vs 3.2s on an oldish Mac Tower). 

  The corresponding src is excerpted bellow. The lopp accounts for about 90% of execution time.
  --------------------
    cat -n test-suite/MultiSource/Benchmarks/Olden/em3d/make_graph.c
     90 
     91         for (k=0; k<j; k++)
     92           if (other_node == cur_node->to_nodes[k]) break;

  The defective layout is sketched bellow, where the two branches need to swap.
  ------------------------------------------------------------------------
      L:
         ...
      if (cond) goto out-of-loop
      goto L

  While this code sequence is defective, I don't understand why it incurs 1/3 of 
execution time. CPU-event-profiling indicates the poor laoyout dose not increase
in br-misprediction; it dosen't increase stall cycle at all, and it dosen't 
prevent the CPU detect the loop (i.e. Loop-Stream-Detector seems to be working fine
as well)... 

   The root cause of the problem is that the layout pass calls AnalyzeBranch() 
with basic-block which is not updated to reflect its current layout.

rdar://13966341

llvm-svn: 183174
parent 6df859d8
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment