Skip to content
  1. Nov 17, 2013
    • Hal Finkel's avatar
      Add a loop rerolling pass · bf45efde
      Hal Finkel authored
      This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The
      transformation aims to take loops like this:
      
      for (int i = 0; i < 3200; i += 5) {
        a[i]     += alpha * b[i];
        a[i + 1] += alpha * b[i + 1];
        a[i + 2] += alpha * b[i + 2];
        a[i + 3] += alpha * b[i + 3];
        a[i + 4] += alpha * b[i + 4];
      }
      
      and turn them into this:
      
      for (int i = 0; i < 3200; ++i) {
        a[i] += alpha * b[i];
      }
      
      and loops like this:
      
      for (int i = 0; i < 500; ++i) {
        x[3*i] = foo(0);
        x[3*i+1] = foo(0);
        x[3*i+2] = foo(0);
      }
      
      and turn them into this:
      
      for (int i = 0; i < 1500; ++i) {
        x[i] = foo(0);
      }
      
      There are two motivations for this transformation:
      
        1. Code-size reduction (especially relevant, obviously, when compiling for
      code size).
      
        2. Providing greater choice to the loop vectorizer (and generic unroller) to
      choose the unrolling factor (and a better ability to vectorize). The loop
      vectorizer can take vector lengths and register pressure into account when
      choosing an unrolling factor, for example, and a pre-unrolled loop limits that
      choice. This is especially problematic if the manual unrolling was optimized
      for a machine different from the current target.
      
      The current implementation is limited to single basic-block loops only. The
      rerolling recognition should work regardless of how the loop iterations are
      intermixed within the loop body (subject to dependency and side-effect
      constraints), but the significant restriction is that the order of the
      instructions in each iteration must be identical. This seems sufficient to
      capture all current use cases.
      
      This pass is not currently enabled by default at any optimization level.
      
      llvm-svn: 194939
      bf45efde
  2. Nov 15, 2013
    • Manman Ren's avatar
      ArgumentPromotion: correctly transfer TBAA tags and alignments. · bc37658a
      Manman Ren authored
      We used to use std::map<IndicesVector, LoadInst*> for OriginalLoads, and when we
      try to promote two arguments, they will both write to OriginalLoads causing
      created loads for the two arguments to have the same original load. And the same
      tbaa tag and alignment will be put to the created loads for the two arguments.
      
      The fix is to use std::map<std::pair<Argument*, IndicesVector>, LoadInst*>
      for OriginalLoads, so each Argument will write to different parts of the map.
      
      PR17906
      
      llvm-svn: 194846
      bc37658a
  3. Nov 12, 2013
    • Rafael Espindola's avatar
      Corruptly merge constants with explicit and implicit alignments. · dd8757ab
      Rafael Espindola authored
      Constant merge can merge a constant with implicit alignment with one that has
      explicit alignment. Before this change it was assuming that the explicit
      alignment was higher than the implicit one, causing the result to be under
      aligned in some cases.
      
      Fixes pr17815.
      
      Patch by Chris Smowton!
      
      llvm-svn: 194506
      dd8757ab
  4. Nov 10, 2013
  5. Nov 04, 2013
  6. Nov 03, 2013
  7. Oct 31, 2013
    • Rafael Espindola's avatar
      Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list". · 282a4703
      Rafael Espindola authored
      There are two ways one could implement hiding of linkonce_odr symbols in LTO:
      * LLVM tells the linker which symbols can be hidden if not used from native
        files.
      * The linker tells LLVM which symbols are not used from other object files,
        but will be put in the dso symbol table if present.
      
      GOLD's API is the second option. It was implemented almost 1:1 in llvm by
      passing the list down to internalize.
      
      LLVM already had partial support for the first option. It is also very similar
      to how ld64 handles hiding these symbols when *not* doing LTO.
      
      This patch then
      * removes the APIs for the DSO list.
      * marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr
        global values and other linkonce_odr whose address is not used.
      * makes the gold plugin responsible for handling the API mismatch.
      
      llvm-svn: 193800
      282a4703
    • Rafael Espindola's avatar
      Merge CallGraph and BasicCallGraph. · 6554e5a9
      Rafael Espindola authored
      llvm-svn: 193734
      6554e5a9
  8. Oct 27, 2013
  9. Oct 23, 2013
  10. Oct 22, 2013
  11. Oct 21, 2013
  12. Oct 19, 2013
  13. Oct 17, 2013
  14. Oct 09, 2013
    • Shuxin Yang's avatar
      Fix a bug in Dead Argument Elimination. · 1cab418c
      Shuxin Yang authored
        If a function seen at compile time is not necessarily the one linked to
      the binary being built, it is illegal to change the actual arguments
      passing to it. 
      
        e.g. 
         --------------------------
         void foo(int lol) {
           // foo() has linkage satisifying isWeakForLinker()
           // "lol" is not used at all.
         }
      
         void bar(int lo2) {
            // xform to foo(undef) is illegal, as compiler dose not know which
            // instance of foo() will be linked to the the binary being built.
            foo(lol2); 
         }
        -----------------------------
      
        Such functions can be captured by isWeakForLinker(). NOTE that
      mayBeOverridden() is insufficient for this purpose as it dosen't include
      linkage types like AvailableExternallyLinkage and LinkOnceODRLinkage.
      Take link_odr* as an example, it indicates a set of *EQUIVALENT* globals
      that can be merged at link-time. However, the semantic of 
      *EQUIVALENT*-functions includes parameters. Changing parameters breaks
      the assumption.
      
        Thank John McCall for help, especially for the explanation of subtle
      difference between linkage types.
      
        rdar://11546243
      
      llvm-svn: 192302
      1cab418c
  15. Oct 07, 2013
  16. Oct 03, 2013
    • Rafael Espindola's avatar
      Optimize linkonce_odr unnamed_addr functions during LTO. · cda2911c
      Rafael Espindola authored
      Generalize the API so we can distinguish symbols that are needed just for a DSO
      symbol table from those that are used from some native .o.
      
      The symbols that are only wanted for the dso symbol table can be dropped if
      llvm can prove every other dso has a copy (linkonce_odr) and the address is not
      important (unnamed_addr).
      
      llvm-svn: 191922
      cda2911c
  17. Oct 02, 2013
  18. Oct 01, 2013
    • Matt Arsenault's avatar
      Don't merge tiny functions. · 517d84e2
      Matt Arsenault authored
      It's silly to merge functions like these:
      
      define void @foo(i32 %x) {
        ret void
      }
      
      define void @bar(i32 %x) {
        ret void
      }
      
      to get
      
      define void @bar(i32) {
        tail call void @foo(i32 %0)
        ret void
      }
      
      llvm-svn: 191786
      517d84e2
  19. Sep 22, 2013
  20. Sep 17, 2013
    • Stepan Dyatkovskiy's avatar
      Bugfix for PR17099: · dc2c4b44
      Stepan Dyatkovskiy authored
      Wrong cast operation.
      MergeFunctions emits Bitcast instead of pointer-to-integer operation.
      Patch fixes MergeFunctions::writeThunk function. It replaces
      unconditional Bitcast creation with "Value* createCast(...)" method, that
      checks operand types and selects proper instruction.
      See unit-test as example.
      
      llvm-svn: 190859
      dc2c4b44
  21. Sep 16, 2013
  22. Sep 13, 2013
  23. Sep 11, 2013
  24. Sep 10, 2013
  25. Sep 05, 2013
  26. Sep 04, 2013
  27. Sep 03, 2013
    • Nadav Rotem's avatar
      Enable late-vectorization by default. · 5d78dba6
      Nadav Rotem authored
      This patch changes the default setting for the LateVectorization flag that controls where the loop-vectorizer is ran.
      
      Perf gains:
      SingleSource/Benchmarks/Shootout/matrix -37.33%
      MultiSource/Benchmarks/PAQ8p/paq8p  -22.83%
      SingleSource/Benchmarks/Linpack/linpack-pc  -16.22%
      SingleSource/Benchmarks/Shootout-C++/ary3 -15.16%
      MultiSource/Benchmarks/TSVC/NodeSplitting-flt/NodeSplitting-flt -10.34%
      MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl -7.12%
      
      Regressions:
      SingleSource/Benchmarks/Misc/lowercase  15.10%
      MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt 13.18%
      SingleSource/Benchmarks/Shootout-C++/matrix 8.27%
      SingleSource/Benchmarks/CoyoteBench/lpbench 7.30%
      
      llvm-svn: 189858
      5d78dba6
  28. Aug 30, 2013
  29. Aug 29, 2013
    • Nadav Rotem's avatar
      Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC... · 4c459bcd
      Nadav Rotem authored
      Vectorizer/PassManager:  I am working on moving the vectorizer out of the SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons:
      1. They are a kind of cannonicalization.
      2. The performance measurements show that it is better to keep them in.
      
      There should be no functional change if you are not enabling the LateVectorization mode.
      
      llvm-svn: 189539
      4c459bcd
Loading