Skip to content
  1. Feb 21, 2014
  2. Feb 13, 2014
    • Reid Kleckner's avatar
      GlobalOpt: Aliases don't have sections, don't copy them when replacing · 22b19da9
      Reid Kleckner authored
      As defined in LangRef, aliases do not have sections.  However, LLVM's
      GlobalAlias class inherits from GlobalValue, which means we can read and
      set its section.  We should probably ban that as a separate change,
      since it doesn't make much sense for an alias to have a section that
      differs from its aliasee.
      
      Fixes PR18757, where the section was being lost on the global in code
      from Clang like:
      
      extern "C" {
      __attribute__((used, section("CUSTOM"))) static int in_custom_section;
      }
      
      Reviewers: rafael.espindola
      
      Differential Revision: http://llvm-reviews.chandlerc.com/D2758
      
      llvm-svn: 201286
      22b19da9
  3. Feb 06, 2014
    • Manman Ren's avatar
      Set default of inlinecold-threshold to 225. · d4612449
      Manman Ren authored
      225 is the default value of inline-threshold. This change will make sure
      we have the same inlining behavior as prior to r200886.
      
      As Chandler points out, even though we don't have code in our testing
      suite that uses cold attribute, there are larger applications that do
      use cold attribute.
      
      r200886 + this commit intend to keep the same behavior as prior to r200886.
      We can later on tune the inlinecold-threshold.
      
      The main purpose of r200886 is to help performance of instrumentation based
      PGO before we actually hook up inliner with analysis passes such as BPI and BFI.
      For instrumentation based PGO, we try to increase inlining of hot functions and
      reduce inlining of cold functions by setting inlinecold-threshold.
      
      Another option suggested by Chandler is to use a boolean flag that controls
      if we should use OptSizeThreshold for cold functions. The default value
      of the boolean flag should not change the current behavior. But it gives us
      less freedom in controlling inlining of cold functions.
      
      llvm-svn: 200898
      d4612449
    • Paul Robinson's avatar
      Disable most IR-level transform passes on functions marked 'optnone'. · af4e64d0
      Paul Robinson authored
      Ideally only those transform passes that run at -O0 remain enabled,
      in reality we get as close as we reasonably can.
      Passes are responsible for disabling themselves, it's not the job of
      the pass manager to do it for them.
      
      llvm-svn: 200892
      af4e64d0
  4. Feb 05, 2014
  5. Feb 04, 2014
  6. Feb 03, 2014
    • Reid Kleckner's avatar
      inalloca: Don't remove dead arguments in the presence of inalloca args · d47a59a4
      Reid Kleckner authored
      It disturbs the layout of the parameters in memory and registers,
      leading to problems in the backend.
      
      The plan for optimizing internal inalloca functions going forward is to
      essentially SROA the argument memory and demote any captured arguments
      (things that aren't trivially written by a load or store) to an indirect
      pointer to a static alloca.
      
      llvm-svn: 200717
      d47a59a4
  7. Jan 28, 2014
  8. Jan 24, 2014
    • Alp Toker's avatar
      Fix known typos · cb402911
      Alp Toker authored
      Sweep the codebase for common typos. Includes some changes to visible function
      names that were misspelt.
      
      llvm-svn: 200018
      cb402911
  9. Jan 23, 2014
  10. Jan 14, 2014
    • Matt Arsenault's avatar
      Make nocapture analysis work with addrspacecast · e55a2c2e
      Matt Arsenault authored
      llvm-svn: 199246
      e55a2c2e
    • Duncan P. N. Exon Smith's avatar
      Reapply "LTO: add API to set strategy for -internalize" · 93be7c4f
      Duncan P. N. Exon Smith authored
      Reapply r199191, reverted in r199197 because it carelessly broke
      Other/link-opts.ll.  The problem was that calling
      createInternalizePass("main") would select
      createInternalizePass(bool("main")) instead of
      createInternalizePass(ArrayRef<const char *>("main")).  This commit
      fixes the bug.
      
      The original commit message follows.
      
      Add API to LTOCodeGenerator to specify a strategy for the -internalize
      pass.
      
      This is a new attempt at Bill's change in r185882, which he reverted in
      r188029 due to problems with the gold linker.  This puts the onus on the
      linker to decide whether (and what) to internalize.
      
      In particular, running internalize before outputting an object file may
      change a 'weak' symbol into an internal one, even though that symbol
      could be needed by an external object file --- e.g., with arclite.
      
      This patch enables three strategies:
      
      - LTO_INTERNALIZE_FULL: the default (and the old behaviour).
      - LTO_INTERNALIZE_NONE: skip -internalize.
      - LTO_INTERNALIZE_HIDDEN: only -internalize symbols with hidden
        visibility.
      
      LTO_INTERNALIZE_FULL should be used when linking an executable.
      
      Outputting an object file (e.g., via ld -r) is more complicated, and
      depends on whether hidden symbols should be internalized.  E.g., for
      ld -r, LTO_INTERNALIZE_NONE can be used when -keep_private_externs, and
      LTO_INTERNALIZE_HIDDEN can be used otherwise.  However,
      LTO_INTERNALIZE_FULL is inappropriate, since the output object file will
      eventually need to link with others.
      
      lto_codegen_set_internalize_strategy() sets the strategy for subsequent
      calls to lto_codegen_write_merged_modules() and lto_codegen_compile*().
      
      <rdar://problem/14334895>
      
      llvm-svn: 199244
      93be7c4f
    • Nico Rieck's avatar
      Decouple dllexport/dllimport from linkage · 7157bb76
      Nico Rieck authored
      Representing dllexport/dllimport as distinct linkage types prevents using
      these attributes on templates and inline functions.
      
      Instead of introducing further mixed linkage types to include linkonce and
      weak ODR, the old import/export linkage types are replaced with a new
      separate visibility-like specifier:
      
        define available_externally dllimport void @f() {}
        @Var = dllexport global i32 1, align 4
      
      Linkage for dllexported globals and functions is now equal to their linkage
      without dllexport. Imported globals and functions must be either
      declarations with external linkage, or definitions with
      AvailableExternallyLinkage.
      
      llvm-svn: 199218
      7157bb76
    • Nico Rieck's avatar
      Revert "Decouple dllexport/dllimport from linkage" · 9d2e0df0
      Nico Rieck authored
      Revert this for now until I fix an issue in Clang with it.
      
      This reverts commit r199204.
      
      llvm-svn: 199207
      9d2e0df0
    • Nico Rieck's avatar
      Decouple dllexport/dllimport from linkage · e43aaf79
      Nico Rieck authored
      Representing dllexport/dllimport as distinct linkage types prevents using
      these attributes on templates and inline functions.
      
      Instead of introducing further mixed linkage types to include linkonce and
      weak ODR, the old import/export linkage types are replaced with a new
      separate visibility-like specifier:
      
        define available_externally dllimport void @f() {}
        @Var = dllexport global i32 1, align 4
      
      Linkage for dllexported globals and functions is now equal to their linkage
      without dllexport. Imported globals and functions must be either
      declarations with external linkage, or definitions with
      AvailableExternallyLinkage.
      
      llvm-svn: 199204
      e43aaf79
    • NAKAMURA Takumi's avatar
      Revert r199191, "LTO: add API to set strategy for -internalize" · 23c0ab53
      NAKAMURA Takumi authored
      Please update also Other/link-opts.ll, in next time.
      
      llvm-svn: 199197
      23c0ab53
    • Duncan P. N. Exon Smith's avatar
      LTO: add API to set strategy for -internalize · 43ea3478
      Duncan P. N. Exon Smith authored
      Add API to LTOCodeGenerator to specify a strategy for the -internalize
      pass.
      
      This is a new attempt at Bill's change in r185882, which he reverted in
      r188029 due to problems with the gold linker.  This puts the onus on the
      linker to decide whether (and what) to internalize.
      
      In particular, running internalize before outputting an object file may
      change a 'weak' symbol into an internal one, even though that symbol
      could be needed by an external object file --- e.g., with arclite.
      
      This patch enables three strategies:
      
      - LTO_INTERNALIZE_FULL: the default (and the old behaviour).
      - LTO_INTERNALIZE_NONE: skip -internalize.
      - LTO_INTERNALIZE_HIDDEN: only -internalize symbols with hidden
        visibility.
      
      LTO_INTERNALIZE_FULL should be used when linking an executable.
      
      Outputting an object file (e.g., via ld -r) is more complicated, and
      depends on whether hidden symbols should be internalized.  E.g., for
      ld -r, LTO_INTERNALIZE_NONE can be used when -keep_private_externs, and
      LTO_INTERNALIZE_HIDDEN can be used otherwise.  However,
      LTO_INTERNALIZE_FULL is inappropriate, since the output object file will
      eventually need to link with others.
      
      lto_codegen_set_internalize_strategy() sets the strategy for subsequent
      calls to lto_codegen_write_merged_modules() and lto_codegen_compile*().
      
      <rdar://problem/14334895>
      
      llvm-svn: 199191
      43ea3478
  11. Jan 13, 2014
    • Chandler Carruth's avatar
      [PM] Split DominatorTree into a concrete analysis result object which · 73523021
      Chandler Carruth authored
      can be used by both the new pass manager and the old.
      
      This removes it from any of the virtual mess of the pass interfaces and
      lets it derive cleanly from the DominatorTreeBase<> template. In turn,
      tons of boilerplate interface can be nuked and it turns into a very
      straightforward extension of the base DominatorTree interface.
      
      The old analysis pass is now a simple wrapper. The names and style of
      this split should match the split between CallGraph and
      CallGraphWrapperPass. All of the users of DominatorTree have been
      updated to match using many of the same tricks as with CallGraph. The
      goal is that the common type remains the resulting DominatorTree rather
      than the pass. This will make subsequent work toward the new pass
      manager significantly easier.
      
      Also in numerous places things became cleaner because I switched from
      re-running the pass (!!! mid way through some other passes run!!!) to
      directly recomputing the domtree.
      
      llvm-svn: 199104
      73523021
    • Chandler Carruth's avatar
      [cleanup] Move the Dominators.h and Verifier.h headers into the IR · 5ad5f15c
      Chandler Carruth authored
      directory. These passes are already defined in the IR library, and it
      doesn't make any sense to have the headers in Analysis.
      
      Long term, I think there is going to be a much better way to divide
      these matters. The dominators code should be fully separated into the
      abstract graph algorithm and have that put in Support where it becomes
      obvious that evn Clang's CFGBlock's can use it. Then the verifier can
      manually construct dominance information from the Support-driven
      interface while the Analysis library can provide a pass which both
      caches, reconstructs, and supports a nice update API.
      
      But those are very long term, and so I don't want to leave the really
      confusing structure until that day arrives.
      
      llvm-svn: 199082
      5ad5f15c
  12. Jan 07, 2014
  13. Jan 02, 2014
  14. Dec 12, 2013
    • Hal Finkel's avatar
      Fix a use-after-free error in GlobalOpt CleanupConstantGlobalUsers · f59fd7dc
      Hal Finkel authored
      GlobalOpt's CleanupConstantGlobalUsers function uses a worklist array to manage
      constant users to be visited. The pointers in this array need to be weak
      handles because when we delete a constant array, we may also be holding a
      pointer to one of its elements (or an element of one of its elements if we're
      dealing with an array of arrays) in the worklist.
      
      Fixes PR17347.
      
      llvm-svn: 197178
      f59fd7dc
    • Hal Finkel's avatar
      Initialize the barrier pass llvm::initializeIPO · 26fc4c29
      Hal Finkel authored
      The barrier pass is a temporary hack, and should go away soon. Nevertheless, if
      we don't initialize it, then opt will not understand -barrier, and this will
      break bugpoint (because when it dumps the passes from the default pass manager
      -barrier will be there).
      
      llvm-svn: 197177
      26fc4c29
  15. Dec 11, 2013
  16. Dec 05, 2013
    • Renato Golin's avatar
      Add #pragma vectorize enable/disable to LLVM · 729a3ae9
      Renato Golin authored
      The intended behaviour is to force vectorization on the presence
      of the flag (either turn on or off), and to continue the behaviour
      as expected in its absence. Tests were added to make sure the all
      cases are covered in opt. No tests were added in other tools with
      the assumption that they should use the PassManagerBuilder in the
      same way.
      
      This patch also removes the outdated -late-vectorize flag, which was
      on by default and not helping much.
      
      The pragma metadata is being attached to the same place as other loop
      metadata, but nothing forbids one from attaching it to a function
      (to enable #pragma optimize) or basic blocks (to hint the basic-block
      vectorizers), etc. The logic should be the same all around.
      
      Patches to Clang to produce the metadata will be produced after the
      initial implementation is agreed upon and committed. Patches to other
      vectorizers (such as SLP and BB) will be added once we're happy with
      the pass manager changes.
      
      llvm-svn: 196537
      729a3ae9
    • Alp Toker's avatar
      Correct word hyphenations · f907b891
      Alp Toker authored
      This patch tries to avoid unrelated changes other than fixing a few
      hyphen-related ambiguities and contractions in nearby lines.
      
      llvm-svn: 196471
      f907b891
  17. Dec 03, 2013
  18. Nov 26, 2013
    • Stepan Dyatkovskiy's avatar
      PR17925 bugfix. · abb8505d
      Stepan Dyatkovskiy authored
      Short description.
      
      This issue is about case of treating pointers as integers.
      We treat pointers as different if they references different address space.
      At the same time, we treat pointers equal to integers (with machine address
      width). It was a point of false-positive. Consider next case on 32bit machine:
      
      void foo0(i32 addrespace(1)* %p)
      void foo1(i32 addrespace(2)* %p)
      void foo2(i32 %p)
      
      foo0 != foo1, while
      foo1 == foo2 and foo0 == foo2.
      
      As you can see it breaks transitivity. That means that result depends on order
      of how functions are presented in module. Next order causes merging of foo0
      and foo1: foo2, foo0, foo1
      First foo0 will be merged with foo2, foo0 will be erased. Second foo1 will be
      merged with foo2.
      Depending on order, things could be merged we don't expect to.
      
      The fix:
      Forbid to treat any pointer as integer, except for those, who belong to address space 0.
      
      llvm-svn: 195769
      abb8505d
    • Chandler Carruth's avatar
      [PM] Split the CallGraph out from the ModulePass which creates the · 6378cf53
      Chandler Carruth authored
      CallGraph.
      
      This makes the CallGraph a totally generic analysis object that is the
      container for the graph data structure and the primary interface for
      querying and manipulating it. The pass logic is separated into its own
      class. For compatibility reasons, the pass provides wrapper methods for
      most of the methods on CallGraph -- they all just forward.
      
      This will allow the new pass manager infrastructure to provide its own
      analysis pass that constructs the same CallGraph object and makes it
      available. The idea is that in the new pass manager, the analysis pass's
      'run' method returns a concrete analysis 'result'. Here, that result is
      a 'CallGraph'. The 'run' method will typically do only minimal work,
      deferring much of the work into the implementation of the result object
      in order to be lazy about computing things, but when (like DomTree)
      there is *some* up-front computation, the analysis does it prior to
      handing the result back to the querying pass.
      
      I know some of this is fairly ugly. I'm happy to change it around if
      folks can suggest a cleaner interim state, but there is going to be some
      amount of unavoidable ugliness during the transition period. The good
      thing is that this is very limited and will naturally go away when the
      old pass infrastructure goes away. It won't hang around to bother us
      later.
      
      Next up is the initial new-PM-style call graph analysis. =]
      
      llvm-svn: 195722
      6378cf53
  19. Nov 22, 2013
    • Manman Ren's avatar
      Debug Info: move StripDebugInfo from StripSymbols.cpp to DebugInfo.cpp. · cb14bbcc
      Manman Ren authored
      We can share the implementation between StripSymbols and dropping debug info
      for metadata versions that do not match.
      
      Also update the comments to match the implementation. A follow-on patch will
      drop the "Debug Info Version" module flag in StripDebugInfo.
      
      llvm-svn: 195505
      cb14bbcc
    • Rafael Espindola's avatar
      Add a fixed version of r195470 back. · 6597992c
      Rafael Espindola authored
      The fix is simply to use CurI instead of I when handling aliases to
      avoid accessing a invalid iterator.
      
      original message:
      
      Convert linkonce* to weak* instead of strong.
      
      Also refactor the logic into a helper function. This is an important improve
      on mingw where the linker complains about mixed weak and strong symbols.
      Converting to weak ensures that the symbol is not dropped, but keeps in a
      comdat, making the linker happy.
      
      llvm-svn: 195477
      6597992c
    • Rafael Espindola's avatar
      Revert "Convert linkonce* to weak* instead of strong." · 77aa674c
      Rafael Espindola authored
      This reverts commit r195470.
      Debugging failure in some bots.
      
      llvm-svn: 195472
      77aa674c
    • Rafael Espindola's avatar
      Convert linkonce* to weak* instead of strong. · 55740325
      Rafael Espindola authored
      Also refactor the logic into a helper function. This is an important improvement
      on mingw where the linker complains about mixed weak and strong symbols.
      Converting to weak ensures that the symbol is not dropped, but keeps in a
      comdat, making the linker happy.
      
      llvm-svn: 195470
      55740325
  20. Nov 17, 2013
    • Hal Finkel's avatar
      Add a loop rerolling flag to the PassManagerBuilder · 29aeb205
      Hal Finkel authored
      This adds a boolean member variable to the PassManagerBuilder to control loop
      rerolling (just like we have for unrolling and the various vectorization
      options). This is necessary for control by the frontend. Loop rerolling remains
      disabled by default at all optimization levels.
      
      llvm-svn: 194966
      29aeb205
    • Hal Finkel's avatar
      Add a loop rerolling pass · bf45efde
      Hal Finkel authored
      This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The
      transformation aims to take loops like this:
      
      for (int i = 0; i < 3200; i += 5) {
        a[i]     += alpha * b[i];
        a[i + 1] += alpha * b[i + 1];
        a[i + 2] += alpha * b[i + 2];
        a[i + 3] += alpha * b[i + 3];
        a[i + 4] += alpha * b[i + 4];
      }
      
      and turn them into this:
      
      for (int i = 0; i < 3200; ++i) {
        a[i] += alpha * b[i];
      }
      
      and loops like this:
      
      for (int i = 0; i < 500; ++i) {
        x[3*i] = foo(0);
        x[3*i+1] = foo(0);
        x[3*i+2] = foo(0);
      }
      
      and turn them into this:
      
      for (int i = 0; i < 1500; ++i) {
        x[i] = foo(0);
      }
      
      There are two motivations for this transformation:
      
        1. Code-size reduction (especially relevant, obviously, when compiling for
      code size).
      
        2. Providing greater choice to the loop vectorizer (and generic unroller) to
      choose the unrolling factor (and a better ability to vectorize). The loop
      vectorizer can take vector lengths and register pressure into account when
      choosing an unrolling factor, for example, and a pre-unrolled loop limits that
      choice. This is especially problematic if the manual unrolling was optimized
      for a machine different from the current target.
      
      The current implementation is limited to single basic-block loops only. The
      rerolling recognition should work regardless of how the loop iterations are
      intermixed within the loop body (subject to dependency and side-effect
      constraints), but the significant restriction is that the order of the
      instructions in each iteration must be identical. This seems sufficient to
      capture all current use cases.
      
      This pass is not currently enabled by default at any optimization level.
      
      llvm-svn: 194939
      bf45efde
  21. Nov 15, 2013
    • Manman Ren's avatar
      ArgumentPromotion: correctly transfer TBAA tags and alignments. · bc37658a
      Manman Ren authored
      We used to use std::map<IndicesVector, LoadInst*> for OriginalLoads, and when we
      try to promote two arguments, they will both write to OriginalLoads causing
      created loads for the two arguments to have the same original load. And the same
      tbaa tag and alignment will be put to the created loads for the two arguments.
      
      The fix is to use std::map<std::pair<Argument*, IndicesVector>, LoadInst*>
      for OriginalLoads, so each Argument will write to different parts of the map.
      
      PR17906
      
      llvm-svn: 194846
      bc37658a
  22. Nov 12, 2013
    • Rafael Espindola's avatar
      Corruptly merge constants with explicit and implicit alignments. · dd8757ab
      Rafael Espindola authored
      Constant merge can merge a constant with implicit alignment with one that has
      explicit alignment. Before this change it was assuming that the explicit
      alignment was higher than the implicit one, causing the result to be under
      aligned in some cases.
      
      Fixes pr17815.
      
      Patch by Chris Smowton!
      
      llvm-svn: 194506
      dd8757ab
  23. Nov 10, 2013
Loading