Skip to content
  1. Aug 09, 2018
    • Reid Kleckner's avatar
      [GlobalOpt] Don't apply fastcc if it would break inalloca invariants · 80c6ec11
      Reid Kleckner authored
      The inalloca parameter has to be the only parameter passed in memory.
      Changing the convention to fastcc can break that.
      
      At some point we should teach global opt how to optimize ABI attributes
      like inalloca and maybe byval. These attributes are mainly used to match
      C ABIs. They are harder for LLVM to optimize and they don't always
      generate the best code.
      
      Fixes PR38487
      
      llvm-svn: 339360
      80c6ec11
  2. Jul 28, 2018
    • David Green's avatar
      [GlobalOpt] Test array indices inside structs for out-of-bounds accesses · fc4b0fe0
      David Green authored
      We now, from clang, can turn arrays of
        static short g_data[] = {16, 16, 16, 16, 16, 16, 16, 16, 0, 0, 0, 0, 0, 0, 0, 0};
      into structs of the form
        @g_data = internal global <{ [8 x i16], [8 x i16] }> ...
      
      GlobalOpt will incorrectly SROA it, not realising that the access to the first
      element may overflow into the second. This fixes it by checking geps more
      thoroughly.
      
      I believe this makes the globalsra-partial.ll test case invalid as the %i value
      could be out of bounds. I've re-purposed it as a negative test for this case.
      
      Differential Revision: https://reviews.llvm.org/D49816
      
      llvm-svn: 338192
      fc4b0fe0
  3. Jul 10, 2018
    • Manoj Gupta's avatar
      llvm: Add support for "-fno-delete-null-pointer-checks" · 77eeac3d
      Manoj Gupta authored
      Summary:
      Support for this option is needed for building Linux kernel.
      This is a very frequently requested feature by kernel developers.
      
      More details : https://lkml.org/lkml/2018/4/4/601
      
      GCC option description for -fdelete-null-pointer-checks:
      This Assume that programs cannot safely dereference null pointers,
      and that no code or data element resides at address zero.
      
      -fno-delete-null-pointer-checks is the inverse of this implying that
      null pointer dereferencing is not undefined.
      
      This feature is implemented in LLVM IR in this CL as the function attribute
      "null-pointer-is-valid"="true" in IR (Under review at D47894).
      The CL updates several passes that assumed null pointer dereferencing is
      undefined to not optimize when the "null-pointer-is-valid"="true"
      attribute is present.
      
      Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
      
      Reviewed By: efriedma, george.burgess.iv
      
      Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D47895
      
      llvm-svn: 336613
      77eeac3d
  4. Jun 12, 2018
  5. Jun 04, 2018
    • David Blaikie's avatar
      Move Analysis/Utils/Local.h back to Transforms · 31b98d2e
      David Blaikie authored
      Review feedback from r328165. Split out just the one function from the
      file that's used by Analysis. (As chandlerc pointed out, the original
      change only moved the header and not the implementation anyway - which
      was fine for the one function that was used (since it's a
      template/inlined in the header) but not in general)
      
      llvm-svn: 333954
      31b98d2e
  6. May 14, 2018
    • Nicola Zaghen's avatar
      Rename DEBUG macro to LLVM_DEBUG. · d34e60ca
      Nicola Zaghen authored
          
      The DEBUG() macro is very generic so it might clash with other projects.
      The renaming was done as follows:
      - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
      - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
      - Manual change to APInt
      - Manually chage DOCS as regex doesn't match it.
      
      In the transition period the DEBUG() macro is still present and aliased
      to the LLVM_DEBUG() one.
      
      Differential Revision: https://reviews.llvm.org/D43624
      
      llvm-svn: 332240
      d34e60ca
  7. Apr 27, 2018
  8. Mar 21, 2018
    • David Blaikie's avatar
      Fix a couple of layering violations in Transforms · 2be39228
      David Blaikie authored
      Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.
      
      Transforms depends on Transforms/Utils, not the other way around. So
      remove the header and the "createStripGCRelocatesPass" function
      declaration (& definition) that is unused and motivated this dependency.
      
      Move Transforms/Utils/Local.h into Analysis because it's used by
      Analysis/MemoryBuiltins.cpp.
      
      llvm-svn: 328165
      2be39228
  9. Feb 28, 2018
  10. Feb 23, 2018
  11. Feb 22, 2018
  12. Feb 02, 2018
    • Mikael Holmen's avatar
      [GlobalOpt] Include padding in debug fragments · b69e5b73
      Mikael Holmen authored
      Summary:
      When creating the debug fragments for a SRA'd variable, use the types'
      allocation sizes. This fixes issues where the pass would emit too small
      fragments, placed at the wrong offset, for padded types.
      
      An example of this is long double on x86. The type is represented using
      x86_fp80, which is 10 bytes, but the value is aligned to 12/16 bytes.
      The padding is included in the type's DW_AT_byte_size attribute;
      therefore, the fragments should also include that. Newer GCC releases
      (I tested 7.2.0) emit 12/16-byte pieces for long double. Earlier
      releases, e.g. GCC 5.5.0, behaved as LLVM did, i.e. by emitting a
      10-byte piece, followed by an empty 2/6-byte piece for the padding.
      
      Failing to cover all `DW_AT_byte_size' bytes of a value with non-empty
      pieces results in the value being printed as <optimized out> by GDB.
      
      Patch by: David Stenberg
      
      Reviewers: aprantl, JDevlieghere
      
      Reviewed By: aprantl, JDevlieghere
      
      Subscribers: llvm-commits
      
      Tags: #debug-info
      
      Differential Revision: https://reviews.llvm.org/D42807
      
      llvm-svn: 324066
      b69e5b73
  13. Feb 01, 2018
    • Amara Emerson's avatar
      [GlobalOpt] Improve common case efficiency of static global initializer evaluation · 93b0ff20
      Amara Emerson authored
      For very, very large global initializers which can be statically evaluated, the
      code would create vectors of temporary Constants, modifying them in place,
      before committing the resulting Constant aggregate to the global's initializer
      value. This had effectively O(n^2) complexity in the size of the global
      initializer and would cause memory and non-termination issues compiling some
      workloads.
      
      This change performs the static initializer evaluation and creation in batches,
      once for each global in the evaluated IR memory. The existing code is maintained
      as a last resort when the initializers are more complex than simple values in a
      large aggregate. This should theoretically by NFC, no test as the example case
      is massive. The existing test cases pass with this, as well as the llvm test
      suite.
      
      To give an example, consider the following C++ code adapted from the clang
      regression tests:
      struct S {
       int n = 10;
       int m = 2 * n;
       S(int a) : n(a) {}
      };
      
      template<typename T>
      struct U {
       T *r = &q;
       T q = 42;
       U *p = this;
      };
      
      U<S> e;
      
      The global static constructor for 'e' will need to initialize 'r' and 'p' of
      the outer struct, while also initializing the inner 'q' structs 'n' and 'm'
      members. This batch algorithm will simply use general CommitValueTo() method
      to handle the complex nested S struct initialization of 'q', before
      processing the outermost members in a single batch. Using CommitValueTo() to
      handle member in the outer struct is inefficient when the struct/array is
      very large as we end up creating and destroy constant arrays for each
      initialization.
      For the above case, we expect the following IR to be generated:
      
      %struct.U = type { %struct.S*, %struct.S, %struct.U* }
      %struct.S = type { i32, i32 }
      @e = global %struct.U { %struct.S* gep inbounds (%struct.U, %struct.U* @e,
                                                       i64 0, i32 1),
                              %struct.S { i32 42, i32 84 }, %struct.U* @e }
      The %struct.S { i32 42, i32 84 } inner initializer is treated as a complex
      constant expression, while the other two elements of @e are "simple".
      
      Differential Revision: https://reviews.llvm.org/D42612
      
      llvm-svn: 323933
      93b0ff20
  14. Jan 30, 2018
    • Zaara Syeda's avatar
      Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark · 1f59ae31
      Zaara Syeda authored
      candidates with coldcc attribute.
      
      This recommits r322721 reverted due to sanitizer memory leak build bot failures.
      
      Original commit message:
      This patch adds support for the coldcc calling convention for Power.
      This changes the set of non-volatile registers. It includes a pass to stress
      test the implementation by marking all static directly called functions with
      the coldcc attribute through the option -enable-coldcc-stress-test. It also
      includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
      functions which are cold at all call sites based on BlockFrequencyInfo when
      the containing function does not call any non cold functions.
      
      Differential Revision: https://reviews.llvm.org/D38413
      
      llvm-svn: 323778
      1f59ae31
  15. Jan 25, 2018
    • Mikael Holmen's avatar
      [GlobalOpt] Emit fragments using field offsets from struct layout · 886edf8f
      Mikael Holmen authored
      Summary:
      When creating the debug fragments for a SRA'd struct, use the fields'
      offsets, taken from the struct layout, as the offsets for the resulting
      fragments. This fixes an issue where GlobalOpt would emit fragments with
      incorrect offsets for padded fields.
      
      This should solve PR36016.
      
      Patch by David Stenberg.
      
      Reviewers: aprantl
      
      Reviewed By: aprantl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D42489
      
      llvm-svn: 323411
      886edf8f
  16. Jan 17, 2018
    • Zaara Syeda's avatar
      Revert [PowerPC] This reverts commit rL322721 · c9dc7b45
      Zaara Syeda authored
      Failing build bots. Revert the commit now.
      
      llvm-svn: 322748
      c9dc7b45
    • Zaara Syeda's avatar
      [PowerPC] Add handling for ColdCC calling convention and a pass to mark · 8e951fd2
      Zaara Syeda authored
      candidates with coldcc attribute.
      
      This patch adds support for the coldcc calling convention for Power.
      This changes the set of non-volatile registers. It includes a pass to stress
      test the implementation by marking all static directly called functions with
      the coldcc attribute through the option -enable-coldcc-stress-test. It also
      includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
      functions which are cold at all call sites based on BlockFrequencyInfo when
      the containing function does not call any non cold functions.
      
      Differential Revision: https://reviews.llvm.org/D38413
      
      llvm-svn: 322721
      8e951fd2
  17. Jan 11, 2018
    • Rafael Espindola's avatar
      Make internal/private GVs implicitly dso_local. · e4b0231c
      Rafael Espindola authored
      While updating clang tests for having clang set dso_local I noticed
      that:
      
      - There are *a lot* of tests to update.
      - Many of the updates are redundant.
      
      They are redundant because a GV is "obviously dso_local". This patch
      starts formalizing that a bit by requiring that internal and private
      GVs be dso_local too. Since they all are, we don't have to print
      dso_local to the textual representation, making it a bit more compact
      and easier to read.
      
      llvm-svn: 322317
      e4b0231c
  18. Nov 07, 2017
  19. Oct 11, 2017
  20. Sep 21, 2017
    • Strahinja Petrovic's avatar
      Fixed reverted commit rL312318 · 29202f6d
      Strahinja Petrovic authored
      This patch contains fix for reverted commit
      rL312318 which was causing failure due to use
      of unchecked dyn_cast to CIInit.
      
      Patch by: Nikola Prica.
      
      llvm-svn: 313870
      29202f6d
  21. Sep 08, 2017
  22. Sep 01, 2017
  23. Aug 31, 2017
  24. Aug 30, 2017
  25. Aug 09, 2017
  26. Aug 04, 2017
  27. Jul 13, 2017
  28. Jul 12, 2017
    • Davide Italiano's avatar
      [IPO] Temporarily rollback r307215. · b8ad3eeb
      Davide Italiano authored
      [GlobalOpt] Remove unreachable blocks before optimizing a function.
      While the change is presumably correct, it exposes a latent bug
      in DI which breaks on of the CFI checks. I'll analyze it further
      and try to understand what's going on.
      
      llvm-svn: 307729
      b8ad3eeb
    • Konstantin Zhuravlyov's avatar
      Enhance synchscope representation · bb80d3e1
      Konstantin Zhuravlyov authored
        OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
        global and local memory. These scopes restrict how synchronization is
        achieved, which can result in improved performance.
      
        This change extends existing notion of synchronization scopes in LLVM to
        support arbitrary scopes expressed as target-specific strings, in addition to
        the already defined scopes (single thread, system).
      
        The LLVM IR and MIR syntax for expressing synchronization scopes has changed
        to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
        replaces *singlethread* keyword), or a target-specific name. As before, if
        the scope is not specified, it defaults to CrossThread/System scope.
      
        Implementation details:
          - Mapping from synchronization scope name/string to synchronization scope id
            is stored in LLVM context;
          - CrossThread/System and SingleThread scopes are pre-defined to efficiently
            check for known scopes without comparing strings;
          - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
            the bitcode.
      
      Differential Revision: https://reviews.llvm.org/D21723
      
      llvm-svn: 307722
      bb80d3e1
  29. Jul 06, 2017
    • Davide Italiano's avatar
      [GlobalOpt] Remove unreachable blocks before optimizing a function. · 7dd0694f
      Davide Italiano authored
      LLVM's definition of dominance allows instructions that are cyclic
      in unreachable blocks, e.g.:
      
        %pat = select i1 %condition, @global, i16* %pat
      
      because any instruction dominates an instruction in a block that's
      not reachable from entry.
      So, remove unreachable blocks from the function, because a) there's
      no point in analyzing them and b) GlobalOpt should otherwise grow
      some more complicated logic to break these cycles.
      
      Differential Revision:  https://reviews.llvm.org/D35028
      
      llvm-svn: 307215
      7dd0694f
  30. May 01, 2017
  31. Apr 27, 2017
  32. Apr 26, 2017
    • Sanjoy Das's avatar
      Reverts commit r301424, r301425 and r301426 · 2cbeb00f
      Sanjoy Das authored
      Commits were:
      
      "Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts"
      "Add a new WeakVH value handle; NFC"
      "Rename WeakVH to WeakTrackingVH; NFC"
      
      The changes assumed pointers are 8 byte aligned on all architectures.
      
      llvm-svn: 301429
      2cbeb00f
    • Sanjoy Das's avatar
      Rename WeakVH to WeakTrackingVH; NFC · 01de5577
      Sanjoy Das authored
      Summary:
      I plan to use WeakVH to mean "nulls itself out on deletion, but does
      not track RAUW" in a subsequent commit.
      
      Reviewers: dblaikie, davide
      
      Reviewed By: davide
      
      Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle
      
      Differential Revision: https://reviews.llvm.org/D32266
      
      llvm-svn: 301424
      01de5577
  33. Apr 20, 2017
  34. Apr 11, 2017
    • Matt Arsenault's avatar
      Allow DataLayout to specify addrspace for allocas. · 3c1fc768
      Matt Arsenault authored
      LLVM makes several assumptions about address space 0. However,
      alloca is presently constrained to always return this address space.
      There's no real way to avoid using alloca, so without this
      there is no way to opt out of these assumptions.
      
      The problematic assumptions include:
      - That the pointer size used for the stack is the same size as
        the code size pointer, which is also the maximum sized pointer.
      
      - That 0 is an invalid, non-dereferencable pointer value.
      
      These are problems for AMDGPU because alloca is used to
      implement the private address space, which uses a 32-bit
      index as the pointer value. Other pointers are 64-bit
      and behave more like LLVM's notion of generic address
      space. By changing the address space used for allocas,
      we can change our generic pointer type to be LLVM's generic
      pointer type which does have similar properties.
      
      llvm-svn: 299888
      3c1fc768
  35. Mar 21, 2017
    • Reid Kleckner's avatar
      Rename AttributeSet to AttributeList · b518054b
      Reid Kleckner authored
      Summary:
      This class is a list of AttributeSetNodes corresponding the function
      prototype of a call or function declaration. This class used to be
      called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
      typically accessed by parameter and return value index, so
      "AttributeList" seems like a more intuitive name.
      
      Rename AttributeSetImpl to AttributeListImpl to follow suit.
      
      It's useful to rename this class so that we can rename AttributeSetNode
      to AttributeSet later. AttributeSet is the set of attributes that apply
      to a single function, argument, or return value.
      
      Reviewers: sanjoy, javed.absar, chandlerc, pete
      
      Reviewed By: pete
      
      Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D31102
      
      llvm-svn: 298393
      b518054b
Loading