Skip to content
  1. Oct 23, 2020
    • Arthur Eubanks's avatar
      [Inliner] Run always-inliner in inliner-wrapper · 0291e2c9
      Arthur Eubanks authored
      An alwaysinline function may not get inlined in inliner-wrapper due to
      the inlining order.
      
      Previously for the following, the inliner would first inline @a() into @b(),
      
      ```
      define void @a() {
      entry:
        call void @b()
        ret void
      }
      
      define void @b() alwaysinline {
      entry:
        br label %for.cond
      
      for.cond:
        call void @a()
        br label %for.cond
      }
      ```
      
      making @b() recursive and unable to be inlined into @a(), ending at
      
      ```
      define void @a() {
      entry:
        call void @b()
        ret void
      }
      
      define void @b() alwaysinline {
      entry:
        br label %for.cond
      
      for.cond:
        call void @b()
        br label %for.cond
      }
      ```
      
      Running always-inliner first makes sure that we respect alwaysinline in more cases.
      
      Fixes https://bugs.llvm.org/show_bug.cgi?id=46945.
      
      Reviewed By: davidxl, rnk
      
      Differential Revision: https://reviews.llvm.org/D86988
      0291e2c9
  2. Oct 22, 2020
    • Vedant Kumar's avatar
      Revert "[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)" · 099bffe7
      Vedant Kumar authored
      This reverts commit 26ee8aff.
      
      It's necessary to insert bitcast the pointer operand of a lifetime
      marker if it has an opaque pointer type.
      
      rdar://70560161
      099bffe7
    • Arthur Eubanks's avatar
      Port -instnamer to NPM · 92d9a386
      Arthur Eubanks authored
      Some clang tests use this.
      
      Reviewed By: akhuang
      
      Differential Revision: https://reviews.llvm.org/D89931
      92d9a386
    • Layton Kifer's avatar
      [InstCombine][NFC] Use ConstantExpr::getBinOpIdentity · d49911c2
      Layton Kifer authored
      Delete duplicate implementation getSelectFoldableConstant and
      replace with ConstantExpr::getBinOpIdentity.
      
      Differential Revision: https://reviews.llvm.org/D89839
      d49911c2
    • Nikita Popov's avatar
      [MemCpyOpt] Move GEP during call slot optimization · 3e375431
      Nikita Popov authored
      When performing a call slot optimization to a GEP destination, it
      will currently usually fail, because the GEP is directly before the
      memcpy and as such does not dominate the call. We should move it
      above the call if that satisfies the domination requirement.
      
      I think that a constant-index GEP is the only useful thing to move
      here, as otherwise isDereferenceablePointer couldn't look through
      it anyway. As such I'm not trying to generalize this further.
      
      Differential Revision: https://reviews.llvm.org/D89623
      3e375431
    • Ettore Tiotto's avatar
      [NFC][PartialInliner]: Clean up code · e6521ce0
      Ettore Tiotto authored
      Make member function const where possible, use LLVM_DEBUG to print debug traces
      rather than a custom option, pass by reference to avoid null checking, ...
      
      Reviewed By: fhann
      
      Differential Revision: https://reviews.llvm.org/D89895
      e6521ce0
    • Vedant Kumar's avatar
      [InstCombine] Remove dbg.values describing contents of dead allocas · 3419252a
      Vedant Kumar authored
      When InstCombine removes an alloca, it erases the dbg.{addr,declare}
      instructions which refer to the alloca. It would be better to instead
      remove all debug intrinsics which describe the contents of the dead
      alloca, namely all dbg.value(<dead alloca>, ..., DW_OP_deref)'s.
      
      This effectively undoes work performed in an InstCombine run earlier in
      the pipeline by LowerDbgDeclare, which inserts DW_OP_deref dbg.values
      before CallInst users of an alloca. The motivating example looks like:
      
      ```
        define void @foo(i32 %0) {
          %a = alloca i32              ; This alloca is erased.
          store i32 %0, i32* %a
          dbg.value(i32 %0, "arg0")    ; This dbg.value survives.
          dbg.value(i32* %a, "arg0", DW_OP_deref)
          call void @trivially_inlinable_no_op(i32* %a)
          ret void
        }
      ```
      
      If the DW_OP_deref dbg.value is not erased, it becomes dbg.value(undef)
      after inlining, making "arg0" unavailable. But we already have dbg.value
      descriptions of the alloca's value (from LowerDbgDeclare), so the
      DW_OP_deref dbg.value cannot serve its purpose of describing an
      initialization of the alloca by some callee. It invalidates other useful
      dbg.values, causing large gaps in location coverage, so we should delete
      it (even though doing so may cause stale dbg.values to appear, if
      there's a dead store to `%a` in @trivially_inlinable_no_op).
      
      OTOH, it wouldn't be correct to delete all dbg.value descriptions of an
      alloca. Note that it's possible to describe a variable that takes on
      different pointer values, e.g.:
      
      ```
        void use(int *);
        void t(int a, int b) {
          int *local = &a;     // dbg.value(i32* %a.addr, "local")
          local = &b;          // dbg.value(i32* undef, "local")
          use(&a);             //           (note: %b.addr is optimized out)
          local = &a;          // dbg.value(i32* %a.addr, "local")
        }
      ```
      
      In this example, the alloca for "b" is erased, but we need to describe
      the value of "local" as <unavailable> before the call to "use". This
      prevents "local" from appearing to be equal to "&a" at the callsite.
      
      rdar://66592859
      
      Differential Revision: https://reviews.llvm.org/D85555
      3419252a
    • Serguei Katkov's avatar
      [IRCE] consolidate profitability check · 75d0e0cd
      Serguei Katkov authored
      Use BFI if it is available and BPI otherwise.
      This is a promised follow-up after D89541.
      
      Reviewers: ebrevnov, mkazantsev
      Reviewed By: ebrevnov
      Subscribers: llvm-commits
      Differential Revision: https://reviews.llvm.org/D89773
      75d0e0cd
    • Zequan Wu's avatar
      Revert "Revert "SimplifyCFG: Clean up optforfuzzing implementation"" · 2f293411
      Zequan Wu authored
      This reverts commit 716f7636.
      2f293411
    • Zequan Wu's avatar
      Revert "SimplifyCFG: Clean up optforfuzzing implementation" · 716f7636
      Zequan Wu authored
      See discussion: https://reviews.llvm.org/D89590
      This reverts commit cdd006ee.
      716f7636
  3. Oct 21, 2020
  4. Oct 20, 2020
    • Nicolai Hähnle's avatar
      DomTree: Extract (mostly) read-only logic into type-erased base classes · 848a68a0
      Nicolai Hähnle authored
      Avoid having to instantiate and compile a subset of the dominator tree logic
      separately for each node type. More importantly, this allows generic
      algorithms to be built on top of dominator trees without writing them as
      templates -- such algorithms can now use opaque CfgBlockRef and
      CfgInterface instead.
      
      A type-erased implementation of dominator trees could be written in
      terms of CfgInterface as well, but doing so would change the current
      trade-off: it would slightly reduce code size at the cost of a slight
      runtime overhead.
      
      This patch does not change the trade-off, as it only does type-erasure
      where basic blocks can be treated in a fully opaque way, i.e. it only
      moves methods that don't require iteration over CFG successors and
      predecessors.
      
      v5:
      - rename generic_{begin,end,children} back without the generic_ prefix
        and refer explictly to base class methods in NewGVN, which wants to
        mutate the order of dominator tree node children directly
      
      v6:
      - style change: iDom -> idom; it's arguable whether this is really
        invalid, since it is actually standard camelCase, but clang-tidy
        complains about it so... *shrug*
      - rename {to,from}Generic -> {wrap,unwrap}Ref
      
      Change-Id: Ib860dc04cf8bb093d8ed00be7def40d662213672
      
      Differential Revision: https://reviews.llvm.org/D83089
      848a68a0
    • Ta-Wei Tu's avatar
      [NPM] port -unify-loop-exits to NPM · 529ecd19
      Ta-Wei Tu authored
      Reviewed By: aeubanks
      
      Differential Revision: https://reviews.llvm.org/D89774
      529ecd19
    • Ta-Wei Tu's avatar
      [NPM] Port -mergereturn to NPM · 59286b36
      Ta-Wei Tu authored
      Reviewed By: aeubanks
      
      Differential Revision: https://reviews.llvm.org/D89781
      59286b36
    • Florian Hahn's avatar
      [DSE] Do not scan users of memory terminators for further reads. · 2e580102
      Florian Hahn authored
      isMemTerminator checks if the current def is a memory terminator that
      terminates the memory pointed to by DefLoc. We do not have to add any of
      their users to the worklist, because the follow-on users cannot read the
      memory in question.
      
      This leads to more stores eliminated in the presence of lifetime calls.
      Previously we added the users of those intrinsics to the worklist,
      limiting elimination.
      
      In terms of removed stores, this gives a nice boost on some benchmarks
      (MultiSource/SPEC2000/SPEC2006 on X86 with -flto -O3):
      
      Same hash: 205 (filtered out)
      Remaining: 32
      Metric: dse.NumFastStores
      
      Program                                          base   patch   diff
       test-suite...000/197.parser/197.parser.test     4.00    8.00  100.0%
       test-suite...rolangs-C++/family/family.test     4.00    7.00  75.0%
       test-suite...marks/7zip/7zip-benchmark.test   1722.00 2189.00 27.1%
       test-suite...CFP2000/177.mesa/177.mesa.test    30.00   38.00  26.7%
       test-suite :: External/Nurbs/nurbs.test        44.00   49.00  11.4%
       test-suite...lications/sqlite3/sqlite3.test   115.00  128.00  11.3%
       test-suite...006/447.dealII/447.dealII.test   2715.00 3013.00 11.0%
       test-suite...ProxyApps-C++/CLAMR/CLAMR.test   237.00  261.00  10.1%
       test-suite...tions/lambda-0.1.3/lambda.test    40.00   44.00  10.0%
       test-suite...3.xalancbmk/483.xalancbmk.test   1366.00 1475.00  8.0%
       test-suite...abench/jpeg/jpeg-6a/cjpeg.test    13.00   14.00   7.7%
       test-suite...oxyApps-C++/miniFE/miniFE.test    43.00   46.00   7.0%
       test-suite...lications/ClamAV/clamscan.test   230.00  246.00   7.0%
       test-suite...006/450.soplex/450.soplex.test   284.00  299.00   5.3%
       test-suite...nsumer-jpeg/consumer-jpeg.test    21.00   22.00   4.8%
      2e580102
    • Simon Pilgrim's avatar
    • Simon Pilgrim's avatar
    • Florian Hahn's avatar
      [DSE] Bail out from getLocForWriteEx if call is not argmemonly/inacc_mem. · 6439fde6
      Florian Hahn authored
      This change should currently not have any impact, but guard against
      further inconsistencies between MemoryLocation and function attributes.
      6439fde6
    • Simon Pilgrim's avatar
      [InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3))... · e372a5f8
      Simon Pilgrim authored
      [InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support
      
      Reapplied rGa704d8238c86 with a check for integer/integervector types to prevent matching with pointer types
      e372a5f8
    • Nicolai Hähnle's avatar
      Introduce CfgTraits abstraction · c0cdd22c
      Nicolai Hähnle authored
      The CfgTraits abstraction simplfies writing algorithms that are
      generic over the type of CFG, and enables writing such algorithms
      as regular non-template code that operates on opaque references
      to CFG blocks and values.
      
      Implementations of CfgTraits provide operations on the concrete
      CFG types, e.g. `IrCfgTraits::BlockRef` is `BasicBlock *`.
      
      CfgInterface is an abstract base class which provides operations
      on opaque types CfgBlockRef and CfgValueRef. Those opaque types
      encapsulate a `void *`, but the meaning depends on the concrete
      CFG type. For example, MachineCfgTraits -- for use with MachineIR
      in SSA form -- encodes a Register inside CfgValueRef. Converting
      between concrete references and opaque/generic ones is done by
      CfgTraits::{fromGeneric,toGeneric}. Convenience methods
      CfgTraits::{un}wrap{Iterator,Range} are available as well.
      
      Writing algorithms in terms of CfgInterface adds some overhead
      (virtual method calls, plus in same cases it removes the
      opportunity to inline iterators), but can be much more convenient
      since generic algorithms can be written as non-templates.
      
      This patch adds implementations of CfgTraits for all CFGs on
      which dominator trees are calculated, so that the dominator
      tree can be ported to this machinery. Only IrCfgTraits (LLVM IR)
      and MachineCfgTraits (Machine IR in SSA form) are complete, the
      other implementations are limited to the absolute minimum
      required to make the upcoming dominator tree changes work.
      
      v5:
      - fix MachineCfgTraits::blockdef_iterator and allow it to iterate over
        the instructions in a bundle
      - use MachineBasicBlock::printName
      
      v6:
      - implement predecessors/successors for all CfgTraits implementations
      - fix error in unwrapRange
      - rename toGeneric/fromGeneric into wrapRef/unwrapRef to have naming
        that is consistent with {wrap,unwrap}{Iterator,Range}
      - use getVRegDef instead of getUniqueVRegDef
      
      v7:
      - std::forward fix in wrapping_iterator
      - fix typos
      
      v8:
      - cleanup operators on CfgOpaqueType
      - address other review comments
      
      Change-Id: Ia75f4f268fded33fca11218a7d578c9aec1f3f4d
      
      Differential Revision: https://reviews.llvm.org/D83088
      c0cdd22c
    • Simon Pilgrim's avatar
    • Atmn Patel's avatar
      [IR] Adds mustprogress as a LLVM IR attribute · 595c6156
      Atmn Patel authored
      This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress.
      
      Reviewed By: nikic
      
      Differential Revision: https://reviews.llvm.org/D85393
      595c6156
    • Serguei Katkov's avatar
      [IRCE] Do not transform if loop has small number of iterations · 38799975
      Serguei Katkov authored
      IRCE has some overhead for runtime checks and in case number of iteration is small
      the overhead can kill the benefit from optimizations.
      
      This CL bases on BlockFrequencyInfo of pre-header and header to estimate the
      number of loop iterations. If it is less than irce-min-estimated-iters we do not transform the loop.
      
      Probably it is better to make more complex cost model but for simplicity it seems the be enough.
      
      The usage of BFI is added only for new pass manager and tries to use it efficiently.
      
      Reviewers: ebrevnov, dantrushin, asbirlea, mkazantsev
      Reviewed By: mkazantsev
      Subscribers: llvm-commits, fhahn
      Differential Revision: https://reviews.llvm.org/D89541
      38799975
    • Jordan Rupprecht's avatar
      [NFC] Inline assertion-only variable · 8a377f1e
      Jordan Rupprecht authored
      8a377f1e
  5. Oct 19, 2020
Loading