Skip to content
  1. Aug 22, 2018
  2. Aug 21, 2018
    • Tom Stellard's avatar
      MachineScheduler: Refactor setPolicy() to limit computing remaining latency · ecd6aa5b
      Tom Stellard authored
      Summary:
      Computing the remaining latency can be very expensive especially
      on graphs of N nodes where the number of edges approaches N^2.
      
      This reduces the compile time of a pathological case with the
      AMDGPU backend from ~7.5 seconds to ~3 seconds.  This test case has
      a basic block with 2655 stores, each with somewhere between 500
      and 1500 successors and predecessors.
      
      Reviewers: atrick, MatzeB, airlied, mareko
      
      Reviewed By: mareko
      
      Subscribers: tpr, javed.absar, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D50486
      
      llvm-svn: 340346
      ecd6aa5b
    • Scott Linder's avatar
      [AMDGPU] Consider loads from flat addrspace to be potentially divergent · 72855e36
      Scott Linder authored
      In general we can't assume flat loads are uniform, and cases where we can prove
      they are should be handled through infer-address-spaces.
      
      Differential Revision: https://reviews.llvm.org/D50991
      
      llvm-svn: 340343
      72855e36
    • Zachary Turner's avatar
      [MS Demangler] Fix a few more edge cases. · df4cd7cb
      Zachary Turner authored
      I found these by running llvm-undname over a couple hundred
      megabytes of object files generated as part of building chromium.
      The issues fixed in this patch are:
      
        1) decltype-auto return types.
        2) Indirect vtables (e.g. const A::`vftable'{for `B'})
        3) Pointers, references, and rvalue-references to member pointers.
      
      I have exactly one remaining symbol out of a few hundred MB of object
      files that produces a name we can't demangle, and it's related to
      back-referencing.
      
      llvm-svn: 340341
      df4cd7cb
    • Heejin Ahn's avatar
      [WebAssembly] Restore __stack_pointer after catch instructions · 78d19108
      Heejin Ahn authored
      Summary:
      After the stack is unwound due to a thrown exception, the
      `__stack_pointer` global can point to an invalid address. This inserts
      instructions that restore `__stack_pointer` global.
      
      Reviewers: jgravelle-google, dschuff
      
      Subscribers: mgorny, sbc100, sunfish, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D50980
      
      llvm-svn: 340339
      78d19108
    • Thomas Lively's avatar
      [WebAssembly] v128.const · 22442924
      Thomas Lively authored
      Summary:
      This CL implements v128.const for each vector type. New operand types
      are added to ensure the vector contents can be serialized without LEB
      encoding. Tests are added for instruction selection, encoding,
      assembly and disassembly.
      
      Reviewers: aheejin, dschuff, aardappel
      
      Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D50873
      
      llvm-svn: 340336
      22442924
    • Marcello Maggioni's avatar
      883fe455
    • Florian Hahn's avatar
      [CodeExtractor] Use 'normal destination' BB as insert point to store invoke results. · 7cdf52e4
      Florian Hahn authored
      Currently CodeExtractor tries to use the next node after an invoke to
      place the store for the result of the invoke, if it is an out parameter
      of the region. This fails, as the invoke terminates the current BB.
      In that case, we can place the store in the 'normal destination' BB, as
      the result will only be available in that case.
      
      
      Reviewers: davidxl, davide, efriedma
      
      Reviewed By: davidxl
      
      Differential Revision: https://reviews.llvm.org/D51037
      
      llvm-svn: 340331
      7cdf52e4
    • Heejin Ahn's avatar
      [WebAssembly] Don't make wasm cleanuppads into funclet entries · 9cd7f88a
      Heejin Ahn authored
      Summary:
      Catchpads and cleanuppads are not funclet entries; they are only EH
      scope entries. We already dont't set `isEHFuncletEntry` for catchpads.
      This patch does the same thing for cleanuppads.
      
      Reviewers: dschuff
      
      Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D50654
      
      llvm-svn: 340330
      9cd7f88a
    • Heejin Ahn's avatar
      [WebAssembly] Change writeSPToMemory to writeSPToGlobal (NFC) · 20c9c443
      Heejin Ahn authored
      Summary: SP is now a __stack_pointer global and not a memory address anymore.
      
      Reviewers: dschuff
      
      Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D51046
      
      llvm-svn: 340328
      20c9c443
    • Bjorn Pettersson's avatar
      [RegisterCoalescer] Use substPhysReg in reMaterializeTrivialDef · e0632138
      Bjorn Pettersson authored
      Summary:
      When RegisterCoalescer::reMaterializeTrivialDef is substituting
      a register use in a DBG_VALUE instruction, and the old register
      is a subreg, and the new register is a physical register,
      then we need to use substPhysReg in order to extract the correct
      subreg.
      
      Reviewers: wmi, aprantl
      
      Reviewed By: wmi
      
      Subscribers: hiraditya, MatzeB, qcolombet, tpr, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D50844
      
      llvm-svn: 340326
      e0632138
    • Heejin Ahn's avatar
      [WebAssembly] Add isEHScopeReturn instruction property · ed5e06b0
      Heejin Ahn authored
      Summary:
      So far, `isReturn` property is used to mean both a return instruction
      from a functon and the end of an EH scope, a scope that starts with a EH
      scope entry BB and ends with a catchret or a cleanupret instruction.
      Because WinEH uses funclets, all EH-scope-ending instructions are also
      real return instruction from a function. But for wasm, they only serve
      as the end marker of an EH scope but not a return instruction that
      exits a function. This mismatch caused incorrect prolog and epilog
      generation in wasm EH scopes. This patch fixes this.
      
      This patch is in the same vein with rL333045, which splits
      `MachineBasicBlock::isEHFuncletEntry` into `isEHFuncletEntry` and
      `isEHScopeEntry`.
      
      Reviewers: dschuff
      
      Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D50653
      
      llvm-svn: 340325
      ed5e06b0
    • Craig Topper's avatar
      [InstCombine] Pull simple checks above a more complicated one. NFCI · 3d8fe39c
      Craig Topper authored
      I'm assuming its easier to make sure the RHS of an XOR is all ones than it is to check for the many select patterns we have. So lets check that first. Same with the one use check.
      
      llvm-svn: 340321
      3d8fe39c
    • Florian Hahn's avatar
      [GVN] Assign new value number to calls reading memory, if there is no MemDep info. · 9583d4fa
      Florian Hahn authored
      Currently we assign the same value number to two calls reading the same
      memory location if we do not have MemoryDependence info. Without MemDep
      Info we cannot guarantee that there is no store between the two calls, so we
      have to assign a new number to the second call.
      
      It also adds a new option EnableMemDep to enable/disable running
      MemoryDependenceAnalysis and also renamed NoLoads to NoMemDepAnalysis to
      be more explicit what it does. As it also impacts calls that read memory,
      NoLoads is a bit confusing.
      
      Reviewers: efriedma, sebpop, john.brawn, wmi
      
      Reviewed By: efriedma
      
      Differential Revision: https://reviews.llvm.org/D50893
      
      llvm-svn: 340319
      9583d4fa
    • Krzysztof Parzyszek's avatar
      [RegisterCoalscer] Manually remove leftover segments when commuting def · b211434a
      Krzysztof Parzyszek authored
      In removeCopyByCommutingDef, segments from the source live range are
      copied into (and merged with) the segments of the target live range.
      This is performed for all subranges of the source interval. It can
      happen that there will be subranges of the target interval that had
      no corresponding subranges in the source interval, and in such cases
      these subrages will not be updated. Since the copy being coalesced
      is about to be removed, these ranges need to be updated by removing
      the segments that are started by the copy.
      
      llvm-svn: 340318
      b211434a
    • Benjamin Kramer's avatar
      [NVPTX] Remove ftz variants of cvt with rounding mode · d66dde5a
      Benjamin Kramer authored
      These do not exist in ptxas, it refuses to compile them.
      
      Differential Revision: https://reviews.llvm.org/D51042
      
      llvm-svn: 340317
      d66dde5a
    • Eric Christopher's avatar
      Temporarily Revert "[PowerPC] Generate Power9 extswsli extend sign and shift... · 3dc594c1
      Eric Christopher authored
      Temporarily Revert "[PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction" due to it causing a compiler crash on valid.
      
      This reverts commit r340016, testcase forthcoming.
      
      llvm-svn: 340315
      3dc594c1
    • Philip Reames's avatar
      [AST] Remove notion of volatile from alias sets [NFCI] · c3c23e8c
      Philip Reames authored
      Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet.
      
      It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.)
      
      There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for..
      
      llvm-svn: 340312
      c3c23e8c
    • Yury Delendik's avatar
      Update DBG_VALUE register operand during LiveInterval operations · 132fc5a8
      Yury Delendik authored
      Summary:
      Handling of DBG_VALUE in ConnectedVNInfoEqClasses::Distribute() was fixed in
      PR16110. However DBG_VALUE register operands are not getting updated. This
      patch properly resolves the value location.
      
      Reviewers: MatzeB, vsk
      
      Reviewed By: MatzeB
      
      Subscribers: kparzysz, thegameg, vsk, MatzeB, dschuff, sbc100, jgravelle-google, aheejin, sunfish, llvm-commits
      
      Tags: #debug-info
      
      Differential Revision: https://reviews.llvm.org/D48994
      
      llvm-svn: 340310
      132fc5a8
    • Aditya Nandakumar's avatar
      Revert "Revert rr340111 "[GISel]: Add Legalization/lowering code for bit counting operations"" · c0333f71
      Aditya Nandakumar authored
      This reverts commit d1341152d91398e9a882ba2ee924147ea2f9b589.
      
      This patch originally made use of Nested MachineIRBuilder buildInstr
      calls, and since order of argument processing is not well defined, the
      instructions were built slightly in a different order (still correct).
      I've removed the nested buildInstr calls to have a defined order now.
      
      Patch was tested by Mikael.
      
      llvm-svn: 340309
      c0333f71
    • Simon Pilgrim's avatar
      [X86][SSE] Lower vXi8 general shifts to SSE shifts directly. NFCI. · 50eba6b3
      Simon Pilgrim authored
      Most of these shifts are extended to vXi16 so we don't gain anything from forcing another round of generic shift lowering - we know these extended cases are legal constant splat shifts.
      
      llvm-svn: 340307
      50eba6b3
    • Craig Topper's avatar
      [BypassSlowDivision] Teach bypass slow division not to interfere with div by... · b172b888
      Craig Topper authored
      [BypassSlowDivision] Teach bypass slow division not to interfere with div by constant where constants have been constant hoisted, but not moved from their basic block
      
      DAGCombiner doesn't pay attention to whether constants are opaque before doing the div by constant optimization. So BypassSlowDivision shouldn't introduce control flow that would make DAGCombiner unable to see an opaque constant. This can occur when a div and rem of the same constant are used in the same basic block. it will be hoisted, but not leave the block.
      
      Longer term we probably need to look into the X86 immediate cost model used by constant hoisting and maybe not mark div/rem immediates for hoisting at all.
      
      This fixes the case from PR38649.
      
      Differential Revision: https://reviews.llvm.org/D51000
      
      llvm-svn: 340303
      b172b888
    • Simon Pilgrim's avatar
      [X86][SSE] Lower v8i16 general shifts to SSE shifts directly. NFCI. · 98eb4ae4
      Simon Pilgrim authored
      We don't gain anything from forcing another round of generic shift lowering - we know these are legal constant splat shifts.
      
      llvm-svn: 340302
      98eb4ae4
    • Simon Pilgrim's avatar
      [X86][SSE] Lower directly to SSE shifts in the BLEND(SHIFT, SHIFT) combine. NFCI. · dbe4e9e3
      Simon Pilgrim authored
      We don't gain anything from forcing another round of generic shift lowering - we know these are legal constant splat shifts.
      
      llvm-svn: 340300
      dbe4e9e3
    • Matt Arsenault's avatar
      Try to fix bot build failure · 182bab8d
      Matt Arsenault authored
      llvm-svn: 340296
      182bab8d
    • Farhana Aleen's avatar
      [AMDGPU] Support idot2 pattern. · 3528c803
      Farhana Aleen authored
      Summary: Transform add (mul ((i32)S0.x, (i32)S1.x),
      
               add( mul ((i32)S0.y, (i32)S1.y), (i32)S3) => i/udot2((v2i16)S0, (v2i16)S1, (i32)S3)
      
      Author: FarhanaAleen
      
      Reviewed By: arsenm
      
      Subscribers: llvm-commits, AMDGPU
      
      Differential Revision: https://reviews.llvm.org/D50024
      
      llvm-svn: 340295
      3528c803
Loading