Skip to content
  1. Oct 07, 2019
  2. Oct 06, 2019
  3. Oct 05, 2019
    • Simon Pilgrim's avatar
      [X86][AVX] Push sign extensions of comparison bool results through bitops (PR42025) · 8815be04
      Simon Pilgrim authored
      As discussed on PR42025, with more complex boolean math we can end up with many truncations/extensions of the comparison results through each bitop.
      
      This patch handles the cases introduced in combineBitcastvxi1 by pushing the sign extension through the AND/OR/XOR ops so its just the original SETCC ops that gets extended.
      
      Differential Revision: https://reviews.llvm.org/D68226
      
      llvm-svn: 373834
      8815be04
    • Sanjay Patel's avatar
      [SLP] avoid reduction transform on patterns that the backend can load-combine · e2321bb4
      Sanjay Patel authored
      I don't see an ideal solution to these 2 related, potentially large, perf regressions:
      https://bugs.llvm.org/show_bug.cgi?id=42708
      https://bugs.llvm.org/show_bug.cgi?id=43146
      
      We decided that load combining was unsuitable for IR because it could obscure other
      optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend.
      Therefore, preventing SLP from destroying load combine opportunities requires that it
      recognizes patterns that could be combined later, but not do the optimization itself (
      it's not a vector combine anyway, so it's probably out-of-scope for SLP).
      
      Here, we add a scalar cost model adjustment with a conservative pattern match and cost
      summation for a multi-instruction sequence that can probably be reduced later.
      This should prevent SLP from creating a vector reduction unless that sequence is
      extremely cheap.
      
      In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining
      will produce a single instruction on these tests like:
      
        movbe   rax, qword ptr [rdi]
      
      or:
      
        mov     rax, qword ptr [rdi]
      
      Not some (half) vector monstrosity as we currently do using SLP:
      
        vpmovzxbq       ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,..
        vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0]
        movzx   eax, byte ptr [rdi]
        movzx   ecx, byte ptr [rdi + 5]
        shl     rcx, 40
        movzx   edx, byte ptr [rdi + 6]
        shl     rdx, 48
        or      rdx, rcx
        movzx   ecx, byte ptr [rdi + 7]
        shl     rcx, 56
        or      rcx, rdx
        or      rcx, rax
        vextracti128    xmm1, ymm0, 1
        vpor    xmm0, xmm0, xmm1
        vpshufd xmm1, xmm0, 78          # xmm1 = xmm0[2,3,0,1]
        vpor    xmm0, xmm0, xmm1
        vmovq   rax, xmm0
        or      rax, rcx
        vzeroupper
        ret
      
      Differential Revision: https://reviews.llvm.org/D67841
      
      llvm-svn: 373833
      e2321bb4
    • Simon Pilgrim's avatar
      [X86] lowerShuffleAsLanePermuteAndRepeatedMask - variable renames. NFCI. · 9ecacb0d
      Simon Pilgrim authored
      Rename some variables to match lowerShuffleAsRepeatedMaskAndLanePermute - prep work toward adding some equivalent sublane functionality.
      
      llvm-svn: 373832
      9ecacb0d
    • Simon Pilgrim's avatar
      BranchFolding - IsBetterFallthrough - assert non-null pointers. NFCI. · f609c0a3
      Simon Pilgrim authored
      Silences static analyzer null dereference warnings.
      
      llvm-svn: 373823
      f609c0a3
    • Mehdi Amini's avatar
      Expose ProvidePositionalOption as a public API · 482f4d9a
      Mehdi Amini authored
      The motivation is to reuse the key value parsing logic here to
      parse instance specific pass options within the context of MLIR.
      The primary functionality exposed is the "," splitting for
      arrays and the logic for properly handling duplicate definitions
      of a single flag.
      
      Patch by: Parker Schuh <parkers@google.com>
      
      Differential Revision: https://reviews.llvm.org/D68294
      
      llvm-svn: 373815
      482f4d9a
    • Philip Reames's avatar
      Fix a *nasty* miscompile in experimental unordered atomic lowering · d5a4dad2
      Philip Reames authored
      This is an omission in rL371441.  Loads which happened to be unordered weren't being added to the PendingLoad set, and thus weren't be ordered w/respect to side effects which followed before the end of the block.
      
      Included test case is how I spotted this.  We had an atomic load being folded into a using instruction after a fence that load was supposed to be ordered with.  I'm sure it showed up a bunch of other ways as well.
      
      Spotted via manual inspecting of assembly differences in a corpus w/and w/o the new experimental mode.  Finding this with testing would have been "unpleasant".  
      
      llvm-svn: 373814
      d5a4dad2
    • Ana Pazos's avatar
      [RISCV] Added missing ImmLeaf predicates · ea835f5c
      Ana Pazos authored
      simm9_lsb0 and simm12_lsb0 operand types were missing predicates.
      
      llvm-svn: 373812
      ea835f5c
    • Aditya Kumar's avatar
      Invalidate assumption cache before outlining. · 6a267360
      Aditya Kumar authored
      Subscribers: llvm-commits
      
      Tags: #llvm
      
      Reviewers: compnerd, vsk, sebpop, fhahn, tejohnson
      
      Reviewed by: vsk
      
      Differential Revision: https://reviews.llvm.org/D68478
      
      llvm-svn: 373807
      6a267360
Loading