Skip to content
  1. Sep 23, 2019
    • Djordje Todorovic's avatar
      Reland "[utils] Implement the llvm-locstats tool" · 0e490ae0
      Djordje Todorovic authored
      The tool reports verbose output for the DWARF debug location coverage.
      The llvm-locstats for each variable or formal parameter DIE computes what
      percentage from the code section bytes, where it is in scope, it has
      location description. The line 0 shows the number (and the percentage) of
      DIEs with no location information, but the line 100 shows the number (and
      the percentage) of DIEs where there is location information in all code
      section bytes (where the variable or parameter is in the scope). The line
      50..59 shows the number (and the percentage) of DIEs where the location
      information is in between 50 and 59 percentage of its scope covered.
      
      Differential Revision: https://reviews.llvm.org/D66526
      
      llvm-svn: 372554
      0e490ae0
    • Craig Topper's avatar
      03b5a13e
    • Craig Topper's avatar
      [X86] Remove SETEQ/SETNE canonicalization code from LowerIntVSETCC_AVX512 to... · 5e26064c
      Craig Topper authored
      [X86] Remove SETEQ/SETNE canonicalization code from LowerIntVSETCC_AVX512 to prevent an infinite loop.
      
      The attached test case would previous infinite loop after
      r365711.
      
      I'm going to move this to X86ISelDAGToDAG.cpp to get the setcc
      to match VPTEST in 32-bit mode in a follow up commit.
      
      llvm-svn: 372543
      5e26064c
    • Craig Topper's avatar
      [X86] Add 32-bit command line to avx512f-vec-test-testn.ll · 1f058538
      Craig Topper authored
      llvm-svn: 372542
      1f058538
    • David Zarzycki's avatar
      Prefer AVX512 memcpy when applicable · a7a515cb
      David Zarzycki authored
      When AVX512 is available and the preferred vector width is 512-bits or
      more, we should prefer AVX512 for memcpy().
      
      https://bugs.llvm.org/show_bug.cgi?id=43240
      
      https://reviews.llvm.org/D67874
      
      llvm-svn: 372540
      a7a515cb
    • Craig Topper's avatar
      [X86] Convert to Constant arguments to MMX shift by i32 intrinsics to... · da4a4707
      Craig Topper authored
      [X86] Convert to Constant arguments to MMX shift by i32 intrinsics to TargetConstant during lowering.
      
      This allows us to use timm in the isel table which is more
      consistent with other intrinsics that take an immediate now.
      
      We can't declare the intrinsic as taking an ImmArg because we
      need to match non-constants to the shift by MMX register
      instruction which we do by mutating the intrinsic id during
      lowering.
      
      llvm-svn: 372537
      da4a4707
    • Craig Topper's avatar
      [X86] Remove stale FIXME. · 5efc928a
      Craig Topper authored
      This goes back to when MMX was migrated to intrinsic only. The
      hack referenced here has been gone for quite a while.
      
      llvm-svn: 372536
      5efc928a
    • Craig Topper's avatar
      [X86][SelectionDAGBuilder] Move the hack for handling MMX shift by i32... · a533e877
      Craig Topper authored
      [X86][SelectionDAGBuilder] Move the hack for handling MMX shift by i32 intrinsics into the X86 backend.
      
      This intrinsics should be shift by immediate, but gcc allows any
      i32 scalar and clang needs to match that. So we try to detect the
      non-constant case and move the data from an integer register to an
      MMX register.
      
      Previously this was done by creating a v2i32 build_vector and
      bitcast in SelectionDAGBuilder. This had to be done early since
      v2i32 isn't a legal type. The bitcast+build_vector would be DAG
      combined to X86ISD::MMX_MOVW2D which isel will turn into a
      GPR->MMX MOVD.
      
      This commit just moves the whole thing to lowering and emits
      the X86ISD::MMX_MOVW2D directly to avoid the illegal type. The
      test changes just seem to be due to nodes being linearized in a
      different order.
      
      llvm-svn: 372535
      a533e877
    • Craig Topper's avatar
      [X86] Require last argument to LWPINS/LWPVAL builtins to be an ICE. Add ImmArg... · e4c17651
      Craig Topper authored
      [X86] Require last argument to LWPINS/LWPVAL builtins to be an ICE. Add ImmArg to the llvm intrinsics.
      
      Update the isel patterns to use timm instead of imm.
      
      llvm-svn: 372534
      e4c17651
    • Roman Lebedev's avatar
      [X86] X86DAGToDAGISel::matchBEXTRFromAndImm(): if can't use BEXTR, fallback to... · 7c3d6f5a
      Roman Lebedev authored
      [X86] X86DAGToDAGISel::matchBEXTRFromAndImm(): if can't use BEXTR, fallback to BZHI is profitable (PR43381)
      
      Summary:
      PR43381 notes that while we are good at matching `(X >> C1) & C2` as BEXTR/BEXTRI,
      we only do that if we either have BEXTRI (TBM),
      or if BEXTR is marked as being fast (`-mattr=+fast-bextr`).
      In all other cases we don't match.
      
      But that is mainly only true for AMD CPU's.
      However, for all the CPU's for which we have sched models,
      the BZHI is always fast (or the sched models are all bad.)
      
      So if we decide that it's unprofitable to emit BEXTR/BEXTRI,
      we should consider falling-back to BZHI if it is available,
      and follow-up with the shift.
      
      While it's really tempting to do something because it's cool
      it is wise to first think whether it actually makes sense to do.
      We shouldn't just use BZHI because we can, but only it it is beneficial.
      In particular, it isn't really worth it if the input is a register,
      mask is small, or we can fold a load.
      But it is worth it if the mask does not fit into 32-bits.
      
      (careful, i don't know much about intel cpu's, my choice of `-mcpu` may be bad here)
      Thus we manage to fold a load:
      https://godbolt.org/z/Er0OQz
      Or if we'd end up using BZHI anyways because the mask is large:
      https://godbolt.org/z/dBJ_5h
      But this isn'r actually profitable in general case,
      e.g. here we'd increase microop count
      (the register renaming is free, mca does not model that there it seems)
      https://godbolt.org/z/k6wFoz
      Likewise, not worth it if we just get load folding:
      https://godbolt.org/z/1M1deG
      
      https://bugs.llvm.org/show_bug.cgi?id=43381
      
      Reviewers: RKSimon, craig.topper, davezarzycki, spatel
      
      Reviewed By: craig.topper, davezarzycki
      
      Subscribers: andreadb, hiraditya, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D67875
      
      llvm-svn: 372532
      7c3d6f5a
  2. Sep 22, 2019
Loading