Skip to content
  1. Feb 15, 2017
    • Stanislav Mekhanoshin's avatar
      [AMDGPU] Revert failed scheduling · 582a5237
      Stanislav Mekhanoshin authored
      This patch reverts region's scheduling to the original untouched state
      in case if we have have decreased occupancy.
      
      In addition it switches to use TargetRegisterInfo occupancy callback
      for pressure limits instead of gradually increasing limits which were
      just passed by. We are going to stay with the best schedule so we do
      not need to tolerate worsened scheduling anymore.
      
      Differential Revision: https://reviews.llvm.org/D29971
      
      llvm-svn: 295206
      582a5237
    • Simon Pilgrim's avatar
      [X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise ZERO inputs · 0f0e5bd3
      Simon Pilgrim authored
      Add support for specifying an UNPCK input as ZERO, particularly improves ZEXT cases with non-zero offsets
      
      llvm-svn: 295169
      0f0e5bd3
    • Sagar Thakur's avatar
      [LLVM][XRAY][MIPS] Support xray on mips/mipsel/mips64/mips64el · ec657929
      Sagar Thakur authored
      Summary: Adds support for xray instrumentation on mips for both 32-bit and 64-bit.
      
      Reviewed by sdardis, dberris
      Differential: D27697
      
      llvm-svn: 295164
      ec657929
    • Ayman Musa's avatar
      [X86][AVX] Remove REX_W from AVX instructions. · b8a4f255
      Ayman Musa authored
      There is no meaning for REX_W in VEX encoded AVX instruction.
      
      Differential Revision: https://reviews.llvm.org/D29894
      
      llvm-svn: 295157
      b8a4f255
    • Craig Topper's avatar
      [X86] Don't create VBROADCAST nodes with 256-bit or 512-bit input types · fbc7805e
      Craig Topper authored
      Summary:
      We don't seem to have great rules on what a valid VBROADCAST node looks like. And as a consequence we end up with a lot of patterns to try to catch everything. We have patterns with scalar inputs, 128-bit vector inputs, 256-bit vector inputs, and 512-bit vector inputs.
      
      As you can see from the things improved here we are currently missing patterns for 128-bit loads being extended to 256-bit before the vbroadcast.
      
      I'd like to propose that VBROADCAST should always take a 128-bit vector type as input. As a first step towards that this patch adds an EXTRACT_SUBVECTOR in front of VBROADCAST when the input is 256 or 512-bits. In the future I would like to add scalar_to_vector around all the scalar operations. And maybe we should consider adding a VBROADCAST+load node to avoid separating loads from the broadcasting operation when the load itself isn't foldable.
      
      This requires an additional change in target shuffle combining to look for the extract subvector and look through it to find the original operand. I'm sure this change isn't perfect but was enough to fix a few test failures that were being caused.
      
      Another interesting thing I noticed is that the changes in masked_gather_scatter.ll show cases were we don't remove a useless insert into element 1 before broadcasting element 0.
      
      Reviewers: delena, RKSimon, zvi
      
      Reviewed By: zvi
      
      Subscribers: igorb, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D28747
      
      llvm-svn: 295155
      fbc7805e
    • Craig Topper's avatar
      [AVX-512] Add PACKSS/PACKUS instructions to load folding tables. · ec5df5f4
      Craig Topper authored
      llvm-svn: 295154
      ec5df5f4
    • Stanislav Mekhanoshin's avatar
      [AMDGPU] Fix MaxWorkGroupsPerCU for large workgroups · 19f98c6a
      Stanislav Mekhanoshin authored
      This patch corrects the maximum workgroups per CU if we have big
      workgroups (more than 128). This calculation contributes to the
      occupancy calculation in respect to LDS size.
      
      Differential Revision: https://reviews.llvm.org/D29974
      
      llvm-svn: 295134
      19f98c6a
  2. Feb 14, 2017
  3. Feb 13, 2017
    • Tim Northover's avatar
      GlobalISel: represent atomic loads & stores via the MachineMemOperand. · 48dfa1a6
      Tim Northover authored
      Also make sure the AArch64 backend doesn't try to convert them into normal
      loads and stores.
      
      llvm-svn: 294993
      48dfa1a6
    • James Molloy's avatar
      [ARM] Fix crash caused by r294945 · 0ae22022
      James Molloy authored
      I'd missed a creator of FCMP nodes - duplicateCmp().
      
      Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite.
      
      llvm-svn: 294968
      0ae22022
    • Simon Dardis's avatar
      [mips] divide macro instruction cleanup. · 509da1a4
      Simon Dardis authored
      Clean up the implementation of divide macro expansion by getting rid of a
      FIXME regarding magic numbers and branch instructions. Match GAS' behaviour
      for expansion of ddiv / div in the two and three operand cases. Add the two
      operand alias for MIPSR6. Finally, optimize macro expansion cases where the
      divisior is the $zero register.
      
      Reviewers: slthakur
      
      Differential Revision: https://reviews.llvm.org/D29887
      
      llvm-svn: 294960
      509da1a4
    • Simon Pilgrim's avatar
      Fix indentation. NFCI. · fd6a84fb
      Simon Pilgrim authored
      llvm-svn: 294959
      fd6a84fb
    • Sanne Wouda's avatar
      [CodeGen] fix alignment of JUMPTABLE_INSTS on v8M.base · 490d4a6d
      Sanne Wouda authored
      Summary:
      The attached test case fails with "fatal error: error in backend:
      misaligned pc-relative fixup value" as the jump table is misaligned.
      The EmitAlignment existed already for ARM and Thumb-1 code, but was
      missing for Thumb-2.
      
      The test checks that the fatal error disappears when generating an obj
      file, as well as checking the align directive is there when producing an
      asm file.
      
      
      Reviewers: rengolin, grosbach, t.p.northover, jmolloy, SjoerdMeijer, samparker
      
      Reviewed By: samparker
      
      Subscribers: samparker, aemerson, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D29650
      
      llvm-svn: 294950
      490d4a6d
    • James Molloy's avatar
      [Thumb-1] TBB generation: spot redefinitions of index register · 92497542
      James Molloy authored
      We match a sequence of 3-4 instructions into a tTBB pseudo. One of our checks is that
      a particular register in that sequence is killed (so it can be clobbered by the pseudo).
      
      We weren't noticing if an errant MOV or other instruction had infiltrated the
      sequence we were walking. If it had, and it defined the register we've already
      identified as killed, it makes it live across the tBR_JT and thus unclobberable.
      
      Notice this case and bail out.
      
      llvm-svn: 294949
      92497542
    • James Molloy's avatar
      [ARM] Register ConstantIslands with the pass manager · 9b3b8996
      James Molloy authored
      This allows us to use -stop-before/-stop-after/-run-pass - we can now write
      .mir tests.
      
      llvm-svn: 294948
      9b3b8996
    • James Molloy's avatar
      [ARM] Use VCMP, not VCMPE, for floating point equality comparisons · d5087896
      James Molloy authored
      When generating a floating point comparison we currently unconditionally
      generate VCMPE. This has the sideeffect of setting the cumulative Invalid
      bit in FPSCR if any of the operands are QNaN.
      
      It is expected that use of a relational predicate on a QNaN value should
      raise Invalid. Quoting from the C standard:
      
        The relational and equality operators support the usual mathematical
        relationships between numeric values. For any ordered pair of numeric
        values exactly one of relationships the less, greater, equal and is true.
        Relational operators may raise the floating-point exception when argument
        values are NaNs.
      
      The standard doesn't explicitly state the expectation for equality operators,
      but the implication and obvious expectation is that equality operators
      should not raise Invalid on a QNaN input, as those predicates are wholly
      defined on unordered inputs (to return not equal).
      
      Therefore, add a new operand to ARMISD::FPCMP and FPCMPZ indicating if
      QNaN should raise Invalid, and pipe that through to TableGen.
      
      llvm-svn: 294945
      d5087896
    • Simon Pilgrim's avatar
      [X86][SSE] Create matchVectorShuffleWithUNPCK helper function. · 828dee1f
      Simon Pilgrim authored
      Currently only used by target shuffle combining - will use it for lowering as well in a future patch.
      
      llvm-svn: 294943
      828dee1f
    • Ayman Musa's avatar
      [X86][AVX512] Fix operand classes for some AVX512 instructions to keep... · f77219e0
      Ayman Musa authored
      [X86][AVX512] Fix operand classes for some AVX512 instructions to keep consistency between VEX/EVEX versions of the same instruction.
      
      Differential Revision: https://reviews.llvm.org/D29873
      
      llvm-svn: 294937
      f77219e0
    • Craig Topper's avatar
      [X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR to... · 680c73e7
      Craig Topper authored
      [X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR to support 512-bit vectors with 128-bit or 256-bit subvectors.
      
      We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors.
      
      llvm-svn: 294931
      680c73e7
    • Craig Topper's avatar
      [X86] Don't let LowerEXTRACT_SUBVECTOR call getNode for EXTRACT_SUBVECTOR. · 53eafa8e
      Craig Topper authored
      This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist.
      
      llvm-svn: 294929
      53eafa8e
  4. Feb 12, 2017
Loading