Skip to content
  1. Aug 20, 2013
    • Reed Kotler's avatar
      Add an option which permits the user to specify using a bitmask, that various · d8f33625
      Reed Kotler authored
      functions be compiled as mips32, without having to add attributes. This
      is useful in certain situations where you don't want to have to edit the
      function attributes in the source. For now it's only an option used for
      the compiler developers when debugging the mips16 port.
      
      llvm-svn: 188826
      d8f33625
    • Akira Hatanaka's avatar
      [mips] Guard micromips instructions with predicate InMicroMips. Also, fix · a43b56d9
      Akira Hatanaka authored
      assembler predicate HasStdEnd so that it is false when the target is micromips.
      
      llvm-svn: 188824
      a43b56d9
    • Jim Grosbach's avatar
      ARM: Fix fast-isel copy/paste-o. · 71a78f96
      Jim Grosbach authored
      Update testcase to be more careful about checking register
      values. While regexes are general goodness for these sorts of
      testcases, in this example, the registers are constrained by
      the calling convention, so we can and should check their
      explicit values.
      
      rdar://14779513
      
      llvm-svn: 188819
      71a78f96
    • Elena Demikhovsky's avatar
      AVX-512: Added more patterns for VMOVSS, VMOVSD, VMOVD, VMOVQ · 540d5825
      Elena Demikhovsky authored
      llvm-svn: 188786
      540d5825
    • Daniel Sanders's avatar
      [mips][msa] Removed fcge, fcgt, fsge, fsgt · 4260527f
      Daniel Sanders authored
      These instructions were present in a draft spec but were removed before
      publication.
      
      llvm-svn: 188782
      4260527f
    • Richard Sandiford's avatar
      [SystemZ] Update README · 2bf7b8cc
      Richard Sandiford authored
      We now use MVST, CLST and SRST for the obvious cases.
      
      llvm-svn: 188781
      2bf7b8cc
    • Richard Sandiford's avatar
      [SystemZ] Use SRST to optimize memchr · 6f6d5516
      Richard Sandiford authored
      SystemZTargetLowering::emitStringWrapper() previously loaded the character
      into R0 before the loop and made R0 live on entry.  I'd forgotten that
      allocatable registers weren't allowed to be live across blocks at this stage,
      and it confused LiveVariables enough to cause a miscompilation of f3 in
      memchr-02.ll.
      
      This patch instead loads R0 in the loop and leaves LICM to hoist it
      after RA.  This is actually what I'd tried originally, but I went for
      the manual optimisation after noticing that R0 often wasn't being hoisted.
      This bug forced me to go back and look at why, now fixed as r188774.
      
      We should also try to optimize null checks so that they test the CC result
      of the SRST directly.  The select between null and the SRST GPR result could
      then usually be deleted as dead.
      
      llvm-svn: 188779
      6f6d5516
    • Daniel Sanders's avatar
      [mips][msa] Added insve · f2a0f1d1
      Daniel Sanders authored
      llvm-svn: 188777
      f2a0f1d1
    • Tim Northover's avatar
      ARM: implement some simple f64 materializations. · f79c3a5a
      Tim Northover authored
      Previously we used a const-pool load for virtually all 64-bit floating values.
      Actually, we can get quite a few common values (including 0.0, 1.0) via "vmov"
      instructions of one stripe or another.
      
      llvm-svn: 188773
      f79c3a5a
    • Daniel Sanders's avatar
      [mips][msa] Added and.v, bmnz.v, bmz.v, bsel.v, nor.v, or.v, xor.v · 869bdad9
      Daniel Sanders authored
      llvm-svn: 188767
      869bdad9
    • Craig Topper's avatar
      Fix formatting. No functional change. · 7a8cf010
      Craig Topper authored
      llvm-svn: 188746
      7a8cf010
    • Craig Topper's avatar
      Add AVX-512 and related features to the CPUID detection code. · e13a066c
      Craig Topper authored
      llvm-svn: 188745
      e13a066c
    • Craig Topper's avatar
      Move AVX and non-AVX replication inside a couple multiclasses to avoid... · fd2b3892
      Craig Topper authored
      Move AVX and non-AVX replication inside a couple multiclasses to avoid repeating each instruction for both individually.
      
      llvm-svn: 188743
      fd2b3892
    • Bill Schmidt's avatar
      [PowerPC] More refactoring prior to real PPC emitPrologue/Epilogue changes. · f381afc9
      Bill Schmidt authored
      (Patch committed on behalf of Mark Minich, whose log entry follows.)
      
      This is a continuation of the refactorings performed in svn rev 188573
      (see that rev's comments for more detail).
      
      This is my stage 2 refactoring: I combined the emitPrologue() &
      emitEpilogue() PPC32 & PPC64 code into a single flow, simplifying a
      lot of the code since in essence the PPC32 & PPC64 code generation
      logic is the same, only the instruction forms are different (in most
      cases). This simplification is necessary because my functional changes
      (yet to come) add significant complexity, and without the
      simplification of my stage 2 refactoring, the overall complexity of
      both emitPrologue() & emitEpilogue() would have become almost
      intractable for most mortal programmers (like me).
      
      This submission was intended to be a pure refactoring (no functional
      changes whatsoever). However, in the process of combining the PPC32 &
      PPC64 flows, I spotted a difference that I believe is a bug (see svn
      rev 186478 line 863, or svn rev 188573 line 888): This line appears to
      be restoring the BP with the original FP content, not the original BP
      content. When I merged the 32-bit and 64-bit code, I used the
      corresponding code from the 64-bit flow, which I believe uses the
      correct offset (BPOffset) for this operation.
      
      llvm-svn: 188741
      f381afc9
    • Venkatraman Govindaraju's avatar
      [Sparc] Use HWEncoding instead of unused Num field in Sparc register... · f625773b
      Venkatraman Govindaraju authored
      [Sparc] Use HWEncoding instead of unused Num field in Sparc register definitions. Also, correct the definitions of RETL and RET instructions.
      
      llvm-svn: 188738
      f625773b
    • Hal Finkel's avatar
      Add a llvm.copysign intrinsic · 0c5c01aa
      Hal Finkel authored
      This adds a llvm.copysign intrinsic; We already have Libfunc recognition for
      copysign (which is turned into the FCOPYSIGN SDAG node). In order to
      autovectorize calls to copysign in the loop vectorizer, we need a corresponding
      intrinsic as well.
      
      In addition to the expected changes to the language reference, the loop
      vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into
      an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a
      few lists in LegalizeVector{Ops,Types} so that vector copysigns can be
      expanded.
      
      In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN
      be Expand for vector types. This seems correct for all in-tree targets, and I
      think is the right thing to do because, previously, there was no way to generate
      vector-values FCOPYSIGN nodes (and most targets don't specify an action for
      vector-typed FCOPYSIGN).
      
      llvm-svn: 188728
      0c5c01aa
    • Hal Finkel's avatar
      Don't form PPC CTR-based loops around a copysignl call · 1cf48ab8
      Hal Finkel authored
      copysign/copysignf never become function calls (because the SDAG expansion code
      does not lower to the corresponding function call, but rather directly
      implements the associated logic), but copysignl almost always is lowered into a
      call to the requested libm functon (and, thus, might clobber CTR).
      
      llvm-svn: 188727
      1cf48ab8
  2. Aug 19, 2013
    • Akira Hatanaka's avatar
    • Mihai Popa's avatar
      Thumb2 add immediate alias for SP · 4a9df8a7
      Mihai Popa authored
      The Thumb2 add immediate is in fact defined for SP. The manual is misleading as it points to a different section for add immediate with SP, however the encoding is the same as for add immediate with register only with the SP operand hard coded. As such add immediate with SP and add immediate with register can safely be treated as the same instruction.
      
      All the patch does is adjust a register constraint on an instruction alias.
      
      llvm-svn: 188676
      4a9df8a7
    • Elena Demikhovsky's avatar
      AVX-512: added arithmetic and logical operations. · 1490c5eb
      Elena Demikhovsky authored
      ADD, SUB, MUL integer and FP types. OR, AND, XOR.
      Added embeded broadcast form for these instructions.
      
      llvm-svn: 188673
      1490c5eb
    • Richard Sandiford's avatar
      [SystemZ] Add negative integer absolute (load negative) · 784a5803
      Richard Sandiford authored
      For now this matches the equivalent of (neg (abs ...)), which did hit a few
      times in projects/test-suite.  We should probably also match cases where
      absolute-like selects are used with reversed arguments.
      
      llvm-svn: 188671
      784a5803
    • Richard Sandiford's avatar
      [SystemZ] Add integer absolute (load positive) · 4b897054
      Richard Sandiford authored
      llvm-svn: 188670
      4b897054
    • Richard Sandiford's avatar
      [SystemZ] Add support for sibling calls · 709bda66
      Richard Sandiford authored
      This first cut is pretty conservative.  The final argument register (R6)
      is call-saved, so we would need to make sure that the R6 argument to a
      sibling call is the same as the R6 argument to the calling function,
      which seems worth keeping as a separate patch.
      
      Saying that integer truncations are free means that we no longer
      use the extending instructions LGF and LLGF for spills in int-conv-09.ll
      and int-conv-10.ll.  Instead we treat the registers as 64 bits wide and
      truncate them to 32-bits where necessary.  I think it's unlikely we'd
      use LGF and LLGF for spills in other situations for the same reason,
      so I'm removing the tests rather than replacing them.  The associated
      code is generic and applies to many more instructions than just
      LGF and LLGF, so there is no corresponding code removal.
      
      llvm-svn: 188669
      709bda66
    • Hal Finkel's avatar
      Add the PPC fcpsgn instruction · dbc78e1f
      Hal Finkel authored
      Modern PPC cores support a floating-point copysign instruction, and we can use
      this to lower the FCOPYSIGN node (which is created from calls to the libm
      copysign function). A couple of extra patterns are necessary because the
      operand types of FCOPYSIGN need not agree.
      
      llvm-svn: 188653
      dbc78e1f
  3. Aug 18, 2013
  4. Aug 17, 2013
  5. Aug 16, 2013
    • Bill Schmidt's avatar
      [PowerPC] Preparatory refactoring for making prologue and epilogue · 8893a3d1
      Bill Schmidt authored
      safe on PPC32 SVR4 ABI
      
      [Patch and following text by Mark Minich; committing on his behalf.]
      
      There are FIXME's in PowerPC/PPCFrameLowering.cpp, method
      PPCFrameLowering::emitPrologue() related to "negative offsets of R1"
      on PPC32 SVR4. They're true, but the real issue is that on PPC32 SVR4
      (and any ABI without a Red Zone), no spills may be made until after
      the stackframe is claimed, which also includes the LR spill which is
      at a positive offset. The same problem exists in emitEpilogue(),
      though there's no FIXME for it. I intend to fix this issue, making
      LLVM-compiled code finally safe for use on SVR4/EABI/e500 32-bit
      platforms (including in particular, OS-free embedded systems & kernel
      code, where interrupts may share the same stack as user code).
      
      In preparation for making these changes, to make the diffs for the
      functional changes less cluttered, I am providing the non-functional
      refactorings in two stages:
      
      Stage 1 does some minor fluffy refactorings to pull multiple method
      calls up into a single bool, creating named bools for repeated uses of
      obscure logic, moving some code up earlier because either stage 2 or
      my final version will require it earlier, and rewording/adding some
      comments. My stage 1 changes can be characterized as primarily fluffy
      cleanup, the purpose of which may be unclear until the stage 2 or
      final changes are made.
      
      My stage 2 refactorings combine the separate PPC32 & PPC64 logic,
      which is currently performed by largely duplicate code, into a single
      flow, with the differences handled by a group of constants initialized
      early in the methods.
      
      This submission is for my stage 1 changes. There should be no
      functional changes whatsoever; this is a pure refactoring.
      
      llvm-svn: 188573
      8893a3d1
    • Michel Danzer's avatar
      R600/SI: Add pattern for xor of i1 · 8522270d
      Michel Danzer authored
      
      
      Fixes two recent piglit regressions with radeonsi.
      
      Reviewed-by: default avatarTom Stellard <thomas.stellard@amd.com>
      llvm-svn: 188559
      8522270d
    • Michel Danzer's avatar
      R600/SI: Fix broken encoding of DS_WRITE_B32 · 20680b1c
      Michel Danzer authored
      
      
      The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD
      instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused
      it to corrupt the encoding of that by clobbering the first operand with
      the second one.
      
      Undo that damage and only apply the SMRD logic to that.
      
      Fixes some derivates related piglit regressions with radeonsi.
      
      Reviewed-by: default avatarTom Stellard <thomas.stellard@amd.com>
      llvm-svn: 188558
      20680b1c
Loading