Skip to content
  1. Jan 21, 2014
    • Justin Holewinski's avatar
      7706107e
    • Saleem Abdulrasool's avatar
      tools: support decoding ARM EHABI opcodes in readobj · 9f0a21ef
      Saleem Abdulrasool authored
      Add support to llvm-readobj to decode the actual opcodes.  The ARM EHABI opcodes
      are a variable length instruction set that describe the operations required for
      properly unwinding stack frames.
      
      The primary motivation for this change is to ease the creation of tests for the
      ARM EHABI object emission as well as the unwinding directive handling in the ARM
      IAS.
      
      Thanks to Logan Chien for an extra test case!
      
      llvm-svn: 199708
      9f0a21ef
    • Saleem Abdulrasool's avatar
      ARM IAS: add support for .unwind_raw directive · d9f08603
      Saleem Abdulrasool authored
      This implements the unwind_raw directive for the ARM IAS.  The unwind_raw
      directive takes the form of a stack offset value followed by one or more bytes
      representing the opcodes to be emitted.  The opcode emitted will interpreted as
      if it were assembled by the opcode assembler via the standard unwinding
      directives.
      
      Thanks to Logan Chien for an extra test!
      
      llvm-svn: 199707
      d9f08603
    • Saleem Abdulrasool's avatar
      ARM IAS: support .personalityindex · 662f5c1a
      Saleem Abdulrasool authored
      The .personalityindex directive is equivalent to the .personality directive with
      the ARM EABI personality with the specific index (0, 1, 2).  Both of these
      directives indicate personality routines, so enhance the personality directive
      handling to take into account personalityindex.
      
      Bonus fix: flush the UnwindContext at the beginning of a new function.
      
      Thanks to Logan Chien for additional tests!
      
      llvm-svn: 199706
      662f5c1a
    • Kevin Qin's avatar
      [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT. · 6d379abd
      Kevin Qin authored
      It was commited as r199628 but reverted in r199628 as causing
      regression test failed. It's because of old vervsion of patch
      I used to commit. Sorry for mistake.
      
      llvm-svn: 199704
      6d379abd
  2. Jan 20, 2014
    • Andrea Di Biagio's avatar
      [X86] Teach how to combine a vselect into a movss/movsd · 450d1661
      Andrea Di Biagio authored
      Add target specific rules for combining vselect dag nodes into movss/movsd
      when possible.
      
      If the vector type of the vselect dag node in input is either MVT::v4i13 or
      MVT::v4f32, then try to fold according to rules:
      
        1) fold (vselect (build_vector (0, -1, -1, -1)), A, B) -> (movss A, B)
        2) fold (vselect (build_vector (-1, 0, 0, 0)), A, B) -> (movss B, A)
      
      If the vector type of the vselect dag node in input is either MVT::v2i64 or
      MVT::v2f64 (and we have SSE2), then try to fold according to rules:
      
        3) fold (vselect (build_vector (0, -1)), A, B) -> (movsd A, B)
        4) fold (vselect (build_vector (-1, 0)), A, B) -> (movsd B, A)
      
      llvm-svn: 199683
      450d1661
    • Adrian Prantl's avatar
      Debug info: On ARM ensure that all __TEXT sections come before the · 671af5ca
      Adrian Prantl authored
      optional DWARF sections, so compiling with -g does not result in
      different code being generated for PC-relative loads.
      
      This is reapplying a diet r197922 (__TEXT-only).
      
      llvm-svn: 199681
      671af5ca
    • Adrian Prantl's avatar
      Revert "Debug info: On ARM ensure that the data sections come before the" · 1a89924d
      Adrian Prantl authored
      Cut back on the cargo cult. The order of __DATA sections doesn't affect
      generated code.
      
      This reverts commit r197922.
      
      llvm-svn: 199680
      1a89924d
    • James Molloy's avatar
      Remove the useless pseudo instructions VDUPfdf and VDUPfqf, replacing them... · 43ccae1b
      James Molloy authored
      Remove the useless pseudo instructions VDUPfdf and VDUPfqf, replacing them with patterns to match VDUPLN.
      
      llvm-svn: 199675
      43ccae1b
    • Hal Finkel's avatar
      Fix misched-aa-colored.ll to require asserts (trying again) · 9ff54e1f
      Hal Finkel authored
      Perhaps it needs to be in caps.
      
      llvm-svn: 199661
      9ff54e1f
    • Hal Finkel's avatar
      Fix misched-aa-colored.ll to require asserts. · a6bcadeb
      Hal Finkel authored
      -misched=shuffle is NDEBUG only. Maybe we should change that.
      
      llvm-svn: 199659
      a6bcadeb
    • Hal Finkel's avatar
      Update IR when merging slots in stack coloring · cd9569c1
      Hal Finkel authored
      The way that stack coloring updated MMOs when merging stack slots, while
      correct, is suboptimal, and is incompatible with the use of AA during
      instruction scheduling. The solution, which involves the use of const_cast (and
      more importantly, updating the IR from within an MI-level pass), obviously
      requires some explanation:
      
      When the stack coloring pass was originally committed, the code in
      ScheduleDAGInstrs::buildSchedGraph tracked possible alias sets by using
      GetUnderlyingObject, and all load/store and store/store memory control
      dependencies where added between SUs at the object level (where only one
      object, that returned by GetUnderlyingObject, was used to identify the object
      associated with each MMO). When stack coloring merged stack slots, it would
      replace MMOs derived from the remapped alloca with the alloca with which the
      remapped alloca was being replaced. Because ScheduleDAGInstrs only used single
      objects, and tracked alias sets at the object level, this was a fine solution.
      
      In r169744, (Andy and) I updated the code in ScheduleDAGInstrs to use
      GetUnderlyingObjects, and track alias sets using, potentially, multiple
      underlying objects for each MMO. This was done, primarily, to provide the
      ability to look through PHIs, and provide better scheduling for
      induction-variable-dependent loads and stores inside loops. At this point, the
      MMO-updating code in stack coloring became suboptimal, because it would clear
      the MMOs for (i.e. completely pessimize) all instructions for which r169744
      might help in scheduling. Updating the IR directly is the simplest fix for this
      (and the one with, by far, the least compile-time impact), but others are
      possible (we could give each MMO a small vector of potential values, or make
      use of a remapping table, constructed from MFI, inside ScheduleDAGInstrs).
      
      Unfortunately, replacing all MMO values derived from the remapped alloca with
      the base replacement alloca fundamentally breaks our ability to use AA during
      instruction scheduling (which is critical to performance on some targets). The
      reason is that the original MMO might have had an offset (either constant or
      dynamic) from the base remapped alloca, and that offset is not present in the
      updated MMO. One possible way around this would be to use
      GetPointerBaseWithConstantOffset, and update not only the MMO's value, but also
      its offset based on the original offset. Unfortunately, this solution would
      only handle constant offsets, and for safety (because AA is not completely
      restricted to deducing relationships with constant offsets), we would need to
      clear all MMOs without constant offsets over the entire function. This would be
      an even worse pessimization than the current single-object restriction. Any
      other solution would involve passing around a vector of remapped allocas, and
      teaching AA to use it, introducing additional complexity and overhead into AA.
      
      Instead, when remapping an alloca, we replace all IR uses of that alloca as
      well (optionally inserting a bitcast as necessary). This is even more efficient
      that the old MMO-updating code in the stack coloring pass (because it removes
      the need to call GetUnderlyingObject on all MMO values), removes the
      single-object pessimization in the default configuration, and enables the
      correct use of AA during instruction scheduling (all without any additional
      overhead).
      
      LLVM now no longer miscompiles itself on x86_64 when using -enable-misched
      -enable-aa-sched-mi -misched-bottomup=0 -misched-topdown=0 -misched=shuffle!
      Fixed PR18497.
      
      Because the alloca replacement is now done at the IR level, unless the MMO
      directly refers to the remapped alloca, the change cannot be seen at the MI
      level. As a result, there is no good way to fix test/CodeGen/X86/pr14090.ll.
      
      llvm-svn: 199658
      cd9569c1
    • David Woodhouse's avatar
      [x86] Fix disassembly of MOV16ao16 et al. · caaa2850
      David Woodhouse authored
      The addition of IC_OPSIZE_ADSIZE in r198759 wasn't quite complete. It
      also turns out to have been unnecessary. The disassembler handles the
      AdSize prefix for itself, and doesn't care about the difference between
      (e.g.) MOV8ao8 and MOB8ao8_16 definitions. So just let them coexist and
      don't worry about it.
      
      llvm-svn: 199654
      caaa2850
    • David Woodhouse's avatar
      [x86] Fix 16-bit disassembly of JCXZ/JECXZ · 9c74fdb8
      David Woodhouse authored
      llvm-svn: 199653
      9c74fdb8
    • David Woodhouse's avatar
      [x86] Rename MOVSD/STOSD/LODSD/OUTSD to MOVSL/STOSL/LODSL/OUTSL · 3442f342
      David Woodhouse authored
      The disassembler has a special case for 'L' vs. 'W' in its heuristic for
      checking for 32-bit and 16-bit equivalents. We could expand the heuristic,
      but better just to be consistent in using the 'L' suffix.
      
      llvm-svn: 199652
      3442f342
    • David Woodhouse's avatar
      [x86] Fix disassembly of callw instruction · 70ced3e0
      David Woodhouse authored
      Not quite sure why this was marked isAsmParserOnly, but it means that the
      disassembler can't see it either.
      
      llvm-svn: 199651
      70ced3e0
    • David Woodhouse's avatar
      [x86] Fix 16-bit handling of OpSize bit · 5cf4c675
      David Woodhouse authored
      When disassembling in 16-bit mode the meaning of the OpSize bit is
      inverted. Instructions found in the IC_OPSIZE context will actually
      *not* have the 0x66 prefix, and instructions in the IC context will
      have the 0x66 prefix. Make use of the existing special-case handling
      for the 0x66 prefix being in the wrong place, to cope with this.
      
      llvm-svn: 199650
      5cf4c675
    • David Woodhouse's avatar
      [x86] Support i386-*-*-code16 triple for emitting 16-bit code · 71d15eda
      David Woodhouse authored
      llvm-svn: 199648
      71d15eda
    • Chandler Carruth's avatar
      [PM] Wire up the Verifier for the new pass manager and connect it to the · 4d35631a
      Chandler Carruth authored
      various opt verifier commandline options.
      
      Mostly mechanical wiring of the verifier to the new pass manager.
      Exercises one of the more unusual aspects of it -- a pass can be either
      a module or function pass interchangably. If this is ever problematic,
      we can make things more constrained, but for things like the verifier
      where there is an "obvious" applicability at both levels, it seems
      convenient.
      
      This is the next-to-last piece of basic functionality left to make the
      opt commandline driving of the new pass manager minimally functional for
      testing and further development. There is still a lot to be done there
      (notably the factoring into .def files to kill the current boilerplate
      code) but it is relatively uninteresting. The only interesting bit left
      for minimal functionality is supporting the registration of analyses.
      I'm planning on doing that on top of the .def file switch mostly because
      the boilerplate for the analyses would be significantly worse.
      
      llvm-svn: 199646
      4d35631a
    • Kai Nacke's avatar
      ARM: add tlsldo relocation · e51c8138
      Kai Nacke authored
      Add support for the symbol(tlsldo) relocation. This is required in order to 
      solve PR18554.
      
      Reviewed by R. Golin, A. Korobeynikov.
      
      llvm-svn: 199644
      e51c8138
    • Artyom Skrobov's avatar
      [ARM] Do not generate Tag_DIV_use=AllowDIVExt when hardware div is... · 10e76a4e
      Artyom Skrobov authored
      [ARM] Do not generate Tag_DIV_use=AllowDIVExt when hardware div is non-optional: it should have the default value of AllowDIVIfExists
      
      llvm-svn: 199638
      10e76a4e
    • Chandler Carruth's avatar
      Revert r199628: "[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT." · f835fc6f
      Chandler Carruth authored
      This test fails the newly added regression tests.
      
      llvm-svn: 199631
      f835fc6f
    • Owen Anderson's avatar
      Fix all the remaining lost-fast-math-flags bugs I've been able to find. The... · 1664dc89
      Owen Anderson authored
      Fix all the remaining lost-fast-math-flags bugs I've been able to find.  The most important of these are cases in the generic logic for combining BinaryOperators.
      This logic hadn't been updated to handle FastMathFlags, and it took me a while to detect it because it doesn't show up in a simple search for CreateFAdd.
      
      llvm-svn: 199629
      1664dc89
    • Kevin Qin's avatar
      [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT. · ff42e06e
      Kevin Qin authored
      llvm-svn: 199628
      ff42e06e
    • Kevin Qin's avatar
      [AArch64 NEON] Accept both #0.0 and #0 for comparing with floating point zero in asm parser. · ef66ff78
      Kevin Qin authored
      For FCMEQ, FCMGE, FCMGT, FCMLE and FCMLT, floating point zero will be
      printed as #0.0 instead of #0. To support the history codes using #0,
      we consider to let asm parser accept both #0.0 and #0.
      
      llvm-svn: 199621
      ef66ff78
  3. Jan 19, 2014
    • Benjamin Kramer's avatar
      InstCombine: Modernize a bunch of cast combines. · b80e1699
      Benjamin Kramer authored
      Also make them vector-aware.
      
      llvm-svn: 199608
      b80e1699
    • Benjamin Kramer's avatar
    • Benjamin Kramer's avatar
    • Benjamin Kramer's avatar
      InstCombine: Refactor fmul/fdiv combines to handle vectors. · 76b15d04
      Benjamin Kramer authored
      llvm-svn: 199598
      76b15d04
    • Chandler Carruth's avatar
      Fix a really nasty SROA bug with how we handled out-of-bounds memcpy · 1bf38c6a
      Chandler Carruth authored
      intrinsics.
      
      Reported on the list by Evan with a couple of attempts to fix, but it
      took a while to dig down to the root cause. There are two overlapping
      bugs here, both centering around the circumstance of discovering
      a memcpy operand which is known to be completely outside the bounds of
      the alloca.
      
      First, we need to kill the *other* side of the memcpy if it was added to
      this alloca. Otherwise we'll factor it into our slicing and try to
      rewrite it even though we know for a fact that it is dead. This is made
      more tricky because we can visit the sides in either order. So we have
      to both kill the other side and skip instructions marked as dead. The
      latter really should be goodness in every case, but here is a matter of
      correctness.
      
      Second, we need to actually remove the *uses* of the alloca by the
      memcpy when queuing it for later deletion. Otherwise it may still be
      using the alloca when we go to promote it (if the rewrite re-uses the
      existing alloca instruction). Do this by factoring out the
      use-clobbering used when for nixing a Phi argument and re-using it
      across the operands of a to-be-deleted instruction.
      
      llvm-svn: 199590
      1bf38c6a
    • Saleem Abdulrasool's avatar
      ARM ELF: ensure that the tag types are corrected · 93900055
      Saleem Abdulrasool authored
      Ensure that the tag types are reflected on a replacement.  This is particularly
      important for the compatibility tag which has multiple representations where the
      last definition wins.
      
      llvm-svn: 199577
      93900055
    • Saleem Abdulrasool's avatar
      ARM: update build attributes for ABI r2.09 · 196c3212
      Saleem Abdulrasool authored
      Update names for the names as per the current ABI errata.  Mark deprecated tags
      as such.
      
      llvm-svn: 199576
      196c3212
    • Arnold Schwaighofer's avatar
      LoopVectorizer: A reduction that has multiple uses of the reduction value is not · cc742dd9
      Arnold Schwaighofer authored
      a reduction.
      
      Really. Under certain circumstances (the use list of an instruction has to be
      set up right - hence the extra pass in the test case) we would not recognize
      when a value in a potential reduction cycle was used multiple times by the
      reduction cycle.
      
      Fixes PR18526.
      radar://15851149
      
      llvm-svn: 199570
      cc742dd9
    • Chandler Carruth's avatar
      [PM] Make the verifier work independently of any pass manager. · 043949d4
      Chandler Carruth authored
      This makes the 'verifyFunction' and 'verifyModule' functions totally
      independent operations on the LLVM IR. It also cleans up their API a bit
      by lifting the abort behavior into their clients and just using an
      optional raw_ostream parameter to control printing.
      
      The implementation of the verifier is now just an InstVisitor with no
      multiple inheritance. It also is significantly more const-correct, and
      hides the const violations internally. The two layers that force us to
      break const correctness are building a DomTree and dispatching through
      the InstVisitor.
      
      A new VerifierPass is used to implement the legacy pass manager
      interface in terms of the other pieces.
      
      The error messages produced may be slightly different now, and we may
      have slightly different short circuiting behavior with different usage
      models of the verifier, but generally everything works equivalently and
      this unblocks wiring the verifier up to the new pass manager.
      
      llvm-svn: 199569
      043949d4
  4. Jan 18, 2014
Loading