Skip to content
  1. Aug 05, 2015
  2. Aug 04, 2015
  3. Aug 03, 2015
    • Tim Northover's avatar
      ARM: prefer allocating VFP regs at stride 4 on Darwin. · 910dde7a
      Tim Northover authored
      This is necessary for WatchOS support, where the compact unwind format assumes
      this kind of layout. For now we only want this on Swift-like CPUs though, where
      it's been the Xcode behaviour for ages. Also, since it can expand the prologue
      we don't want it at -Oz.
      
      llvm-svn: 243884
      910dde7a
    • John Brawn's avatar
      [ARM] Make GlobalMerge merge extern globals by default · f3324cf1
      John Brawn authored
      Enabling merging of extern globals appears to be generally either beneficial or
      harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57)
      it gives improvements in the 1-5% range, but in the rest the overall effect is
      zero.
      
      Differential Revision: http://reviews.llvm.org/D10966
      
      llvm-svn: 243874
      f3324cf1
    • James Molloy's avatar
      Be less conservative about forming IT blocks. · 6967e5e4
      James Molloy authored
      In http://reviews.llvm.org/rL215382, IT forming was made more conservative under
      the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M.
      
      But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for
      v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an
      IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block
      as long as the flags register is dead afterwards.
      
      This gives significant performance improvements in a variety of MPEG based workloads.
      
      Differential revision: http://reviews.llvm.org/D11680
      
      llvm-svn: 243869
      6967e5e4
  4. Aug 02, 2015
  5. Aug 01, 2015
  6. Jul 31, 2015
    • Sumanth Gundapaneni's avatar
      [ARM] Lower modulo operation to generate __aeabi_divmod on Android · 532a1369
      Sumanth Gundapaneni authored
          
      For a modulo (reminder) operation,
      clang -target armv7-none-linux-gnueabi generates "__modsi3"
      clang -target armv7-none-eabi generates "__aeabi_idivmod"
      clang -target armv7-linux-androideabi generates "__modsi3"
      Android bionic libc doesn't provide a __modsi3, instead it provides a
      "__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate
      the correct call when ever there is a modulo operation.
      
      Differential Revision: http://reviews.llvm.org/D11661
      
      llvm-svn: 243717
      532a1369
  7. Jul 30, 2015
    • Sanjay Patel's avatar
      fix memcpy/memset/memmove lowering when optimizing for size · 1166f2ff
      Sanjay Patel authored
      Fixing MinSize attribute handling was discussed in D11363. 
      This is a prerequisite patch to doing that.
      
      The handling of OptSize when lowering mem* functions was broken
      on Darwin because it wants to ignore -Os for these cases, but the
      existing logic also made it ignore -Oz (MinSize).
      
      The Linux change demonstrates a widespread problem. The backend
      doesn't usually recognize the MinSize attribute by itself; it
      assumes that if the MinSize attribute exists, then the OptSize 
      attribute must also exist. 
      
      Fixing this more generally will be a follow-on patch or two.
      
      Differential Revision: http://reviews.llvm.org/D11568
      
      llvm-svn: 243693
      1166f2ff
    • Nick Lewycky's avatar
      Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the... · c3890d29
      Nick Lewycky authored
      Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.)
      
      Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h.
      
      llvm-svn: 243585
      c3890d29
  8. Jul 29, 2015
  9. Jul 28, 2015
    • Chih-Hung Hsieh's avatar
      Implement target independent TLS compatible with glibc's emutls.c. · 1e859582
      Chih-Hung Hsieh authored
      The 'common' section TLS is not implemented.
      Current C/C++ TLS variables are not placed in common section.
      DWARF debug info to get the address of TLS variables is not generated yet.
      
      clang and driver changes in http://reviews.llvm.org/D10524
      
        Added -femulated-tls flag to select the emulated TLS model,
        which will be used for old targets like Android that do not
        support ELF TLS models.
      
      Added TargetLowering::LowerToTLSEmulatedModel as a target-independent
      function to convert a SDNode of TLS variable address to a function call
      to __emutls_get_address.
      
      Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel
      for TLSModel::Emulated. Although all targets supporting ELF TLS models are
      enhanced, emulated TLS model has been tested only for Android ELF targets.
      Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for
      emulated TLS variables.
      Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls.
      
      TODO: Add proper DIE for emulated TLS variables.
            Added new unit tests with emulated TLS.
      
      Differential Revision: http://reviews.llvm.org/D10522
      
      llvm-svn: 243438
      1e859582
    • Alexandros Lamprineas's avatar
      - Added support for parsing HWDiv features using Target Parser. · 4ea70755
      Alexandros Lamprineas authored
      - Architecture extensions are represented as a bitmap.
      
      Phabricator: http://reviews.llvm.org/D11457
      llvm-svn: 243335
      4ea70755
  10. Jul 27, 2015
  11. Jul 24, 2015
    • Luke Cheeseman's avatar
      [ARM] - Fix lowering of shufflevectors in AArch32 · 4d45ff2b
      Luke Cheeseman authored
      Some shufflevectors are currently being incorrectly lowered in the AArch32
      backend as the existing checks for detecting the NEON operations from the
      shufflevector instruction expects the shuffle mask and the vector operands to be
      of the same length.
      
      This is not always the case as the mask may be twice as long as the operand;
      here only the lower half of the shufflemask gets checked, so provided the lower
      half of the shufflemask looks like a vector transpose (or even is just all -1
      for undef) then the intrinsics may get incorrectly lowered into a vector
      transpose (VTRN) instruction.
      
      This patch fixes this by accommodating for both cases and adds regression tests.
      
      Differential Revision: http://reviews.llvm.org/D11407
      
      llvm-svn: 243103
      4d45ff2b
    • Luke Cheeseman's avatar
      When lowering vector shifts a check is performed to see if the value to shift by · b5c627ab
      Luke Cheeseman authored
      is an immediate, in this check the value is negated and stored in and int64_t.
      The value can be -2^63 yet the result cannot be stored in an int64_t and this
      gives some undefined behaviour causing failures. The negation is only necessary
      when the values is within a certain range and so it should not need to negate
      -2^63, this patch introduces this and also a regression test.
      
      Differential Revision: http://reviews.llvm.org/D11408
      
      llvm-svn: 243100
      b5c627ab
    • David Gross's avatar
      [ARM] Register (existing) ARMLoadStoreOpt pass with LLVM pass manager. · d9c1bc99
      David Gross authored
      Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass.
      
      Subscribers: aemerson, llvm-commits, rengolin
      
      Differential Revision: http://reviews.llvm.org/D11373
      
      llvm-svn: 243052
      d9c1bc99
  12. Jul 23, 2015
  13. Jul 22, 2015
  14. Jul 21, 2015
  15. Jul 20, 2015
    • Quentin Colombet's avatar
      [ARM] Refactor the prologue/epilogue emission to be more robust. · 71a71485
      Quentin Colombet authored
      This is the first step toward supporting shrink-wrapping for this target.
      
      The changes could be summarized by these items:
      - Expand the tail-call return as part of the expand pseudo pass.
      - Get rid of the assumptions that the epilogue is the exit block:
        * Do not assume which registers are free in the epilogue. (This indirectly
          improve the lowering of the code for the segmented stacks, see the test
          cases.)
        * Take into account that the basic block can be empty.
      
      Related to <rdar://problem/20821730>
      
      llvm-svn: 242714
      71a71485
  16. Jul 18, 2015
    • Matthias Braun's avatar
      ARM: Enable MachineScheduler and disable PostRAScheduler for swift. · 9e859806
      Matthias Braun authored
      Reapply r242500 now that the swift schedmodel includes LDRLIT.
      
      This is mostly done to disable the PostRAScheduler which optimizes for
      instruction latencies which isn't a good fit for out-of-order
      architectures. This also allows to leave out the itinerary table in
      swift in favor of the SchedModel ones.
      
      This change leads to performance improvements/regressions by as much as
      10% in some benchmarks, in fact we loose 0.4% performance over the
      llvm-testsuite for reasons that appear to be unknown or out of the
      compilers control. rdar://20803802 documents the investigation of
      these effects.
      
      While it is probably a good idea to perform the same switch for the
      other ARM out-of-order CPUs, I limited this change to swift as I cannot
      perform the benchmark verification on the other CPUs.
      
      Differential Revision: http://reviews.llvm.org/D10513
      
      llvm-svn: 242588
      9e859806
    • Matthias Braun's avatar
      ARM: Add scheduling information for LDRLIT instructions to swift scheduling model · 141d1c9d
      Matthias Braun authored
      These pseudo instructions are only lowered after register allocation and
      are therefore still present when the machine scheduler runs.
      Add a run: line to a testcase that uses the uncommon flags necessary to
      actually produce a LDRLIT instruction on swift.
      
      llvm-svn: 242587
      141d1c9d
  17. Jul 17, 2015
    • Adam Nemet's avatar
      Revert "ARM: Enable MachineScheduler and disable PostRAScheduler for swift." · 5a6d5bc1
      Adam Nemet authored
      This reverts commit r242500.
      
      It broke some internal tests and Matthias asked me to revert it while he
      is investigating.
      
      llvm-svn: 242553
      5a6d5bc1
    • James Molloy's avatar
      [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA · a6702e2f
      James Molloy authored
      No functional change, but it preps codegen for the future when SABSDIFF
      will start getting generated in anger.
      
      llvm-svn: 242546
      a6702e2f
    • Matthias Braun's avatar
      ARM: Enable MachineScheduler and disable PostRAScheduler for swift. · 2d8315f8
      Matthias Braun authored
      This is mostly done to disable the PostRAScheduler which optimizes for
      instruction latencies which isn't a good fit for out-of-order
      architectures. This also allows to leave out the itinerary table in
      swift in favor of the SchedModel ones.
      
      This change leads to performance improvements/regressions by as much as
      10% in some benchmarks, in fact we loose 0.4% performance over the
      llvm-testsuite for reasons that appear to be unknown or out of the
      compilers control. rdar://20803802 documents the investigation of
      these effects.
      
      While it is probably a good idea to perform the same switch for the
      other ARM out-of-order CPUs, I limited this change to swift as I cannot
      perform the benchmark verification on the other CPUs.
      
      Differential Revision: http://reviews.llvm.org/D10513
      
      llvm-svn: 242500
      2d8315f8
    • Matthias Braun's avatar
      Arm: Don't define a label twice with two setjmps in a function. · da3d0d73
      Matthias Braun authored
      Constructing a name based on the function name didn't give us a unique
      symbol if we had more than one setjmp in a function. Using
      MCContext::createTempSymbol() always gives us a unique name.
      
      Differential Revision: http://reviews.llvm.org/D9314
      
      llvm-svn: 242482
      da3d0d73
    • Matthias Braun's avatar
      Fix __builtin_setjmp in combination with sjlj exception handling. · 3cd00c17
      Matthias Braun authored
      llvm.eh.sjlj.setjmp was used as part of the SjLj exception handling
      style but is also used in clang to implement __builtin_setjmp.  The ARM
      backend needs to output additional dispatch tables for the SjLj
      exception handling style, these tables however can't be emitted if
      llvm.eh.sjlj.setjmp is simply used for __builtin_setjmp and no actual
      landing pad blocks exist.
      
      To solve this issue a new llvm.eh.sjlj.setup_dispatch intrinsic is
      introduced which is used instead of llvm.eh.sjlj.setjmp in the SjLj
      exception handling lowering, so we can differentiate between the case
      where we actually need to setup a dispatch table and the case where we
      just need the __builtin_setjmp semantic.
      
      Differential Revision: http://reviews.llvm.org/D9313
      
      llvm-svn: 242481
      3cd00c17
Loading