Skip to content
  1. Oct 28, 2015
  2. Oct 27, 2015
    • David Majnemer's avatar
      [SimplifyCFG] Don't DCE catchret because the successor is unreachable · 49293709
      David Majnemer authored
      CatchReturnInst has side-effects: it runs a destructor.  This destructor
      could conceivably run forever/call exit/etc. and should not be removed.
      
      llvm-svn: 251461
      49293709
    • Vedant Kumar's avatar
      [Bitcode] Fix accidental syntax errors in compatibility tests · 9fde8d60
      Vedant Kumar authored
      We used automated tools to update our IR to its current syntax in commit
      21f77df7(r247378). While it correctly updated the CHECK lines in our
      compatibility tests, the IR should have remained untouched.  This commit
      fixes the syntax errors.
      
      llvm-svn: 251458
      9fde8d60
    • Simon Pilgrim's avatar
      [X86][AVX512] Test UNPCK with non-sequential scalars · 94c49435
      Simon Pilgrim authored
      Missing tests for r251297
      
      llvm-svn: 251453
      94c49435
    • Vedant Kumar's avatar
      [IR] Limit bits used for CallingConv::ID, update tests · ad6d6e74
      Vedant Kumar authored
      Use 10 bits to represent calling convention ID's instead of 13, and
      update the bitcode compatibility tests accordingly. We now error-out in
      the bitcode reader when we see bad calling conv ID's.
      
      Thanks to rnk and dexonsmith for feedback!
      
      Differential Revision: http://reviews.llvm.org/D13826
      
      llvm-svn: 251452
      ad6d6e74
    • Hal Finkel's avatar
      [AliasSetTracker] Use mod/ref information for UnknownInstr · b1bb7391
      Hal Finkel authored
      AliasSetTracker does not need to convert the access mode to ModRefAccess if the
      new visited UnknownInst has only 'REF' modrefinfo to existing pointers in the
      sets.
      
      Patch by Andrew Zhogin!
      
      llvm-svn: 251451
      b1bb7391
    • Sanjay Patel's avatar
      Use the 'arcp' fast-math-flag when combining repeated FP divisors · bbd4c79c
      Sanjay Patel authored
      This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. 
      This was originally part of D8900.
      
      Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and 
      possibly other changes.
      
      Differential Revision: http://reviews.llvm.org/D9708
      
      llvm-svn: 251450
      bbd4c79c
    • David Majnemer's avatar
      [ScalarEvolutionExpander] PHI on a catchpad can be used on both edges · 235acde9
      David Majnemer authored
      A PHI on a catchpad might be used by both edges out of the catchpad,
      feeding back into a loop.  In this case, just use the insertion point.
      Anything more clever would require new basic blocks or PHI placement.
      
      llvm-svn: 251442
      235acde9
    • Jun Bum Lim's avatar
      [AArch64]Merge halfword loads into a 32-bit load · c9879ecf
      Jun Bum Lim authored
      This recommits r250719, which caused a failure in SPEC2000.gcc
      because of the incorrect insert point for the new wider load.
      
      Convert two halfword loads into a single 32-bit word load with bitfield extract
      instructions. For example :
        ldrh w0, [x2]
        ldrh w1, [x2, #2]
      becomes
        ldr w0, [x2]
        ubfx w1, w0, #16, #16
        and  w0, w0, #ffff
      
      llvm-svn: 251438
      c9879ecf
    • NAKAMURA Takumi's avatar
      Whitespace. · 6f49ecc3
      NAKAMURA Takumi authored
      llvm-svn: 251437
      6f49ecc3
    • NAKAMURA Takumi's avatar
      Revert r251291, "Loop Vectorizer - skipping "bitcast" before GEP" · 7ef7293b
      NAKAMURA Takumi authored
      It causes miscompilation of llvm/lib/ExecutionEngine/Interpreter/Execution.cpp.
      See also PR25324.
      
      llvm-svn: 251436
      7ef7293b
    • Diego Novillo's avatar
      Tidy a comment. NFC. · aa55507f
      Diego Novillo authored
      llvm-svn: 251434
      aa55507f
    • Cong Hou's avatar
      Create a new interface addSuccessorWithoutWeight(MBB*) in MBB to add... · 07eeb800
      Cong Hou authored
      Create a new interface addSuccessorWithoutWeight(MBB*) in MBB to add successors when optimization is disabled.
      
      When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights.
      
      We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled.
      
      In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list.
      
      Differential revision: http://reviews.llvm.org/D13963
      
      llvm-svn: 251429
      07eeb800
    • Charlie Turner's avatar
      [SLP] Be more aggressive about reduction width selection. · ab3215fa
      Charlie Turner authored
      Summary:
      This change could be way off-piste, I'm looking for any feedback on whether it's an acceptable approach.
      
      It never seems to be a problem to gobble up as many reduction values as can be found, and then to attempt to reduce the resulting tree. Some of the workloads I'm looking at have been aggressively unrolled by hand, and by selecting reduction widths that are not constrained by a vector register size, it becomes possible to profitably vectorize. My test case shows such an unrolling which SLP was not vectorizing (on neither ARM nor X86) before this patch, but with it does vectorize.
      
      I measure no significant compile time impact of this change when combined with D13949 and D14063. There are also no significant performance regressions on ARM/AArch64 in SPEC or LNT.
      
      The more principled approach I thought of was to generate several candidate tree's and use the cost model to pick the cheapest one. That seemed like quite a big design change (the algorithms seem very much one-shot), and would likely be a costly thing for compile time. This seemed to do the job at very little cost, but I'm worried I've misunderstood something!
      
      Reviewers: nadav, jmolloy
      
      Subscribers: mssimpso, llvm-commits, aemerson
      
      Differential Revision: http://reviews.llvm.org/D14116
      
      llvm-svn: 251428
      ab3215fa
    • Charlie Turner's avatar
      [SLP] Try a bit harder to find reduction PHIs · cd6e8cf8
      Charlie Turner authored
      Summary:
      Currently, when the SLP vectorizer considers whether a phi is part of a reduction, it dismisses phi's whose incoming blocks are not the same as the block containing the phi. For the patterns I'm looking at, extending this rule to allow phis whose incoming block is a containing loop latch allows me to vectorize certain workloads.
      
      There is no significant compile-time impact, and combined with D13949, no performance improvement measured in ARM/AArch64 in any of SPEC2000, SPEC2006 or LNT.
      
      Reviewers: jmolloy, mcrosier, nadav
      
      Subscribers: mssimpso, nadav, aemerson, llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D14063
      
      llvm-svn: 251425
      cd6e8cf8
    • Charlie Turner's avatar
      [SLP] Treat SelectInsts as reduction values. · 74c387fe
      Charlie Turner authored
      Summary:
      Certain workloads, in particular sum-of-absdiff loops, can be vectorized using SLP if it can treat select instructions as reduction values.
      
      The test case is a bit awkward. The AArch64 cost model needs some tuning to not be so pessimistic about selects. I've had to tweak the SLP threshold here.
      
      Reviewers: jmolloy, mzolotukhin, spatel, nadav
      
      Subscribers: nadav, mssimpso, aemerson, llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D13949
      
      llvm-svn: 251424
      74c387fe
    • Lang Hames's avatar
      [Orc] Fix indentation. · 4a51e5dd
      Lang Hames authored
      llvm-svn: 251423
      4a51e5dd
    • Diego Novillo's avatar
      Fix SamplePGO segfault when debug info is missing. · c04270d2
      Diego Novillo authored
      When emitting a remark for a conditional branch annotation, the remark
      uses the line location information of the conditional branch in the
      message.  In some cases, that information is unavailable and the
      optimization would segfaul. I'm still not sure whether this is a bug or
      WAI, but the optimizer should not die because of this.
      
      llvm-svn: 251420
      c04270d2
    • Reid Kleckner's avatar
      [ms-inline-asm] Leave alignment in bytes if the native assembler uses bytes · fb1c1c7e
      Reid Kleckner authored
      The existing behavior was correct on Darwin, which is probably the
      platform it was written for.
      
      Before this change, we would rewrite "align 8" to ".align 3" and then
      fail to make it through the integrated assembler because 3 is not a
      power of 2.
      
      Differential Revision: http://reviews.llvm.org/D14120
      
      llvm-svn: 251418
      fb1c1c7e
    • Rui Ueyama's avatar
      Rename qsort -> multikey_qsort. NFC. · 5579e0b8
      Rui Ueyama authored
      `qsort` as a file-scope local function name was confusing.
      
      llvm-svn: 251414
      5579e0b8
    • Ed Schouten's avatar
      Prefer ranlib mode over ar mode. · baff6b46
      Ed Schouten authored
      For CloudABI's toolchain I have a symlink that goes from <target>-ar and
      <target>-ranlib to LLVM's ar binary, to mimick GNU Binutils' naming
      scheme. The problem is that if we're targetting ARM64, the name of the
      ranlib executable is aarch64-unknown-cloudabi-ranlib. This already
      contains the string "ar".
      
      Let's move the "ranlib" test above the "ar" test. It's not that likely
      that we're going to see operating systems or harwdare architectures that
      are called "ranlib".
      
      Reviewed by:	rafael
      Differential Revision:	http://reviews.llvm.org/D14123
      
      llvm-svn: 251413
      baff6b46
    • Chris Bieneman's avatar
      [CMake] Get rid of LLVM_DYLIB_EXPORT_ALL, and make it the default, add... · 9c5e41f3
      Chris Bieneman authored
      [CMake] Get rid of LLVM_DYLIB_EXPORT_ALL, and make it the default, add libLLVM-C on darwin to cover the C API needs.
      
      Summary:
      We've had a lot of discussion in the past about the meaningful and useful default behaviors for the llvm-shlib tool. The original implementation was heavily geared toward Apple's use, and I think that was wrong. This patch seeks to correct that.
      
      I've removed the LLVM_DYLIB_EXPORT_ALL variable and made libLLVM export everything by default.
      
      I've also added a new target that is only built on Darwin for libLLVM-C as a library that re-exports the LLVM-C API. This library is not built on Linux because ELF doesn't support re-export libraries in the same way MachO does.
      
      Reviewers: chapuni, resistor, bogner, axw
      
      Subscribers: llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D13842
      
      llvm-svn: 251411
      9c5e41f3
    • Asaf Badouh's avatar
      [X86][AVX512] [X86][AVX512] add convert float to half · c7cb8806
      Asaf Badouh authored
      convert float to half with mask/maskz for the reg to reg version and mask for the reg to mem version (there is no maskz version for reg to mem).
      
      Differential Revision: http://reviews.llvm.org/D14113
      
      llvm-svn: 251409
      c7cb8806
    • Charlie Turner's avatar
      [ARM] Expand ROTL and ROTR of vector value types · 458e79b8
      Charlie Turner authored
      Summary: After D13851 landed, we saw backend crashes when compiling the reduced test case included in this patch. The right fix seems to be to allow these vector types for expansion in instruction selection.
      
      Reviewers: rengolin, t.p.northover
      
      Subscribers: RKSimon, t.p.northover, aemerson, llvm-commits, rengolin
      
      Differential Revision: http://reviews.llvm.org/D14082
      
      llvm-svn: 251401
      458e79b8
    • Mehdi Amini's avatar
      Do not use "else" when both branches return (NFC) · 891c0973
      Mehdi Amini authored
      From: Mehdi Amini <mehdi.amini@apple.com>
      llvm-svn: 251398
      891c0973
    • David Majnemer's avatar
      [ScalarEvolutionExpander] Properly insert no-op casts + EH Pads · dd9a8157
      David Majnemer authored
      We want to insert no-op casts as close as possible to the def.  This is
      tricky when the cast is of a PHI node and the BasicBlocks between the
      def and the use cannot hold any instructions.  Iteratively walk EH pads
      until we hit a non-EH pad.
      
      This fixes PR25326.
      
      llvm-svn: 251393
      dd9a8157
    • Michael Kuperstein's avatar
      [X86] Make elfiamcu an OS, not an environment. · e1194bdb
      Michael Kuperstein authored
      GNU tools require elfiamcu to take up the entire OS field, so, e.g.
      i?86-*-linux-elfiamcu is not considered a legal triple.
      Make us compatible.
      
      Differential Revision: http://reviews.llvm.org/D14081
      
      llvm-svn: 251390
      e1194bdb
    • Davide Italiano's avatar
      [SimplifyLibCalls] Use range-based loop. No functional change. · c692688c
      Davide Italiano authored
      llvm-svn: 251383
      c692688c
    • Craig Topper's avatar
      Convert cost table lookup functions to return a pointer to the entry or... · ee0c8597
      Craig Topper authored
      Convert cost table lookup functions to return a pointer to the entry or nullptr instead of the index.
      
      This avoid mentioning the table name an extra time and allows the lookup to be done directly in the ifs by relying on the bool conversion of the pointer.
      
      While there make use of ArrayRef and std::find_if.
      
      llvm-svn: 251382
      ee0c8597
    • Chandler Carruth's avatar
      [function-attrs] Refactor code to handle shorter code with early exits. · 69798fb5
      Chandler Carruth authored
      No functionality changed here, but the indentation is substantially
      reduced and IMO the code is much easier to read. I've also added some
      helpful comments.
      
      This is just a clean-up I wrote while studying the code, and that has
      been in my backlog for a while.
      
      llvm-svn: 251381
      69798fb5
    • Sanjoy Das's avatar
      [ValueTracking] Don't special case wrapped ConstantRanges; NFCI · 63d2b779
      Sanjoy Das authored
      Use `getUnsignedMax` directly instead of special casing a wrapped
      ConstantRange.
      
      The previous code would have been "buggy" (and this would have been a
      semantic change) if LLVM allowed !range metadata to denote full
      ranges. E.g. in
      
        %val = load i1, i1* %ptr, !range !{i1 1, i1 1} ;; == full set
      
      ValueTracking would conclude that the high bit (IOW the only bit) in
      %val was zero.
      
      Since !range metadata does not allow empty or full ranges, this change
      is just a minor stylistic improvement.
      
      llvm-svn: 251380
      63d2b779
    • Sanjay Patel's avatar
      [x86] replace integer logic ops with packed SSE FP logic ops · 309c4f93
      Sanjay Patel authored
      If we have an operand to a bitwise logic op that's already in
      an XMM register and the result is going to be sent to an XMM
      register, then use an SSE logic op to avoid moves between the
      integer and vector register files.
      
      Related commits:
      http://reviews.llvm.org/rL248395
      http://reviews.llvm.org/rL248399
      http://reviews.llvm.org/rL248404
      http://reviews.llvm.org/rL248409
      http://reviews.llvm.org/rL248415
      
      This should solve PR22428:
      https://llvm.org/bugs/show_bug.cgi?id=22428
      
      llvm-svn: 251378
      309c4f93
    • Sanjoy Das's avatar
      [SCEV] Refactor out ScalarEvolution::getDataLayout; NFC · 49edd3b3
      Sanjoy Das authored
      llvm-svn: 251375
      49edd3b3
    • Steve King's avatar
      Fix llc crash processing S/UREM for -Oz builds caused by rL250825. · fee370be
      Steve King authored
      When taking the remainder of a value divided by a constant, visitREM()
      attempts to convert the REM to a longer but faster sequence of instructions.
      This conversion calls combine() on a speculative DIV instruction. Commit
      rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes.
      Flow eventually hits unreachable().
      
      This patch adds a test case and a check to prevent visitREM() from trying
      to convert the REM instruction in cases where a DIVREM is possible.
      See http://reviews.llvm.org/D14035
      
      llvm-svn: 251373
      fee370be
    • Sanjay Patel's avatar
      add FP logic test cases to show current codegen (PR22428) · 28d1598e
      Sanjay Patel authored
      llvm-svn: 251370
      28d1598e
    • Daniel Sanders's avatar
      [mips][ias] Fold needsExpansion() and expandInstruction() together. NFC. · 5bf6eab6
      Daniel Sanders authored
      Summary:
      Previously we maintained two separate switch statements that had to be kept in
      sync. This patch merges them into a single switch.
      
      Reviewers: vkalintiris
      
      Subscribers: llvm-commits, dsanders
      
      Differential Revision: http://reviews.llvm.org/D14012
      
      llvm-svn: 251369
      5bf6eab6
    • Tim Northover's avatar
      Switch ownership of miscellaneous ARM target to myself. · 45a9b282
      Tim Northover authored
      llvm-svn: 251367
      45a9b282
  3. Oct 26, 2015
Loading