Skip to content
  1. Jan 27, 2017
    • Yichao Yu's avatar
      CMake is funky on detecting Intel 17 as GCC compatible. · e1864d06
      Yichao Yu authored
      Summary: This adds a fallback in case that the Intel compiler is failed to be detected correctly.
      
      Reviewers: chapuni
      
      Reviewed By: chapuni
      
      Subscribers: llvm-commits, mgorny
      
      Differential Revision: https://reviews.llvm.org/D27610
      
      llvm-svn: 293230
      e1864d06
    • Eugene Zelenko's avatar
    • Tim Northover's avatar
      GlobalISel: support debug intrinsics. · 09aac4ad
      Tim Northover authored
      The translation scheme is mostly cribbed from FastISel, and it's not entirely
      convincing semantically. But it does seem to work in the common cases and allow
      variables to be printed so it can't be all wrong.
      
      llvm-svn: 293228
      09aac4ad
    • Sanjoy Das's avatar
      Revert a couple of InstCombine/Guard checkins · 7516192a
      Sanjoy Das authored
      This change reverts:
      
      r293061: "[InstCombine] Canonicalize guards for NOT OR condition"
      r293058: "[InstCombine] Canonicalize guards for AND condition"
      
      They miscompile cases like:
      
      ```
      declare void @llvm.experimental.guard(i1, ...)
      
      define void @test_guard_not_or(i1 %A, i1 %B) {
        %C = or i1 %A, %B
        %D = xor i1 %C, true
        call void(i1, ...) @llvm.experimental.guard(i1 %D, i32 20, i32 30)[ "deopt"() ]
        ret void
      }
      ```
      
      because they do transfer the `i32 20, i32 30` parameters to newly
      created guard instructions.
      
      llvm-svn: 293227
      7516192a
    • Andrew Kaylor's avatar
      Add intrinsics for constrained floating point operations · a0a1164c
      Andrew Kaylor authored
      This commit introduces a set of experimental intrinsics intended to prevent
      optimizations that make assumptions about the rounding mode and floating point
      exception behavior.  These intrinsics will later be extended to specify
      flush-to-zero behavior.  More work is also required to model instruction
      dependencies in machine code and to generate these instructions from clang
      (when required by pragmas and/or command line options that are not currently
      supported).
      
      Differential Revision: https://reviews.llvm.org/D27028
      
      llvm-svn: 293226
      a0a1164c
    • Chandler Carruth's avatar
      [PM] Enable the main loop pass pipelines with everything but · 79b733bc
      Chandler Carruth authored
      loop-unswitch in the main pipelines for the new PM.
      
      All of these now work, and Clang built using this pipeline can build the
      test suite and SPEC without hitting any asserts of ASan failures.
      
      There are still some bugs hiding though -- 7 tests regress with the new
      PM. I'm going to be investigating these, but it seems worthwhile to at
      least get the pipelines in place so that others can play with them, and
      they aren't completely broken.
      
      Differential Revision: https://reviews.llvm.org/D29113
      
      llvm-svn: 293225
      79b733bc
    • Davide Italiano's avatar
      [obj2yaml] Produce correct output for invalid relocations. · 44f1281f
      Davide Italiano authored
      R_X86_64_NONE can be emitted without a symbol associated (well,
      in theory it should never be emitted in an ABI-compliant relocatable
      object). So, if there's no symbol associated to a reloc, emit one
      with an empty name, instead of crashing.
      
      Ack'ed by Michael Spencer offline.
      
      PR: 31768
      llvm-svn: 293224
      44f1281f
    • Krzysztof Parzyszek's avatar
      [Hexagon] Require IPO library in Hexagon build · d6c8e3c9
      Krzysztof Parzyszek authored
      This should unbreak the Hexagon build bots.
      
      llvm-svn: 293221
      d6c8e3c9
  2. Jan 26, 2017
    • Daniel Berlin's avatar
      NewGVN: Fix bug exposed by PR31761 · 1ea5f324
      Daniel Berlin authored
      Summary:
      This does not actually fix the testcase in PR31761 (discussion is
      ongoing on the testcase), but does fix a bug it exposes, where stores
      were not properly clobbering loads.
      
      We accomplish this by unifying the memory equivalence infratructure
      back into the normal congruence infrastructure, and then properly
      destroying congruence classes when memory state leaders disappear.
      
      Reviewers: davide
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D29195
      
      llvm-svn: 293216
      1ea5f324
    • Sanjay Patel's avatar
      [InstCombine] fold (X >>u C) << C --> X & (-1 << C) · 50753f02
      Sanjay Patel authored
      We already have this fold when the lshr has one use, but it doesn't need that
      restriction. We may be able to remove some code from foldShiftedShift().
      
      Also, move the similar:
      (X << C) >>u C --> X & (-1 >>u C)
      ...directly into visitLShr to help clean up foldShiftByConstOfShiftByConst().
      
      That whole function seems questionable since it is called by commonShiftTransforms(),
      but there's really not much in common if we're checking the shift opcodes for every
      fold.
      
      llvm-svn: 293215
      50753f02
    • Ahmed Bougacha's avatar
      [GlobalISel] Remove duplicate function using variadic templates. NFC. · b67a3cef
      Ahmed Bougacha authored
      I think the initial version of r293172 was trying:
        std::forward<Args...>(args)...
      which doesn't compile.  This seems like the correct way:
        std::forward<Args>(args)...
      
      llvm-svn: 293214
      b67a3cef
    • Krzysztof Parzyszek's avatar
      [Hexagon] Add Hexagon-specific loop idiom recognition pass · c8b94386
      Krzysztof Parzyszek authored
      llvm-svn: 293213
      c8b94386
    • Daniel Berlin's avatar
      NewGVN: Add algorithm overview · db3c7be0
      Daniel Berlin authored
      llvm-svn: 293212
      db3c7be0
    • Sanjay Patel's avatar
    • Zvi Rackover's avatar
      [Doc][LangRef] Fix typo-ish error in description of Masked Gather · b26530cd
      Zvi Rackover authored
      Summary: Fix the example of equivalent expansion for when mask is all ones.
      
      Reviewers: delena
      
      Reviewed By: delena
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D29179
      
      llvm-svn: 293206
      b26530cd
    • Sanjay Patel's avatar
      [InstCombine] add tests for shift-shift folds; NFC · 0ca3f64c
      Sanjay Patel authored
      llvm-svn: 293205
      0ca3f64c
    • Balaram Makam's avatar
      [AArch64] Refine Kryo Machine Model · b73d2962
      Balaram Makam authored
      Summary: Refine floating point SQRT and DIV with accurate latency information.
      
      Reviewers: mcrosier
      
      Subscribers: aemerson, rengolin, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D29191
      
      llvm-svn: 293204
      b73d2962
    • Kyle Butt's avatar
      [IfConversion] Use reverse_iterator to simplify. NFC · c4614b3e
      Kyle Butt authored
      This simplifies skipping debug instructions and shrinking ranges.
      
      llvm-svn: 293202
      c4614b3e
    • Sean Fertile's avatar
      [PPC] cleanup of mayLoad/mayStore flags and memory operands. · 3c8c385a
      Sean Fertile authored
      1) Explicitly sets mayLoad/mayStore property in the tablegen files on load/store
         instructions.
      2) Updated the flags on a number of intrinsics indicating that they write
          memory.
      3) Added SDNPMemOperand flags for some target dependent SDNodes so that they
         propagate their memory operand
      
      Review: https://reviews.llvm.org/D28818
      llvm-svn: 293200
      3c8c385a
    • Daniel Berlin's avatar
    • Daniel Berlin's avatar
      NewGVN: Make unreachable blocks be marked with unreachable · 2b83492e
      Daniel Berlin authored
      llvm-svn: 293196
      2b83492e
    • Stanislav Mekhanoshin's avatar
      Replace addEarlyAsPossiblePasses callback with adjustPassManager · 81598117
      Stanislav Mekhanoshin authored
      This change introduces adjustPassManager target callback giving a
      target an opportunity to tweak PassManagerBuilder before pass
      managers are populated.
      
      This generalizes and replaces addEarlyAsPossiblePasses target
      callback. In particular that can be used to add custom passes to
      extension points other than EP_EarlyAsPossible.
      
      Differential Revision: https://reviews.llvm.org/D28336
      
      llvm-svn: 293189
      81598117
    • Nirav Dave's avatar
      Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." · d32a421f
      Nirav Dave authored
      This reverts commit r293184 which is failing in LTO builds
      
      llvm-svn: 293188
      d32a421f
    • Serge Rogatch's avatar
      [XRay][Arm32] Reduce the portion of the stub and implement more staging for tail calls - in LLVM · e09ba748
      Serge Rogatch authored
      Summary:
      This patch provides more staging for tail calls in XRay Arm32 . When the logging part of XRay is ready for tail calls, its support in the core part of XRay Arm32 may be as easy as changing the number passed to the handler from 1 to 2.
      Coupled patch:
      - https://reviews.llvm.org/D28674
      
      Reviewers: dberris, rengolin
      
      Reviewed By: dberris
      
      Subscribers: llvm-commits, iid_iunknown, aemerson, rengolin, dberris
      
      Differential Revision: https://reviews.llvm.org/D28673
      
      llvm-svn: 293185
      e09ba748
    • Nirav Dave's avatar
      In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. · de6516c4
      Nirav Dave authored
          * Simplify Consecutive Merge Store Candidate Search
      
          Now that address aliasing is much less conservative, push through
          simplified store merging search and chain alias analysis which only
          checks for parallel stores through the chain subgraph. This is cleaner
          as the separation of non-interfering loads/stores from the
          store-merging logic.
      
          When merging stores search up the chain through a single load, and
          finds all possible stores by looking down from through a load and a
          TokenFactor to all stores visited.
      
          This improves the quality of the output SelectionDAG and the output
          Codegen (save perhaps for some ARM cases where we correctly constructs
          wider loads, but then promotes them to float operations which appear
          but requires more expensive constant generation).
      
          Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)
      
          Additional Minor Changes:
      
            1. Finishes removing unused AliasLoad code
      
            2. Unifies the chain aggregation in the merged stores across code
               paths
      
            3. Re-add the Store node to the worklist after calling
               SimplifyDemandedBits.
      
            4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
               arbitrary, but seems sufficient to not cause regressions in
               tests.
      
            5. Remove Chain dependencies of Memory operations on CopyfromReg
               nodes as these are captured by data dependence
      
            6. Forward loads-store values through tokenfactors containing
                {CopyToReg,CopyFromReg} Values.
      
            7. Peephole to convert buildvector of extract_vector_elt to
               extract_subvector if possible (see
               CodeGen/AArch64/store-merge.ll)
      
            8. Store merging for the ARM target is restricted to 32-bit as
               some in some contexts invalid 64-bit operations are being
               generated. This can be removed once appropriate checks are
               added.
      
          This finishes the change Matt Arsenault started in r246307 and
          jyknight's original patch.
      
          Many tests required some changes as memory operations are now
          reorderable, improving load-store forwarding. One test in
          particular is worth noting:
      
            CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
            forwarding converts a load-store pair into a parallel store and
            a memory-realized bitcast of the same value. However, because we
            lose the sharing of the explicit and implicit store values we
            must create another local store. A similar transformation
            happens before SelectionDAG as well.
      
          Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle
      
      llvm-svn: 293184
      de6516c4
    • Rafael Espindola's avatar
      Use shouldAssumeDSOLocal in classifyGlobalReference. · 82149a1a
      Rafael Espindola authored
      And teach shouldAssumeDSOLocal that ppc has no copy relocations.
      
      The resulting code handle a few more case than before. For example, it
      knows that a weak symbol can be resolved to another .o file, but it
      will still be in the main executable.
      
      llvm-svn: 293180
      82149a1a
    • Simon Pilgrim's avatar
      027bb453
    • Daniil Fukalov's avatar
      [SCEV] Introduce add operation inlining limit · b09dac59
      Daniil Fukalov authored
      Inlining in getAddExpr() can cause abnormal computational time in some cases.
      New parameter -scev-addops-inline-threshold is intruduced with default value 500.
      
      Reviewers: sanjoy
      
      Subscribers: mzolotukhin, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D28812
      
      llvm-svn: 293176
      b09dac59
    • Simon Pilgrim's avatar
      [X86][SSE] Pull out target shuffle resolve code into helper. NFCI. · 3057fd53
      Simon Pilgrim authored
      Pulled out code that removed unused inputs from a target shuffle mask into a helper function to allow it to be reused in a future commit.
      
      llvm-svn: 293175
      3057fd53
    • Daniel Sanders's avatar
      Remove a '#if 0' that wasn't intended for commit in r293173. · f69fe686
      Daniel Sanders authored
      The '#if 0' contained the code I had intended to use but clang
      rejects it (possibly incorrectly).
      
      llvm-svn: 293174
      f69fe686
    • Daniel Sanders's avatar
      Attempt to fix windows buildbots after r293172. · b2224311
      Daniel Sanders authored
      llvm-svn: 293173
      b2224311
    • Daniel Sanders's avatar
      [globalisel] Re-factor ISel matchers into a hierarchy. NFC · dc662ff0
      Daniel Sanders authored
      Summary:
      This should make it possible to easily add everything needed to import all
      the existing SelectionDAG rules. It should also serve the likely
      kinds of GlobalISel rules (some of which are not currently representable
      in SelectionDAG) once we've nailed down the tablegen definition for that.
      
      The hierarchy is as follows:
        MatcherRule - A matching rule. Currently used to emit C++ ISel code but will
        |             also be used to emit test cases and tablegen definitions in the
        |             near future.
        |- Instruction(s) - Represents the instruction to be matched.
           |- Instruction Predicate(s) - Test the opcode, arithmetic flags, etc. of an
           |                             instruction.
           \- Operand(s) - Represents a particular operand of the instruction. In the
              |            future, there may be subclasses to test the same predicates
              |            on multiple operands (including for variadic instructions).
              \ Operand Predicate(s) - Test the type, register bank, etc. of an operand.
                                       This is where the ComplexPattern equivalent
                                       will be represented. It's also
                                       nested-instruction matching will live as a
                                       predicate that follows the DefUse chain to the
                                       Def and tests a MatcherRule from that position.
      
      Support for multiple instruction matchers in a rule has been retained from
      the existing code but has been adjusted to assert when it is used.
      Previously it would silently drop all but the first instruction matcher.
      
      The tablegen-erated file is not functionally changed but has more
      parentheses and no longer attempts to format the if-statements since
      keeping track of the indentation is tricky in the presence of the matcher
      hierarchy. It would be nice to have CMakes tablegen() run the output
      through clang-format (when available) so we don't have to complicate
      TableGen with pretty-printing.
      
      It's also worth mentioning that this hierarchy will also be able to emit
      TableGen definitions and test cases in the near future. This is the reason
      for favouring explicit emit*() calls rather than the << operator.
      
      Reviewers: aditya_nandakumar, rovka, t.p.northover, qcolombet, ab
      
      Reviewed By: ab
      
      Subscribers: igorb, dberris, kristof.beyls, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D28942
      
      llvm-svn: 293172
      dc662ff0
    • Valery Pykhtin's avatar
      [AMDGPU] Fix typo in GCNSchedStrategy · 75d1de90
      Valery Pykhtin authored
      Differential revision: https://reviews.llvm.org/D28980
      
      llvm-svn: 293171
      75d1de90
    • Simon Dardis's avatar
      Revert "[mips] N64 static relocation model support" · 5b67a4f7
      Simon Dardis authored
      This reverts commit r293164. There are multiple tests failing.
      
      llvm-svn: 293170
      5b67a4f7
    • Chandler Carruth's avatar
      [LV] Fix an issue where forming LCSSA in the place that we did would · 6f4ed077
      Chandler Carruth authored
      change the set of uniform instructions in the loop causing an assert
      failure.
      
      The problem is that the legalization checking also builds data
      structures mapping various facts about the loop body. The immediate
      cause was the set of uniform instructions. If these then change when
      LCSSA is formed, the data structures would already have been built and
      become stale. The included test case triggered an assert in loop
      vectorize that was reduced out of the new PM's pipeline.
      
      The solution is to form LCSSA early enough that no information is cached
      across the changes made. The only really obvious position is outside of
      the main logic to vectorize the loop. This also has the advantage of
      removing one case where forming LCSSA could mutate the loop but we
      wouldn't track that as a "Changed" state.
      
      If it is significantly advantageous to do some legalization checking
      prior to this, we can do a more careful positioning but it seemed best
      to just back off to a safe position first.
      
      llvm-svn: 293168
      6f4ed077
    • Simon Dardis's avatar
      [mips] N64 static relocation model support · 09e65efd
      Simon Dardis authored
      This patch makes one change to GOT handling and two changes to N64's
      relocation model handling. Furthermore, the jumptable encodings have
      been corrected for static N64.
      
      Big GOT handling is now done via a new SDNode MipsGotHi - this node is
      unconditionally lowered to an lui instruction.
      
      The first change to N64's relocation handling is the lifting of the
      restriction that N64 always uses PIC. Now it is possible to target static
      environments.
      
      The second change adds support for 64 bit symbols and enables them by
      default. Previously N64 had patterns for sym32 mode only. In this mode all
      symbols are assumed to have 32 bit addresses. sym32 mode support
      is selectable with attribute 'sym32'. A follow on patch for clang will
      add the necessary frontend parameter.
      
      This partially resolves PR/23485.
      
      Thanks to Brooks Davis for reporting the issue!
      
      Reviewers: dsanders, seanbruno, zoran.jovanovic, vkalintiris
      
      Differential Revision: https://reviews.llvm.org/D23652
      
      llvm-svn: 293164
      09e65efd
    • Diana Picus's avatar
      [ARM] GlobalISel: Load i1, i8 and i16 args from stack · 278c722e
      Diana Picus authored
      Add support for loading i1, i8 and i16 arguments from the stack, with or without
      the ABI extension flags.
      
      When the ABI extension flags are present, we load a 4-byte value, otherwise we
      preserve the size of the load and let the instruction selector replace it with a
      LDRB/LDRH. This generates the same thing as DAGISel.
      
      Differential Revision: https://reviews.llvm.org/D27803
      
      llvm-svn: 293163
      278c722e
    • Alexey Bataev's avatar
      [SLP] Add one more reduction operation for extra argument test to make · 7a7510ea
      Alexey Bataev authored
      it vectorizable.
      
      llvm-svn: 293162
      7a7510ea
    • Chandler Carruth's avatar
      [PM] Use PoisoningVH correctly when merely deleting entries in a map · 41421df0
      Chandler Carruth authored
      with it.
      
      This code was dereferencing the PoisoningVH which isn't allowed once it
      is poisoned. But the code itself really doesn't need to access the
      pointer, it is just doing the safe stuff of clearing out data structures
      keyed on the pointer value.
      
      Change the code to use iterators to erase directly from a DenseMap. This
      is also substantially more efficient as it avoids lots of hashing and
      lookups to do the erasure. DenseMap supports iterating behind the
      iteration which is fairly easy to implement.
      
      Sadly, I don't have a test case here. I'm not even close and I don't
      know that I ever will be. The issue is that several of the tricky
      aspects of fixing this only show up when you cause the stack's
      SmallVector to be in *EXACTLY* the right location. I only ever got
      a reproduction for those with Clang, and only with *exactly* the right
      command line flags. Any adjustment, even to seemingly unrelated flags,
      would make partial and half-way solutions magically start to "work". In
      good news, all of this was caught with the LLVM test suite. Also, there
      is no *specific* code here that is untested, just that the old pattern
      of code won't immediately fail on any test case I've managed to
      contrive.
      
      llvm-svn: 293160
      41421df0
    • NAKAMURA Takumi's avatar
      Chapter3/KaleidoscopeJIT.h: Fix a warning. [-Wunused-lambda-capture] · 949d54eb
      NAKAMURA Takumi authored
      "this", aka class members, is not referred in the body.
      
      llvm-svn: 293159
      949d54eb
Loading