Skip to content
  1. May 09, 2017
    • Reid Kleckner's avatar
      Use the frame index side table for byval and inalloca arguments · 45efcf0c
      Reid Kleckner authored
      Summary:
      For inalloca functions, this is a very common code pattern:
      
        %argpack = type <{ i32, i32, i32 }>
        define void @f(%argpack* inalloca %args) {
        entry:
          %a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0
          %b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1
          %c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2
          tail call void @llvm.dbg.declare(metadata i32* %a, ... "a")
          tail call void @llvm.dbg.declare(metadata i32* %c, ... "b")
          tail call void @llvm.dbg.declare(metadata i32* %b, ... "c")
      
      Even though these GEPs can be simplified to a constant offset from EBP
      or RSP, we don't do that at -O0, and each GEP is computed into a
      register. Registers used to compute argument addresses are typically
      spilled and clobbered very quickly after the initial computation, so
      live debug variable tracking loses information very quickly if we use
      DBG_VALUE instructions.
      
      This change moves processing of dbg.declare between argument lowering
      and basic block isel, so that we can ask if an argument has a frame
      index or not. If the argument lives in a register as is the case for
      byval arguments on some targets, then we don't put it in the side table
      and during ISel we emit DBG_VALUE instructions.
      
      Reviewers: aprantl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32980
      
      llvm-svn: 302483
      45efcf0c
  2. May 08, 2017
    • Reid Kleckner's avatar
      Don't add DBG_VALUE instructions for static allocas in dbg.declare · bf828eed
      Reid Kleckner authored
      Summary:
      An llvm.dbg.declare of a static alloca is always added to the
      MachineFunction dbg variable map, so these values are entirely
      redundant. They survive all the way through codegen to be ignored by
      DWARF emission.
      
      Effectively revert r113967
      
      Two bugpoint-reduced test cases from 2012 broke as a result of this
      change. Despite my best efforts, I haven't been able to rewrite the test
      case using dbg.value. I'm not too concerned about the lost coverage
      because these were reduced from the test-suite, which we still run.
      
      Reviewers: aprantl, dblaikie
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32920
      
      llvm-svn: 302461
      bf828eed
    • Dean Michael Berris's avatar
      [XRay] Custom event logging intrinsic · 9bcaed86
      Dean Michael Berris authored
      This patch introduces an LLVM intrinsic and a target opcode for custom event
      logging in XRay. Initially, its use case will be to allow users of XRay to log
      some type of string ("poor man's printf"). The target opcode compiles to a noop
      sled large enough to enable calling through to a runtime-determined relative
      function call. At runtime, when X-Ray is enabled, the sled is replaced by
      compiler-rt with a trampoline to the logic for creating the custom log entries.
      
      Future patches will implement the compiler-rt parts and clang-side support for
      emitting the IR corresponding to this intrinsic.
      
      Reviewers: timshen, dberris
      
      Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D27503
      
      llvm-svn: 302405
      9bcaed86
  3. May 06, 2017
  4. May 05, 2017
    • Reid Kleckner's avatar
      Simplify dbg.value handling in SDISel with early returns · ac1a97b3
      Reid Kleckner authored
      No functional change other than improving dbgs logging accuracy on
      constant dbg values. Previously we would add things like "i32 42" as
      debug values, and then log that we were dropping the debug info, which
      is silly.
      
      Delete some dead code that was checking for static allocas. This
      remained after r207165, but served no purpose. Currently, static alloca
      dbg.values are always sent through the DanglingDebugInfoMap, and are
      usually made valid the first time the alloca is used.
      
      llvm-svn: 302267
      ac1a97b3
    • Craig Topper's avatar
      [KnownBits] Add wrapper methods for setting and clear all bits in the... · f0aeee01
      Craig Topper authored
      [KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits.
      
      This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown.
      
      Differential Revision: https://reviews.llvm.org/D32637
      
      llvm-svn: 302262
      f0aeee01
  5. May 04, 2017
  6. May 03, 2017
  7. May 02, 2017
  8. May 01, 2017
    • Craig Topper's avatar
      [SelectionDAG] Use known ones to provide a better bound for the known zeros... · 6b1b630a
      Craig Topper authored
      [SelectionDAG] Use known ones to provide a better bound for the known zeros for CTTZ/CTLZ operations.
      
      This is the SelectionDAG version of D32521. If know where at least one 1 is located in the input to these intrinsics we can place an upper bound on the number of bits needed to represent the count and thus increase the number of known zeros in the output.
      
      I think we can also refine this further for CTTZ_UNDEF/CTLZ_UNDEF by assuming that the answer will never be BitWidth. I've left this out for now because it caused other test failures across multiple targets. Usually because of turning ADD into OR based on this new information.
      
      I'll fix CTPOP in a future patch.
      
      Differential Revision: https://reviews.llvm.org/D32692
      
      llvm-svn: 301806
      6b1b630a
    • Amara Emerson's avatar
      Generalize the specialized flag-carrying SDNodes by moving flags into SDNode. · d28f0cd4
      Amara Emerson authored
      This removes BinaryWithFlagsSDNode, and flags are now all passed by value.
      
      Differential Revision: https://reviews.llvm.org/D32527
      
      llvm-svn: 301803
      d28f0cd4
    • Sanjay Patel's avatar
      [DAGCombiner] shrink/widen a vselect to match its condition operand size (PR14657) · ad13826a
      Sanjay Patel authored
      We discussed shrinking/widening of selects in IR in D26556, and I'll try to get back to that
      patch eventually. But I'm hoping that this transform is less iffy in the DAG where we can check
      legality of the select that we want to produce.
      
      A few things to note:
      
      1. We can't wait until after legalization and do this generically because (at least in the x86
         tests from PR14657), we'll have PACKSS and bitcasts in the pattern.
      2. This might benefit more of the SSE codegen if we lifted the legal-or-custom requirement, but
         that requires a closer look to make sure we don't end up worse.
      3. There's a 'vblendv' opportunity that we're missing that results in andn/and/or in some cases. 
         That should be fixed next.
      4. I'm assuming that AVX1 offers the worst of all worlds wrt uneven ISA support with multiple 
         legal vector sizes, but if there are other targets like that, we should add more tests.
      5. There's a codegen miracle in the multi-BB tests from PR14657 (the gcc auto-vectorization tests):
         despite IR that is terrible for the target, this patch allows us to generate the optimal loop
         code because something post-ISEL is hoisting the splat extends above the vector loops.
      
      Differential Revision: https://reviews.llvm.org/D32620
      
      llvm-svn: 301781
      ad13826a
  9. Apr 30, 2017
  10. Apr 29, 2017
    • Craig Topper's avatar
      [KnownBits] Add methods for determining if the known bits represent a... · ca48af3c
      Craig Topper authored
      [KnownBits] Add methods for determining if the known bits represent a negative/nonnegative number and add methods for changing the negative/nonnegative state
      
      Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit.
      
      Reviewers: RKSimon, spatel, davide
      
      Reviewed By: RKSimon
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32651
      
      llvm-svn: 301747
      ca48af3c
  11. Apr 28, 2017
    • Reid Kleckner's avatar
      Make getParamAlignment use argument numbers · 859f8b54
      Reid Kleckner authored
      The method is called "get *Param* Alignment", and is only used for
      return values exactly once, so it should take argument indices, not
      attribute indices.
      
      Avoids confusing code like:
        IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError);
        Alignment  = CS->getParamAlignment(ArgIdx + 1);
      
      Add getRetAlignment to handle the one case in Value.cpp that wants the
      return value alignment.
      
      This is a potentially breaking change for out-of-tree backends that do
      their own call lowering.
      
      llvm-svn: 301682
      859f8b54
    • Matthias Braun's avatar
      TargetLowering: Add finalizeLowering() function; NFC · 744c215e
      Matthias Braun authored
      Adds a new method finalizeLowering to TargetLoweringBase. This is in
      preparation for an upcoming commit.
      
      This function is meant for target specific adjustments to
      MachineFrameInfo or register reservations.
      
      Move the freezeRegisters() and the hasCopyImplyingStackAdjustment()
      handling into the new function to prove the concept. As an added bonus
      GlobalISel no longer missed the hasCopyImplyingStackAdjustment()
      handling with this.
      
      Differential Revision: https://reviews.llvm.org/D32621
      
      llvm-svn: 301679
      744c215e
    • Reid Kleckner's avatar
      Use Argument::hasAttribute and AttributeList::ReturnIndex more · 6652a52e
      Reid Kleckner authored
      This eliminates many extra 'Idx' induction variables in loops over
      arguments in CodeGen/ and Target/. It also reduces the number of places
      where we assume that ReturnIndex is 0 and that we should add one to
      argument numbers to get the corresponding attribute list index.
      
      NFC
      
      llvm-svn: 301666
      6652a52e
    • Craig Topper's avatar
      24db6b80
    • Jun Bum Lim's avatar
      [InlineCost] Improve the cost heuristic for Switch · 919f9e8d
      Jun Bum Lim authored
      Summary:
      The motivation example is like below which has 13 cases but only 2 distinct targets
      
      ```
      lor.lhs.false2:                                   ; preds = %if.then
        switch i32 %Status, label %if.then27 [
          i32 -7012, label %if.end35
          i32 -10008, label %if.end35
          i32 -10016, label %if.end35
          i32 15000, label %if.end35
          i32 14013, label %if.end35
          i32 10114, label %if.end35
          i32 10107, label %if.end35
          i32 10105, label %if.end35
          i32 10013, label %if.end35
          i32 10011, label %if.end35
          i32 7008, label %if.end35
          i32 7007, label %if.end35
          i32 5002, label %if.end35
        ]
      ```
      which is compiled into a balanced binary tree like this on AArch64 (similar on X86)
      
      ```
      .LBB853_9:                              // %lor.lhs.false2
              mov     w8, #10012
              cmp             w19, w8
              b.gt    .LBB853_14
      // BB#10:                               // %lor.lhs.false2
              mov     w8, #5001
              cmp             w19, w8
              b.gt    .LBB853_18
      // BB#11:                               // %lor.lhs.false2
              mov     w8, #-10016
              cmp             w19, w8
              b.eq    .LBB853_23
      // BB#12:                               // %lor.lhs.false2
              mov     w8, #-10008
              cmp             w19, w8
              b.eq    .LBB853_23
      // BB#13:                               // %lor.lhs.false2
              mov     w8, #-7012
              cmp             w19, w8
              b.eq    .LBB853_23
              b       .LBB853_3
      .LBB853_14:                             // %lor.lhs.false2
              mov     w8, #14012
              cmp             w19, w8
              b.gt    .LBB853_21
      // BB#15:                               // %lor.lhs.false2
              mov     w8, #-10105
              add             w8, w19, w8
              cmp             w8, #9          // =9
              b.hi    .LBB853_17
      // BB#16:                               // %lor.lhs.false2
              orr     w9, wzr, #0x1
              lsl     w8, w9, w8
              mov     w9, #517
              and             w8, w8, w9
              cbnz    w8, .LBB853_23
      .LBB853_17:                             // %lor.lhs.false2
              mov     w8, #10013
              cmp             w19, w8
              b.eq    .LBB853_23
              b       .LBB853_3
      .LBB853_18:                             // %lor.lhs.false2
              mov     w8, #-7007
              add             w8, w19, w8
              cmp             w8, #2          // =2
              b.lo    .LBB853_23
      // BB#19:                               // %lor.lhs.false2
              mov     w8, #5002
              cmp             w19, w8
              b.eq    .LBB853_23
      // BB#20:                               // %lor.lhs.false2
              mov     w8, #10011
              cmp             w19, w8
              b.eq    .LBB853_23
              b       .LBB853_3
      .LBB853_21:                             // %lor.lhs.false2
              mov     w8, #14013
              cmp             w19, w8
              b.eq    .LBB853_23
      // BB#22:                               // %lor.lhs.false2
              mov     w8, #15000
              cmp             w19, w8
              b.ne    .LBB853_3
      ```
      However, the inline cost model estimates the cost to be linear with the number
      of distinct targets and the cost of the above switch is just 2 InstrCosts.
      The function containing this switch is then inlined about 900 times.
      
      This change use the general way of switch lowering for the inline heuristic. It
      etimate the number of case clusters with the suitability check for a jump table
      or bit test. Considering the binary search tree built for the clusters, this
      change modifies the model to be linear with the size of the balanced binary
      tree. The model is off by default for now :
        -inline-generic-switch-cost=false
      
      This change was originally proposed by Haicheng in D29870.
      
      Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier
      
      Reviewed By: hans
      
      Subscribers: joerg, aemerson, llvm-commits, rengolin
      
      Differential Revision: https://reviews.llvm.org/D31085
      
      llvm-svn: 301649
      919f9e8d
    • Simon Pilgrim's avatar
      [DAGCombiner] Add ComputeNumSignBits vector demanded elements support to ASHR... · 7ae9419d
      Simon Pilgrim authored
      [DAGCombiner] Add ComputeNumSignBits vector demanded elements support to ASHR and INSERT_VECTOR_ELT (reapplied)
      
      Reapplied r299221 after fix for nondeterminism in ThinLTO builder (rL301599), with extra check for implicit truncation of inserted element.
      
      llvm-svn: 301644
      7ae9419d
    • Craig Topper's avatar
    • Craig Topper's avatar
      [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits · d0af7e8a
      Craig Topper authored
      This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.
      
      This is largely a mechanical transformation from KnownZero to Known.Zero.
      
      Differential Revision: https://reviews.llvm.org/D32569
      
      llvm-svn: 301620
      d0af7e8a
    • Craig Topper's avatar
      [SelectionDAG] Use various APInt methods to reduce temporary APInt creation · 0e03e74e
      Craig Topper authored
      This patch uses various APInt methods to reduce the number of temporary APInts. These were all found while working through converting SelectionDAG's computeKnownBits to also use the KnownBits struct recently added to the ValueTracking version.
      
      llvm-svn: 301618
      0e03e74e
    • Craig Topper's avatar
      [APInt] Use inplace shift methods where possible. NFCI · 24e71017
      Craig Topper authored
      llvm-svn: 301612
      24e71017
  12. Apr 27, 2017
  13. Apr 26, 2017
    • Sanjay Patel's avatar
      [DAGCombiner] add (sext i1 X), 1 --> zext (not i1 X) · a0547c3d
      Sanjay Patel authored
      Besides better codegen, the motivation is to be able to canonicalize this pattern 
      in IR (currently we don't) knowing that the backend is prepared for that.
      
      This may also allow removing code for special constant cases in 
      DAGCombiner::foldSelectOfConstants() that was added in D30180.
      
      Differential Revision: https://reviews.llvm.org/D31944
      
      llvm-svn: 301457
      a0547c3d
    • Craig Topper's avatar
      [ValueTracking] Introduce a KnownBits struct to wrap the two APInts for computeKnownBits · b45eabcf
      Craig Topper authored
      This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.
      
      Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.
      
      I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.
      
      Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.
      
      Differential Revision: https://reviews.llvm.org/D32376
      
      llvm-svn: 301432
      b45eabcf
    • Sanjay Patel's avatar
      [TargetLowering] fix isConstTrueVal to account for build vector truncation · e2ec05a6
      Sanjay Patel authored
      Build vectors have magical truncation powers, so we have things like this:
      
      v4i1 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
      v4i16 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
      
      If we don't truncate the splat node returned by getConstantSplatNode(), then we won't find 
      truth when ZeroOrNegativeOneBooleanContent is the rule.
      
      Differential Revision: https://reviews.llvm.org/D32505
      
      llvm-svn: 301408
      e2ec05a6
    • Ranjeet Singh's avatar
      Fix signed multiplication with overflow fallback. · acbd4e14
      Ranjeet Singh authored
      For targets that don't have ISD::MULHS or ISD::SMUL_LOHI for the type
      and the double width type is illegal, then the two operands are
      sign extended to twice their size then multiplied to check for overflow.
      The extended upper halves were mismatched causing an incorrect result.
      This fixes the mismatch.
      
      A test was added for ARM V6-M where the bug was detected.
      
      Patch by James Duley.
      
      Differential Revision: https://reviews.llvm.org/D31807
      
      llvm-svn: 301404
      acbd4e14
    • Sanjay Patel's avatar
      [DAG] add FIXME comments for splat detection; NFC · a4b4e938
      Sanjay Patel authored
      llvm-svn: 301403
      a4b4e938
    • Sanjay Patel's avatar
      [DAG] fix formatting of isConstantSplat(); NFC · 7a8317c0
      Sanjay Patel authored
      llvm-svn: 301366
      7a8317c0
Loading