- May 09, 2017
-
-
Reid Kleckner authored
Summary: For inalloca functions, this is a very common code pattern: %argpack = type <{ i32, i32, i32 }> define void @f(%argpack* inalloca %args) { entry: %a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0 %b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1 %c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2 tail call void @llvm.dbg.declare(metadata i32* %a, ... "a") tail call void @llvm.dbg.declare(metadata i32* %c, ... "b") tail call void @llvm.dbg.declare(metadata i32* %b, ... "c") Even though these GEPs can be simplified to a constant offset from EBP or RSP, we don't do that at -O0, and each GEP is computed into a register. Registers used to compute argument addresses are typically spilled and clobbered very quickly after the initial computation, so live debug variable tracking loses information very quickly if we use DBG_VALUE instructions. This change moves processing of dbg.declare between argument lowering and basic block isel, so that we can ask if an argument has a frame index or not. If the argument lives in a register as is the case for byval arguments on some targets, then we don't put it in the side table and during ISel we emit DBG_VALUE instructions. Reviewers: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32980 llvm-svn: 302483
-
- May 08, 2017
-
-
Reid Kleckner authored
Summary: An llvm.dbg.declare of a static alloca is always added to the MachineFunction dbg variable map, so these values are entirely redundant. They survive all the way through codegen to be ignored by DWARF emission. Effectively revert r113967 Two bugpoint-reduced test cases from 2012 broke as a result of this change. Despite my best efforts, I haven't been able to rewrite the test case using dbg.value. I'm not too concerned about the lost coverage because these were reduced from the test-suite, which we still run. Reviewers: aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32920 llvm-svn: 302461
-
Dean Michael Berris authored
This patch introduces an LLVM intrinsic and a target opcode for custom event logging in XRay. Initially, its use case will be to allow users of XRay to log some type of string ("poor man's printf"). The target opcode compiles to a noop sled large enough to enable calling through to a runtime-determined relative function call. At runtime, when X-Ray is enabled, the sled is replaced by compiler-rt with a trampoline to the logic for creating the custom log entries. Future patches will implement the compiler-rt parts and clang-side support for emitting the IR corresponding to this intrinsic. Reviewers: timshen, dberris Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D27503 llvm-svn: 302405
-
- May 06, 2017
-
-
Simon Pilgrim authored
Remove an extra canonicalization step if ISD::ABS is going to be used anyway. Updated x86 abs combine to check that we are lowering from both canonicalizations. llvm-svn: 302337
-
- May 05, 2017
-
-
Reid Kleckner authored
No functional change other than improving dbgs logging accuracy on constant dbg values. Previously we would add things like "i32 42" as debug values, and then log that we were dropping the debug info, which is silly. Delete some dead code that was checking for static allocas. This remained after r207165, but served no purpose. Currently, static alloca dbg.values are always sent through the DanglingDebugInfoMap, and are usually made valid the first time the alloca is used. llvm-svn: 302267
-
Craig Topper authored
[KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits. This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown. Differential Revision: https://reviews.llvm.org/D32637 llvm-svn: 302262
-
- May 04, 2017
-
-
Chad Rosier authored
Differential Revision: http://reviews.llvm.org/D32596 llvm-svn: 302153
-
Krzysztof Parzyszek authored
Patch by Wei-Ren Chen. Differential Revision: https://reviews.llvm.org/D32682 llvm-svn: 302148
-
Craig Topper authored
This is based on the same concept from ValueTracking's version of computeKnownBits. llvm-svn: 302110
-
Craig Topper authored
This patch adds zext, sext, and trunc methods to KnownBits and uses them where possible. Differential Revision: https://reviews.llvm.org/D32784 llvm-svn: 302088
-
- May 03, 2017
-
-
Sanjay Patel authored
This is the DAG equivalent of https://reviews.llvm.org/D32255 , which will hopefully be committed again. The functionality (preferring a 'not' op) is already here in the DAG, so this is just intended to be a clean-up and performance improvement. llvm-svn: 302087
-
Amaury Sechet authored
Summary: Do the transform when the carry isn't used. It's a pattern exposed when legalizing large integers. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32755 llvm-svn: 302047
-
Tim Shen authored
Summary: This is the corresponding llvm change to D28037 to ensure no performance regression. Reviewers: bogner, kbarton, hfinkel, iteratee, echristo Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28329 llvm-svn: 301990
-
- May 02, 2017
-
-
Amaury Sechet authored
Summary: This is a common pattern that arise when legalizing large integers operations. Only do it when Y + 1 cannot overflow as this would change the carry behavior of uaddo . Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32687 llvm-svn: 301922
-
Amaury Sechet authored
Summary: Common pattern when legalizing large integers operations. Similar to D32687, when the carry isn't used. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Differential Revision: https://reviews.llvm.org/D32738 llvm-svn: 301919
-
Simon Pilgrim authored
PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types, when in fact float types may require it as well - in this case half floats. This patch adds support for extension/truncation for both integer and float types. Differential Revision: https://reviews.llvm.org/D32391 llvm-svn: 301910
-
Simon Pilgrim authored
The existing code only looks at half of the tree when matching bswap + rol patterns ending in an OR tree (as opposed to a cascade). Patch originally introduced by Jim Lewis. Submitted on the behalf of Dinar Temirbulatov. Differential Revision: https://reviews.llvm.org/D32039 llvm-svn: 301907
-
- May 01, 2017
-
-
Craig Topper authored
[SelectionDAG] Use known ones to provide a better bound for the known zeros for CTTZ/CTLZ operations. This is the SelectionDAG version of D32521. If know where at least one 1 is located in the input to these intrinsics we can place an upper bound on the number of bits needed to represent the count and thus increase the number of known zeros in the output. I think we can also refine this further for CTTZ_UNDEF/CTLZ_UNDEF by assuming that the answer will never be BitWidth. I've left this out for now because it caused other test failures across multiple targets. Usually because of turning ADD into OR based on this new information. I'll fix CTPOP in a future patch. Differential Revision: https://reviews.llvm.org/D32692 llvm-svn: 301806
-
Amara Emerson authored
This removes BinaryWithFlagsSDNode, and flags are now all passed by value. Differential Revision: https://reviews.llvm.org/D32527 llvm-svn: 301803
-
Sanjay Patel authored
We discussed shrinking/widening of selects in IR in D26556, and I'll try to get back to that patch eventually. But I'm hoping that this transform is less iffy in the DAG where we can check legality of the select that we want to produce. A few things to note: 1. We can't wait until after legalization and do this generically because (at least in the x86 tests from PR14657), we'll have PACKSS and bitcasts in the pattern. 2. This might benefit more of the SSE codegen if we lifted the legal-or-custom requirement, but that requires a closer look to make sure we don't end up worse. 3. There's a 'vblendv' opportunity that we're missing that results in andn/and/or in some cases. That should be fixed next. 4. I'm assuming that AVX1 offers the worst of all worlds wrt uneven ISA support with multiple legal vector sizes, but if there are other targets like that, we should add more tests. 5. There's a codegen miracle in the multi-BB tests from PR14657 (the gcc auto-vectorization tests): despite IR that is terrible for the target, this patch allows us to generate the optimal loop code because something post-ISEL is hoisting the splat extends above the vector loops. Differential Revision: https://reviews.llvm.org/D32620 llvm-svn: 301781
-
- Apr 30, 2017
-
-
Amaury Sechet authored
Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D29872 llvm-svn: 301775
-
Craig Topper authored
[APInt] Replace calls to setBits with more specific calls to setBitsFrom and setLowBits where possible. llvm-svn: 301768
-
- Apr 29, 2017
-
-
Craig Topper authored
[KnownBits] Add methods for determining if the known bits represent a negative/nonnegative number and add methods for changing the negative/nonnegative state Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit. Reviewers: RKSimon, spatel, davide Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32651 llvm-svn: 301747
-
- Apr 28, 2017
-
-
Reid Kleckner authored
The method is called "get *Param* Alignment", and is only used for return values exactly once, so it should take argument indices, not attribute indices. Avoids confusing code like: IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError); Alignment = CS->getParamAlignment(ArgIdx + 1); Add getRetAlignment to handle the one case in Value.cpp that wants the return value alignment. This is a potentially breaking change for out-of-tree backends that do their own call lowering. llvm-svn: 301682
-
Matthias Braun authored
Adds a new method finalizeLowering to TargetLoweringBase. This is in preparation for an upcoming commit. This function is meant for target specific adjustments to MachineFrameInfo or register reservations. Move the freezeRegisters() and the hasCopyImplyingStackAdjustment() handling into the new function to prove the concept. As an added bonus GlobalISel no longer missed the hasCopyImplyingStackAdjustment() handling with this. Differential Revision: https://reviews.llvm.org/D32621 llvm-svn: 301679
-
Reid Kleckner authored
This eliminates many extra 'Idx' induction variables in loops over arguments in CodeGen/ and Target/. It also reduces the number of places where we assume that ReturnIndex is 0 and that we should add one to argument numbers to get the corresponding attribute list index. NFC llvm-svn: 301666
-
Craig Topper authored
llvm-svn: 301656
-
Jun Bum Lim authored
Summary: The motivation example is like below which has 13 cases but only 2 distinct targets ``` lor.lhs.false2: ; preds = %if.then switch i32 %Status, label %if.then27 [ i32 -7012, label %if.end35 i32 -10008, label %if.end35 i32 -10016, label %if.end35 i32 15000, label %if.end35 i32 14013, label %if.end35 i32 10114, label %if.end35 i32 10107, label %if.end35 i32 10105, label %if.end35 i32 10013, label %if.end35 i32 10011, label %if.end35 i32 7008, label %if.end35 i32 7007, label %if.end35 i32 5002, label %if.end35 ] ``` which is compiled into a balanced binary tree like this on AArch64 (similar on X86) ``` .LBB853_9: // %lor.lhs.false2 mov w8, #10012 cmp w19, w8 b.gt .LBB853_14 // BB#10: // %lor.lhs.false2 mov w8, #5001 cmp w19, w8 b.gt .LBB853_18 // BB#11: // %lor.lhs.false2 mov w8, #-10016 cmp w19, w8 b.eq .LBB853_23 // BB#12: // %lor.lhs.false2 mov w8, #-10008 cmp w19, w8 b.eq .LBB853_23 // BB#13: // %lor.lhs.false2 mov w8, #-7012 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_14: // %lor.lhs.false2 mov w8, #14012 cmp w19, w8 b.gt .LBB853_21 // BB#15: // %lor.lhs.false2 mov w8, #-10105 add w8, w19, w8 cmp w8, #9 // =9 b.hi .LBB853_17 // BB#16: // %lor.lhs.false2 orr w9, wzr, #0x1 lsl w8, w9, w8 mov w9, #517 and w8, w8, w9 cbnz w8, .LBB853_23 .LBB853_17: // %lor.lhs.false2 mov w8, #10013 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_18: // %lor.lhs.false2 mov w8, #-7007 add w8, w19, w8 cmp w8, #2 // =2 b.lo .LBB853_23 // BB#19: // %lor.lhs.false2 mov w8, #5002 cmp w19, w8 b.eq .LBB853_23 // BB#20: // %lor.lhs.false2 mov w8, #10011 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_21: // %lor.lhs.false2 mov w8, #14013 cmp w19, w8 b.eq .LBB853_23 // BB#22: // %lor.lhs.false2 mov w8, #15000 cmp w19, w8 b.ne .LBB853_3 ``` However, the inline cost model estimates the cost to be linear with the number of distinct targets and the cost of the above switch is just 2 InstrCosts. The function containing this switch is then inlined about 900 times. This change use the general way of switch lowering for the inline heuristic. It etimate the number of case clusters with the suitability check for a jump table or bit test. Considering the binary search tree built for the clusters, this change modifies the model to be linear with the size of the balanced binary tree. The model is off by default for now : -inline-generic-switch-cost=false This change was originally proposed by Haicheng in D29870. Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier Reviewed By: hans Subscribers: joerg, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D31085 llvm-svn: 301649
-
Simon Pilgrim authored
[DAGCombiner] Add ComputeNumSignBits vector demanded elements support to ASHR and INSERT_VECTOR_ELT (reapplied) Reapplied r299221 after fix for nondeterminism in ThinLTO builder (rL301599), with extra check for implicit truncation of inserted element. llvm-svn: 301644
-
Craig Topper authored
llvm-svn: 301626
-
Craig Topper authored
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently. This is largely a mechanical transformation from KnownZero to Known.Zero. Differential Revision: https://reviews.llvm.org/D32569 llvm-svn: 301620
-
Craig Topper authored
This patch uses various APInt methods to reduce the number of temporary APInts. These were all found while working through converting SelectionDAG's computeKnownBits to also use the KnownBits struct recently added to the ValueTracking version. llvm-svn: 301618
-
Craig Topper authored
llvm-svn: 301612
-
- Apr 27, 2017
-
-
Sanjoy Das authored
Summary: The type of the target frame index is intptr, not the type of the value we're going to store into it. Without this change we crash in the attached test case when trying to type-legalize a TargetFrameIndex. Patchpoint lowering types the target frame index as intptr as well. Reviewers: reames, bogner, arsenm Subscribers: arsenm, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D32256 llvm-svn: 301566
-
- Apr 26, 2017
-
-
Sanjay Patel authored
Besides better codegen, the motivation is to be able to canonicalize this pattern in IR (currently we don't) knowing that the backend is prepared for that. This may also allow removing code for special constant cases in DAGCombiner::foldSelectOfConstants() that was added in D30180. Differential Revision: https://reviews.llvm.org/D31944 llvm-svn: 301457
-
Craig Topper authored
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit. Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch. I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases. Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with. Differential Revision: https://reviews.llvm.org/D32376 llvm-svn: 301432
-
Sanjay Patel authored
Build vectors have magical truncation powers, so we have things like this: v4i1 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1> v4i16 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1> If we don't truncate the splat node returned by getConstantSplatNode(), then we won't find truth when ZeroOrNegativeOneBooleanContent is the rule. Differential Revision: https://reviews.llvm.org/D32505 llvm-svn: 301408
-
Ranjeet Singh authored
For targets that don't have ISD::MULHS or ISD::SMUL_LOHI for the type and the double width type is illegal, then the two operands are sign extended to twice their size then multiplied to check for overflow. The extended upper halves were mismatched causing an incorrect result. This fixes the mismatch. A test was added for ARM V6-M where the bug was detected. Patch by James Duley. Differential Revision: https://reviews.llvm.org/D31807 llvm-svn: 301404
-
Sanjay Patel authored
llvm-svn: 301403
-
Sanjay Patel authored
llvm-svn: 301366
-