- Nov 02, 2011
-
-
Craig Topper authored
llvm-svn: 143529
-
- Nov 01, 2011
-
-
Eli Friedman authored
Teach the x86 backend a couple tricks for dealing with v16i8 sra by a constant splat value. Fixes PR11289. llvm-svn: 143498
-
Craig Topper authored
llvm-svn: 143455
-
- Oct 31, 2011
-
-
Craig Topper authored
llvm-svn: 143336
-
Craig Topper authored
llvm-svn: 143332
-
Craig Topper authored
llvm-svn: 143331
-
Nick Lewycky authored
-enable-dwarf-directory. llvm-svn: 143326
-
- Oct 30, 2011
-
-
Benjamin Kramer authored
X86: Emit logical shift by constant splat of <16 x i8> as a <8 x i16> shift and zero out the bits where zeros should've been shifted in. llvm-svn: 143315
-
Craig Topper authored
Fix return type for X86 mpsadbw instrinsic. The instruction takes in a vector of 8-bit integers, but produces a vector of 16-bit integers. llvm-svn: 143313
-
Nadav Rotem authored
Fix pr11266. On x86: (shl V, 1) -> add V,V Hardware support for vector-shift is sparse and in many cases we scalarize the result. Additionally, on sandybridge padd is faster than shl. llvm-svn: 143311
-
Nadav Rotem authored
llvm-svn: 143307
-
- Oct 29, 2011
-
-
Nadav Rotem authored
If all of the inputs are zero/any_extended, create a new simple BV which can be further optimized by other BV optimizations. llvm-svn: 143297
-
Benjamin Kramer authored
llvm-svn: 143291
-
Dan Gohman authored
llvm-svn: 143262
-
- Oct 28, 2011
-
-
Dan Gohman authored
fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206
-
NAKAMURA Takumi authored
Don't assume APInt::getRawData() would hold target-aware endianness nor host-compliant endianness. rawdata[0] holds most lower i64, even on big endian host. FIXME: Add a testcase for big endian target. FIXME: Ditto on CompileUnit::addConstantFPValue() ? llvm-svn: 143194
-
NAKAMURA Takumi authored
llvm-svn: 143189
-
Duncan Sands authored
it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188
-
Nick Lewycky authored
tools that read the debug info in the .o files by making the DIE sizes more consistent. llvm-svn: 143186
-
Dan Gohman authored
on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177
-
- Oct 27, 2011
-
-
Pete Cooper authored
llvm-svn: 143116
-
Nick Lewycky authored
llvm-svn: 143097
-
Eli Friedman authored
llvm-svn: 143095
-
- Oct 26, 2011
-
-
Rafael Espindola authored
Patch by Sanjoy Das. llvm-svn: 143066
-
Rafael Espindola authored
Patch by Sanjoy Das. llvm-svn: 143064
-
Rafael Espindola authored
MORESTACK_RET_RESTORE_R10; which are lowered to a RET and a RET followed by a MOV respectively. Having a fake instruction prevents the verifier from seeing a MachineBasicBlock end with a non-terminator (MOV). It also prevents the rather eccentric case of a MachineBasicBlock ending with RET but having successors nevertheless. Patch by Sanjoy Das. llvm-svn: 143062
-
- Oct 23, 2011
-
-
Chandler Carruth authored
discussions with Andy. Fundamentally, the previous algorithm is both counter productive on several fronts and prioritizing things which aren't necessarily the most important: static branch prediction. The new algorithm uses the existing loop CFG structure information to walk through the CFG itself to layout blocks. It coalesces adjacent blocks within the loop where the CFG allows based on the most likely path taken. Finally, it topologically orders the block chains that have been formed. This allows it to choose a (mostly) topologically valid ordering which still priorizes fallthrough within the structural constraints. As a final twist in the algorithm, it does violate the CFG when it discovers a "hot" edge, that is an edge that is more than 4x hotter than the competing edges in the CFG. These are forcibly merged into a fallthrough chain. Future transformations that need te be added are rotation of loop exit conditions to be fallthrough, and better isolation of cold block chains. I'm also planning on adding statistics to model how well the algorithm does at laying out blocks based on the probabilities it receives. The old tests mostly still pass, and I have some new tests to add, but the nested loops are still behaving very strangely. This almost seems like working-as-intended as it rotated the exit branch to be fallthrough, but I'm not convinced this is actually the best layout. It is well supported by the probabilities for loops we currently get, but those are pretty broken for nested loops, so this may change later. llvm-svn: 142743
-
- Oct 22, 2011
-
-
Nadav Rotem authored
SHL inserts zeros from the right, thus even when the original sign_extend_inreg value was of 1-bit, we need to sra. llvm-svn: 142724
-
- Oct 21, 2011
-
-
Nadav Rotem authored
ZExtPromotedInteger and SExtPromotedInteger based on the operation we legalize. SetCC return type needs to be legalized via PromoteTargetBoolean. llvm-svn: 142660
-
Chandler Carruth authored
all x86 systems. Sorry for the breakage. llvm-svn: 142656
-
Nadav Rotem authored
2. Fix a typo in CONCAT_VECTORS which exposed the bug in #1. llvm-svn: 142648
-
Chandler Carruth authored
it's a bit more plausible to use this instead of CodePlacementOpt. The code for this was shamelessly stolen from CodePlacementOpt, and then trimmed down a bit. There doesn't seem to be much utility in returning true/false from this pass as we may or may not have rewritten all of the blocks. Also, the statistic of counting how many loops were aligned doesn't seem terribly important so I removed it. If folks would like it to be included, I'm happy to add it back. This was probably the most egregious of the missing features, and now I'm going to start gathering some performance numbers and looking at specific loop structures that have different layout between the two. Test is updated to include both basic loop alignment and nested loop alignment. llvm-svn: 142645
-
Chandler Carruth authored
canonical example I used when developing it, and is one of the primary motivating real-world use cases for __builtin_expect (when burried under a macro). I'm working on more test cases here, but I'm trying to make sure both that the pass is doing the right thing with the test cases and that they aren't too brittle to changes elsewhere in the code generation pipeline. Feedback and/or suggestions on how to test this are very welcome. Especially feedback on whether testing the block comments is a good strategy; I couldn't find any good examples to steal from but all the other ideas I had were a lot uglier or more fragile. llvm-svn: 142644
-
Craig Topper authored
Remove intrinsics for X86 BLSI, BLSMSK, and BLSR intrinsics and replace with custom isel lowering code. llvm-svn: 142642
-
- Oct 20, 2011
-
-
- Oct 19, 2011
-
-
Nadav Rotem authored
When checking the availability of instructions using the TLI, a 'promoted' instruction IS available. It means that the value is bitcasted to another type for which there is an operation. The correct check for the availablity of an instruction is to check if it should be expanded. llvm-svn: 142542
-
Nadav Rotem authored
llvm-svn: 142488
-
Craig Topper authored
llvm-svn: 142480
-
-
Nadav Rotem authored
llvm-svn: 142442
-