- Jan 11, 2012
-
-
Chandler Carruth authored
strange build bot failures that look like a miscompile into an infloop. I'll investigate this tomorrow, but I'd both like to know whether my patch is the culprit, and get the bots back to green. llvm-svn: 147945
-
Chandler Carruth authored
lots of lines of code. No functionality changed. llvm-svn: 147942
-
Chandler Carruth authored
SRL-rooted code. llvm-svn: 147941
-
Chandler Carruth authored
factor the differences that were hiding in one of them into its other caller, the SRL handling code. No change in behavior. llvm-svn: 147940
-
Chandler Carruth authored
mask+shift pairs at the beginning of the ISD::AND case block, and then hoist the final pattern into a helper function, simplifying and reflowing it appropriately. This should have no observable behavior change, but several simplifications fell out of this such as directly computing the new mask constant, etc. llvm-svn: 147939
-
Chandler Carruth authored
extracts and scaled addressing modes into its own helper function. No functionality changed here, just hoisting and layout fixes falling out of that hoisting. llvm-svn: 147937
-
Chandler Carruth authored
detect a pattern which can be implemented with a small 'shl' embedded in the addressing mode scale. This happens in real code as follows: unsigned x = my_accelerator_table[input >> 11]; Here we have some lookup table that we look into using the high bits of 'input'. Each entity in the table is 4-bytes, which means this implicitly gets turned into (once lowered out of a GEP): *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2)); The shift right followed by a shift left is canonicalized to a smaller shift right and masking off the low bits. That hides the shift right which x86 has an addressing mode designed to support. We now detect masks of this form, and produce the longer shift right followed by the proper addressing mode. In addition to saving a (rather large) instruction, this also reduces stalls in Intel chips on benchmarks I've measured. In order for all of this to work, one part of the DAG needs to be canonicalized *still further* than it currently is. This involves removing pointless 'trunc' nodes between a zextload and a zext. Without that, we end up generating spurious masks and hiding the pattern. llvm-svn: 147936
-
- Jan 09, 2012
-
-
Chandler Carruth authored
this substraction will result in small negative numbers at worst which become very large positive numbers on assignment and are thus caught by the <=4 check on the next line. The >0 check clearly intended to catch these as negative numbers. Spotted by inspection, and impossible to trigger given the shift widths that can be used. llvm-svn: 147773
-
- Nov 16, 2011
-
-
Pete Cooper authored
llvm-svn: 144811
-
- Nov 15, 2011
-
-
Pete Cooper authored
by later instructions. Only done for DEC64m right now. Fixes <rdar://problem/6172640> llvm-svn: 144705
-
- Nov 03, 2011
-
-
Dan Gohman authored
across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660
-
- Oct 29, 2011
-
-
Dan Gohman authored
llvm-svn: 143262
-
- Oct 28, 2011
-
-
Dan Gohman authored
fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206
-
Duncan Sands authored
it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188
-
Dan Gohman authored
on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177
-
- Oct 08, 2011
-
-
Jakob Stoklund Olesen authored
In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot target all GR8 registers, only those in GR8_NOREX. TO enforce this, we ensure that all instructions using the EXTRACT_SUBREG are GR8_NOREX constrained. This fixes PR11088. llvm-svn: 141499
-
- Aug 01, 2011
-
-
Bruno Cardoso Lopes authored
llvm-svn: 136653
-
- Jul 13, 2011
-
-
Eli Friedman authored
Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile. <rdar://problem/9763308> llvm-svn: 135084
-
Eli Friedman authored
Refactor out checking for displacements on x86-64 addressing modes. No functionality change. Refactoring in preparation for an additional safety check in FoldOffsetIntoAddress. Part of <rdar://problem/9763308>. llvm-svn: 135079
-
- Jul 02, 2011
-
-
Eric Christopher authored
up the valid constant check earlier. rdar://9692967 llvm-svn: 134286
-
- Jun 30, 2011
-
-
Eric Christopher authored
we didn't have an opcode for 64-bit constant or expressions. Fixes rdar://9692967 llvm-svn: 134121
-
- May 20, 2011
-
-
Stuart Hastings authored
rdar://problem/8614450 llvm-svn: 131746
-
- May 17, 2011
-
-
Eric Christopher authored
llvm-svn: 131459
-
Eric Christopher authored
Finishes off rdar://8470697 llvm-svn: 131458
-
Eric Christopher authored
llvm-svn: 131457
-
Eric Christopher authored
llvm-svn: 131456
-
- May 11, 2011
-
-
Eric Christopher authored
Part of rdar://8470697 llvm-svn: 131200
-
Eric Christopher authored
Next up: xor and and. Part of rdar://8470697 llvm-svn: 131171
-
- Apr 23, 2011
-
-
Benjamin Kramer authored
llvm-svn: 130053
-
- Apr 22, 2011
-
-
Benjamin Kramer authored
X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039) This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is uint64_t foo(uint64_t x) { return (x&1) << 42; } which used to compile into bloated code: shlq $42, %rdi ## encoding: [0x48,0xc1,0xe7,0x2a] movabsq $4398046511104, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00] andq %rdi, %rax ## encoding: [0x48,0x21,0xf8] ret ## encoding: [0xc3] with this patch we can fold the immediate into the and: andq $1, %rdi ## encoding: [0x48,0x83,0xe7,0x01] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] shlq $42, %rax ## encoding: [0x48,0xc1,0xe0,0x2a] ret ## encoding: [0xc3] It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing that without making this code even more complicated. See the TODOs in the code. llvm-svn: 129990
-
- Feb 16, 2011
-
-
Stuart Hastings authored
other getNode() methods. Radar 9002173. llvm-svn: 125665
-
- Feb 13, 2011
-
-
Chris Lattner authored
have their low bits set to zero. This allows us to optimize out explicit stack alignment code like in stack-align.ll:test4 when it is redundant. Doing this causes the code generator to start turning FI+cst into FI|cst all over the place, which is general goodness (that is the canonical form) except that various pieces of the code generator don't handle OR aggressively. Fix this by introducing a new SelectionDAG::isBaseWithConstantOffset predicate, and using it in places that are looking for ADD(X,CST). The ARM backend in particular was missing a lot of addressing mode folding opportunities around OR. llvm-svn: 125470
-
- Jan 27, 2011
-
-
NAKAMURA Takumi authored
CALL64 marks %xmm* as dead. llvm-svn: 124354
-
- Jan 16, 2011
-
-
Chris Lattner authored
into and/shift would cause nodes to move around and a dangling pointer to happen. The code tried to avoid this with a HandleSDNode, but got the details wrong. llvm-svn: 123578
-
- Jan 14, 2011
-
-
Ted Kremenek authored
declaration and its assignments. Found by clang static analyzer. llvm-svn: 123486
-
- Jan 06, 2011
-
-
Bill Wendling authored
beginning of the "main" function. The assembler complains about the invalid suffix for the 'call' instruction. The right instruction is "callq __main". Patch by KS Sreeram! llvm-svn: 122933
-
- Dec 21, 2010
-
-
Chris Lattner authored
something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310
-
- Dec 05, 2010
-
-
Chris Lattner authored
backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935
-
- Oct 27, 2010
-
-
Dale Johannesen authored
memory, so a MachineMemOperand is useful (not propagated into the MachineInstr yet). No functional change except for dump output. llvm-svn: 117413
-
- Oct 06, 2010
-
-
Chris Lattner authored
(e.g. CMOVBE16rr instead of CMOVBErr16). llvm-svn: 115705
-