- Apr 21, 2011
-
-
Evan Cheng authored
llvm-svn: 129884
-
- Apr 20, 2011
-
-
Jakob Stoklund Olesen authored
On the x86-64 and thumb2 targets, some registers are more expensive to encode than others in the same register class. Add a CostPerUse field to the TableGen register description, and make it available from TRI->getCostPerUse. This represents the cost of a REX prefix or a 32-bit instruction encoding required by choosing a high register. Teach the greedy register allocator to prefer cheap registers for busy live ranges (as indicated by spill weight). llvm-svn: 129864
-
-
-
Justin Holewinski authored
used by Clang. To help Clang integration, the PTX target has been split into two targets: ptx32 and ptx64, depending on the desired pointer size. - Add GCCBuiltin class to all intrinsics - Split PTX target into ptx32 and ptx64 llvm-svn: 129851
-
Che-Liang Chiou authored
Patched by Dan Bailey llvm-svn: 129848
-
Che-Liang Chiou authored
Patched by Dan Bailey llvm-svn: 129847
-
Che-Liang Chiou authored
Patched by Dan Bailey llvm-svn: 129846
-
Nick Lewycky authored
llvm is built with unsigned chars where an immediate such as 0xff would be zero extended to 64-bits, turning "cmp $0xff,%eax" into "cmp $0xffffffffffffffff,%eax". llvm-svn: 129845
-
Rafael Espindola authored
llvm-svn: 129844
-
Daniel Dunbar authored
triple component. llvm-svn: 129838
-
Johnny Chen authored
llvm-svn: 129837
- Apr 19, 2011
-
-
Daniel Dunbar authored
predicates. llvm-svn: 129816
-
Daniel Dunbar authored
llvm-svn: 129813
-
Daniel Dunbar authored
llvm-svn: 129812
-
Daniel Dunbar authored
llvm-svn: 129811
-
Daniel Dunbar authored
llvm-svn: 129810
-
Daniel Dunbar authored
llvm-svn: 129809
-
Daniel Dunbar authored
llvm-svn: 129803
-
Eric Christopher authored
llvm-svn: 129781
-
rdar://8659675Bob Wilson authored
Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Enable these fp vmlx codegen changes for Cortex-A9. llvm-svn: 129775
-
-
Bob Wilson authored
(and add false dependency) when it isn't dependent on last CPSR defining instruction. rdar://8928208 llvm-svn: 129773
-
Bob Wilson authored
Add a avoidWriteAfterWrite() target hook to identify register classes that suffer from write-after-write hazards. For those register classes, try to avoid writing the same register in two consecutive instructions. This is currently disabled by default. We should not spill to avoid hazards! The command line flag -avoid-waw-hazard can be used to enable waw avoidance. llvm-svn: 129772
-
Bob Wilson authored
pipelines, at least on Cortex-A9. llvm-svn: 129771
-
Bob Wilson authored
llvm-svn: 129770
-
Eli Friedman authored
llvm-svn: 129765
-
Chris Lattner authored
en-mass for C++ PODs. On my c++ test file, this cuts the fast isel rejects by 10x and shrinks the generated .s file by 5% llvm-svn: 129755
-
Chris Lattner authored
llvm-svn: 129753
-
Chris Lattner authored
when they are a truncate from something else. This eliminates fully half of all the fastisel rejections on a test c++ file I'm working with, which should make a substantial improvement for -O0 compile of c++ code. This fixed rdar://9297003 - fast isel bails out on all functions taking bools llvm-svn: 129752
-
Chris Lattner authored
Before we would bail out on i1 arguments all together, now we just bail on non-constant ones. Also, we used to emit extraneous code. e.g. test12 was: movb $0, %al movzbl %al, %edi callq _test12 and test13 was: movb $0, %al xorl %edi, %edi movb %al, 7(%rsp) callq _test13f Now we get: movl $0, %edi callq _test12 and: movl $0, %edi callq _test13f llvm-svn: 129751
-
Chris Lattner authored
testb $1, %al je LBB0_2 ## BB#1: ## %if.then movb $0, %al instead of: testb $1, %al jne LBB0_1 jmp LBB0_2 LBB0_1: ## %if.then movb $0, %al how 'bout that. llvm-svn: 129749
-
rdar://9297006Chris Lattner authored
a common cause of fast isel rejects on c++ code. llvm-svn: 129748
-
Evan Cheng authored
is, it assumes addresses are 64-bit aligned (which should be the more common case). If the alignment is found not to be aligned, then getOperandLatency() would adjust the operand latency computation by one to compensate for it. rdar://9294833 llvm-svn: 129742
-
Evan Cheng authored
llvm-svn: 129738
-
- Apr 18, 2011
-
-
Jim Grosbach authored
llvm-svn: 129723
-
Eric Christopher authored
true on success and false on failure. Update callers. llvm-svn: 129722
-
Sean Callanan authored
superclass variable is instantiated properly. llvm-svn: 129713
-
Chris Lattner authored
the generated FastISel. X86 doesn't need to generate code to match ADD16ri8 since ADD16ri will do just fine. This is a small codesize win in the generated instruction selector. llvm-svn: 129692
-