- Dec 06, 2010
-
-
Devang Patel authored
This time for .s file. llvm-svn: 121016
-
Rafael Espindola authored
as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc bootstrap on darwin10 using darwin9's assembler and linker. llvm-svn: 121006
-
Rafael Espindola authored
linux and darwin assemblers happy :-( llvm-svn: 121004
-
Rafael Espindola authored
llvm-svn: 121001
-
Rafael Espindola authored
that no relocations are used (on MochO). Fixes llc producing different output from llc + llvm-mc. llvm-svn: 121000
-
Frits van Bommel authored
llvm-svn: 120998
-
Chris Lattner authored
optimization. Consider: static void foo() { A = alloca ... } static void bar() { B = alloca ... call foo(); } void main() { bar() } The inliner proceeds bottom up, but lets pretend it decides not to inline foo into bar. When it gets to main, it inlines bar into main(), and says "hey, I just inlined an alloca "B" into main, lets remember that. Then it keeps going and finds that it now contains a call to foo. It decides to inline foo into main, and says "hey, foo has an alloca A, and I have an alloca B from another inlined call site, lets reuse it". The problem with this of course, is that the lifetime of A and B are nested, not disjoint. Unfortunately I can't create a reasonable testcase for this: the one in the PR is both huge and extremely sensitive, because you minor tweaks end up causing foo to get inlined into bar too early. We already have tests for the basic alloca merging optimization and this does not break them. llvm-svn: 120995
-
Chris Lattner authored
llvm-svn: 120994
-
Chris Lattner authored
llvm-svn: 120993
-
Michael J. Spencer authored
llvm-svn: 120991
-
Michael J. Spencer authored
llvm-svn: 120989
-
Michael J. Spencer authored
llvm-svn: 120988
-
Michael J. Spencer authored
llvm-svn: 120987
-
Michael J. Spencer authored
llvm-svn: 120986
-
Michael J. Spencer authored
llvm-svn: 120985
-
Michael J. Spencer authored
implementation needs it for wchar_t and SmallVectorImpl in general. llvm-svn: 120984
-
Che-Liang Chiou authored
llvm-svn: 120982
-
Rafael Espindola authored
that on the ELF writer to detect a section we created. llvm-svn: 120981
-
Rafael Espindola authored
llvm-svn: 120980
-
Rafael Espindola authored
llvm-svn: 120979
-
Rafael Espindola authored
llvm-svn: 120978
-
Rafael Espindola authored
llvm-svn: 120977
-
Chris Lattner authored
memcpy's like: memcpy(A, B) memcpy(A, C) we cannot delete the first memcpy as dead if A and C might be aliases. If so, we actually get: memcpy(A, B) memcpy(A, A) which is not correct to transform into: memcpy(A, A) This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks Jakub! llvm-svn: 120974
-
Chris Lattner authored
llvm-svn: 120973
-
Evan Cheng authored
llvm-svn: 120971
-
NAKAMURA Takumi authored
llvm-svn: 120966
-
Evan Cheng authored
llvm-svn: 120965
-
Evan Cheng authored
llvm-svn: 120964
-
- Dec 05, 2010
-
-
Cameron Zwarich authored
StrongPHIElimination. llvm-svn: 120961
-
Evan Cheng authored
difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960
-
Cameron Zwarich authored
llvm-svn: 120959
-
Frits van Bommel authored
Clarify some of the differences between indexing with getelementptr and indexing with insertvalue/extractvalue. llvm-svn: 120957
-
Frits van Bommel authored
Also add asserts that the indices are valid in InsertValueInst::init(). ExtractValueInst already asserts when constructed with invalid indices. llvm-svn: 120956
-
Cameron Zwarich authored
PHIElimination.h. llvm-svn: 120953
-
Cameron Zwarich authored
time, this method existed, but now PHIElimination uses the method of the same name on MachineBasicBlock. llvm-svn: 120952
-
Cameron Zwarich authored
function so that it can be shared with StrongPHIElimination. llvm-svn: 120951
-
Frits van Bommel authored
Should have no functional change other than the order of two transformations that are mutually-exclusive and the exact formatting of debug output. Internally, it now stores the ConstantInt*s as Constant*s, and actual undef values instead of nulls. llvm-svn: 120946
-
Frits van Bommel authored
llvm-svn: 120945
-
Frits van Bommel authored
(indirectbr (select cond, blockaddress(@fn, BlockA), blockaddress(@fn, BlockB))) into (br cond, BlockA, BlockB). llvm-svn: 120943
-
Chris Lattner authored
result. This allows us to compile: void *test12(long count) { return new int[count]; } into: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx movq $-1, %rdi cmovnoq %rax, %rdi jmp __Znam ## TAILCALL instead of: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx seto %cl testb %cl, %cl movq $-1, %rdi cmoveq %rax, %rdi jmp __Znam Of course it would be even better if the regalloc inverted the cmov to 'cmovoq', which would eliminate the need for the 'movq %rdi, %rax'. llvm-svn: 120936
-