- Sep 13, 2011
-
-
Jakob Stoklund Olesen authored
SplitKit will soon need two copies of these data structures, and the algorithms will also be useful when LiveIntervalAnalysis becomes independent of LiveVariables. llvm-svn: 139572
-
- Sep 12, 2011
-
-
Bill Wendling authored
Splitting a landing pad takes considerable care because of PHIs and other nasties. The problem is that the jump table needs to jump to the landing pad block. However, the landing pad block can be jumped to only by an invoke instruction. So we clone the landingpad instruction into its own basic block, have the invoke jump to there. The landingpad instruction's basic block's successor is now the target for the jump table. But because of PHI nodes, we need to create another basic block for the jump table to jump to. This is definitely a hack, because the values for the PHI nodes may not be defined on the edge from the jump table. But that's okay, because the jump table is simply a construct to mimic what is happening in the CFG. So the values are mysteriously there, even though there is no value for the PHI from the jump table's edge (hence calling this a hack). llvm-svn: 139545
-
Jakob Stoklund Olesen authored
It has been enabled by default for a while, it was only there to allow performance comparisons. llvm-svn: 139501
-
Jakob Stoklund Olesen authored
SplitKit always computes a complement live range to cover the places where the original live range was live, but no explicit region has been allocated. Currently, the complement live range is created to be as small as possible - it never overlaps any of the regions. This minimizes register pressure, but if the complement is going to be spilled anyway, that is not very important. The spiller will eliminate redundant spills, and hoist others by making the spill slot live range overlap some of the regions created by splitting. Stack slots are cheap. This patch adds the interface to enable spill modes in SplitKit. In spill mode, SplitKit will assume that the complement is going to spill, so it will allow it to overlap regions in order to avoid back-copies. By doing some of the spiller's work early, the complement live range becomes simpler. In some cases, it can become much simpler because no extra PHI-defs are required. This will speed up both splitting and spilling. This is only the interface to enable spill modes, no implementation yet. llvm-svn: 139500
-
Jakob Stoklund Olesen authored
llvm-svn: 139498
-
- Sep 10, 2011
-
-
Richard Trieu authored
assert("error"); to: assert(0 && "error"); llvm-svn: 139449
-
Chris Lattner authored
llvm-svn: 139419
-
- Sep 09, 2011
-
-
Eli Friedman authored
Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the same type. Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs. Fixes PR10897. llvm-svn: 139407
-
Jakob Stoklund Olesen authored
In some cases such as interpreters using indirectbr, the CFG can be very complicated, and live range splitting may be forced to insert a large number of phi-defs. When that happens, traceSiblingValue can spend a lot of time zipping around in the CFG looking for defs and reloads. This patch causes more information to be cached in SibValues, and the cached values are used to terminate searches early. This speeds up spilling by 20x in one interpreter test case. For more typical code, this is just a 10% speedup of spilling. The previous version had bugs that caused miscompilations. They have been fixed. llvm-svn: 139378
-
Devang Patel authored
Directly point debug info to the stack slot of the arugment, instead of trying to keep track of vreg in which it the arugment is copied. The LiveDebugVariable can keep track of variable's ranges. llvm-svn: 139330
-
- Sep 07, 2011
-
-
Jakob Stoklund Olesen authored
It broke the self host and clang-x86_64-darwin10-RA. llvm-svn: 139259
-
Jakob Stoklund Olesen authored
In some cases such as interpreters using indirectbr, the CFG can be very complicated, and live range splitting may be forced to insert a large number of phi-defs. When that happens, traceSiblingValue can spend a lot of time zipping around in the CFG looking for defs and reloads. This patch causes more information to be cached in SibValues, and the cached values are used to terminate searches early. This speeds up spilling by 20x in one interpreter test case. For more typical code, this is just a 10% speedup of spilling. llvm-svn: 139247
-
James Molloy authored
Refactor instprinter and mcdisassembler to take a SubtargetInfo. Add -mattr= handling to llvm-mc. Reviewed by Owen Anderson. llvm-svn: 139237
-
Eli Friedman authored
Relax the MemOperands on atomics a bit. Fixes -verify-machineinstrs failures for atomic laod/store on ARM. (The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.) llvm-svn: 139221
-
Devang Patel authored
While sinking machine instructions, sink matching DBG_VALUEs also otherwise live debug variable pass will drop DBG_VALUEs on the floor. llvm-svn: 139208
-
- Sep 06, 2011
-
-
Duncan Sands authored
with a vector condition); such selects become VSELECT codegen nodes. This patch also removes VSETCC codegen nodes, unifying them with SETCC nodes (codegen was actually often using SETCC for vector SETCC already). This ensures that various DAG combiner optimizations kick in for vector comparisons. Passes dragonegg bootstrap with no testsuite regressions (nightly testsuite as well as "make check-all"). Patch mostly by Nadav Rotem. llvm-svn: 139159
-
Duncan Sands authored
init.trampoline and adjust.trampoline intrinsics, into two intrinsics like in GCC. While having one combined intrinsic is tempting, it is not natural because typically the trampoline initialization needs to be done in one function, and the result of adjust trampoline is needed in a different (nested) function. To get around this llvm-gcc hacks the nested function lowering code to insert an additional parent variable holding the adjust.trampoline result that can be accessed from the child function. Dragonegg doesn't have the luxury of tweaking GCC code, so it stored the result of adjust.trampoline in the memory GCC set aside for the trampoline itself (this is always available in the child function), and set up some new memory (using an alloca) to hold the trampoline. Unfortunately this breaks Go which allocates trampoline memory on the heap and wants to use it even after the parent has exited (!). Rather than doing even more hacks to get Go working, it seemed best to just use two intrinsics like in GCC. Patch mostly by Sanjoy Das. llvm-svn: 139140
-
- Sep 03, 2011
-
-
Owen Anderson authored
If we have a chain of zext -> assert_zext -> zext -> use, the first zext would get simplified away because of the later zext, and then the later zext would get simplified away because of the assert. The solution is to teach SimplifyDemandedBits that assert_zext demands all of the high bits of its input, rather than only those demanded by its users. No testcase because the only example I have manifests as llvm-gcc miscompiling LLVM, and I haven't found a smaller case that reproduces this problem. Fixes <rdar://problem/10063365>. llvm-svn: 139059
-
- Sep 02, 2011
-
-
Jakob Stoklund Olesen authored
llvm-svn: 139019
-
Duncan Sands authored
llvm-svn: 139015
-
Dan Gohman authored
to be unreliable on platforms which require memcpy calls, and it is complicating broader legalize cleanups. It is hoped that these cleanups will make memcpy byval easier to implement in the future. llvm-svn: 138977
-
Benjamin Kramer authored
- On COFF the .lcomm directive has an alignment argument. - On ELF we fall back to .local + .comm Based on a patch by NAKAMURA Takumi. Fixes PR9337, PR9483 and PR10128. llvm-svn: 138976
-
- Sep 01, 2011
-
-
Jakob Stoklund Olesen authored
An instruction may define part of a register where the other bits are undefined. In that case, it is safe to rematerialize the instruction. For example: %vreg2:ssub_0<def> = VLDRS <cp#0>, 0, pred:14, pred:%noreg, %vreg2<imp-def> The extra <imp-def> operand indicates that the instruction does not read the other parts of the virtual register, so a remat is safe. This patch simply allows multiple def operands for the virtual register. It is MI->readsVirtualRegister() that determines if we depend on a previous value so remat is impossible. llvm-svn: 138953
-
Jakob Stoklund Olesen authored
The problem is fixed for all register allocators by r138944, so this patch is no longer necessary. <rdar://problem/10032939> llvm-svn: 138945
-
Jakob Stoklund Olesen authored
An instruction that redefines only part of a larger register can never be rematerialized since the virtual register value depends on the old value in other parts of the register. This was fixed for the inline spiller in r138794. This patch fixes the problem for all register allocators, and includes a small test case. <rdar://problem/10032939> llvm-svn: 138944
-
Evan Cheng authored
Teach MachineLICM reg pressure tracking code to deal with MVT::untyped. Sorry, I can't come up with a small test case. rdar://10043690 llvm-svn: 138934
-
Andrew Trick authored
Added canClobberReachingPhysRegUse() to handle a particular pattern in which a two-address instruction could be forced to interfere with EFLAGS, causing a compare to be unnecessarilly cloned. Fixes rdar://problem/5875261 llvm-svn: 138924
-
- Aug 31, 2011
-
-
David Greene authored
Stores sizes as uint64_t to avoid possible truncation. llvm-svn: 138901
-
Eli Friedman authored
llvm-svn: 138887
-
Eli Friedman authored
Fill in type legalization for MERGE_VALUES in all the various cases. Patch by Micah Villmow. (No testcase because the issue only showed up in an out-of-tree backend.) llvm-svn: 138877
-
Eli Friedman authored
Generic expansion for atomic load/store into cmpxchg/atomicrmw xchg; implements 64-bit atomic load/store for ARM. llvm-svn: 138872
-
David Greene authored
Emit a repeated sequence of bytes using .zero. This saves an enormous amount of asm file space for certain programs. llvm-svn: 138864
-
Rafael Espindola authored
llvm-svn: 138858
-
- Aug 30, 2011
-
-
Rafael Espindola authored
X86. Modify the pass added in the previous patch to call this new code. This new prologues generated will call a libgcc routine (__morestack) to allocate more stack space from the heap when required Patch by Sanjoy Das. llvm-svn: 138812
-
Evan Cheng authored
Add a instruction flag: hasPostISelHook which tells the pre-RA scheduler to call a target hook to adjust the instruction. For ARM, this is used to adjust instructions which may be setting the 's' flag. ADC, SBC, RSB, and RSC instructions have implicit def of CPSR (required since it now uses CPSR physical register dependency rather than "glue"). If the carry flag is used, then the target hook will *fill in* the optional operand with CPSR. Otherwise, the hook will remove the CPSR implicit def from the MachineInstr. llvm-svn: 138810
-
Bob Wilson authored
I don't currently have a good testcase for this; will try to get one tomorrow. <rdar://problem/10032939> llvm-svn: 138794
-
Jim Grosbach authored
llvm-svn: 138773
-
- Aug 28, 2011
-
-
Duncan Sands authored
when outputting them. With this, the entire LLVM testsuite passes when built with dragonegg. llvm-svn: 138724
-
- Aug 27, 2011
-
-
Bill Wendling authored
llvm-svn: 138697
-
- Aug 26, 2011
-
-
Bill Wendling authored
llvm-svn: 138664
-