- Jan 05, 2011
-
-
Jakob Stoklund Olesen authored
calculated. llvm-svn: 122912
-
Eric Christopher authored
llvm-svn: 122909
-
- Jan 04, 2011
-
-
Eric Christopher authored
llvm-svn: 122849
-
Jakob Stoklund Olesen authored
The analysis will be needed by both the greedy register allocator and the X86FloatingPoint pass. It only needs to be computed once when the CFG doesn't change. This pass is very fast, usually showing up as 0.0% wall time. llvm-svn: 122832
-
Cameron Zwarich authored
makes getLeader() nonrecursive. llvm-svn: 122811
-
Cameron Zwarich authored
spent in StrongPHIElimination on 403.gcc. llvm-svn: 122803
-
Owen Anderson authored
llvm-svn: 122795
-
- Jan 03, 2011
-
-
Cameron Zwarich authored
a 28% speedup of MachineCSE time on 403.gcc. llvm-svn: 122735
-
- Jan 02, 2011
-
-
Chris Lattner authored
so that Dominators.h is *just* domtree. Also prune #includes a bit. llvm-svn: 122714
-
Benjamin Kramer authored
This allows us to compile: void test(char *s, int a) { __builtin_memset(s, a, 15); } into 1 mul + 3 stores instead of 3 muls + 3 stores. llvm-svn: 122710
-
Benjamin Kramer authored
Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors. We could implement a DAGCombine to turn x * 0x0101 back into logic operations on targets that doesn't support the multiply or it is slow (p4) if someone cares enough. Example code: void test(char *s, int a) { __builtin_memset(s, a, 4); } before: _test: ## @test movzbl 8(%esp), %eax movl %eax, %ecx shll $8, %ecx orl %eax, %ecx movl %ecx, %eax shll $16, %eax orl %ecx, %eax movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret after: _test: ## @test movzbl 8(%esp), %eax imull $16843009, %eax, %eax ## imm = 0x1010101 movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret llvm-svn: 122707
-
- Dec 30, 2010
-
-
Cameron Zwarich authored
with 2-address instructions, for about a 3.5% speedup of StrongPHIElimination on 403.gcc. llvm-svn: 122635
-
- Dec 29, 2010
-
-
Cameron Zwarich authored
llvm-svn: 122628
-
Cameron Zwarich authored
process those instructions that define phi sources. This is a 47% speedup of StrongPHIElimination compile time on 403.gcc. llvm-svn: 122627
-
Cameron Zwarich authored
llvm-svn: 122625
-
Cameron Zwarich authored
llvm-svn: 122617
-
Cameron Zwarich authored
in the most obvious way. llvm-svn: 122610
-
Cameron Zwarich authored
it relies on assumptions that may not be true in the future. llvm-svn: 122608
-
- Dec 28, 2010
-
-
Cameron Zwarich authored
we are only interested in the defs when discovering interferences. This is a 28% speedup running StrongPHIElimination on 403.gcc. llvm-svn: 122596
-
Duncan Sands authored
in this function, but the compiler was warning that it might be when doing a release build. llvm-svn: 122595
-
- Dec 27, 2010
-
-
Cameron Zwarich authored
llvm-svn: 122586
-
Cameron Zwarich authored
when running without the verifier, and I have not yet checked them to see if the new results are still correct. There are more verifier failures, but they all seem to be additional occurrences of verifier failures that occur with the existing PHIElimination pass. There are a few obvious issues with the code: 1) It doesn't properly update the register equivalence classes during copy insertion, and instead recomputes them before merging live intervals and renaming registers. I wanted to keep this first patch simple for debugging purposes, but it shouldn't be very hard to do this. 2) It doesn't mix the renaming and live interval merging with the copy insertion process, which leads to a lot of virtual register churn. Virtual registers and live intervals are created, only to later be merged into others. The code should be smarter and only create a new virtual register if there is no existing register in the same congruence class. 3) In one place the code uses a DenseMap per basic block, which is unnecessary heap allocation. There should be an inline storage version of DenseMap. I did a quick compile-time test of running llc on 403.gcc with and without StrongPHIElimination. It is slightly slower with StrongPHIElimination, because the small decrease in the coalescer runtime can't beat the increase in phi elimination runtime. Perhaps fixing the above performance issues will narrow the gap. I also haven't yet run any tests of the quality of the generated code. llvm-svn: 122582
-
Cameron Zwarich authored
valno verification. The "Different value live out of predecessor" check is incorrect in the case of phi-def valnos, so just skip that check for phi-def valnos and instead check that all of the valnos for predecessors have phi-kill. Fixes PR8863. llvm-svn: 122581
-
- Dec 24, 2010
-
-
Andrew Trick authored
llvm-svn: 122545
-
Andrew Trick authored
Fix a few cases where the scheduler is not checking for phys reg copies. The scheduling node may have a NULL DAG node, yuck. llvm-svn: 122544
-
Andrew Trick authored
DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541
-
Andrew Trick authored
llvm-svn: 122539
-
Cameron Zwarich authored
llvm-svn: 122537
-
- Dec 23, 2010
-
-
Chris Lattner authored
llvm-svn: 122509
-
Chris Lattner authored
llvm-svn: 122507
-
Andrew Trick authored
llvm-svn: 122491
-
Andrew Trick authored
Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows multiple nodes per cycle. llvm-svn: 122474
-
Andrew Trick authored
llvm-svn: 122473
-
Andrew Trick authored
In the bottom-up selection DAG scheduling, handle two-address instructions that read/write unspillable registers. Treat the entire chain of two-address nodes as a single live range. llvm-svn: 122472
-
Jeffrey Yasskin authored
new gcc warning that complains on self-assignments and self-initializations. llvm-svn: 122458
-
Benjamin Kramer authored
DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code. example code: unsigned foo(unsigned x, unsigned y) { if (x != 0) y--; return y; } before: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] sbbl %eax, %eax ## encoding: [0x19,0xc0] notl %eax ## encoding: [0xf7,0xd0] addl 8(%esp), %eax ## encoding: [0x03,0x44,0x24,0x08] ret ## encoding: [0xc3] after: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] movl 8(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08] adcl $-1, %eax ## encoding: [0x83,0xd0,0xff] ret ## encoding: [0xc3] llvm-svn: 122455
-
- Dec 22, 2010
-
-
Jakob Stoklund Olesen authored
pick the victim with the lowest total spill weight. llvm-svn: 122445
-
Jakob Stoklund Olesen authored
llvm-svn: 122444
-
Chris Lattner authored
loads properly. We miscompiled the testcase into: _test: ## @test movl $128, (%rdi) movzbl 1(%rdi), %eax ret Now we get a proper: _test: ## @test movl $128, (%rdi) movsbl (%rdi), %eax movzbl %ah, %eax ret This fixes PR8757. llvm-svn: 122392
-
Chris Lattner authored
unhanded cases faster and simplify code. llvm-svn: 122391
-