- Apr 21, 2011
-
-
Jakob Stoklund Olesen authored
TII::isTriviallyReMaterializable() shouldn't depend on any properties of the register being defined by the instruction. Rematerialization is going to create a new virtual register anyway. llvm-svn: 129882
-
- Apr 20, 2011
-
-
Jakob Stoklund Olesen authored
On the x86-64 and thumb2 targets, some registers are more expensive to encode than others in the same register class. Add a CostPerUse field to the TableGen register description, and make it available from TRI->getCostPerUse. This represents the cost of a REX prefix or a 32-bit instruction encoding required by choosing a high register. Teach the greedy register allocator to prefer cheap registers for busy live ranges (as indicated by spill weight). llvm-svn: 129864
-
-
Rafael Espindola authored
llvm-svn: 129844
-
Eric Christopher authored
manually and pass all (now) 4 arguments to the mul libcall. Add a new ExpandLibCall for just this (copied gratuitously from type legalization). Fixes rdar://9292577 llvm-svn: 129842
-
Daniel Dunbar authored
triple component. llvm-svn: 129838
-
- Apr 19, 2011
-
-
Daniel Dunbar authored
- There is a minor semantic change here (evidenced by the test change) for Darwin triples that have no version component. I debated changing the default behavior of isOSVersionLT, but decided it made more sense for triples to be explicit. llvm-svn: 129802
-
-
Bob Wilson authored
Add a avoidWriteAfterWrite() target hook to identify register classes that suffer from write-after-write hazards. For those register classes, try to avoid writing the same register in two consecutive instructions. This is currently disabled by default. We should not spill to avoid hazards! The command line flag -avoid-waw-hazard can be used to enable waw avoidance. llvm-svn: 129772
-
Jakob Stoklund Olesen authored
This means that the new register allocator can be used with 'clang -mllvm -regalloc=greedy'. llvm-svn: 129764
-
Eli Friedman authored
unnecessary work where possible. llvm-svn: 129763
-
-
Chris Lattner authored
en-mass for C++ PODs. On my c++ test file, this cuts the fast isel rejects by 10x and shrinks the generated .s file by 5% llvm-svn: 129755
-
- Apr 18, 2011
-
-
Eli Friedman authored
llvm-svn: 129720
-
Devang Patel authored
llvm-svn: 129715
-
Jakob Stoklund Olesen authored
the spilled register. This is quite common on ARM now that some stores have early-clobber defines. llvm-svn: 129714
-
Eric Christopher authored
registers for fast allocation a different way. This has us updating used registers only when we're using that exact register. Fixes rdar://9207598 llvm-svn: 129711
-
Chris Lattner authored
this fixes a few rejects on c++ iterator loops. llvm-svn: 129694
-
-
- Apr 17, 2011
-
-
Chris Lattner authored
2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts 3. teach tblgen to handle shift immediates that are different sizes than the shifted operands, eliminating some code from the X86 fast isel backend. 4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function instead of FastEmit_ri to simplify code. llvm-svn: 129666
-
Chris Lattner authored
less trivial things) into a dummy lea. Before we generated: _test: ## @test movq _G@GOTPCREL(%rip), %rax leaq (%rax), %rax ret now we produce: _test: ## @test movq _G@GOTPCREL(%rip), %rax ret This is part of rdar://9289558 llvm-svn: 129662
-
rdar://9289512Chris Lattner authored
The basic issue here is that bottom-up isel is matching the branch and compare, and was failing to fold the load into the branch/compare combo. Fixing this (by allowing folding into any instruction of a sequence that is selected) allows us to produce things like: cmpb $0, 52(%rax) je LBB4_2 instead of: movb 52(%rax), %cl cmpb $0, %cl je LBB4_2 This makes the generated -O0 code run a bit faster, but also speeds up compile time by putting less pressure on the register allocator and generating less code. This was one of the biggest classes of missing load folding. Implementing this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm) line count. llvm-svn: 129656
-
Chris Lattner authored
which don't need to check for falling off the end of a block *and* end of phi nodes, since terminators are never phis. llvm-svn: 129655
-
rdar://9289583Chris Lattner authored
allowing us to fold the immediate into the 'and' in this case: int test1(int i) { return 8&i; } llvm-svn: 129653
-
Eli Friedman authored
Returning a new node makes the code try to replace the old node, which in the included testcase is killed by CSE. llvm-svn: 129650
-
- Apr 16, 2011
-
-
Francois Pichet authored
For further information on this particular issue see: http://connect.microsoft.com/VisualStudio/feedback/details/520043/error-converting-from-null-to-a-pointer-type-in-std-pair llvm-svn: 129642
-
Benjamin Kramer authored
llvm-svn: 129639
-
Rafael Espindola authored
error in foo.o; no .eh_frame_hdr table will be created. llvm-svn: 129635
-
Evan Cheng authored
Fix divmod libcall lowering. Convert to {S|U}DIVREM first and then expand the node to a libcall. rdar://9280991 llvm-svn: 129633
-
Devang Patel authored
Introduce support to encode Objective-C property information in debugging information generated for an interface. llvm-svn: 129624
-
- Apr 15, 2011
-
-
Rafael Espindola authored
llvm-svn: 129600
-
Jakob Stoklund Olesen authored
The transferValues() function can now handle both singly and multiply defined values, as long as the resulting live range is known. Only rematerialized values have their live range recomputed by extendRange(). The updateSSA() function can now insert PHI values in bulk across multiple values in multiple target registers in one pass. The list of blocks received from transferValues() is in layout order which seems to work well for the iterative algorithm. Blocks from extendRange() are still in reverse BFS order, but this function is used so rarely now that it doesn't matter. llvm-svn: 129580
-
Jakob Stoklund Olesen authored
llvm-svn: 129579
-
Rafael Espindola authored
Change ELF systems to use CFI for producing the EH tables. This reduces the size of the clang binary in Debug builds from 690MB to 679MB. llvm-svn: 129571
-
Chris Lattner authored
Luis Felipe Strano Moraes! llvm-svn: 129558
-
NAKAMURA Takumi authored
It broke several builds. llvm-svn: 129557
-
- Apr 14, 2011
-
-
Owen Anderson authored
llvm-svn: 129522
-
Rafael Espindola authored
size of the clang binary in Debug builds from 690MB to 679MB. llvm-svn: 129518
-
Andrew Trick authored
This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508
-
Chris Lattner authored
llvm-svn: 129503
-