- Apr 15, 2011
-
-
NAKAMURA Takumi authored
It broke several builds. llvm-svn: 129557
-
- Apr 14, 2011
-
-
Owen Anderson authored
llvm-svn: 129522
-
Rafael Espindola authored
size of the clang binary in Debug builds from 690MB to 679MB. llvm-svn: 129518
-
Andrew Trick authored
This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508
-
Chris Lattner authored
llvm-svn: 129503
-
Owen Anderson authored
During post-legalization DAG combining, be careful to only create shifts where the RHS is of the legal type for the new operation. llvm-svn: 129484
-
- Apr 13, 2011
-
-
Devang Patel authored
Remove extra bytes that were added for gdb. We do not have good poiner to understand actual reason behind this fixme. Spot checking suggest that newer gdb does not need this. llvm-svn: 129461
-
Jakob Stoklund Olesen authored
llvm-svn: 129442
-
Andrew Trick authored
Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421
-
Eric Christopher authored
llvm-svn: 129417
-
Eric Christopher authored
registers for fast allocation. Fixes rdar://9207598 llvm-svn: 129408
-
Devang Patel authored
llvm-svn: 129407
-
Devang Patel authored
llvm-svn: 129406
-
Devang Patel authored
llvm-svn: 129405
-
Devang Patel authored
This mechanical patch moves type handling into CompileUnit from DwarfDebug. In case of multiple compile unit in one object file, each compile unit is responsible for its own set of type entries anyway. This refactoring makes this obvious. llvm-svn: 129402
-
Eric Christopher authored
llvm-svn: 129400
-
- Apr 12, 2011
-
-
Jakob Stoklund Olesen authored
Use a Bitvector instead, we didn't need the smaller memory footprint anyway. This makes the greedy register allocator 10% faster. llvm-svn: 129390
-
Andrew Trick authored
llvm-svn: 129385
-
Andrew Trick authored
UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make these heuristic adjustments to node latency work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129383
-
Jakob Stoklund Olesen authored
This merges the behavior of splitSingleBlocks into splitAroundRegion, so the RS_Region and RS_Block register stages can be coalesced. That means the leftover intervals after region splitting go directly to spilling instead of a second pass of per-block splitting. llvm-svn: 129379
-
Jakob Stoklund Olesen authored
This makes it possible to target multiple registers in one pass. llvm-svn: 129374
-
Jakob Stoklund Olesen authored
llvm-svn: 129373
-
Devang Patel authored
llvm-svn: 129368
-
Devang Patel authored
llvm-svn: 129367
-
Eric Christopher authored
llvm-svn: 129334
-
Jakob Stoklund Olesen authored
when compiling many small functions. llvm-svn: 129321
-
Nick Lewycky authored
mean that it has to be ConstantArray of ConstantStruct. We might have ConstantAggregateZero, at either level, so don't crash on that. Also, semi-deprecate the sentinal value. The linker isn't aware of sentinals so we end up with the two lists appended, each with their "sentinals" on them. Different parts of LLVM treated sentinals differently, so make them all just ignore the single entry and continue on with the rest of the list. llvm-svn: 129307
-
- Apr 11, 2011
-
-
Jakob Stoklund Olesen authored
weight limit has been exceeded. llvm-svn: 129305
-
Bill Wendling authored
the 'unwind' instruction. However, later on that instruction was converted into a jump to the basic block it was located in, causing an infinite loop when we get there. It turns out, we get there if the _Unwind_Resume_or_Rethrow call returns (which it's not supposed to do). It returns if it cannot find a place to unwind to. Thus we would get what appears to be a "hang" when in reality it's just that the EH couldn't be propagated further along. Instead of infinitely looping (or calling `unwind', which none of our back-ends support (it's lowered into nothing...)), call the @llvm.trap() intrinsic instead. This may not conform to specific rules of a particular language, but it's rather better than infinitely looping. <rdar://problem/9175843&9233582> llvm-svn: 129302
-
Evan Cheng authored
Look pass copies when determining whether hoisting would end up inserting more copies. rdar://9266679 llvm-svn: 129297
-
Jakob Stoklund Olesen authored
LiveIntervals::findLiveInMBBs has to do a full binary search for each segment. llvm-svn: 129292
-
Evan Cheng authored
llvm-svn: 129287
-
Jakob Stoklund Olesen authored
Both coalescing and register allocation already check aliases for interference, so these extra segments are only slowing us down. This speeds up both linear scan and the greedy register allocator. llvm-svn: 129283
-
Jakob Stoklund Olesen authored
This particularly helps with the initial transfer of fixed intervals. llvm-svn: 129277
-
Jakob Stoklund Olesen authored
llvm-svn: 129276
-
Jakob Stoklund Olesen authored
In particular, don't repeatedly recompute the PIC base live range after rematerialization. llvm-svn: 129275
-
Jay Foad authored
llvm-svn: 129271
-
- Apr 09, 2011
-
-
Chris Lattner authored
Switch lowering probably shouldn't be using FP for this. This resolves PR9581. llvm-svn: 129199
-
Jakob Stoklund Olesen authored
It is common for large live ranges to have few basic blocks with register uses and many live-through blocks without any uses. This approach grows the Hopfield network incrementally around the use blocks, completely avoiding checking interference for some through blocks. llvm-svn: 129188
-
Jakob Stoklund Olesen authored
This doesn't require seeking in the live interval union, so it is very cheap. llvm-svn: 129187
-