- Apr 10, 2012
-
-
Chandler Carruth authored
the loop header has a non-loop predecessor which has been pre-fused into its chain due to unanalyzable branches. In this case, rotating the header into the body of the loop in order to place a loop exit at the bottom of the loop is a Very Bad Idea as it makes the loop non-contiguous. I'm working on a good test case for this, but it's a bit annoynig to craft. I should get one shortly, but I'm submitting this now so I can begin the (lengthy) performance analysis process. An initial run of LNT looks really, really good, but there is too much noise there for me to trust it much. llvm-svn: 154395
-
Anton Korobeynikov authored
This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394
-
David Chisnall authored
Patch by Dmitri Shubin! llvm-svn: 154391
-
Duncan Sands authored
rational number, eg as 2.5 rather than 5, 2. OK'd by Peter Collingbourne. llvm-svn: 154387
-
Andrew Trick authored
Take this opportunity to generalize the indirectbr bailout logic for loop transformations. CFG transformations will never get indirectbr right, and there's no point trying. llvm-svn: 154386
-
Andrew Trick authored
llvm-svn: 154385
-
Andrew Trick authored
Recent refactoring introduced a bug. Fix: added buildRegUnitSets. llvm-svn: 154382
-
Evan Cheng authored
llvm-svn: 154379
-
Evan Cheng authored
llvm-svn: 154378
-
Andrew Trick authored
Jakob's review. llvm-svn: 154377
-
Andrew Trick authored
llvm-svn: 154375
-
Andrew Trick authored
This is a new algorithm that finds sets of register units that can be used to model registers pressure. This handles arbitrary, overlapping register classes. Each register class is associated with a (small) list of pressure sets. These are the dimensions of pressure affected by the register class's liveness. llvm-svn: 154374
-
Andrew Trick authored
This is a new algorithm that associates registers with weighted register units to accuretely model their effect on register pressure. This handles registers with multiple overlapping subregisters. It is possible, but almost inconceivable that the algorithm fails to find an exact solution for a target description. If an exact solution cannot be found, an inexact, but reasonable solution will be chosen. llvm-svn: 154373
-
Andrew Trick authored
llvm-svn: 154372
-
Danil Malyshev authored
llvm-svn: 154371
-
Evan Cheng authored
legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370
-
Rafael Espindola authored
not fit in a i64. llvm-svn: 154364
-
Jim Grosbach authored
Generalized logic of r154141. llvm-svn: 154362
-
Lang Hames authored
llvm-svn: 154359
-
Bill Wendling authored
Revert the 'EnableInitializing' flag. There is debate on whether we should run that pass by default in LTO. llvm-svn: 154356
-
Bill Wendling authored
Apply the scope restrictions after parsing the command line options. There may be some which are used in that function. llvm-svn: 154348
-
- Apr 09, 2012
-
-
Akira Hatanaka authored
GOT if jump table uses 64-bit gp-relative relocation. llvm-svn: 154341
-
Chad Rosier authored
in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 llvm-svn: 154340
-
Lang Hames authored
This patch restores TwoAddressInstructionPass's pre-r153892 behaviour when rescheduling instructions in TryInstructionTransform. Hopefully this will fix PR12493. To refix PR11861, lowering of INSERT_SUBREGS is deferred until after the copy that unties the operands is emitted (this seems to be a more appropriate fix for that issue anyway). llvm-svn: 154338
-
Chad Rosier authored
llvm-svn: 154336
-
Eric Christopher authored
llvm-svn: 154329
-
David Blaikie authored
A couple of cases where we were accidentally creating constant conditions by something like "x == a || b" instead of "x == a || x == b". In one case a conditional & then unreachable was used - I transformed this into a direct assert instead. llvm-svn: 154324
-
Rafael Espindola authored
llvm-svn: 154322
-
Preston Gurd authored
original patch to add itineraries, to X86InstrArithmetc.td. llvm-svn: 154320
-
Duncan Sands authored
less accurate method. llvm-svn: 154319
-
Nadav Rotem authored
llvm-svn: 154313
-
Bill Wendling authored
llvm-svn: 154312
-
Nadav Rotem authored
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering. llvm-svn: 154310
-
Craig Topper authored
Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out. llvm-svn: 154309
-
Craig Topper authored
llvm-svn: 154308
-
Craig Topper authored
llvm-svn: 154307
-
Bill Wendling authored
llvm-svn: 154306
-
Craig Topper authored
llvm-svn: 154305
-
Chandler Carruth authored
x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is *not* using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304
-
Chandler Carruth authored
comprehensive testing of TLS codegen for x86. Convert all of the ones that were still using grep to use FileCheck. Remove some redundancies between them. Perhaps most interestingly expand the test cases so that they actually fully list the instruction snippet being tested. TLS operations are *very* narrowly defined, and so these seem reasonably stable. More importantly, the existing test cases already were crazy fine grained, expecting specific registers to be allocated. This just clarifies that no *other* instructions are expected, and fills in some crucial gaps that weren't being tested at all. This will make any subsequent changes to TLS much more clear during review. llvm-svn: 154303
-