- Apr 15, 2013
-
-
Nadav Rotem authored
llvm-svn: 179505
-
Nadav Rotem authored
llvm-svn: 179504
-
Jia Liu authored
llvm-svn: 179503
-
Hal Finkel authored
Now that the CR spilling issues have been resolved, we can remove the unmodeled-side-effect attributes from the comparison instructions (and also mark them as isCompare). By allowing these, by default, to have unmodeled side effects, we were hiding problems with CR spilling; but everything seems much happier now. llvm-svn: 179502
-
Hal Finkel authored
This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition registers, the spill location is specified relative to the stack pointer (SP + 8). However, this is not relative to the SP after the new stack frame is established, but instead relative to the caller's stack pointer (it is stored into the linkage area of the parent's stack frame). So, like with the link register, we don't directly spill the CRs with other callee-saved registers, but just mark them to be spilled during prologue generation. In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32). llvm-svn: 179500
-
Eric Christopher authored
This reverts commit r179497 and the accompanying commit as it broke random platforms that aren't osx. llvm-svn: 179499
-
Eric Christopher authored
llvm-svn: 179498
-
Eric Christopher authored
and use that as the default triple for the module and target data layout. llvm-svn: 179497
-
- Apr 14, 2013
-
-
Nico Rieck authored
llvm-svn: 179494
-
David Majnemer authored
One performs: (X == 13 | X == 14) -> X-13 <u 2 The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1 The problem is that there are certain values of C1 and C2 that trigger both transforms but the first one blocks out the second, this generates suboptimal code. Reordering the transforms should be better in every case and allows us to do interesting stuff like turn: %shr = lshr i32 %X, 4 %and = and i32 %shr, 15 %add = add i32 %and, -14 %tobool = icmp ne i32 %add, 0 into: %and = and i32 %X, 240 %tobool = icmp ne i32 %and, 224 llvm-svn: 179493
-
Nadav Rotem authored
llvm-svn: 179492
-
Benjamin Kramer authored
llvm-svn: 179483
-
Nadav Rotem authored
llvm-svn: 179480
-
Nadav Rotem authored
llvm-svn: 179479
-
Nadav Rotem authored
llvm-svn: 179478
-
Jakob Stoklund Olesen authored
Test case by llvm-stress. llvm-svn: 179477
-
Nadav Rotem authored
llvm-svn: 179476
-
Nadav Rotem authored
SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree. llvm-svn: 179475
-
Jakob Stoklund Olesen authored
For when 16 TB just isn't enough. llvm-svn: 179474
-
Jakob Stoklund Olesen authored
This is the default model for non-PIC 64-bit code. It supports text+data+bss linked anywhere in the low 16 TB of the address space. llvm-svn: 179473
-
Jakob Stoklund Olesen authored
64-bit code models need multiple relocations that can't be inferred from the opcode like they can in 32-bit code. llvm-svn: 179472
-
Jakob Stoklund Olesen authored
Constant pool entries are accessed exactly the same way as global variables. llvm-svn: 179471
-
Nadav Rotem authored
llvm-svn: 179470
-
Jakob Stoklund Olesen authored
This fixes the pic32 code model for SPARC v9. llvm-svn: 179469
-
Jakob Stoklund Olesen authored
SDNodes and MachineOperands get target flags representing the %hi() and %lo() assembly annotations that eventually become relocations. Also define flags to be used by the 64-bit code models. llvm-svn: 179468
-
Hal Finkel authored
Leaving MFCR has having unmodeled side effects is not enough to prevent unwanted instruction reordering post-RA. We could probably apply a stronger barrier attribute, but there is a better way: Add all (not just the first) CR to be spilled as live-in to the entry block, and add all CRs to the MFCR instruction as implicitly killed. Unfortunately, I don't have a small test case. llvm-svn: 179465
-
- Apr 13, 2013
-
-
Jakob Stoklund Olesen authored
Currently, only abs32 and pic32 are implemented. Add a test case for abs32 with 64-bit code. 64-bit PIC code is currently broken. llvm-svn: 179463
-
Jakob Stoklund Olesen authored
It doesn't seem like anybody is checking types this late in isel, so no test case. llvm-svn: 179462
-
Benjamin Kramer authored
There is a Constant with non-constant operands: blockaddress. llvm-svn: 179460
-
Benjamin Kramer authored
This is basically the same fix in three different places. We use a set to avoid walking the whole tree of a big ConstantExprs multiple times. For example: (select cmp, (add big_expr 1), (add big_expr 2)) We don't want to visit big_expr twice here, it may consist of thousands of nodes. The testcase exercises this by creating an insanely large ConstantExprs out of a loop. It's questionable if the optimizer should ever create those, but this can be triggered with real C code. Fixes PR15714. llvm-svn: 179458
-
Hal Finkel authored
For functions that need to spill CRs, and have dynamic stack allocations, the value of the SP during the restore is not what it was during the save, and so we need to use the FP in these cases (as for all of the other spills and restores, but the CR restore has a special code path because its reserved slot, like the link register, is specified directly relative to the adjusted SP). llvm-svn: 179457
-
Andrew Trick authored
The order of copies depends on queue order, which is not very stable. llvm-svn: 179456
-
Andrew Trick authored
llvm-svn: 179455
-
Andrew Trick authored
llvm-svn: 179453
-
Andrew Trick authored
llvm-svn: 179452
-
Andrew Trick authored
MI-Sched cleanup. If an instruction has no valid sched class, do not attempt to check for a variant. llvm-svn: 179451
-
Andrew Trick authored
The initial values were arbitrary. I want them to be more conservative. This represents the number of latency cycles hidden by OOO execution. In practice, I think it should be within a small factor of the complex floating point operation latency so the scheduler can make some attempt to hide latency even for smallish blocks. These are by no means the best values, just a starting point for tuning heuristics. Some benchmarks such as TSVC run faster with this lower value for SandyBridge. I haven't run anything on Haswell, but it's shouldn't be 2x SB. llvm-svn: 179450
-
Andrew Trick authored
The register allocator expects minimal physreg live ranges. Schedule physreg copies accordingly. This is slightly tricky when they occur in the middle of the scheduling region. For now, this is handled by rescheduling the copy when its associated instruction is scheduled. Eventually we may instead bundle them, but only if we can preserve the bundles as parallel copies during regalloc. llvm-svn: 179449
-
Andrew Trick authored
I need to handle this for the test case in my following scheduler commit. Work is already under way to redesign the mechanism for node order propagation because this case by case approach is unmaintainable. llvm-svn: 179448
-
Rafael Espindola authored
I hope this brings http://lab.llvm.org:8011/builders/clang-x86_64-darwin11-self-mingw32 back. llvm-svn: 179446
-