- Apr 16, 2013
-
-
Nadav Rotem authored
SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops. llvm-svn: 179562
-
- Apr 15, 2013
-
-
Tom Stellard authored
Instead of emitting config values in a predefined order, the code emitter will now emit a 32-bit register index followed by the 32-bit config value. llvm-svn: 179546
-
Tom Stellard authored
llvm-svn: 179545
-
Tom Stellard authored
llvm-svn: 179544
-
Jim Grosbach authored
llvm-svn: 179542
-
Rafael Espindola authored
I will remove the isBigEndianHost function once I update clang. The ifdef logic is designed to * not use configure/cmake to avoid breaking -arch i686 -arch ppc. * default to little endian * be as small as possible It looks like sys/endian.h is the preferred header on most modern BSD systems, but it is better to change this in a followup patch as machine/endian.h is available on FreeBSD, OpenBSD, NetBSD and OS X. llvm-svn: 179527
-
Andy Gibbs authored
This is a rework of the broken parts in r179373 which were subsequently reverted in r179374 due to incompatibility with C++98 compilers. This version should be ok under C++98. llvm-svn: 179520
-
Nadav Rotem authored
Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make -fslp-vectorize run the slp-vectorizer. llvm-svn: 179508
-
Nadav Rotem authored
llvm-svn: 179505
-
Nadav Rotem authored
llvm-svn: 179504
-
Hal Finkel authored
Now that the CR spilling issues have been resolved, we can remove the unmodeled-side-effect attributes from the comparison instructions (and also mark them as isCompare). By allowing these, by default, to have unmodeled side effects, we were hiding problems with CR spilling; but everything seems much happier now. llvm-svn: 179502
-
Hal Finkel authored
This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition registers, the spill location is specified relative to the stack pointer (SP + 8). However, this is not relative to the SP after the new stack frame is established, but instead relative to the caller's stack pointer (it is stored into the linkage area of the parent's stack frame). So, like with the link register, we don't directly spill the CRs with other callee-saved registers, but just mark them to be spilled during prologue generation. In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32). llvm-svn: 179500
-
- Apr 14, 2013
-
-
Nico Rieck authored
llvm-svn: 179494
-
David Majnemer authored
One performs: (X == 13 | X == 14) -> X-13 <u 2 The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1 The problem is that there are certain values of C1 and C2 that trigger both transforms but the first one blocks out the second, this generates suboptimal code. Reordering the transforms should be better in every case and allows us to do interesting stuff like turn: %shr = lshr i32 %X, 4 %and = and i32 %shr, 15 %add = add i32 %and, -14 %tobool = icmp ne i32 %add, 0 into: %and = and i32 %X, 240 %tobool = icmp ne i32 %and, 224 llvm-svn: 179493
-
Benjamin Kramer authored
llvm-svn: 179483
-
Nadav Rotem authored
llvm-svn: 179479
-
Nadav Rotem authored
llvm-svn: 179478
-
Jakob Stoklund Olesen authored
Test case by llvm-stress. llvm-svn: 179477
-
Nadav Rotem authored
SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree. llvm-svn: 179475
-
Jakob Stoklund Olesen authored
For when 16 TB just isn't enough. llvm-svn: 179474
-
Jakob Stoklund Olesen authored
This is the default model for non-PIC 64-bit code. It supports text+data+bss linked anywhere in the low 16 TB of the address space. llvm-svn: 179473
-
Jakob Stoklund Olesen authored
64-bit code models need multiple relocations that can't be inferred from the opcode like they can in 32-bit code. llvm-svn: 179472
-
Jakob Stoklund Olesen authored
Constant pool entries are accessed exactly the same way as global variables. llvm-svn: 179471
-
Nadav Rotem authored
llvm-svn: 179470
-
Jakob Stoklund Olesen authored
This fixes the pic32 code model for SPARC v9. llvm-svn: 179469
-
Jakob Stoklund Olesen authored
SDNodes and MachineOperands get target flags representing the %hi() and %lo() assembly annotations that eventually become relocations. Also define flags to be used by the 64-bit code models. llvm-svn: 179468
-
Hal Finkel authored
Leaving MFCR has having unmodeled side effects is not enough to prevent unwanted instruction reordering post-RA. We could probably apply a stronger barrier attribute, but there is a better way: Add all (not just the first) CR to be spilled as live-in to the entry block, and add all CRs to the MFCR instruction as implicitly killed. Unfortunately, I don't have a small test case. llvm-svn: 179465
-
- Apr 13, 2013
-
-
Jakob Stoklund Olesen authored
Currently, only abs32 and pic32 are implemented. Add a test case for abs32 with 64-bit code. 64-bit PIC code is currently broken. llvm-svn: 179463
-
Jakob Stoklund Olesen authored
It doesn't seem like anybody is checking types this late in isel, so no test case. llvm-svn: 179462
-
Benjamin Kramer authored
There is a Constant with non-constant operands: blockaddress. llvm-svn: 179460
-
Benjamin Kramer authored
This is basically the same fix in three different places. We use a set to avoid walking the whole tree of a big ConstantExprs multiple times. For example: (select cmp, (add big_expr 1), (add big_expr 2)) We don't want to visit big_expr twice here, it may consist of thousands of nodes. The testcase exercises this by creating an insanely large ConstantExprs out of a loop. It's questionable if the optimizer should ever create those, but this can be triggered with real C code. Fixes PR15714. llvm-svn: 179458
-
Hal Finkel authored
For functions that need to spill CRs, and have dynamic stack allocations, the value of the SP during the restore is not what it was during the save, and so we need to use the FP in these cases (as for all of the other spills and restores, but the CR restore has a special code path because its reserved slot, like the link register, is specified directly relative to the adjusted SP). llvm-svn: 179457
-
Andrew Trick authored
llvm-svn: 179452
-
Andrew Trick authored
MI-Sched cleanup. If an instruction has no valid sched class, do not attempt to check for a variant. llvm-svn: 179451
-
Andrew Trick authored
The initial values were arbitrary. I want them to be more conservative. This represents the number of latency cycles hidden by OOO execution. In practice, I think it should be within a small factor of the complex floating point operation latency so the scheduler can make some attempt to hide latency even for smallish blocks. These are by no means the best values, just a starting point for tuning heuristics. Some benchmarks such as TSVC run faster with this lower value for SandyBridge. I haven't run anything on Haswell, but it's shouldn't be 2x SB. llvm-svn: 179450
-
Andrew Trick authored
The register allocator expects minimal physreg live ranges. Schedule physreg copies accordingly. This is slightly tricky when they occur in the middle of the scheduling region. For now, this is handled by rescheduling the copy when its associated instruction is scheduled. Eventually we may instead bundle them, but only if we can preserve the bundles as parallel copies during regalloc. llvm-svn: 179449
-
Andrew Trick authored
I need to handle this for the test case in my following scheduler commit. Work is already under way to redesign the mechanism for node order propagation because this case by case approach is unmaintainable. llvm-svn: 179448
-
Akira Hatanaka authored
lowerINTRINSIC_WO_CHAIN into MipsSETargetLowering. No functionality changes. llvm-svn: 179444
-
Rafael Espindola authored
We are now able to handle big endian macho files in llvm-readobject. Thanks to David Fang for providing the object files. llvm-svn: 179440
-
Akira Hatanaka authored
llvm-svn: 179434
-