Commits · 5d393c416f86ca92254a61eb384a269030db6458 · Roger Ferrer / llvm-epi-0.8

Apr 15, 2013

SLPVectorizer: Add support for vectorizing trees that start at compare instructions. · 5d393c41
Nadav Rotem authored Apr 15, 2013
```
llvm-svn: 179504
```
5d393c41

Mark all PPC comparison instructions as not having side effects · 95e6ea69

Hal Finkel authored Apr 15, 2013

Now that the CR spilling issues have been resolved, we can remove the
unmodeled-side-effect attributes from the comparison instructions (and also
mark them as isCompare). By allowing these, by default, to have unmodeled side
effects, we were hiding problems with CR spilling; but everything seems much
happier now.

llvm-svn: 179502

95e6ea69

Fix PPC64 CR spill location for callee-saved registers · 6736988a

Hal Finkel authored Apr 15, 2013

This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition
registers, the spill location is specified relative to the stack pointer (SP +
8). However, this is not relative to the SP after the new stack frame is
established, but instead relative to the caller's stack pointer (it is stored
into the linkage area of the parent's stack frame).

So, like with the link register, we don't directly spill the CRs with other
callee-saved registers, but just mark them to be spilled during prologue
generation.

In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32).

llvm-svn: 179500

6736988a

Apr 14, 2013

Use object file specific section type for initial text section · 334c7bc7
Nico Rieck authored Apr 14, 2013
```
llvm-svn: 179494
```
334c7bc7

Reorders two transforms that collide with each other · 1fae1955

David Majnemer authored Apr 14, 2013

One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1

The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.

Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
  %shr = lshr i32 %X, 4
  %and = and i32 %shr, 15
  %add = add i32 %and, -14
  %tobool = icmp ne i32 %add, 0

into:
  %and = and i32 %X, 240
  %tobool = icmp ne i32 %and, 224

llvm-svn: 179493

1fae1955

Miscellaneous cleanups for VecUtils.h · 7d62ea86
Benjamin Kramer authored Apr 14, 2013
```
llvm-svn: 179483
```
7d62ea86
SLP: Document the scalarization cost method. · 3403c115
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179479
```
3403c115
Document the decision to assume that the cost of floats is twice as much as integers. · 0db0690a
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179478
```
0db0690a
Use i32 for all SPARC shift amounts, even in 64-bit mode. · eed1072f
Jakob Stoklund Olesen authored Apr 14, 2013
```
Test case by llvm-stress.

llvm-svn: 179477
```
eed1072f

SLPVectorizer: Add support for trees that don't start at binary operators, and... · 54b413d1

Nadav Rotem authored Apr 14, 2013

SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree.

llvm-svn: 179475

54b413d1

Add support for the abs64 SPARC v9 code model. · c3c28f85
Jakob Stoklund Olesen authored Apr 14, 2013
```
For when 16 TB just isn't enough.

llvm-svn: 179474
```
c3c28f85

Add support for the SPARC v9 abs44 code model. · c8fc76b0

Jakob Stoklund Olesen authored Apr 14, 2013

This is the default model for non-PIC 64-bit code. It supports
text+data+bss linked anywhere in the low 16 TB of the address space.

llvm-svn: 179473

c8fc76b0

Use target flags for printing SPARC asm operands. · 2e64d7ab

Jakob Stoklund Olesen authored Apr 14, 2013

64-bit code models need multiple relocations that can't be inferred from
the opcode like they can in 32-bit code.

llvm-svn: 179472

2e64d7ab

Also put target flags on SPARC constant pool references. · e0fc832b
Jakob Stoklund Olesen authored Apr 14, 2013
```
Constant pool entries are accessed exactly the same way as global
variables.

llvm-svn: 179471
```
e0fc832b
SLPVectorizer: add initial support for reduction variable vectorization. · 0b9cf856
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179470
```
0b9cf856
Fix patterns for 64-bit pointers. · dc1ed578
Jakob Stoklund Olesen authored Apr 14, 2013
```
This fixes the pic32 code model for SPARC v9.

llvm-svn: 179469
```
dc1ed578

Add target flags to SPARC address operands. · 1fb08a8b

Jakob Stoklund Olesen authored Apr 14, 2013

SDNodes and MachineOperands get target flags representing the %hi() and
%lo() assembly annotations that eventually become relocations.

Also define flags to be used by the 64-bit code models.

llvm-svn: 179468

1fb08a8b

Mark all PPC CR registers to be spilled as live-in and tag MFCR appropriately · 2f293915

Hal Finkel authored Apr 13, 2013

Leaving MFCR has having unmodeled side effects is not enough to prevent
unwanted instruction reordering post-RA. We could probably apply a stronger
barrier attribute, but there is a better way: Add all (not just the first) CR
to be spilled as live-in to the entry block, and add all CRs to the MFCR
instruction as implicitly killed.

Unfortunately, I don't have a small test case.

llvm-svn: 179465

2f293915

Apr 13, 2013

Define SPARC code models. · 15b3e900

Jakob Stoklund Olesen authored Apr 13, 2013

Currently, only abs32 and pic32 are implemented. Add a test case for
abs32 with 64-bit code. 64-bit PIC code is currently broken.

llvm-svn: 179463

15b3e900

Use the correct types when matching ADDRri patterns from frame indexes. · 6a0a3eb5
Jakob Stoklund Olesen authored Apr 13, 2013
```
It doesn't seem like anybody is checking types this late in isel, so no
test case.

llvm-svn: 179462
```
6a0a3eb5
GlobalDCE: Fix an oversight in my last commit that could lead to crashes. · adc1727c
Benjamin Kramer authored Apr 13, 2013
```
There is a Constant with non-constant operands: blockaddress.

llvm-svn: 179460
```
adc1727c

Fix a scalability issue with complex ConstantExprs. · 89ca4bc6

Benjamin Kramer authored Apr 13, 2013

This is basically the same fix in three different places. We use a set to avoid
walking the whole tree of a big ConstantExprs multiple times.

For example: (select cmp, (add big_expr 1), (add big_expr 2))
We don't want to visit big_expr twice here, it may consist of thousands of
nodes.

The testcase exercises this by creating an insanely large ConstantExprs out of
a loop. It's questionable if the optimizer should ever create those, but this
can be triggered with real C code. Fixes PR15714.

llvm-svn: 179458

89ca4bc6

Spill and restore PPC CR registers using the FP when we have one · d85a04b3

Hal Finkel authored Apr 13, 2013

For functions that need to spill CRs, and have dynamic stack allocations, the
value of the SP during the restore is not what it was during the save, and so
we need to use the FP in these cases (as for all of the other spills and
restores, but the CR restore has a special code path because its reserved slot,
like the link register, is specified directly relative to the adjusted SP).

llvm-svn: 179457

d85a04b3

MI-Sched: DEBUG formatting. · 1f0bb69b
Andrew Trick authored Apr 13, 2013
```
llvm-svn: 179452
```
1f0bb69b
MI-Sched cleanup. If an instruction has no valid sched class, do not attempt... · be2bccbc
Andrew Trick authored Apr 13, 2013
```
MI-Sched cleanup. If an instruction has no valid sched class, do not attempt to check for a variant.

llvm-svn: 179451
```
be2bccbc

X86 machine model: reduce SandyBridge and Haswell ILPWindow. · f7fd6b9e

Andrew Trick authored Apr 13, 2013

The initial values were arbitrary. I want them to be more
conservative. This represents the number of latency cycles hidden by
OOO execution. In practice, I think it should be within a small factor
of the complex floating point operation latency so the scheduler can
make some attempt to hide latency even for smallish blocks.

These are by no means the best values, just a starting point for
tuning heuristics. Some benchmarks such as TSVC run faster with this
lower value for SandyBridge. I haven't run anything on Haswell, but
it's shouldn't be 2x SB.

llvm-svn: 179450

f7fd6b9e

MI-Sched: schedule physreg copies. · e833e1cd

Andrew Trick authored Apr 13, 2013

The register allocator expects minimal physreg live ranges. Schedule
physreg copies accordingly. This is slightly tricky when they occur in
the middle of the scheduling region. For now, this is handled by
rescheduling the copy when its associated instruction is
scheduled. Eventually we may instead bundle them, but only if we can
preserve the bundles as parallel copies during regalloc.

llvm-svn: 179449

e833e1cd

Catch another case where SD fails to propagate node order. · 52b8387f

Andrew Trick authored Apr 13, 2013

I need to handle this for the test case in my following scheduler
commit.

Work is already under way to redesign the mechanism for node order
propagation because this case by case approach is unmaintainable.

llvm-svn: 179448

52b8387f

[mips] Move MipsTargetLowering::lowerINTRINSIC_W_CHAIN and · a6bbde58
Akira Hatanaka authored Apr 13, 2013
```
lowerINTRINSIC_WO_CHAIN into MipsSETargetLowering.

No functionality changes.

llvm-svn: 179444
```
a6bbde58

Finish templating MachObjectFile over endianness. · 9b709259

Rafael Espindola authored Apr 13, 2013

We are now able to handle big endian macho files in llvm-readobject. Thanks to
David Fang for providing the object files.

llvm-svn: 179440

9b709259

[mips] Reapply r179420 and r179421. · 2f08822f
Akira Hatanaka authored Apr 13, 2013
```
llvm-svn: 179434
```
2f08822f
[mips] Override TargetLoweringBase::isShuffleMaskLegal. · 48996b06
Akira Hatanaka authored Apr 13, 2013
```
llvm-svn: 179433
```
48996b06
[ms-inline asm] Simplify the logic by using parsePrimaryExpr. No functional · 43554eed
Chad Rosier authored Apr 12, 2013
```
change intended.  Test case previously added in r178568.
Part of rdar://13611297

llvm-svn: 179425
```
43554eed
Revert r179420 and r179421. · 8ed2892c
Akira Hatanaka authored Apr 12, 2013
```
llvm-svn: 179422
```
8ed2892c
[mips] Instruction selection patterns for carry-setting and using add · 931ad87f
Akira Hatanaka authored Apr 12, 2013
```
instructions.

llvm-svn: 179421
```
931ad87f
[mips] v4i8 and v2i16 add, sub and mul instruction selection patterns. · 8f41dd92
Akira Hatanaka authored Apr 12, 2013
```
llvm-svn: 179420
```
8f41dd92
Revert r179409 because it caused some warnings and some of the build bots fail. · 4e4d45e5
Nadav Rotem authored Apr 12, 2013
```
llvm-svn: 179418
```
4e4d45e5

Apr 12, 2013

InstCombine: Check the operand types before merging fcmp ord & fcmp ord. · e89c7050
Benjamin Kramer authored Apr 12, 2013
```
Fixes PR15737.

llvm-svn: 179417
```
e89c7050

SLPVectorizer: add support for vectorization of diamond shaped trees. We now... · 8543ba3e

Nadav Rotem authored Apr 12, 2013

SLPVectorizer: add support for vectorization of diamond shaped trees. We now perform a preliminary traversal of the graph to collect values with multiple users and check where the users came from. 

llvm-svn: 179414

8543ba3e

CostModel: increase the default cost of supported floating point operations... · 87a0af6e

Nadav Rotem authored Apr 12, 2013

CostModel: increase the default cost of supported floating point operations from 1 to two. Fixed a few tests that changes because now the cost of one insert + a vector operation on two doubles is lower than two scalar operations on doubles.

llvm-svn: 179413

87a0af6e