Commits · a1e5e44eb35c1a3d4a64d34a7c73f3cea0c244d7 · Roger Ferrer / llvm-epi-0.8

Apr 15, 2013

Rename the slp-vectorizer clang/llvm flags. No functionality change. · a1e5e44e
Nadav Rotem authored Apr 15, 2013
```
llvm-svn: 179505
```
a1e5e44e
SLPVectorizer: Add support for vectorizing trees that start at compare instructions. · 5d393c41
Nadav Rotem authored Apr 15, 2013
```
llvm-svn: 179504
```
5d393c41
fix include path in doc Extending LLVM · f3076492
Jia Liu authored Apr 15, 2013
```
llvm-svn: 179503
```
f3076492

Mark all PPC comparison instructions as not having side effects · 95e6ea69

Hal Finkel authored Apr 15, 2013

Now that the CR spilling issues have been resolved, we can remove the
unmodeled-side-effect attributes from the comparison instructions (and also
mark them as isCompare). By allowing these, by default, to have unmodeled side
effects, we were hiding problems with CR spilling; but everything seems much
happier now.

llvm-svn: 179502

95e6ea69

Fix PPC64 CR spill location for callee-saved registers · 6736988a

Hal Finkel authored Apr 15, 2013

This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition
registers, the spill location is specified relative to the stack pointer (SP +
8). However, this is not relative to the SP after the new stack frame is
established, but instead relative to the caller's stack pointer (it is stored
into the linkage area of the parent's stack frame).

So, like with the link register, we don't directly spill the CRs with other
callee-saved registers, but just mark them to be spilled during prologue
generation.

In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32).

llvm-svn: 179500

6736988a

Revert "Remove some unused triple and data layout." · 1f140317

Eric Christopher authored Apr 14, 2013

This reverts commit r179497 and the accompanying commit as it broke random platforms that aren't osx.

llvm-svn: 179499

1f140317

Remove some unused triple and data layout. · 4eebd14a
Eric Christopher authored Apr 14, 2013
```
llvm-svn: 179498
```
4eebd14a
If we've specified a triple on the command line then go ahead · e1876a2b
Eric Christopher authored Apr 14, 2013
```
and use that as the default triple for the module and target
data layout.

llvm-svn: 179497
```
e1876a2b

Apr 14, 2013

Use object file specific section type for initial text section · 334c7bc7
Nico Rieck authored Apr 14, 2013
```
llvm-svn: 179494
```
334c7bc7

Reorders two transforms that collide with each other · 1fae1955

David Majnemer authored Apr 14, 2013

One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1

The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.

Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
  %shr = lshr i32 %X, 4
  %and = and i32 %shr, 15
  %add = add i32 %and, -14
  %tobool = icmp ne i32 %add, 0

into:
  %and = and i32 %X, 240
  %tobool = icmp ne i32 %and, 224

llvm-svn: 179493

1fae1955

Make the command line triple match the module triple. · 6ebddae1
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179492
```
6ebddae1
Miscellaneous cleanups for VecUtils.h · 7d62ea86
Benjamin Kramer authored Apr 14, 2013
```
llvm-svn: 179483
```
7d62ea86
Document the SLP infrastructure. · efa56e18
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179480
```
efa56e18
SLP: Document the scalarization cost method. · 3403c115
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179479
```
3403c115
Document the decision to assume that the cost of floats is twice as much as integers. · 0db0690a
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179478
```
0db0690a
Use i32 for all SPARC shift amounts, even in 64-bit mode. · eed1072f
Jakob Stoklund Olesen authored Apr 14, 2013
```
Test case by llvm-stress.

llvm-svn: 179477
```
eed1072f
Remove unused function attributes. · 029208ce
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179476
```
029208ce

SLPVectorizer: Add support for trees that don't start at binary operators, and... · 54b413d1

Nadav Rotem authored Apr 14, 2013

SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree.

llvm-svn: 179475

54b413d1

Add support for the abs64 SPARC v9 code model. · c3c28f85
Jakob Stoklund Olesen authored Apr 14, 2013
```
For when 16 TB just isn't enough.

llvm-svn: 179474
```
c3c28f85

Add support for the SPARC v9 abs44 code model. · c8fc76b0

Jakob Stoklund Olesen authored Apr 14, 2013

This is the default model for non-PIC 64-bit code. It supports
text+data+bss linked anywhere in the low 16 TB of the address space.

llvm-svn: 179473

c8fc76b0

Use target flags for printing SPARC asm operands. · 2e64d7ab

Jakob Stoklund Olesen authored Apr 14, 2013

64-bit code models need multiple relocations that can't be inferred from
the opcode like they can in 32-bit code.

llvm-svn: 179472

2e64d7ab

Also put target flags on SPARC constant pool references. · e0fc832b
Jakob Stoklund Olesen authored Apr 14, 2013
```
Constant pool entries are accessed exactly the same way as global
variables.

llvm-svn: 179471
```
e0fc832b
SLPVectorizer: add initial support for reduction variable vectorization. · 0b9cf856
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179470
```
0b9cf856
Fix patterns for 64-bit pointers. · dc1ed578
Jakob Stoklund Olesen authored Apr 14, 2013
```
This fixes the pic32 code model for SPARC v9.

llvm-svn: 179469
```
dc1ed578

Add target flags to SPARC address operands. · 1fb08a8b

Jakob Stoklund Olesen authored Apr 14, 2013

SDNodes and MachineOperands get target flags representing the %hi() and
%lo() assembly annotations that eventually become relocations.

Also define flags to be used by the 64-bit code models.

llvm-svn: 179468

1fb08a8b

Mark all PPC CR registers to be spilled as live-in and tag MFCR appropriately · 2f293915

Hal Finkel authored Apr 13, 2013

Leaving MFCR has having unmodeled side effects is not enough to prevent
unwanted instruction reordering post-RA. We could probably apply a stronger
barrier attribute, but there is a better way: Add all (not just the first) CR
to be spilled as live-in to the entry block, and add all CRs to the MFCR
instruction as implicitly killed.

Unfortunately, I don't have a small test case.

llvm-svn: 179465

2f293915

Apr 13, 2013

Define SPARC code models. · 15b3e900

Jakob Stoklund Olesen authored Apr 13, 2013

Currently, only abs32 and pic32 are implemented. Add a test case for
abs32 with 64-bit code. 64-bit PIC code is currently broken.

llvm-svn: 179463

15b3e900

Use the correct types when matching ADDRri patterns from frame indexes. · 6a0a3eb5
Jakob Stoklund Olesen authored Apr 13, 2013
```
It doesn't seem like anybody is checking types this late in isel, so no
test case.

llvm-svn: 179462
```
6a0a3eb5
GlobalDCE: Fix an oversight in my last commit that could lead to crashes. · adc1727c
Benjamin Kramer authored Apr 13, 2013
```
There is a Constant with non-constant operands: blockaddress.

llvm-svn: 179460
```
adc1727c

Fix a scalability issue with complex ConstantExprs. · 89ca4bc6

Benjamin Kramer authored Apr 13, 2013

This is basically the same fix in three different places. We use a set to avoid
walking the whole tree of a big ConstantExprs multiple times.

For example: (select cmp, (add big_expr 1), (add big_expr 2))
We don't want to visit big_expr twice here, it may consist of thousands of
nodes.

The testcase exercises this by creating an insanely large ConstantExprs out of
a loop. It's questionable if the optimizer should ever create those, but this
can be triggered with real C code. Fixes PR15714.

llvm-svn: 179458

89ca4bc6

Spill and restore PPC CR registers using the FP when we have one · d85a04b3

Hal Finkel authored Apr 13, 2013

For functions that need to spill CRs, and have dynamic stack allocations, the
value of the SP during the restore is not what it was during the save, and so
we need to use the FP in these cases (as for all of the other spills and
restores, but the CR restore has a special code path because its reserved slot,
like the link register, is specified directly relative to the adjusted SP).

llvm-svn: 179457

d85a04b3

Further generalize this scheduler test. · 3d957c0e
Andrew Trick authored Apr 13, 2013
```
The order of copies depends on queue order, which is not very stable.

llvm-svn: 179456
```
3d957c0e
Fix a dislexic regex. · e6f9fc0c
Andrew Trick authored Apr 13, 2013
```
llvm-svn: 179455
```
e6f9fc0c
Add a missing REQUIRES: asserts · 88a1285b
Andrew Trick authored Apr 13, 2013
```
llvm-svn: 179453
```
88a1285b
MI-Sched: DEBUG formatting. · 1f0bb69b
Andrew Trick authored Apr 13, 2013
```
llvm-svn: 179452
```
1f0bb69b
MI-Sched cleanup. If an instruction has no valid sched class, do not attempt... · be2bccbc
Andrew Trick authored Apr 13, 2013
```
MI-Sched cleanup. If an instruction has no valid sched class, do not attempt to check for a variant.

llvm-svn: 179451
```
be2bccbc

X86 machine model: reduce SandyBridge and Haswell ILPWindow. · f7fd6b9e

Andrew Trick authored Apr 13, 2013

The initial values were arbitrary. I want them to be more
conservative. This represents the number of latency cycles hidden by
OOO execution. In practice, I think it should be within a small factor
of the complex floating point operation latency so the scheduler can
make some attempt to hide latency even for smallish blocks.

These are by no means the best values, just a starting point for
tuning heuristics. Some benchmarks such as TSVC run faster with this
lower value for SandyBridge. I haven't run anything on Haswell, but
it's shouldn't be 2x SB.

llvm-svn: 179450

f7fd6b9e

MI-Sched: schedule physreg copies. · e833e1cd

Andrew Trick authored Apr 13, 2013

The register allocator expects minimal physreg live ranges. Schedule
physreg copies accordingly. This is slightly tricky when they occur in
the middle of the scheduling region. For now, this is handled by
rescheduling the copy when its associated instruction is
scheduled. Eventually we may instead bundle them, but only if we can
preserve the bundles as parallel copies during regalloc.

llvm-svn: 179449

e833e1cd

Catch another case where SD fails to propagate node order. · 52b8387f

Andrew Trick authored Apr 13, 2013

I need to handle this for the test case in my following scheduler
commit.

Work is already under way to redesign the mechanism for node order
propagation because this case by case approach is unmaintainable.

llvm-svn: 179448

52b8387f

Add typenames to see if bot goes green. · 98c0eaec

Rafael Espindola authored Apr 13, 2013

I hope this brings http://lab.llvm.org:8011/builders/clang-x86_64-darwin11-self-mingw32 back.

llvm-svn: 179446

98c0eaec