Commits · af15f8dd5a4c73acb8fad42e5a5337b4b488f5ab · Roger Ferrer / llvm-epi-0.8

Aug 07, 2013
- Move somewhat messy conditional out of line. · af15f8dd
  Eric Christopher authored Aug 07, 2013
```
No functional change.

llvm-svn: 187843
```
  af15f8dd
- LoopVectorize: Allow vectorization of loops with lifetime markers · a7cd6bf3
  Arnold Schwaighofer authored Aug 06, 2013
```
Patch by Marc Jessome!

llvm-svn: 187825
```
  a7cd6bf3
Aug 06, 2013

Refactor isInTailCallPosition handling · a4415854

Tim Northover authored Aug 06, 2013

This change came about primarily because of two issues in the existing code.
Niether of:

define i64 @test1(i64 %val) {
  %in = trunc i64 %val to i32
  tail call i32 @ret32(i32 returned %in)
  ret i64 %val
}

define i64 @test2(i64 %val) {
  tail call i32 @ret32(i32 returned undef)
  ret i32 42
}

should be tail calls, and the function sameNoopInput is responsible. The main
problem is that it is completely symmetric in the "tail call" and "ret" value,
but in reality different things are allowed on each side.

For these cases:
1. Any truncation should lead to a larger value being generated by "tail call"
   than needed by "ret".
2. Undef should only be allowed as a source for ret, not as a result of the
   call.

Along the way I noticed that a mismatch between what this function treats as a
valid truncation and what the backends see can lead to invalid calls as well
(see x86-32 test case).

This patch refactors the code so that instead of being based primarily on
values which it recurses into when necessary, it starts by inspecting the type
and considers each fundamental slot that the backend will see in turn. For
example, given a pathological function that returned {{}, {{}, i32, {}}, i32}
we would consider each "real" i32 in turn, and ask if it passes through
unchanged. This is much closer to what the backend sees as a result of
ComputeValueVTs.

Aside from the bug fixes, this eliminates the recursion that's going on and, I
believe, makes the bulk of the code significantly easier to understand. The
trade-off is the nasty iterators needed to find the real types inside a
returned value.

llvm-svn: 187787

a4415854

AsmPrinter/CMakeLists.txt: Add explicit dependency to intrinsics_gen here. · e359e856
NAKAMURA Takumi authored Aug 06, 2013
```
llvm-svn: 187778
```
e359e856
Recommit previous cleanup with a fix for c++98 ambiguity. · 0062f2ed
Eric Christopher authored Aug 05, 2013
```
llvm-svn: 187752
```
0062f2ed

TargetLowering: Add getVectorIdxTy() function v2 · d42c5949

Tom Stellard authored Aug 05, 2013

This virtual function can be implemented by targets to specify the type
to use for the index operand of INSERT_VECTOR_ELT, EXTRACT_VECTOR_ELT,
INSERT_SUBVECTOR, EXTRACT_SUBVECTOR. The default implementation returns
the result from TargetLowering::getPointerTy()

The previous code was using TargetLowering::getPointerTy() for vector
indices, because this is guaranteed to be legal on all targets. However,
using TargetLowering::getPointerTy() can be a problem for targets with
pointer sizes that differ across address spaces. On such targets,
when vectors need to be loaded or stored to an address space other than the
default 'zero' address space (which is the address space assumed by
TargetLowering::getPointerTy()), having an index that
is a different size than the pointer can lead to inefficient
pointer calculations, (e.g. 64-bit adds for a 32-bit address space).

There is no intended functionality change with this patch.

llvm-svn: 187748

d42c5949

Revert "Use existing builtin hashing functions to make this routine more" · 432c99af
Eric Christopher authored Aug 05, 2013
```
This reverts commit r187745.

llvm-svn: 187747
```
432c99af
Use existing builtin hashing functions to make this routine more · d728355a
Eric Christopher authored Aug 05, 2013
```
simple.

llvm-svn: 187745
```
d728355a

Aug 05, 2013
- Change parent hashing algorithm to be non-recursive and elaborate · 0369ad70
  Eric Christopher authored Aug 05, 2013
```
greatly on many comments in the code.

llvm-svn: 187742
```
  0369ad70
- Don't leak passes if added outside of the area determined by Started/Stopped flags. · 483b9fbd
  Benjamin Kramer authored Aug 05, 2013
```
llvm-svn: 187722
```
  483b9fbd
Aug 02, 2013

Bugfix for making the DWARF debug strings and labels to code emitted as... · 4382da98

Carlo Kok authored Aug 02, 2013

Bugfix for making the DWARF debug strings and labels to code emitted as secrel32 instead of long opcodes (only for coff). This makes them debuggable with GDB (with fix for 64bits msvc)

llvm-svn: 187656

4382da98

Revert r187597, "Bugfix for making the DWARF debug strings and labels to code... · 6fda3b4b

NAKAMURA Takumi authored Aug 02, 2013

Revert r187597, "Bugfix for making the DWARF debug strings and labels to code emitted as secrel32 instead of long opcodes (only for coff). This makes them debuggable with GDB."

It broke x86_64-win32 builder in llvm/test/DebugInfo.

llvm-svn: 187642

6fda3b4b

Aug 01, 2013

Use function attributes to indicate that we don't want to realign the stack. · a5c536e1

Bill Wendling authored Aug 01, 2013

Function attributes are the future! So just query whether we want to realign the
stack directly from the function instead of through a random target options
structure.

llvm-svn: 187618

a5c536e1

DebugInfo: Emit definitions for types with no members. · a1ae0e6e
David Blaikie authored Aug 01, 2013
```
The absence of members was a poor/incorrect proxy for "is definition".

llvm-svn: 187607
```
a1ae0e6e

Bugfix for making the DWARF debug strings and labels to code emitted as... · afcc6202

Carlo Kok authored Aug 01, 2013

Bugfix for making the DWARF debug strings and labels to code emitted as secrel32 instead of long opcodes (only for coff). This makes them debuggable with GDB.

fixes Bug 16249 - LLVM generates broken debug info on Windows

llvm-svn: 187597

afcc6202

Jul 31, 2013

Fix crashing on invalid inline asm with matching constraints. · e6656ac8

Eric Christopher authored Jul 31, 2013

For a testcase like the following:

 typedef unsigned long uint64_t;

 typedef struct {
   uint64_t lo;
   uint64_t hi;
 } blob128_t;

 void add_128_to_128(const blob128_t *in, blob128_t *res) {
   asm ("PAND %1, %0" : "+Q"(*res) : "Q"(*in));
 }

where we'll fail to allocate the register for the output constraint,
our matching input constraint will not find a register to match,
and could try to search past the end of the current operands array.

On the idea that we'd like to attempt to keep compilation going
to find more errors in the module, change the error cases when
we're visiting inline asm IR to return immediately and avoid
trying to create a node in the DAG. This leaves us with only
a single error message per inline asm instruction, but allows us
to safely keep going in the general case.

llvm-svn: 187470

e6656ac8

Reflow this to be easier to read. · 029af150
Eric Christopher authored Jul 30, 2013
```
llvm-svn: 187459
```
029af150

Jul 30, 2013

Down-scale slot index distance to save bits. · c7934b3e
Andrew Trick authored Jul 30, 2013
```
llvm-svn: 187438
```
c7934b3e

MI Sched: Track live-thru registers. · 9c17eab7

Andrew Trick authored Jul 30, 2013

When registers must be live throughout the scheduling region, increase
the limit for the register class. Once we exceed the original limit,
they will be spilled, and there's no point further reducing pressure.

This isn't a perfect heuristics but avoids a situation where the
scheduler could become trapped by trying to achieve the impossible.

llvm-svn: 187436

9c17eab7

MI Sched fix: assert "Disconnected LRG within the scheduling region." · d9761776
Andrew Trick authored Jul 30, 2013
```
llvm-svn: 187435
```
d9761776

[DAGCombiner] insert_vector_elt: Avoid building a vector twice. · 6bf4baa4

Quentin Colombet authored Jul 30, 2013

This patch prevents the following combine when the input vector is used more
than once.
insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx
=>
build_vector elt0, ..., NewEltIdx, ..., eltN 

The reasons are:
- Building a vector may be expensive, so try to reuse the existing part of a
  vector instead of creating a new one (think big vectors).
- elt0 to eltN now have two users instead of one. This may prevent some other
  optimizations.

llvm-svn: 187396

6bf4baa4

Fix a truly egregious thinko in anonymous namespace check, · e414ece7

Eric Christopher authored Jul 29, 2013

update testcase to make sure we generate debug info for walrus
by adding a non-trivial constructor and verify that we don't
emit an ODR signature for the type.

llvm-svn: 187393

e414ece7

Make sure we don't emit an ODR hash for types with no name and make · d853ea31
Eric Christopher authored Jul 29, 2013
```
sure the comments for each testcase are a bit easier to distinguish.

llvm-svn: 187392
```
d853ea31
Elaborate a bit on the type unit and ODR conditional code. · f8542ec3
Eric Christopher authored Jul 29, 2013
```
llvm-svn: 187385
```
f8542ec3

Jul 29, 2013

Use proper section suffix for COFF weak symbols · 7fdaee8f

Nico Rieck authored Jul 29, 2013

32-bit symbols have "_" as global prefix, but when forming the name of
COMDAT sections this prefix is ignored. The current behavior assumes that
this prefix is always present which is not the case for 64-bit and names
are truncated.

llvm-svn: 187356

7fdaee8f

Jul 27, 2013

DwarfDebug: MD5 is always little endian, bswap on big endian platforms. · 409afcf1
Benjamin Kramer authored Jul 27, 2013
```
This makes LLVM emit the same signature regardless of host and target endianess.

llvm-svn: 187304
```
409afcf1

Fix a memory leak in the debug emission by simply not allocating memory. · 2a1c0d2c

Chandler Carruth authored Jul 27, 2013

There doesn't appear to be any reason to put this variable on the heap.
I'm suspicious of the LexicalScope above that we stuff in a map and then
delete afterward, but I'm just trying to get the valgrind bot clean.

llvm-svn: 187301

2a1c0d2c

Reimplement isPotentiallyReachable to make nocapture deduction much stronger. · 0b68245e

Nick Lewycky authored Jul 27, 2013

Adds unit tests for it too.

Split BasicBlockUtils into an analysis-half and a transforms-half, and put the
analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable
into llvm::isPotentiallyReachable and move it into Analysis/CFG.

llvm-svn: 187283

0b68245e

SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions · 8b1e021e

Tom Stellard authored Jul 27, 2013

Merge consecutive if-regions if they contain identical statements.
Both transformations reduce number of branches.  The transformation
is guarded by a target-hook, and is currently enabled only for +R600,
but the correctness has been tested on X86 target using a variety of
CPU benchmarks.

Patch by: Mei Ye

llvm-svn: 187278

8b1e021e

Jul 26, 2013

Remove addLetterToHash, no functional change. · 219fb914
Eric Christopher authored Jul 26, 2013
```
llvm-svn: 187245
```
219fb914

Add preliminary support for hashing DIEs and breaking them into · 67646438

Eric Christopher authored Jul 26, 2013

type units.

Initially this support is used in the computation of an ODR checker
for C++. For now we're attaching it to the DIE, but in the future
it will be attached to the type unit.

This also starts breaking out types into the separation for type
units, but without actually splitting the DIEs.

In preparation for hashing the DIEs this adds a DIEString type
that contains a StringRef with the string contained at the label.

llvm-svn: 187213

67646438

Add a target legalize hook for SplitVectorOperand (again) · d3f2035a

Justin Holewinski authored Jul 26, 2013

CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.

This also adds a test case for NVPTX that depends on this custom
legalization.

Differential Revision: http://llvm-reviews.chandlerc.com/D1195

Attempt to fix the buildbots by making the X86 test I just added platform independent

llvm-svn: 187202

d3f2035a

Revert "Add a target legalize hook for SplitVectorOperand" · 1d812728

Rafael Espindola authored Jul 26, 2013

This reverts commit 187198. It broke the bots.

The soft float test probably needs a -triple because of name differences.
On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of
"vroundss $1, %xmm0, %xmm0, %xmm0".

llvm-svn: 187201

1d812728

Add a target legalize hook for SplitVectorOperand · f848a24e

Justin Holewinski authored Jul 26, 2013

CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.

This also adds a test case for NVPTX that depends on this custom
legalization.

Differential Revision: http://llvm-reviews.chandlerc.com/D1195

llvm-svn: 187198

f848a24e

Jul 25, 2013

RegAllocGreedy comment. · f4b1ee34
Andrew Trick authored Jul 25, 2013
```
llvm-svn: 187141
```
f4b1ee34

Evict local live ranges if they can be reassigned. · 8bb0a251

Andrew Trick authored Jul 25, 2013

The previous change to local live range allocation also suppressed
eviction of local ranges. In rare cases, this could result in more
expensive register choices. This commit actually revives a feature
that I added long ago: check if live ranges can be reassigned before
eviction. But now it only happens in rare cases of evicting a local
live range because another local live range wants a cheaper register.

The benefit is improved code size for some benchmarks on x86 and armv7.

I measured no significant compile time increase and performance
changes are noise.

llvm-svn: 187140

8bb0a251

Allocate local registers in order for optimal coloring. · 8485257d

Andrew Trick authored Jul 25, 2013

Also avoid locals evicting locals just because they want a cheaper register.

Problem: MI Sched knows exactly how many registers we have and assumes
they can be colored. In cases where we have large blocks, usually from
unrolled loops, greedy coloring fails. This is a source of
"regressions" from the MI Scheduler on x86. I noticed this issue on
x86 where we have long chains of two-address defs in the same live
range. It's easy to see this in matrix multiplication benchmarks like
IRSmk and even the unit test misched-matmul.ll.

A fundamental difference between the LLVM register allocator and
conventional graph coloring is that in our model a live range can't
discover its neighbors, it can only verify its neighbors. That's why
we initially went for greedy coloring and added eviction to deal with
the hard cases. However, for singly defined and two-address live
ranges, we can optimally color without visiting neighbors simply by
processing the live ranges in instruction order.

Other beneficial side effects:

It is much easier to understand and debug regalloc for large blocks
when the live ranges are allocated in order. Yes, global allocation is
still very confusing, but it's nice to be able to comprehend what
happened locally.

Heuristics could be added to bias register assignment based on
instruction locality (think late register pairing, banks...).

Intuituvely this will make some test cases that are on the threshold
of register pressure more stable.

llvm-svn: 187139

8485257d

typo. · e4daf52a
Adrian Prantl authored Jul 25, 2013
```
llvm-svn: 187135
```
e4daf52a
MI Sched: Register pressure heuristics. · 401b6959
Andrew Trick authored Jul 25, 2013
```
Consider which set is being increased or decreased before comparing.

llvm-svn: 187110
```
401b6959
MI Sched: track register pressure by importance of the set, not weight of the units. · 27e5fea6
Andrew Trick authored Jul 25, 2013
```
llvm-svn: 187109
```
27e5fea6