Commits · a2888e71eb3f7d4fed348fd0d659355099bd1505 · Roger Ferrer / llvm-epi-0.8

May 01, 2013

Temporarily revert "Change the informal convention of DBG_VALUE so that we can express a" · a2888e71
Adrian Prantl authored Apr 30, 2013
```
because it breaks some buildbots.

This reverts commit 180816.

llvm-svn: 180819
```
a2888e71

Change the informal convention of DBG_VALUE so that we can express a · 9a576644

Adrian Prantl authored Apr 30, 2013

register-indirect address with an offset of 0.
It used to be that a DBG_VALUE is a register-indirect value if the offset
(operand 1) is nonzero. The new convention is that a DBG_VALUE is
register-indirect if the first operand is a register and the second
operand is an immediate. For plain registers use the combination reg, reg.

rdar://problem/13658587

llvm-svn: 180816

9a576644

MI Sched: revert a minor heuristic that snuck in with -misched-vcopy. · dd77014a
Andrew Trick authored Apr 30, 2013
```
I'll fix the heuristic in a general way in a follow-up commit.

llvm-svn: 180815
```
dd77014a

Apr 30, 2013

LocalStackSlotAllocation improvements · 7153251a

Hal Finkel authored Apr 30, 2013

First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created.

Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway.

Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll

Jim has okayed this off-list.

llvm-svn: 180799

7153251a

Emit the TLS initialization function pointers into the correct section. · fb7e32eb

Bill Wendling authored Apr 29, 2013

The `llvm.tls_init_funcs' (created by the front-end) holds pointers to the TLS
initialization functions. These need to be placed into the correct section so
that they are run before `main()'.

<rdar://problem/13733006>

llvm-svn: 180737

fb7e32eb

Apr 27, 2013

Generalize the MachineTraceMetrics public API. · 85058af6

Andrew Trick authored Apr 27, 2013

Naturally, we should be able to pass in extra instructions, not just
extra blocks.

llvm-svn: 180667

85058af6

Use the target triple from the target machine rather than the module · 203e12bf

Eric Christopher authored Apr 27, 2013

to determine whether or not we're on a darwin platform for debug code
emitting.

Solves the problem of a module with no triple on the command line
and no triple in the module using non-gdb ok features on darwin. Fix
up the member-pointers test to check the correct things for cross
platform (DW_FORM_flag is a good prefix).

Unfortunately no testcase because I have no ideas how to test something
without a triple and without a triple in the module yet check
precisely on two platforms. Ideas welcome.

llvm-svn: 180660

203e12bf

Apr 26, 2013

Cleanup and document MachineLocation. · d4c0dd47

Adrian Prantl authored Apr 26, 2013

Clarify documentation and API to make the difference between register and
register-indirect addressed locations more explicit. Put in a comment
to point out that with the current implementation we cannot specify
a register-indirect location with offset 0 (a breg 0 in DWARF).
No functionality change intended.

rdar://problem/13658587

llvm-svn: 180641

d4c0dd47

Micro-optimization · 55a9c97c

Bill Wendling authored Apr 26, 2013

TLVs probably won't be as common as the other types of variables. Check for them
last before defaulting to "DATA".

llvm-svn: 180631

55a9c97c

Re-write the address propagation code for pre-indexed loads/stores to take... · af7e8c36

Silviu Baranga authored Apr 26, 2013

Re-write the address propagation code for pre-indexed loads/stores to take into account some previously misssed cases (PRE_DEC addressing mode, the offset and base address are swapped, etc). This should fix PR15581.

llvm-svn: 180609

af7e8c36

DAGCombiner: Canonicalize vector integer abs in the same way we do it for scalars. · d56ffc70

Benjamin Kramer authored Apr 26, 2013

This already helps SSE2 x86 a lot because it lacks an efficient way to
represent a vector select. The long term goal is to enable the backend to match
a canonicalized pattern into a single instruction (e.g. vabs or pabs).

llvm-svn: 180597

d56ffc70

Apr 25, 2013

[mc-coff] Forward Linker Option flags into the .drectve section · d973ca3c

Reid Kleckner authored Apr 25, 2013

Summary:
This is modelled on the Mach-O linker options implementation and should
support a Clang implementation of #pragma comment(lib/linker).

Reviewers: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D724

llvm-svn: 180569

d973ca3c

Fix constant folding for one lane vector types. Constant folding one lane... · 4ad2bc59

Silviu Baranga authored Apr 25, 2013

Fix constant folding for one lane vector types. Constant folding one lane vector types not returns a vector instead of a scalar.

llvm-svn: 180254

4ad2bc59

Fix for r180193 - MI Sched: eliminate local vreg. · 2e875171

Andrew Trick authored Apr 24, 2013

Fixes PR15838. Need to check for blocks with nothing but dbg.value.

I'm not sure how to force this situation with a unit test. I tried to
reduce the test case in PR15838 (1k lines of metadata) but gave up.

llvm-svn: 180227

2e875171

[inline asm] Fix a crasher for an invalid value type/register class. · 108d5a61
Chad Rosier authored Apr 24, 2013
```
rdar://13731657

llvm-svn: 180226
```
108d5a61

Apr 24, 2013

MI Sched: eliminate local vreg copies. · 85a1d4cb

Andrew Trick authored Apr 24, 2013

For now, we just reschedule instructions that use the copied vregs and
let regalloc elliminate it. I would really like to eliminate the
copies on-the-fly during scheduling, but we need a complete
implementation of repairIntervalsInRange() first.

The general strategy is for the register coalescer to eliminate as
many global copies as possible and shrink live ranges to be
extended-basic-block local. The coalescer should not have to worry
about resolving local copies (e.g. it shouldn't attemp to reorder
instructions). The scheduler is a much better place to deal with local
interference. The coalescer side of this equation needs work.

llvm-svn: 180193

85a1d4cb

Andrew Trick authored Apr 24, 2013

When MachineScheduler is enabled, this functionality can be
removed. Until then, provide a way to disable it for test cases and
designing MachineScheduler heuristics.

llvm-svn: 180192

608a698c

MI Sched: regpressure tracing. · 7c791a3d
Andrew Trick authored Apr 24, 2013
```
llvm-svn: 180191
```
7c791a3d
Formatting. · 4eb5eb5b
Eric Christopher authored Apr 24, 2013
```
llvm-svn: 180186
```
4eb5eb5b

Apr 23, 2013

DAGCombine should not aggressively fold SEXT(VSETCC(...)) into a wider VSETCC... · 2d4cca35

Owen Anderson authored Apr 23, 2013

DAGCombine should not aggressively fold SEXT(VSETCC(...)) into a wider VSETCC without first checking the target's vector boolean contents.
This exposed an issue with PowerPC AltiVec where it appears it was setting the wrong vector boolean contents. The included change
fixes the PowerPC tests, and was OK'd by Hal.

llvm-svn: 180129

2d4cca35

Add some constraints to use of 'returned': · 6c70dc78

Stephen Lin authored Apr 23, 2013

1) Disallow 'returned' on parameter that is also 'sret' (no sensible semantics, as far as I can tell).
2) Conservatively disallow tail calls through 'returned' parameters that also are 'zext' or 'sext' (for consistency with treatment of other zero-extending and sign-extending operations in tail call position detection...can be revised later to handle situations that can be determined to be safe).

This is a new attribute that is not yet used, so there is no impact.

llvm-svn: 180118

6c70dc78

Remove unused DwarfSectionOffsetDirective string · 034ca0fe
Matt Arsenault authored Apr 22, 2013
```
The value isn't actually used, and setting it emits a COFF specific
directive.

llvm-svn: 180064
```
034ca0fe

Move C++ code out of the C headers and into either C++ headers · 04d4e931

Eric Christopher authored Apr 22, 2013

or the C++ files themselves. This enables people to use
just a C compiler to interoperate with LLVM.

llvm-svn: 180063

04d4e931

Apr 22, 2013

Optimize MachineBasicBlock::getSymbol by caching the symbol. Since the symbol · 58b04b7e
Eli Bendersky authored Apr 22, 2013
```
name computation is expensive, this helps save about 25% of the time spent in
this function.

llvm-svn: 180049
```
58b04b7e

Clarify that llvm.used can contain aliases. · 74f2e46e

Rafael Espindola authored Apr 22, 2013

Also add a check for llvm.used in the verifier and simplify clients now that
they can assume they have a ConstantArray.

llvm-svn: 180019

74f2e46e

Tidy. · 44c6aa67
Eric Christopher authored Apr 22, 2013
```
llvm-svn: 180000
```
44c6aa67
Update comment. Whitespace. · 25e3509c
Eric Christopher authored Apr 22, 2013
```
llvm-svn: 179999
```
25e3509c

Revert "Revert "PR14606: debug info imported_module support"" · f55abeaf

David Blaikie authored Apr 22, 2013

This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll

I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even
though the debug info was clearly invalid on all of them, but this ought to fix
it.

llvm-svn: 179996

f55abeaf

Legalize vector truncates by parts rather than just splitting. · 563983c8

Jim Grosbach authored Apr 21, 2013

Rather than just splitting the input type and hoping for the best, apply
a bit more cleverness. Just splitting the types until the source is
legal often leads to an illegal result time, which is then widened and a
scalarization step is introduced which leads to truly horrible code
generation. With the loop vectorizer, these sorts of operations are much
more common, and so it's worth extra effort to do them well.

Add a legalization hook for the operands of a TRUNCATE node, which will
be encountered after the result type has been legalized, but if the
operand type is still illegal. If simple splitting of both types
ends up with the result type of each half still being legal, just
do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
we can get more clever with power-two vectors. Specifically,
split the input type, but also widen the result element size, then
concatenate the halves and truncate again.  For example on ARM,
To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
  %inlo = v4i32 extract_subvector %in, 0
  %inhi = v4i32 extract_subvector %in, 4
  %lo16 = v4i16 trunc v4i32 %inlo
  %hi16 = v4i16 trunc v4i32 %inhi
  %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
  %res = v8i8 trunc v8i16 %in16

This allows instruction selection to generate three VMOVN instructions
instead of a sequences of moves, stores and loads.

Update the ARMTargetTransformInfo to take this improved legalization
into account.

Consider the simplified IR:

define <16 x i8> @test1(<16 x i32>* %ap) {
  %a = load <16 x i32>* %ap
  %tmp = trunc <16 x i32> %a to <16 x i8>
  ret <16 x i8> %tmp
}

define <8 x i8> @test2(<8 x i32>* %ap) {
  %a = load <8 x i32>* %ap
  %tmp = trunc <8 x i32> %a to <8 x i8>
  ret <8 x i8> %tmp
}

Previously, we would generate the truly hideous:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #20
	bic	sp, sp, #7
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d24, d25}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	vld1.64	{d18, d19}, [r2:128]
	add	r1, r0, #16
	vmovn.i32	d22, q8
	vld1.64	{d16, d17}, [r1:128]
	vmovn.i32	d20, q9
	vmovn.i32	d18, q12
	vmov.u16	r0, d22[3]
	strb	r0, [sp, #15]
	vmov.u16	r0, d22[2]
	strb	r0, [sp, #14]
	vmov.u16	r0, d22[1]
	strb	r0, [sp, #13]
	vmov.u16	r0, d22[0]
	vmovn.i32	d16, q8
	strb	r0, [sp, #12]
	vmov.u16	r0, d20[3]
	strb	r0, [sp, #11]
	vmov.u16	r0, d20[2]
	strb	r0, [sp, #10]
	vmov.u16	r0, d20[1]
	strb	r0, [sp, #9]
	vmov.u16	r0, d20[0]
	strb	r0, [sp, #8]
	vmov.u16	r0, d18[3]
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	vldmia	sp, {d16, d17}
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	mov	sp, r7
	pop	{r7}
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #12
	bic	sp, sp, #7
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d20, d21}, [r0:128]
	vmovn.i32	d18, q8
	vmov.u16	r0, d18[3]
	vmovn.i32	d16, q10
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	ldm	sp, {r0, r1}
	mov	sp, r7
	pop	{r7}
	bx	lr

Now, however, we generate the much more straightforward:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d20, d21}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	add	r1, r0, #16
	vld1.64	{d18, d19}, [r2:128]
	vld1.64	{d22, d23}, [r1:128]
	vmovn.i32	d17, q8
	vmovn.i32	d16, q9
	vmovn.i32	d18, q10
	vmovn.i32	d19, q11
	vmovn.i16	d17, q8
	vmovn.i16	d16, q9
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d18, d19}, [r0:128]
	vmovn.i32	d16, q8
	vmovn.i32	d17, q9
	vmovn.i16	d16, q8
	vmov	r0, r1, d16
	bx	lr

llvm-svn: 179989

563983c8

Apr 21, 2013
- Tidy up comment grammar. · d4db72db
  Jim Grosbach authored Apr 21, 2013
```
llvm-svn: 179986
```
  d4db72db
Apr 20, 2013
- Remove unused ShouldFoldAtomicFences flag. · 16aba170
  Tim Northover authored Apr 20, 2013
```
I think it's almost impossible to fold atomic fences profitably under
LLVM/C++11 semantics. As a result, this is now unused and just
cluttering up the target interface.

llvm-svn: 179940
```
  16aba170
- Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE. · a2b53390
  Tim Northover authored Apr 20, 2013
```
llvm-svn: 179939
```
  a2b53390
- Add CodeGen support for functions that always return arguments via a new... · b8bd232a
  Stephen Lin authored Apr 20, 2013
```
Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter).

llvm-svn: 179925
```
  b8bd232a
- Allow tail call opportunity detection through nested and/or multiple... · ffc44549
  Stephen Lin authored Apr 20, 2013
```
Allow tail call opportunity detection through nested and/or multiple iterations of extractelement/insertelement indirection

llvm-svn: 179924
```
  ffc44549
- Simplify the code in FastISel::tryToFoldLoad, add an assertion and fix a comment. · e80691dc
  Eli Bendersky authored Apr 19, 2013
```
llvm-svn: 179908
```
  e80691dc
- Move TryToFoldFastISelLoad to FastISel, where it belongs. In general, I'm · 90dd3e7d
  Eli Bendersky authored Apr 19, 2013
```
trying to move as much FastISel logic as possible out of the main path in
SelectionDAGISel - intermixing them just adds confusion.

llvm-svn: 179902
```
  90dd3e7d
- ArrayRefize getMachineNode(). No functionality change. · b53d8963
  Michael Liao authored Apr 19, 2013
```
llvm-svn: 179901
```
  b53d8963
Apr 19, 2013
- Add an MRI::verifyUseLists() function. · e17c3fde
  Jakob Stoklund Olesen authored Apr 19, 2013
```
This checks the sanity of the register use lists in the MI intermediate
representation.

llvm-svn: 179895
```
  e17c3fde
- Use dbgs() consistently for -debug printouts · dbeefaa8
  Eli Bendersky authored Apr 19, 2013
```
llvm-svn: 179894
```
  dbeefaa8
- Revert "PR14606: debug info imported_module support" · 0e89ade8
  Eric Christopher authored Apr 19, 2013
```
This reverts commit r179836 as it seems to have caused test failures.

llvm-svn: 179840
```
  0e89ade8