Commits · 68062617a6e4fd5e2fefa0ebb99a36e77dcb0cff · Roger Ferrer / llvm-epi-0.8

Apr 10, 2012

Make a somewhat subtle change in the logic of block placement. Sometimes · 68062617

Chandler Carruth authored Apr 10, 2012

the loop header has a non-loop predecessor which has been pre-fused into
its chain due to unanalyzable branches. In this case, rotating the
header into the body of the loop in order to place a loop exit at the
bottom of the loop is a Very Bad Idea as it makes the loop
non-contiguous.

I'm working on a good test case for this, but it's a bit annoynig to
craft. I should get one shortly, but I'm submitting this now so I can
begin the (lengthy) performance analysis process. An initial run of LNT
looks really, really good, but there is too much noise there for me to
trust it much.

llvm-svn: 154395

68062617

Transform div to mul with reciprocal only when fp imm is legal. · 4d1220de
Anton Korobeynikov authored Apr 10, 2012
```
This fixes PR12516 and uncovers one weird problem in legalize (workarounded)

llvm-svn: 154394
```
4d1220de
Use the correct section types on Solaris for unwind data on both x86 and x86-64. · bbec8720
David Chisnall authored Apr 10, 2012
```
Patch by Dmitri Shubin!

llvm-svn: 154391
```
bbec8720
Express the number of ULPs in fpaccuracy metadata as a real rather than a · af06b26c
Duncan Sands authored Apr 10, 2012
```
rational number, eg as 2.5 rather than 5, 2.  OK'd by Peter Collingbourne.

llvm-svn: 154387
```
af06b26c

Fix 12513: Loop unrolling breaks with indirect branches. · 4442bfe5

Andrew Trick authored Apr 10, 2012

Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.

llvm-svn: 154386

4442bfe5

whitespace · 4104ed9c
Andrew Trick authored Apr 10, 2012
```
llvm-svn: 154385
```
4104ed9c
Fix for register pressure tables. · 7d52db98
Andrew Trick authored Apr 10, 2012
```
Recent refactoring introduced a bug. Fix: added buildRegUnitSets.

llvm-svn: 154382
```
7d52db98
Add proper checks. · 07526249
Evan Cheng authored Apr 10, 2012
```
llvm-svn: 154379
```
07526249
Make the code slightly more palatable. · 136861d9
Evan Cheng authored Apr 10, 2012
```
llvm-svn: 154378
```
136861d9
Use std::includes instead of my own implementation. · 9002c315
Andrew Trick authored Apr 10, 2012
```
Jakob's review.

llvm-svn: 154377
```
9002c315
Added a TargetRegisterInfo interface for accessing register pressure sets. · 31f64875
Andrew Trick authored Apr 10, 2012
```
llvm-svn: 154375
```
31f64875

Added register unit sets to the target description. · 739a0038

Andrew Trick authored Apr 10, 2012

This is a new algorithm that finds sets of register units that can be
used to model registers pressure. This handles arbitrary, overlapping
register classes. Each register class is associated with a (small)
list of pressure sets. These are the dimensions of pressure affected
by the register class's liveness.

llvm-svn: 154374

739a0038

Added register unit weights to the target description. · 1d7a2c57

Andrew Trick authored Apr 10, 2012

This is a new algorithm that associates registers with weighted
register units to accuretely model their effect on register
pressure. This handles registers with multiple overlapping
subregisters. It is possible, but almost inconceivable that the
algorithm fails to find an exact solution for a target description. If
an exact solution cannot be found, an inexact, but reasonable solution
will be chosen.

llvm-svn: 154373

1d7a2c57

Fix header comment · 3a6e88dc
Andrew Trick authored Apr 10, 2012
```
llvm-svn: 154372
```
3a6e88dc
Add a constructor for DataRefImpl and remove excess initialization. · 549515e1
Danil Malyshev authored Apr 10, 2012
```
llvm-svn: 154371
```
549515e1

Fix a long standing tail call optimization bug. When a libcall is emitted · f8bad080

Evan Cheng authored Apr 10, 2012

legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178

llvm-svn: 154370

f8bad080

Don't try to zExt just to check if an integer constant is zero, it might · 1d9672bd
Rafael Espindola authored Apr 10, 2012
```
not fit in a i64.

llvm-svn: 154364
```
1d9672bd
ARM LDR/LDRT has the same encoding collision as STR/STRT. · 8f99bc3a
Jim Grosbach authored Apr 10, 2012
```
Generalized logic of r154141.

llvm-svn: 154362
```
8f99bc3a
Test case for PR12495. · ec96cd06
Lang Hames authored Apr 09, 2012
```
llvm-svn: 154359
```
ec96cd06
Revert the 'EnableInitializing' flag. There is debate on whether we should run... · b5cedde6
Bill Wendling authored Apr 09, 2012
```
Revert the 'EnableInitializing' flag. There is debate on whether we should run that pass by default in LTO.

llvm-svn: 154356
```
b5cedde6

Apply the scope restrictions after parsing the command line options. There may... · 383fda29

Bill Wendling authored Apr 09, 2012

Apply the scope restrictions after parsing the command line options. There may be some which are used in that function.

llvm-svn: 154348

383fda29

Apr 09, 2012

Have TargetLowering::getPICJumpTableRelocBase return a node that points to the · 8483a6c4
Akira Hatanaka authored Apr 09, 2012
```
GOT if jump table uses 64-bit gp-relative relocation.

llvm-svn: 154341
```
8483a6c4

When performing a truncating store, it's possible to rearrange the data · e0e38f61

Chad Rosier authored Apr 09, 2012

in-register, such that we can use a single vector store rather then a 
series of scalar stores.

For func_4_8 the generated code

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vmov.u16	r0, d16[3]
	strb	r0, [r2, #3]
	vmov.u16	r0, d16[2]
	strb	r0, [r2, #2]
	vmov.u16	r0, d16[1]
	strb	r0, [r2, #1]
	vmov.u16	r0, d16[0]
	strb	r0, [r2]
	bx	lr

becomes

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vuzp.8	d16, d17
	vst1.32	{d16[0]}, [r2, :32]
	bx	lr

I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll,
but I couldn't think of a way to judiciously apply this combine.

This

	ldrh	r0, [r0, #4]
	strh	r0, [r1]

becomes

	vldr	d16, [r0]
	vmov.u16	r0, d16[2]
	vmov.32	d16[0], r0
	vuzp.16	d16, d17
	vst1.32	{d16[0]}, [r1, :32]

PR11158
rdar://10703339

llvm-svn: 154340

e0e38f61

Patch r153892 for PR11861 apparently broke an external project (see PR12493). · 3ad11ff9

Lang Hames authored Apr 09, 2012

This patch restores TwoAddressInstructionPass's pre-r153892 behaviour when
rescheduling instructions in TryInstructionTransform. Hopefully this will fix
PR12493. To refix PR11861, lowering of INSERT_SUBREGS is deferred until after
the copy that unties the operands is emitted (this seems to be a more
appropriate fix for that issue anyway).

llvm-svn: 154338

3ad11ff9

Update comments and remove unnecessary isVolatile() check. · 99cbde9e
Chad Rosier authored Apr 09, 2012
```
llvm-svn: 154336
```
99cbde9e
Typo. · 132a9983
Eric Christopher authored Apr 09, 2012
```
llvm-svn: 154329
```
132a9983

Fix accidentally constant conditions found by uncommitted improvements to -Wconstant-conversion. · e6b6fae8

David Blaikie authored Apr 09, 2012

A couple of cases where we were accidentally creating constant conditions by
something like "x == a || b" instead of "x == a || x == b". In one case a
conditional & then unreachable was used - I transformed this into a direct
assert instead.

llvm-svn: 154324

e6b6fae8

Pattern match a setcc of boolean value with 0 as a truncate. · 8f62b324
Rafael Espindola authored Apr 09, 2012
```
llvm-svn: 154322
```
8f62b324
This patch adds X86 instruction itineraries, which were missed by the · 2eec3672
Preston Gurd authored Apr 09, 2012
```
original patch to add itineraries, to X86InstrArithmetc.td.  

llvm-svn: 154320
```
2eec3672
Clarify that fpaccuracy metadata is giving the compiler permission to use a · f1e1bb21
Duncan Sands authored Apr 09, 2012
```
less accurate method.

llvm-svn: 154319
```
f1e1bb21
Lower some x86 shuffle sequences to the vblend family of instructions. · fb7e2ae5
Nadav Rotem authored Apr 09, 2012
```
llvm-svn: 154313
```
fb7e2ae5
s/lto_codegen_whole_program_optimization/lto_codegen_set_whole_program_optimization/ · deffc42d
Bill Wendling authored Apr 09, 2012
```
llvm-svn: 154312
```
deffc42d
Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type. · b801ca39
Nadav Rotem authored Apr 09, 2012
```
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.

llvm-svn: 154310
```
b801ca39
Remove unnecessary type check when combining and/or/xor of swizzles. Move some... · 9c3da316
Craig Topper authored Apr 09, 2012
```
Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out.

llvm-svn: 154309
```
9c3da316
Remove unnecessary 'else' on an 'if' that always returns · e5893f64
Craig Topper authored Apr 09, 2012
```
llvm-svn: 154308
```
e5893f64
Optimize code slightly. No functionality change. · e3ad4834
Craig Topper authored Apr 09, 2012
```
llvm-svn: 154307
```
e3ad4834
Add a hook to turn on the internalize pass through the LTO interface. · 8a49d049
Bill Wendling authored Apr 09, 2012
```
llvm-svn: 154306
```
8a49d049
Replace some explicit checks with asserts for conditions that should never happen. · 5894fe43
Craig Topper authored Apr 09, 2012
```
llvm-svn: 154305
```
5894fe43

Cleanup and relax a restriction on the matching of global offsets into · 3779ac10

Chandler Carruth authored Apr 09, 2012

x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.

To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
the match to try to fit RIP into the base register. If it fails, it
now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
as it did before. The reason these immediates are safe is because the
ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
This is the only case changed by the patch, and the primary place you
see it is in TLS, either the win64 section offset TLS or Linux
local-exec TLS model in a PIC compilation. Here the ABI again ensures
that the immediates fit because we are in small mode, and any other
operations required due to the PIC relocation model have been handled
externally to the Wrapper node (extra loads etc are made around the
wrapper node in ISelLowering).

I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.

llvm-svn: 154304

3779ac10

Fold 15 tiny test cases into a single file that implements the · 84b83426

Chandler Carruth authored Apr 09, 2012

comprehensive testing of TLS codegen for x86. Convert all of the ones
that were still using grep to use FileCheck. Remove some redundancies
between them.

Perhaps most interestingly expand the test cases so that they actually
fully list the instruction snippet being tested. TLS operations are
*very* narrowly defined, and so these seem reasonably stable. More
importantly, the existing test cases already were crazy fine grained,
expecting specific registers to be allocated. This just clarifies that
no *other* instructions are expected, and fills in some crucial gaps
that weren't being tested at all.

This will make any subsequent changes to TLS much more clear during
review.

llvm-svn: 154303

84b83426