Commits · 53c5ddf0d7a9a37b6a73d078d42ecaeb8b76427f · Roger Ferrer / llvm-epi-0.8

Mar 19, 2013

[ms-inline asm] Move the size directive asm rewrite into the target specific · 120eefd1
Chad Rosier authored Mar 19, 2013
```
logic as a QOI cleanup.
rdar://13445327

llvm-svn: 177413
```
120eefd1
The Linker interface has some dead code after the cleanup in r172749 · 0f7fd36f
Eli Bendersky authored Mar 19, 2013
```
(and possibly others). The attached patch removes it, and tries to
update comments accordingly.

llvm-svn: 177406
```
0f7fd36f

Cleanup PPC64 unaligned i64 load/store · 66814863

Hal Finkel authored Mar 19, 2013

Remove an accidentally-added instruction definition and add a comment in the
test case. This is in response to a post-commit review by Bill Schmidt.

No functionality change intended.

llvm-svn: 177404

66814863

The testing to ensure a vector of zeros of type floating point isn't... · 298e4192

David Tweed authored Mar 19, 2013

The testing to ensure a vector of zeros of type floating point isn't misclassified as negative zero can be simplified, as pointed out by Duncan Sands.

llvm-svn: 177386

298e4192

Improve long vector sext/zext lowering on ARM · 227eb6fc

Renato Golin authored Mar 19, 2013

The ARM backend currently has poor codegen for long sext/zext
operations, such as v8i8 -> v8i32. This patch addresses this
by performing a custom expansion in ARMISelLowering. It also
adds/changes the cost of such lowering in ARMTTI.

This partially addresses PR14867.

Patch by Pete Couperus

llvm-svn: 177380

227eb6fc

Don't reserve R31 on PPC64 unless the frame pointer is needed · d9e10d51
Hal Finkel authored Mar 19, 2013
```
llvm-svn: 177379
```
d9e10d51

Revert "Cleanup some SCEV logic a bit." · f3a2544d

Andrew Trick authored Mar 19, 2013

This reverts commit 82cd8f7382322bee7a71cdc31f7a923c44d37d32.

Just add a comment instead!

llvm-svn: 177377

f3a2544d

Cleanup some SCEV logic a bit. · de788665
Andrew Trick authored Mar 19, 2013
```
Make the code more obvious to scan-build and humans.

llvm-svn: 177375
```
de788665
Tighten up an internal LSR API that should check for NULL. · a1c01ba8
Andrew Trick authored Mar 19, 2013
```
No test case, but should fix a scan_build warning.

llvm-svn: 177374
```
a1c01ba8
Emit the linkage name instead of the function name, when available. This means · d6718633
Nick Lewycky authored Mar 19, 2013
```
that we'll prefer to emit the mangled C++ name (pending a clang change).

llvm-svn: 177371
```
d6718633

Fix a sign-extension bug in PPCCTRLoops · fc9aad64

Hal Finkel authored Mar 18, 2013

Don't sign extend the immediate value from the OR instruction in
an LIS/OR pair.

llvm-svn: 177361

fc9aad64

Move #include of BitVector from .h to .cpp file. · b6970267
Jakub Staszak authored Mar 18, 2013
```
Also remove unneeded #include and forward declaration.

llvm-svn: 177357
```
b6970267
Add some constantness. · 26ac8a7b
Jakub Staszak authored Mar 18, 2013
```
llvm-svn: 177356
```
26ac8a7b
Make method private. Keep coding standard. · bc421efd
Jakub Staszak authored Mar 18, 2013
```
llvm-svn: 177348
```
bc421efd
[ms-inline asm] Avoid emitting a redundant sizing directive, if we've already · 2707d534
Chad Rosier authored Mar 18, 2013
```
parsed one.  Test case coming shortly.
rdar://13446980

llvm-svn: 177347
```
2707d534
Change NULL to 0. · a0f3694a
Jakub Staszak authored Mar 18, 2013
```
llvm-svn: 177342
```
a0f3694a

Bill Wendling authored Mar 18, 2013

For each compile unit, we want to register a function that will flush that
compile unit. Otherwise, __gcov_flush() would only flush the counters within the
current compile unit, and not any outside of it.

PR15191 & <rdar://problem/13167507>

llvm-svn: 177340

c3cab816

Fix PPC unaligned 64-bit loads and stores · b09680b0

Hal Finkel authored Mar 18, 2013

PPC64 supports unaligned loads and stores of 64-bit values, but
in order to use the r+i forms, the offset must be a multiple of 4.
Unfortunately, this cannot always be determined by examining the
immediate itself because it might be available only via a TOC entry.

In order to get around this issue, we additionally predicate the
selection of the r+i form on the alignment of the load or store
(forcing it to be at least 4 in order to select the r+i form).

llvm-svn: 177338

b09680b0

Mar 18, 2013

ARM cost model: Make some vector integer to float casts cheaper · ae0052f1

Arnold Schwaighofer authored Mar 18, 2013

The default logic marks them as too expensive.

For example, before this patch we estimated:
  cost of 16 for instruction:   %r = uitofp <4 x i16> %v0 to <4 x float>

While this translates to:
  vmovl.u16 q8, d16
  vcvt.f32.u32  q8, q8

All other costs are left to the values assigned by the fallback logic. Theses
costs are mostly reasonable in the sense that they get progressively more
expensive as the instruction sequences emitted get longer.

radar://13445992

llvm-svn: 177334

ae0052f1

ARM cost model: Correct cost for some cheap float to integer conversions · 6c9c3a8b

Arnold Schwaighofer authored Mar 18, 2013

Fix cost of some "cheap" cast instructions. Before this patch we used to
estimate for example:
  cost of 16 for instruction:   %r = fptoui <4 x float> %v0 to <4 x i16>

While we would emit:
  vcvt.s32.f32  q8, q8
  vmovn.i32 d16, q8
  vuzp.8  d16, d17

All other costs are left to the values assigned by the fallback logic. Theses
costs are mostly reasonable in the sense that they get progressively more
expensive as the instruction sequences emitted get longer.

radar://13434072

llvm-svn: 177333

6c9c3a8b

Extend global merge pass to optionally consider global constant variables. · 8fc34097
Quentin Colombet authored Mar 18, 2013
```
Also add some checks to not merge globals used within landing pad instructions or marked as "used".

llvm-svn: 177331
```
8fc34097

Add SchedRW annotations to most of X86InstrSSE.td. · a5158c8f

Jakob Stoklund Olesen authored Mar 18, 2013

We hitch a ride with the existing OpndItins class that was used to add
instruction itinerary classes in the many multiclasses in this file.

Use the link provided by the X86FoldableSchedWrite.Folded to find the
right SchedWrite for folded loads.

llvm-svn: 177326

a5158c8f

Annotate X86 arithmetic instructions with SchedRW lists. · e2289b78

Jakob Stoklund Olesen authored Mar 18, 2013

This new-style scheduling information is going to replace the
instruction iteneraries.

This also serves as a test case for Andy's fix in r177317.

llvm-svn: 177323

e2289b78

Check whether a pointer is non-null (isKnownNonNull) in isKnownNonZero. · 1217112d

Manman Ren authored Mar 18, 2013

This handles the case where we have an inbounds GEP with alloca as the pointer.
This fixes the regression in PR12750 and rdar://13286434.
Note that we can also fix this by handling some GEP cases in isKnownNonNull.

llvm-svn: 177321

1217112d

Fix 80-col. violations in PPCCTRLoops · e8f1cf47
Hal Finkel authored Mar 18, 2013
```
llvm-svn: 177296
```
e8f1cf47

Fix large count and negative constant count handling in PPCCTRLoops · 21f2a43a

Hal Finkel authored Mar 18, 2013

This commit fixes an assert that would occur on loops with large constant counts
(like looping for ((uint32_t) -1) iterations on PPC64). The existing code did
not handle counts that it computed to be negative (asserting instead), but
these can be created with valid inputs.

This bug was discovered by bugpoint while I was attempting to isolate a
completely different problem.

Also, in writing test cases for the negative-count problem, I discovered that
the ori/lsi handling was broken (there was a typo which caused the logic that
was supposed to detect these pairs and extract the iteration count to always
fail). This has now also been corrected (and is covered by one of the new test
cases).

llvm-svn: 177295

21f2a43a

Cleanup initial-value constants in PPCCTRLoops · 12337e4e

Hal Finkel authored Mar 18, 2013

Because the initial-value constants had not been added to the list
of instructions considered for DCE the resulting code had redundant
constant-materialization instructions.

llvm-svn: 177294

12337e4e

Fix integer comparison in DIEInteger::BestForm. · 7504cefa

Hans Wennborg authored Mar 18, 2013

The always-true "(int)Int == (signed)Int" comparison was found
while experimenting with a potential new Clang warning.

llvm-svn: 177290

7504cefa

The optimization a + (-0.0f) -> a was being misapplied to a + (+0.0f) in the vector case (because · 5493feed

David Tweed authored Mar 18, 2013

we weren't differntiating floating-point zeroinitializers from other zero-initializers)
which was causing problems for code relying upon a + (+0.0f) to, eg, flush denormals to
0. Make the scalar and vector cases have the same behaviour.

llvm-svn: 177279

5493feed

R600/SI: implement indirect adressing for SI · 2989ffca

Christian Konig authored Mar 18, 2013



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 177277

2989ffca

R600/SI: add float vector types · 4a1b9c3b

Christian Konig authored Mar 18, 2013



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 177276

4a1b9c3b

R600/SI: add shl pattern · 082a14a8

Christian Konig authored Mar 18, 2013



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 177275

082a14a8

R600/SI: add BUFFER_LOAD_DWORD pattern · 7a14a47e

Christian Konig authored Mar 18, 2013



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 177274

7a14a47e

R600/SI: implement SI.load.const intrinsic · 49374087

Christian Konig authored Mar 18, 2013



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 177273

49374087

R600/SI: enable all S_LOAD and S_BUFFER_LOAD opcodes · 9c7afd11

Christian Konig authored Mar 18, 2013



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 177272

9c7afd11

R600/SI: fix inserting waits for all defines · f1fd5fad

Christian Konig authored Mar 18, 2013



Unfortunately the previous fix for inserting waits for unordered
defines wasn't sufficient, cause it's possible that even ordered
defines are only partially used (or not used at all).

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 177271

f1fd5fad

[asan] when creating string constants, set unnamed_attr and align 1 so that... · 10cc12f2

Kostya Serebryany authored Mar 18, 2013

[asan] when creating string constants, set unnamed_attr and align 1 so that equal strings are merged by the linker. Observed up to 1% binary size reduction. Thanks to Anton Korobeynikov for the suggestion

llvm-svn: 177264

10cc12f2

Mark internal classes as POD-like to get better behavior out of · f74654d2
Chandler Carruth authored Mar 18, 2013
```
SmallVector and DenseMap.

This speeds up SROA by 25% on PR15412.

llvm-svn: 177259
```
f74654d2

TLS support for MinGW targets. · 3e7005f1

Anton Korobeynikov authored Mar 18, 2013

MinGW is almost completely compatible to MSVC, with the exception of the _tls_array global not being available.

Patch by David Nadlinger!

llvm-svn: 177257

3e7005f1

Windows TLS: Section name prefix to ensure correct order · 2810a0ab

Anton Korobeynikov authored Mar 18, 2013

The linker sorts the .tls$<xyz> sections by name, and we need
to make sure any extra sections we produce (e.g. for weak globals) 
always end up between .tls$AAA and .tls$ZZZ, even if the name 
starts with e.g. an underscore.

Patch by David Nadlinger!

llvm-svn: 177256

2810a0ab