Commits · 13ce07fa928ba20d6369195dd62bc3599172bf16 · Roger Ferrer / llvm-epi-0.8

Aug 28, 2010

Change ARM VFP VLDM/VSTM instructions to use addressing mode #4, just like · 13ce07fa

Bob Wilson authored Aug 27, 2010

all the other LDM/STM instructions. This fixes asm printer crashes when
compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run
with -O0 to check this in the future.

Prior to this change VLDM/VSTM used addressing mode #5, but not really.
The offset field was used to hold a count of the number of registers being
loaded or stored, and the AM5 opcode field was expanded to specify the IA
or DB mode, instead of the standard ADD/SUB specifier. Much of the backend
was not aware of these special cases. The crashes occured when rewriting
a frameindex caused the AM5 offset field to be changed so that it did not
have a valid submode. I don't know exactly what changed to expose this now.
Maybe we've never done much with -O0 and NEON. Regardless, there's no longer
any reason to keep a count of the VLDM/VSTM registers, so we can use
addressing mode #4 and clean things up in a lot of places.

llvm-svn: 112322

13ce07fa

tidy up test. · 954e9557
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112321
```
954e9557
no really, fix the test. · b8b7d526
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112317
```
b8b7d526
fix this test. It's not clear what it's really testing. · c8908b4c
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112316
```
c8908b4c

Enhance the shift propagator to handle the case when you have: · 6c1395f6

Chris Lattner authored Aug 27, 2010

A = shl x, 42
...
B = lshr ..., 38

which can be transformed into:
A = shl x, 4
...

iff we can prove that the would-be-shifted-in bits
are already zero.  This eliminates two shifts in the testcase
and allows eliminate of the whole i128 chain in the real example.

llvm-svn: 112314

6c1395f6

Simplify. · f2855b14
Devang Patel authored Aug 27, 2010
```
llvm-svn: 112305
```
f2855b14

Implement a pretty general logical shift propagation · 18d7fc8f

Chris Lattner authored Aug 27, 2010

framework, which is good at ripping through bitfield
operations.  This generalize a bunch of the existing
xforms that instcombine does, such as 
  (x << c) >> c -> and
to handle intermediate logical nodes.  This is useful for
ripping up the "promote to large integer" code produced by
SRoA.

llvm-svn: 112304

18d7fc8f

Aug 27, 2010

Fix a comment typo. · aaff8f53
Bob Wilson authored Aug 27, 2010
```
llvm-svn: 112302
```
aaff8f53
Unsigned value cannot be < 0. · af371b49
Bob Wilson authored Aug 27, 2010
```
llvm-svn: 112300
```
af371b49
When merging adjacent operands, scan ahead and merge all equal · 15871f23
Dan Gohman authored Aug 27, 2010
```
adjacent operands at once, instead of just two at a time.

llvm-svn: 112299
```
15871f23
Fix a couple of typos. · ec5030b2
Eric Christopher authored Aug 27, 2010
```
Patch by Cameron Esfahani!

llvm-svn: 112297
```
ec5030b2
remove some special shift cases that have been subsumed into the · 25a198e7
Chris Lattner authored Aug 27, 2010
```
more general simplify demanded bits logic.

llvm-svn: 112291
```
25a198e7

Make the {A,+,B}<L> + {C,+,D}<L> --> Other + {A+C,+,B+D}<L> · c866bf4f

Dan Gohman authored Aug 27, 2010

transformation collect all the addrecs with the same loop
add combine them at once rather than starting everything over
at the first chance.

llvm-svn: 112290

c866bf4f

merge and filecheckize test · 606b76eb
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112289
```
606b76eb
merge two tests · c665156e
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112288
```
c665156e
Remove now unneeded command line flag that enables 'optimize compares.' · 6628431a
Bill Wendling authored Aug 27, 2010
```
llvm-svn: 112287
```
6628431a
Fix typos in comments. · 99d4cb86
Owen Anderson authored Aug 27, 2010
```
llvm-svn: 112286
```
99d4cb86

teach the truncation optimization that an entire chain of · 73984346

Chris Lattner authored Aug 27, 2010

computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.

llvm-svn: 112285

73984346

Switch ScalarEvolution's main Value*->SCEV* map from std::map · 9bad2fb3
Dan Gohman authored Aug 27, 2010
```
to DenseMap.

llvm-svn: 112281
```
9bad2fb3
get this test passing on linux builders. · 7413e87b
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112280
```
7413e87b

Add an instcombine to clean up a common pattern produced · 90cd746e

Chris Lattner authored Aug 27, 2010

by the SRoA "promote to large integer" code, eliminating
some type conversions like this:

   %94 = zext i16 %93 to i32                       ; <i32> [#uses=2]
   %96 = lshr i32 %94, 8                           ; <i32> [#uses=1]
   %101 = trunc i32 %96 to i8                      ; <i8> [#uses=1]

This also unblocks other xforms from happening, now clang is able to compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	pshufd	$1, %xmm0, %xmm2
	addss	%xmm0, %xmm2
	movdqa	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	pshufd	$1, %xmm1, %xmm0
	addss	%xmm3, %xmm0
	ret

on x86-64, instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

This seems pretty close to optimal to me, at least without
using horizontal adds.  This also triggers in lots of other
code, including SPEC.

llvm-svn: 112278

90cd746e

Add alignment arguments to all the NEON load/store intrinsics. · edf722ad

Bob Wilson authored Aug 27, 2010

Update all the tests using those intrinsics and add support for
auto-upgrading bitcode files with the old versions of the intrinsics.

llvm-svn: 112271

edf722ad

Use LVI to eliminate conditional branches where we've tested a related... · 6ebbd923

Owen Anderson authored Aug 27, 2010

Use LVI to eliminate conditional branches where we've tested a related condition previously.  Update tests for this change.
This fixes PR5652.

llvm-svn: 112270

6ebbd923

Optimize SCEVComplexityCompare. Use a 3-way return instead of a 2-way · 2706567c

Dan Gohman authored Aug 27, 2010

return to avoid needing two calls to test for equivalence, and sort
addrecs by their degree before examining their operands.

llvm-svn: 112267

2706567c

Clarify a comment. · ca158413
Dan Gohman authored Aug 27, 2010
```
llvm-svn: 112266
```
ca158413
Parse " (Hidden)" and cope with it. · fa4a4705
Dan Gohman authored Aug 27, 2010
```
llvm-svn: 112265
```
fa4a4705
Default to looking for clang++ in the PATH, rather than trying to · 7e64c985
Dan Gohman authored Aug 27, 2010
```
guess a path that will work.

llvm-svn: 112264
```
7e64c985
Properly handle passing of FP stuff to varargs function on Win64: · c0b36921
Anton Korobeynikov authored Aug 27, 2010
```
value should be copied to the corresponding shadow reg as well.
Patch by Cameron Esfahani!

llvm-svn: 112262
```
c0b36921
MCELF: Port EmitInstruction changes from MachO streamer. Patch by Roman Divacky. · 1f601247
Benjamin Kramer authored Aug 27, 2010
```
llvm-svn: 112260
```
1f601247
MCELF: Always overwrite FixedValue. · 05e22982
Benjamin Kramer authored Aug 27, 2010
```
llvm-svn: 112259
```
05e22982

Fix the msvs 2010 build. · 788a6079

Michael J. Spencer authored Aug 27, 2010

The Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01
implements parts of C++0x based on the draft standard. An old version of
the draft had a bug that makes std::pair<T1*, T2*>(something, 0) fail to
compile. This is because the template<class U, class V> pair(U&& x, V&& y)
constructor is selected, even though it later fails to implicitly convert
U and V to frist_type and second_type.

This has been fixed in n3090, but it seems that Microsoft is not going to
update msvc.

llvm-svn: 112257

788a6079

X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to... · 1844a71e

Daniel Dunbar authored Aug 27, 2010

X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler.

llvm-svn: 112250

1844a71e

Revert r112213. It is not needed. · b12ff599
Devang Patel authored Aug 26, 2010
```
llvm-svn: 112242
```
b12ff599
Simplify eliminateFrameIndex() interface back down now that PEI doesn't need · 6a770669
Jim Grosbach authored Aug 26, 2010
```
to try to re-use scavenged frame index reference registers. rdar://8277890

llvm-svn: 112241
```
6a770669
If node is not available then use FuncInfo.ValueMap to emit debug info for byval parameter. · ea134f56
Devang Patel authored Aug 26, 2010
```
llvm-svn: 112238
```
ea134f56

Remove the now obsolete frame index virtual re-use algorithm from PEI. Pre-RA · 2a1915d0

Jim Grosbach authored Aug 26, 2010

virtual base registers handle this function, and more. A bit more cleanup
to do on the interface to eliminateFrameIndex() after this.

llvm-svn: 112237

2a1915d0

filecheckize · c188b96b
Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112235
```
c188b96b
rename test. · 387d6bcd
Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112234
```
387d6bcd
optimize "integer extraction out of the middle of a vector" as produced · bfd22281
Chris Lattner authored Aug 26, 2010
```
by SRoA.  This is part of rdar://7892780, but needs another xform to
expose this.

llvm-svn: 112232
```
bfd22281

Aug 26, 2010
- tidy up a bit. no functional change. · e82d5b4a
  Jim Grosbach authored Aug 26, 2010
```
llvm-svn: 112228
```
  e82d5b4a