Commits · 99d4cb861b5ce31d08f59c2f3c795e23648eba66 · Roger Ferrer / llvm-epi-0.8

Aug 27, 2010

Fix typos in comments. · 99d4cb86
Owen Anderson authored Aug 27, 2010
```
llvm-svn: 112286
```
99d4cb86

teach the truncation optimization that an entire chain of · 73984346

Chris Lattner authored Aug 27, 2010

computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.

llvm-svn: 112285

73984346

Switch ScalarEvolution's main Value*->SCEV* map from std::map · 9bad2fb3
Dan Gohman authored Aug 27, 2010
```
to DenseMap.

llvm-svn: 112281
```
9bad2fb3

Add an instcombine to clean up a common pattern produced · 90cd746e

Chris Lattner authored Aug 27, 2010

by the SRoA "promote to large integer" code, eliminating
some type conversions like this:

   %94 = zext i16 %93 to i32                       ; <i32> [#uses=2]
   %96 = lshr i32 %94, 8                           ; <i32> [#uses=1]
   %101 = trunc i32 %96 to i8                      ; <i8> [#uses=1]

This also unblocks other xforms from happening, now clang is able to compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	pshufd	$1, %xmm0, %xmm2
	addss	%xmm0, %xmm2
	movdqa	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	pshufd	$1, %xmm1, %xmm0
	addss	%xmm3, %xmm0
	ret

on x86-64, instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

This seems pretty close to optimal to me, at least without
using horizontal adds.  This also triggers in lots of other
code, including SPEC.

llvm-svn: 112278

90cd746e

Add alignment arguments to all the NEON load/store intrinsics. · edf722ad

Bob Wilson authored Aug 27, 2010

Update all the tests using those intrinsics and add support for
auto-upgrading bitcode files with the old versions of the intrinsics.

llvm-svn: 112271

edf722ad

Use LVI to eliminate conditional branches where we've tested a related... · 6ebbd923

Owen Anderson authored Aug 27, 2010

Use LVI to eliminate conditional branches where we've tested a related condition previously.  Update tests for this change.
This fixes PR5652.

llvm-svn: 112270

6ebbd923

Optimize SCEVComplexityCompare. Use a 3-way return instead of a 2-way · 2706567c

Dan Gohman authored Aug 27, 2010

return to avoid needing two calls to test for equivalence, and sort
addrecs by their degree before examining their operands.

llvm-svn: 112267

2706567c

Properly handle passing of FP stuff to varargs function on Win64: · c0b36921
Anton Korobeynikov authored Aug 27, 2010
```
value should be copied to the corresponding shadow reg as well.
Patch by Cameron Esfahani!

llvm-svn: 112262
```
c0b36921
MCELF: Port EmitInstruction changes from MachO streamer. Patch by Roman Divacky. · 1f601247
Benjamin Kramer authored Aug 27, 2010
```
llvm-svn: 112260
```
1f601247
MCELF: Always overwrite FixedValue. · 05e22982
Benjamin Kramer authored Aug 27, 2010
```
llvm-svn: 112259
```
05e22982

X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to... · 1844a71e

Daniel Dunbar authored Aug 27, 2010

X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler.

llvm-svn: 112250

1844a71e

Revert r112213. It is not needed. · b12ff599
Devang Patel authored Aug 26, 2010
```
llvm-svn: 112242
```
b12ff599
Simplify eliminateFrameIndex() interface back down now that PEI doesn't need · 6a770669
Jim Grosbach authored Aug 26, 2010
```
to try to re-use scavenged frame index reference registers. rdar://8277890

llvm-svn: 112241
```
6a770669
If node is not available then use FuncInfo.ValueMap to emit debug info for byval parameter. · ea134f56
Devang Patel authored Aug 26, 2010
```
llvm-svn: 112238
```
ea134f56

Remove the now obsolete frame index virtual re-use algorithm from PEI. Pre-RA · 2a1915d0

Jim Grosbach authored Aug 26, 2010

virtual base registers handle this function, and more. A bit more cleanup
to do on the interface to eliminateFrameIndex() after this.

llvm-svn: 112237

2a1915d0

optimize "integer extraction out of the middle of a vector" as produced · bfd22281
Chris Lattner authored Aug 26, 2010
```
by SRoA.  This is part of rdar://7892780, but needs another xform to
expose this.

llvm-svn: 112232
```
bfd22281

Aug 26, 2010

tidy up a bit. no functional change. · e82d5b4a
Jim Grosbach authored Aug 26, 2010
```
llvm-svn: 112228
```
e82d5b4a

optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x' · d4ebd6df

Chris Lattner authored Aug 26, 2010

is a vector to be a vector element extraction.  This allows clang to
compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	movd	%eax, %xmm0
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movd	%xmm1, %rax
	movd	%eax, %xmm1
	addss	%xmm2, %xmm1
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm1, %xmm0
	ret

... eliminating half of the horribleness.

llvm-svn: 112227

d4ebd6df

Turn off the scavenging based frame reg reuse briefly to measure whether it's · 17da9359

Jim Grosbach authored Aug 26, 2010

still having a significant effect. It shouldn't be now that the pre-RA
virtual base reg stuff is in. Assuming that's valididated by the nightly
testers, we can simplify a lot of the PEI frame index code.

llvm-svn: 112220

17da9359

zap the now unused MVT::getIntVectorWithNumElements · e25ba0c7
Bruno Cardoso Lopes authored Aug 26, 2010
```
llvm-svn: 112218
```
e25ba0c7
Speculatively revert r112207. · 42b4ac7e
Devang Patel authored Aug 26, 2010
```
llvm-svn: 112216
```
42b4ac7e
80 col. · 977057f4
Devang Patel authored Aug 26, 2010
```
llvm-svn: 112215
```
977057f4
Update DanglingDebugInfo so that it can be used to track llvm.dbg.declare also. · 384fa91d
Devang Patel authored Aug 26, 2010
```
llvm-svn: 112213
```
384fa91d
Use pseudo instructions for VST3. · 97919e9c
Bob Wilson authored Aug 26, 2010
```
llvm-svn: 112208
```
97919e9c

Donot forget to resolve dangling debug info in a case where virtual register,... · ab596a63

Devang Patel authored Aug 26, 2010

Donot forget to resolve dangling debug info in a case where virtual register, used for a value, is initialized after a dbg intrinsic is seen.

llvm-svn: 112207

ab596a63

Reapply r112176 without removing the other CMN patterns (that was unintentional). · a9c03f4f
Bill Wendling authored Aug 26, 2010
```
llvm-svn: 112206
```
a9c03f4f
MCELF: Fix a thinko of mine. · 2c45f431
Benjamin Kramer authored Aug 26, 2010
```
llvm-svn: 112203
```
2c45f431
Fix comment typos. · a967c42a
Bob Wilson authored Aug 26, 2010
```
llvm-svn: 112202
```
a967c42a
Make JumpThreading smart enough to properly thread StrSwitch when it's compiled with clang++. · bd2ecc7e
Owen Anderson authored Aug 26, 2010
```
llvm-svn: 112198
```
bd2ecc7e
MCELF: Compensate for the addend on i386. Patch by Roman Divacky, with some cleanups. · 929cc761
Benjamin Kramer authored Aug 26, 2010
```
llvm-svn: 112197
```
929cc761
Restrict the register to tGPR to make sure the str instruction will be · 074d22e1
Jim Grosbach authored Aug 26, 2010
```
encodable as a 16-bit wide instruction.

llvm-svn: 112195
```
074d22e1
Revert r112176; it broke test/CodeGen/Thumb2/thumb2-cmn.ll. · 10b20b2b
Dan Gohman authored Aug 26, 2010
```
llvm-svn: 112191
```
10b20b2b

Reapply r112091 and r111922, support for metadata linking, with a · ca26f790

Dan Gohman authored Aug 26, 2010

fix: add a flag to MapValue and friends which indicates whether
any module-level mappings are being made. In the common case of
inlining, no module-level mappings are needed, so MapValue doesn't
need to examine non-function-local metadata, which can be very
expensive in the case of a large module with really deep metadata
(e.g. a large C++ program compiled with -g).

This flag is a little awkward; perhaps eventually it can be moved
into the ClonedCodeInfo class.

llvm-svn: 112190

ca26f790

StringRef::compare_numeric also differed from StringRef::compare for characters > 127. · 9bf0380a
Benjamin Kramer authored Aug 26, 2010
```
llvm-svn: 112189
```
9bf0380a
Do unsigned char comparisons in StringRef::compare_lower to be more consistent... · b04d4af0
Benjamin Kramer authored Aug 26, 2010
```
Do unsigned char comparisons in StringRef::compare_lower to be more consistent with compare in corner cases.

llvm-svn: 112185
```
b04d4af0

There seems to be a (potential) hardware bug with the CMN instruction and · a9a0599b

Bill Wendling authored Aug 26, 2010

comparison with 0. These two pieces of code should give identical results:

  rsbs r1, r1, 0
  cmp  r0, r1
  mov  r0, #0
  it   ls
  mov  r0, #1

and:

  cmn  r0, r1
  mov  r0, #0
  it   ls
  mov  r0, #1

However, the CMN gives the *opposite* result when r1 is 0. This is because the
carry flag is set in the CMP case but not in the CMN case. In short, the CMP
instruction doesn't perform a truncate of the (logical) NOT of 0 plus the value
of r0 and the carry bit (because the "carry bit" parameter to AddWithCarry is
defined as 1 in this case, the carry flag will always be set when r0 >= 0). The
CMN instruction doesn't perform a NOT of 0 so there is never a "carry" when this
AddWithCarry is performed (because the "carry bit" parameter to AddWithCarry is
defined as 0).

The AddWithCarry in the CMP case seems to be relying upon the identity:

  ~x + 1 = -x

However when x is 0 and unsigned, this doesn't hold:

   x = 0
  ~x = 0xFFFF FFFF
  ~x + 1 = 0x1 0000 0000
  (-x = 0) != (0x1 0000 0000 = ~x + 1)

Therefore, we should disable *all* versions of CMN, especially when comparing
against zero, until we can limit when the CMN instruction is used (when we know
that the RHS is not 0) or when we have a hardware fix for this.

(See the ARM docs for the "AddWithCarry" pseudo-code.)

This is related to <rdar://problem/7569620>.

llvm-svn: 112176

a9a0599b

Add a hackaround for PR7993 which is causing failures on x86 builders that lack sse2. · af23e9a7
Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112175
```
af23e9a7
implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. · eb2cc0ce
Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112171
```
eb2cc0ce
Use pseudo instructions for VST1d64Q. · 4cec4497
Bob Wilson authored Aug 26, 2010
```
llvm-svn: 112170
```
4cec4497
fix sse1 only codegen in x86-64 mode, which is something we · cc60609c
Chris Lattner authored Aug 26, 2010
```
apparently try to support.

llvm-svn: 112168
```
cc60609c