Commits · 64be20e5a47a8c7f52083fd6dbeb8178c00e4edf · Roger Ferrer / llvm-epi

Aug 28, 2010

Minor change. · 64be20e5
NAKAMURA Takumi authored Aug 28, 2010
```
This is test for git svn dcommit

llvm-svn: 112389
```
64be20e5

I have manually decoded the imm field of an insertps one too many · 7a05e6dc

Chris Lattner authored Aug 28, 2010

times.  This patch causes llc and llvm-mc (which both default to
verbose-asm) to print out comments after a few common shuffle 
instructions which indicates the shuffle mask, e.g.:

	insertps	$113, %xmm3, %xmm0     ## xmm0 = zero,xmm0[1,2],xmm3[1]
	unpcklps	%xmm1, %xmm0    ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	pshufd	$1, %xmm1, %xmm1        ## xmm1 = xmm1[1,0,0,0]

This is carefully factored to keep the information extraction (of the
shuffle mask) separate from the printing logic.  I plan to move the
extraction part out somewhere else at some point for other parts of
the x86 backend that want to introspect on the behavior of shuffles.

llvm-svn: 112387

7a05e6dc

fixme accomplished · 112b6ee3
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112386
```
112b6ee3
tidy up · d39d2aad
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112385
```
d39d2aad
Add me to the "blame list"! · d1ca3e14
NAKAMURA Takumi authored Aug 28, 2010
```
And it is my 1st test commit.

llvm-svn: 112384
```
d1ca3e14
Remove obsolete keywords which are no longer relevant. · d13b1a35
Dan Gohman authored Aug 28, 2010
```
llvm-svn: 112382
```
d13b1a35
Remove unions from the vim syntax highlighting. · 5458b9dd
Dan Gohman authored Aug 28, 2010
```
llvm-svn: 112381
```
5458b9dd

fix the buildvector->insertp[sd] logic to not always create a redundant · 94656b1c

Chris Lattner authored Aug 28, 2010

insertp[sd] $0, which is a noop.  Before:

_f32:                                   ## @f32
	pshufd	$1, %xmm1, %xmm2
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm2, %xmm3
	addss	%xmm1, %xmm0
                                        ## kill: XMM0<def> XMM0<kill> XMM0<def>
	insertps	$0, %xmm0, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

after:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movdqa	%xmm2, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

The extra movs are due to a random (poor) scheduling decision.

llvm-svn: 112379

94656b1c

fix the BuildVector -> unpcklps logic to not do pointless shuffles · bcb6090a

Chris Lattner authored Aug 28, 2010

when the top elements of a vector are undefined.  This happens all
the time for X86-64 ABI stuff because only the low 2 elements of
a 4 element vector are defined.  For example, on:

_Complex float f32(_Complex float A, _Complex float B) {
  return A+B;
}

We used to produce (with SSE2, SSE4.1+ uses insertps):

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$16, %xmm2, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm1
	movdqa	%xmm2, %xmm0
	unpcklps	%xmm1, %xmm0
	ret

We now produce:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movaps	%xmm2, %xmm0
	unpcklps	%xmm3, %xmm0
	ret

This implements rdar://8368414

llvm-svn: 112378

bcb6090a

improve comments in the unpcklps generating logic, introduce · 96db6e66

Chris Lattner authored Aug 28, 2010

a new EltStride variable instead of reusing NumElems variable
for a non-obvious purpose.  No functionality change.

llvm-svn: 112377

96db6e66

Don't cast Win32 FILETIME structs to int64. Patch by Dimitry Andric! · d75cf22f

Michael J. Spencer authored Aug 28, 2010

According to the Microsoft documentation here:
http://msdn.microsoft.com/en-us/library/ms724284%28VS.85%29.aspx

this cast used in lib/System/Win32/Path.inc:

__int64 ft = *reinterpret_cast<__int64*>(&fi.ftLastWriteTime);

should not be done.  The documentation says: "Do not cast a pointer to a
FILETIME structure to either a ULARGE_INTEGER* or __int64* value because
it can cause alignment faults on 64-bit Windows."

llvm-svn: 112376

d75cf22f

remove the MSIL backend. It isn't maintained, is buggy, has no testcases · bd244047
Chris Lattner authored Aug 28, 2010
```
and hasn't kept up with ToT.  Approved by Anton.

llvm-svn: 112375
```
bd244047
Update ocaml test. · 2e5c1471
Benjamin Kramer authored Aug 28, 2010
```
llvm-svn: 112364
```
2e5c1471
Remove unions from the ocaml bindings. · 1d872f5a
Benjamin Kramer authored Aug 28, 2010
```
llvm-svn: 112363
```
1d872f5a
Use pseudo instructions for VST1 and VST2. · 950882be
Bob Wilson authored Aug 28, 2010
```
llvm-svn: 112357
```
950882be
remove unions from LLVM IR. They are severely buggy and not · 13ee795c
Chris Lattner authored Aug 28, 2010
```
being actively maintained, improved, or extended.

llvm-svn: 112356
```
13ee795c

remove the ABCD and SSI passes. They don't have any clients that · 504e5100

Chris Lattner authored Aug 28, 2010

I'm aware of, aren't maintained, and LVI will be replacing their value.
nlewycky approved this on irc.

llvm-svn: 112355

504e5100

remove dead proto · a5217a19
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112354
```
a5217a19
more dead thing zapping. · 2c9e253c
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112353
```
2c9e253c
zap dead method · d0691146
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112352
```
d0691146
for completeness, allow undef also. · 50df36ac
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112351
```
50df36ac
squish dead code. · 95bb297c
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112350
```
95bb297c
zap dead code · ca936ac9
Chris Lattner authored Aug 28, 2010
```
llvm-svn: 112349
```
ca936ac9

Clean up the logic of vector shuffles -> vector shifts. · a982aa24

Bruno Cardoso Lopes authored Aug 28, 2010

Also teach this logic how to handle target specific shuffles if
needed, this is necessary while searching recursively for zeroed
scalar elements in vector shuffle operands.

llvm-svn: 112348

a982aa24

handle the constant case of vector insertion. For something · d0214f3e

Chris Lattner authored Aug 28, 2010

like this:

struct S { float A, B, C, D; };

struct S g;
struct S bar() { 
  struct S A = g;
  ++A.B;
  A.A = 42;
  return A;
}

we now generate:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	pshufd	$16, %xmm2, %xmm2
	movss	LCPI1_1(%rip), %xmm0
	pshufd	$16, %xmm0, %xmm0
	unpcklps	%xmm2, %xmm0
	ret

instead of:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	movd	%xmm2, %eax
	shlq	$32, %rax
	addq	$1109917696, %rax       ## imm = 0x42280000
	movd	%rax, %xmm0
	ret

llvm-svn: 112345

d0214f3e

Straighten out any triple strings passed on the command line before · 3bd97fec
Duncan Sands authored Aug 28, 2010
```
they hit the rest of the system.

llvm-svn: 112344
```
3bd97fec

optimize bitcasts from large integers to vector into vector · dd660104

Chris Lattner authored Aug 28, 2010

element insertion from the pieces that feed into the vector.
This handles a pattern that occurs frequently due to code
generated for the x86-64 abi.  We now compile something like
this:

struct S { float A, B, C, D; };
struct S g;
struct S bar() { 
  struct S A = g;
  ++A.A;
  ++A.C;
  return A;
}

into all nice vector operations:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	12(%rax), %xmm3
	pshufd	$16, %xmm2, %xmm2
	unpcklps	%xmm2, %xmm0
	addss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	pshufd	$16, %xmm3, %xmm2
	unpcklps	%xmm2, %xmm1
	ret

instead of icky integer operations:

_bar:                                   ## @bar
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	movd	%xmm0, %ecx
	movl	4(%rax), %edx
	movl	12(%rax), %esi
	shlq	$32, %rdx
	addq	%rcx, %rdx
	movd	%rdx, %xmm0
	addss	8(%rax), %xmm1
	movd	%xmm1, %eax
	shlq	$32, %rsi
	addq	%rax, %rsi
	movd	%rsi, %xmm1
	ret

This resolves rdar://8360454

llvm-svn: 112343

dd660104

Completely disable tail calls when fast-isel is enabled, as fast-isel · e06905d1
Dan Gohman authored Aug 28, 2010
```
doesn't currently support dealing with this.

llvm-svn: 112341
```
e06905d1
Trim a #include. · 1e06dbf8
Dan Gohman authored Aug 28, 2010
```
llvm-svn: 112340
```
1e06dbf8
Fix an index calculation thinko. · fe22f1d3
Dan Gohman authored Aug 28, 2010
```
llvm-svn: 112337
```
fe22f1d3
We don't need to custom-select VLDMQ and VSTMQ anymore. · 8ee93947
Bob Wilson authored Aug 28, 2010
```
llvm-svn: 112336
```
8ee93947
Update CMake build. Add newline at end of file. · 83f9ff04
Benjamin Kramer authored Aug 28, 2010
```
llvm-svn: 112332
```
83f9ff04

When merging Thumb2 loads/stores, do not give up when the offset is one of · ca5af129

Bob Wilson authored Aug 27, 2010

the special values that for ARM would be used with IB or DA modes.  Fall
through and consider materializing a new base address is it would be
profitable.

llvm-svn: 112329

ca5af129

Add a prototype of a new peephole optimizing pass that uses LazyValue info to... · cf7f9411

Owen Anderson authored Aug 27, 2010

Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's.
This pass addresses the missed optimizations from PR2581 and PR4420.

llvm-svn: 112325

cf7f9411

Improve the precision of getConstant(). · 38f6b7fe
Owen Anderson authored Aug 27, 2010
```
llvm-svn: 112323
```
38f6b7fe

Change ARM VFP VLDM/VSTM instructions to use addressing mode #4, just like · 13ce07fa

Bob Wilson authored Aug 27, 2010

all the other LDM/STM instructions. This fixes asm printer crashes when
compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run
with -O0 to check this in the future.

Prior to this change VLDM/VSTM used addressing mode #5, but not really.
The offset field was used to hold a count of the number of registers being
loaded or stored, and the AM5 opcode field was expanded to specify the IA
or DB mode, instead of the standard ADD/SUB specifier. Much of the backend
was not aware of these special cases. The crashes occured when rewriting
a frameindex caused the AM5 offset field to be changed so that it did not
have a valid submode. I don't know exactly what changed to expose this now.
Maybe we've never done much with -O0 and NEON. Regardless, there's no longer
any reason to keep a count of the VLDM/VSTM registers, so we can use
addressing mode #4 and clean things up in a lot of places.

llvm-svn: 112322

13ce07fa

tidy up test. · 954e9557
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112321
```
954e9557
no really, fix the test. · b8b7d526
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112317
```
b8b7d526
fix this test. It's not clear what it's really testing. · c8908b4c
Chris Lattner authored Aug 27, 2010
```
llvm-svn: 112316
```
c8908b4c

Enhance the shift propagator to handle the case when you have: · 6c1395f6

Chris Lattner authored Aug 27, 2010

A = shl x, 42
...
B = lshr ..., 38

which can be transformed into:
A = shl x, 4
...

iff we can prove that the would-be-shifted-in bits
are already zero.  This eliminates two shifts in the testcase
and allows eliminate of the whole i128 chain in the real example.

llvm-svn: 112314

6c1395f6