Commits · 3f40c69083d07e816690bb810ba2d9ff40546a8a · Lorenzo Albano / LLVM bpEVL

May 13, 2008
- On x86, it's safe to treat i32 load anyext as a normal i32 load. Ditto for i8 anyext load to i16. · 3f40c690
  Evan Cheng authored May 13, 2008
  
  llvm-svn: 51019
  3f40c690
- Xform bitconvert(build_pair(load a, load b)) to a single load if the load... · b980f6fb
  Evan Cheng authored May 12, 2008
  
  Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other. llvm-svn: 51008
  b980f6fb
- New test for tail merging · e6942c31
  Dale Johannesen authored May 12, 2008
  
  llvm-svn: 51007
  e6942c31
May 10, 2008
- When transforming a vector_shuffle to a load, the base address must not be an undef. · 71b9afb0
  Evan Cheng authored May 10, 2008
  
  llvm-svn: 50940
  71b9afb0
- Add nounwind. · 9c4d6851
  Evan Cheng authored May 10, 2008
  
  llvm-svn: 50931
  9c4d6851
- If all sources of a PHI node are defined by an implicit_def, just emit an... · bec201fa
  Evan Cheng authored May 10, 2008
  
  If all sources of a PHI node are defined by an implicit_def, just emit an implicit_def instead of a copy. llvm-svn: 50927
  bec201fa
- Add a pattern to do move the low element of a v4f32 and zero extend the rest. · 867af267
  Evan Cheng authored May 09, 2008
  
  llvm-svn: 50922
  867af267
May 09, 2008
- Handle a few more cases of folding load i64 into xmm and zero top bits. · 961339bb
  Evan Cheng authored May 09, 2008
  
  Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch. llvm-svn: 50918
  961339bb
- Simplify test. · 0352e63e
  Evan Cheng authored May 09, 2008
  
  llvm-svn: 50911
  0352e63e
- Use movq to move low half of XMM register and zero-extend the rest. · 0360ecbe
  Evan Cheng authored May 08, 2008
  
  llvm-svn: 50874
  0360ecbe
May 08, 2008
- Handle vector move / load which zero the destination register top bits (i.e.... · 78af38c3
  Evan Cheng authored May 08, 2008
  
  Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine. llvm-svn: 50838
  78af38c3
- Add nounwind. · 0d6311d4
  Evan Cheng authored May 07, 2008
  
  llvm-svn: 50837
  0d6311d4
May 07, 2008

Yet another nasty spiller bug. · 7ca4a67c

Evan Cheng authored May 07, 2008

%ecx = op
store %cl<kill>, (addr)
(addr) = op %al

It's not safe to unfold the last operand and eliminate store even though %cl is marked kill. It's a sub-register use which means one of its super-register(s) may be used below.

llvm-svn: 50794

7ca4a67c

Use target triple in tests, not 'realign-stack=0' option. Per request. · f5d2c3b4
Anton Korobeynikov authored May 06, 2008
```
llvm-svn: 50778
```
f5d2c3b4

May 06, 2008
- Fix PR2287. Darwin passes mmx values in register in 64-mode, not Linux. · ef3faa1b
  Evan Cheng authored May 06, 2008
  
  llvm-svn: 50716
  ef3faa1b
May 05, 2008
- Added addition atomic instrinsics and, or, xor, min, and max. · 3e58393c
  Mon P Wang authored May 05, 2008
  
  llvm-svn: 50663
  3e58393c
- no need for eh info · 9c0c60d0
  Chris Lattner authored May 05, 2008
  
  llvm-svn: 50658
  9c0c60d0
- Add AsmPrinter support for emitting a directive to declare that · bcde1722
  Dan Gohman authored May 05, 2008
  
  the code being generated does not require an executable stack. Also, add target-specific code to make use of this on Linux on x86. llvm-svn: 50634
  bcde1722
May 04, 2008

Select vector shift with non-immediate i32 shift amount operand by first... · d9481366

Evan Cheng authored May 04, 2008

Select vector shift with non-immediate i32 shift amount operand by first moving the operand into the right register.

llvm-svn: 50619

d9481366

May 03, 2008

Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This... · cdf22f29

Evan Cheng authored May 03, 2008

Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This allow us to simplify the horribly complicated matching code.

llvm-svn: 50601

cdf22f29

May 02, 2008
- specify an arch for non-x86 hosts. · 34931aff
  Chris Lattner authored May 02, 2008
  
  llvm-svn: 50576
  34931aff
May 01, 2008

don't randomly miscompile seto/setuo just because we are in · d4b2a67c

Chris Lattner authored May 01, 2008

ffastmath mode.  This fixes rdar://5902801, a miscompilation
of gcc.dg/builtins-8.c.

Bill, please pull this into Tak.

llvm-svn: 50523

d4b2a67c

Apr 30, 2008
- Really commit the test checking the argument lowering behaviour on x86-64 :). · 68988d13
  Arnold Schwaighofer authored Apr 30, 2008
  
  llvm-svn: 50478
  68988d13
Apr 29, 2008

make the vector conversion magic handle multiple results. · 5c88f7b1

Chris Lattner authored Apr 29, 2008

We now compile test2/test3 to:

_test2:
	## InlineAsm Start
	set %xmm0, %xmm1
	## InlineAsm End
	addps	%xmm1, %xmm0
	ret
_test3:
	## InlineAsm Start
	set %xmm0, %xmm1
	## InlineAsm End
	paddd	%xmm1, %xmm0
	ret

as expected.

llvm-svn: 50389

5c88f7b1

add support for multiple return values in inline asm. This is a step · f9a49c43

Chris Lattner authored Apr 29, 2008

towards PR2094.  It now compiles the attached .ll file to:

_sad16_sse2:
	movslq	%ecx, %rax
	## InlineAsm Start
	%ecx %rdx %rax %rax %r8d %rdx %rsi
	## InlineAsm End
	## InlineAsm Start
	set %eax
	## InlineAsm End
	ret

which is pretty decent for a 3 output, 4 input asm.

llvm-svn: 50386

f9a49c43

Another extract_subreg coalescing bug. · 11b98b66

Evan Cheng authored Apr 29, 2008

e.g.
vr1024<2> extract_subreg vr1025, 2
If vr1024 do not have the same register class as vr1025, it's not safe to coalesce this away. For example, vr1024 might be a GPR32 while vr1025 might be a GPR64.

llvm-svn: 50385

11b98b66

Add -march=x86. · 73c3b474
Evan Cheng authored Apr 28, 2008
```
llvm-svn: 50380
```
73c3b474
Test case. · 315e3cb9
Evan Cheng authored Apr 28, 2008
```
llvm-svn: 50377
```
315e3cb9

Apr 27, 2008

Implement a signficant optimization for inline asm: · 22379734

Chris Lattner authored Apr 27, 2008

When choosing between constraints with multiple options,
like "ir", test to see if we can use the 'i' constraint and
go with that if possible.  This produces more optimal ASM in
all cases (sparing a register and an instruction to load it),
and fixes inline asm like this:

void test () {
  asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14));
}

Previously we would dump "42" into a memory location (which
is ok for the 'm' constraint) which would cause a problem
because the 'c' modifier is not valid on memory operands.

Isn't it great how inline asm turns 'missed optimization'
into 'compile failed'??

Incidentally, this was the todo in 
PowerPC/2007-04-24-InlineAsm-I-Modifier.ll

Please do NOT pull this into Tak.

llvm-svn: 50315

22379734

Apr 25, 2008

Feedback from chris · 98f0898e
Nate Begeman authored Apr 25, 2008
```
llvm-svn: 50305
```
98f0898e
Add a testcase for the recent "handle variable vector insert elt in mem" patch · f10b493f
Nate Begeman authored Apr 25, 2008
```
llvm-svn: 50303
```
f10b493f
Update tests. · 402572a1
Evan Cheng authored Apr 25, 2008
```
llvm-svn: 50293
```
402572a1
Special handling for MMX values being passed in either GPR64 or lower 64-bits of XMM registers. · ccde6dd0
Evan Cheng authored Apr 25, 2008
```
llvm-svn: 50289
```
ccde6dd0

MMX argument passing fixes: · df38b35a

Evan Cheng authored Apr 25, 2008

On Darwin / Linux x86-32, v8i8, v4i16, v2i32 values are passed in MM[0-2].
On Darwin / Linux x86-32, v1i64 values are passed in memory.
On Darwin x86-64, v8i8, v4i16, v2i32 values are passed in XMM[0-7].
On Darwin x86-64, v1i64 values are passed in 64-bit GPRs.

llvm-svn: 50257

df38b35a

Loosen up an assertion to allow intrinsics. I really have no · 741c7a3b

Chris Lattner authored Apr 25, 2008

idea what this code (findNonImmUse) does, so I'm only guessing 
that this is the right thing.  It would be really really nice
if this had comments and perhaps switched to SmallPtrSet
(hint hint) :)

This fixes rdar://5886601, a crash on gcc.target/i386/sse4_1-pblendw.c

llvm-svn: 50252

741c7a3b

Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not... · 9165e165

Evan Cheng authored Apr 25, 2008

Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not handled by rep instructions, a new memcpy / memset is introduced for them. However, since source / destination addresses are already adjusted, their offsets should be zero.

llvm-svn: 50239

9165e165

Apr 23, 2008
- Disable stack realignment for these tests · dd4ef2e3
  Anton Korobeynikov authored Apr 23, 2008
  
  llvm-svn: 50172
  dd4ef2e3
- Fix test becase ABI stack alignment dropped to 'normal' value · c3ada5c9
  Anton Korobeynikov authored Apr 23, 2008
  
  llvm-svn: 50171
  c3ada5c9
- Fix test, instruction count is valid only if stack is not realigned · 955a8a91
  Anton Korobeynikov authored Apr 23, 2008
  
  llvm-svn: 50170
  955a8a91
Apr 22, 2008

Implement an x86-64 ABI detail of passing structs by hidden first · f166d2d0

Dan Gohman authored Apr 21, 2008

argument. The x86-64 ABI requires the incoming value of %rdi to
be copied to %rax on exit from a function that is returning a
large C struct.

Also, add a README-X86-64 entry detailing the missed optimization
opportunity and proposing an alternative approach.

llvm-svn: 50075

f166d2d0