Commits · fa1472fd552d726d691c708f5dc6628392e13c83 · Roger Ferrer / llvm-epi-0.8

Sep 06, 2008

Fix for PR2687: Add patterns to match sint_to_fp and fp_to_sint for <2 x · a9c52c82

Eli Friedman authored Sep 05, 2008

i32>.  This is a little messy, but it works.

We should really get rid of the intrinsics, though, since they map
perfectly well to standard LLVM instructions.

llvm-svn: 55864

a9c52c82

Aug 28, 2008
- FsFLD0S{S|D} and V_SETALLONES are as cheap as moves. · 97af20f8
  Evan Cheng authored Aug 28, 2008
```
llvm-svn: 55466
```
  97af20f8
Aug 20, 2008
- Tablegen generated code already tests the opcode value, so it's not · 8823b0d2
  Dan Gohman authored Aug 20, 2008
```
necessary to use dyn_cast in these predicates.

llvm-svn: 55055
```
  8823b0d2
Aug 08, 2008
- Add an EXTRACTPSmr pattern to match the pattern that · 4e2f3ace
  Dan Gohman authored Aug 08, 2008
```
X86ISelLowering creates.

llvm-svn: 54544
```
  4e2f3ace
Aug 06, 2008
- Fix PR2620: Fix X86cmppd selection code so it expects operands to be v2f64. · 7823a411
  Evan Cheng authored Aug 05, 2008
```
llvm-svn: 54376
```
  7823a411
Jul 17, 2008
- Fix a typo in last commit · 3a2147aa
  Nate Begeman authored Jul 17, 2008
```
llvm-svn: 53720
```
  3a2147aa
- SSE codegen for vsetcc nodes · 55b7becb
  Nate Begeman authored Jul 17, 2008
```
llvm-svn: 53719
```
  55b7becb
Jul 10, 2008
- Fix for PR2472. Use movss to set lower 32-bits of a zero XMM vector. · 71b73984
  Evan Cheng authored Jul 10, 2008
```
llvm-svn: 53386
```
  71b73984
Jun 16, 2008
- Horizontal-add instructions are not commutative. · a5e30076
  Evan Cheng authored Jun 16, 2008
```
llvm-svn: 52363
```
  a5e30076
- mpsadbw is commutable. · b90be27f
  Evan Cheng authored Jun 16, 2008
```
llvm-svn: 52352
```
  b90be27f
Jun 13, 2008

Disable some DAG combiner optimizations that may be · 8651e9c5

Duncan Sands authored Jun 13, 2008

wrong for volatile loads and stores.  In fact this
is almost all of them!  There are three types of
problems: (1) it is wrong to change the width of
a volatile memory access.  These may be used to
do memory mapped i/o, in which case a load can have
an effect even if the result is not used.  Consider
loading an i32 but only using the lower 8 bits.  It
is wrong to change this into a load of an i8, because
you are no longer tickling the other three bytes.  It
is also unwise to make a load/store wider.  For
example, changing an i16 load into an i32 load is
wrong no matter how aligned things are, since the
fact of loading an additional 2 bytes can have
i/o side-effects.  (2) it is wrong to change the
number of volatile load/stores: they may be counted
by the hardware.  (3) it is wrong to change a volatile
load/store that requires one memory access into one
that requires several.  For example on x86-32, you
can store a double in one processor operation, but to
store an i64 requires two (two i32 stores).  In a
multi-threaded program you may want to bitcast an i64
to a double and store as a double because that will
occur atomically, and be indivisible to other threads.
So it would be wrong to convert the store-of-double
into a store of an i64, because this will become two
i32 stores - no longer atomic.  My policy here is
to say that the number of processor operations for
an illegal operation is undefined.  So it is alright
to change a store of an i64 (requires at least two
stores; but could be validly lowered to memcpy for
example) into a store of double (one processor op).
In short, if the new store is legal and has the same
size then I say that the transform is ok.  It would
also be possible to say that transforms are always
ok if before they were illegal, whether after they
are illegal or not, but that's more awkward to do
and I doubt it buys us anything much.
However this exposed an interesting thing - on x86-32
a store of i64 is considered legal!  That is because
operations are marked legal by default, regardless of
whether the type is legal or not.  In some ways this
is clever: before type legalization this means that
operations on illegal types are considered legal;
after type legalization there are no illegal types
so now operations are only legal if they really are.
But I consider this to be too cunning for mere mortals.
Better to do things explicitly by testing AfterLegalize.
So I have changed things so that operations with illegal
types are considered illegal - indeed they can never
map to a machine operation.  However this means that
the DAG combiner is more conservative because before
it was "accidentally" performing transforms where the
type was illegal because the operation was nonetheless
marked legal.  So in a few such places I added a check
on AfterLegalize, which I suppose was actually just
forgotten before.  This causes the DAG combiner to do
slightly more than it used to, which resulted in the X86
backend blowing up because it got a slightly surprising
node it wasn't expecting, so I tweaked it.

llvm-svn: 52254

8651e9c5

May 29, 2008
- Implement vector shift up / down and insert zero with ps{rl}lq / ps{rl}ldq. · 5e28227d
  Evan Cheng authored May 29, 2008
```
llvm-svn: 51667
```
  5e28227d
May 28, 2008
- Fix the encoding for two more "rm" instructions that were using MRMSrcReg. · 68bddb89
  Dan Gohman authored May 28, 2008
```
llvm-svn: 51630
```
  68bddb89
- Fixed X86 encoding error CVTPS2PD and CVTPD2PS when the source operand · 5e3faf23
  Mon P Wang authored May 28, 2008
```
is a memory location

llvm-svn: 51626
```
  5e3faf23
May 24, 2008
- Eliminate x86.sse2.punpckh.qdq and x86.sse2.punpckl.qdq. · 91a2e56b
  Evan Cheng authored May 24, 2008
```
llvm-svn: 51533
```
  91a2e56b
- Eliminate x86.sse2.movs.d, x86.sse2.shuf.pd, x86.sse2.unpckh.pd, and... · 2146270c
  Evan Cheng authored May 24, 2008
```
Eliminate x86.sse2.movs.d, x86.sse2.shuf.pd, x86.sse2.unpckh.pd, and x86.sse2.unpckl.pd intrinsics. These will be lowered into shuffles.

llvm-svn: 51531
```
  2146270c
- Remove x86.sse2.loadh.pd and x86.sse2.loadl.pd. These will be lowered into... · 6f8cfac7
  Evan Cheng authored May 24, 2008
```
Remove x86.sse2.loadh.pd and x86.sse2.loadl.pd. These will be lowered into load and shuffle instructions.

llvm-svn: 51522
```
  6f8cfac7
May 23, 2008
- Use movlps / movhps to modify low / high half of 16-byet memory location. · 04d24edc
  Evan Cheng authored May 23, 2008
```
llvm-svn: 51501
```
  04d24edc
- Fix a duplicated pattern. · 01b7fffb
  Evan Cheng authored May 23, 2008
```
llvm-svn: 51490
```
  01b7fffb
- Use PMULDQ for v2i64 multiplies when SSE4.1 is available. And add · 3388d022
  Dan Gohman authored May 23, 2008
```
load-folding table entries for PMULDQ and PMULLD.

llvm-svn: 51489
```
  3388d022
- Bug: rcpps can only folds a load if the address is 16-byte aligned. Fixed many... · f3be7a7e
  Evan Cheng authored May 23, 2008
```
Bug: rcpps can only folds a load if the address is 16-byte aligned. Fixed many 'ps' load folding patterns in X86InstrSSE.td which are missing the proper alignment checks.
Also fixed some 80 col. violations.

llvm-svn: 51462
```
  f3be7a7e
May 22, 2008
- Add missing patterns. · 53963b77
  Evan Cheng authored May 22, 2008
```
llvm-svn: 51435
```
  53963b77
May 20, 2008
- movsd and movq do not require 16-byte alignment. This fixes vec_set-5.ll on Linux. · f945f943
  Evan Cheng authored May 20, 2008
```
llvm-svn: 51327
```
  f945f943
May 13, 2008
- Fix one more encoding bug. · 6645714f
  Nate Begeman authored May 13, 2008
```
llvm-svn: 51057
```
  6645714f
- Fix and encoding error in the psrad xmm, imm8 instruction. · 50f7ef30
  Nate Begeman authored May 13, 2008
```
llvm-svn: 51020
```
  50f7ef30
- Teach Legalize how to scalarize VSETCC · b87e63a7
  Nate Begeman authored May 12, 2008
```
Teach X86 a few more vsetcc patterns.  Custom lowering for unsupported ones is next.

llvm-svn: 51009
```
  b87e63a7
May 12, 2008
- Initial X86 codegen support for VSETCC. · d875c3e2
  Nate Begeman authored May 12, 2008
```
llvm-svn: 51000
```
  d875c3e2
May 10, 2008
- Some clean up. · da2587ce
  Evan Cheng authored May 10, 2008
```
llvm-svn: 50929
```
  da2587ce
- Add a pattern to do move the low element of a v4f32 and zero extend the rest. · 867af267
  Evan Cheng authored May 09, 2008
```
llvm-svn: 50922
```
  867af267
May 09, 2008
- Handle a few more cases of folding load i64 into xmm and zero top bits. · 961339bb
  Evan Cheng authored May 09, 2008
```
Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch.

llvm-svn: 50918
```
  961339bb
- Use movq to move low half of XMM register and zero-extend the rest. · 0360ecbe
  Evan Cheng authored May 08, 2008
```
llvm-svn: 50874
```
  0360ecbe
May 08, 2008

Handle vector move / load which zero the destination register top bits (i.e.... · 78af38c3

Evan Cheng authored May 08, 2008

Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine.

llvm-svn: 50838

78af38c3

May 03, 2008

Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This... · cdf22f29

Evan Cheng authored May 03, 2008

Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This allow us to simplify the horribly complicated matching code.

llvm-svn: 50601

cdf22f29

May 02, 2008
- 80 column violation. · 4f9cd918
  Evan Cheng authored May 02, 2008
```
llvm-svn: 50575
```
  4f9cd918
Apr 20, 2008
- A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2. · 470ab00c
  Chris Lattner authored Apr 20, 2008
```
llvm-svn: 49986
```
  470ab00c
Apr 16, 2008
- Add support for the form of the SSE41 extractps instruction that · d43d3bee
  Dan Gohman authored Apr 16, 2008
```
puts its result in a 32-bit GPR.

llvm-svn: 49762
```
  d43d3bee
Apr 10, 2008

Fix the x86-64 side of PR2108 by adding a v2f64 version of · ad753024

Chris Lattner authored Apr 10, 2008

MOVZQI2PQIrr.  This would be better handled as a dag combine 
(with the goal of eliminating the bitconvert) but I don't know
how to do that safely.  Thoughts welcome.

llvm-svn: 49463

ad753024

Apr 05, 2008
- Favors pshufd over shufps when shuffling elements from one vector. pshufd is faster than shufps. · f77b5ef3
  Evan Cheng authored Apr 05, 2008
```
llvm-svn: 49244
```
  f77b5ef3
Mar 26, 2008
- Fix some SSE4.1 instruction encoding bugs. · 29206360
  Evan Cheng authored Mar 26, 2008
```
llvm-svn: 48815
```
  29206360
Mar 24, 2008

- SSE4.1 extractfps extracts a f32 into a gr32 register. Very useful! Not. Fix... · 615488ab

Evan Cheng authored Mar 24, 2008

- SSE4.1 extractfps extracts a f32 into a gr32 register. Very useful! Not. Fix the instruction specification and teaches lowering code to use it only when the only use is a store instruction.

llvm-svn: 48746

615488ab