Commits · fb44980b416fac959c8584a5fd1c8a4b5d6d25ed · Roger Ferrer / llvm-epi-0.8

Feb 02, 2012
- Optimization for SIGN_EXTEND operation on AVX. · fb44980b
  Elena Demikhovsky authored Feb 02, 2012
```
Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32
extensions.

llvm-svn: 149600
```
  fb44980b
Feb 01, 2012
- Optimization for "truncate" operation on AVX. · 0e48c70b
  Elena Demikhovsky authored Feb 01, 2012
```
Truncating v4i64 -> v4i32 and v8i32 -> v8i16 may be done with set of shuffles.

llvm-svn: 149485
```
  0e48c70b
Jan 30, 2012

Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic... · ca29bcfc

Craig Topper authored Jan 30, 2012

Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic patterns with custom lowering to a target specific nodes.

llvm-svn: 149216

ca29bcfc

Jan 23, 2012
- Combine X86 CMPPD and CMPPS node types. Simplifies selection code and pattern matching. · 0b7ad76b
  Craig Topper authored Jan 22, 2012
```
llvm-svn: 148670
```
  0b7ad76b
Jan 22, 2012

Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86... · bd488437

Craig Topper authored Jan 22, 2012

Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86 ISD node types into only two node types. Simplifying opcode selection and pattern matching.

llvm-svn: 148667

bd488437

Add target specific ISD node types for SSE/AVX vector shuffle instructions and... · 09462641

Craig Topper authored Jan 22, 2012

Add target specific ISD node types for SSE/AVX vector shuffle instructions and change all the code that used to create intrinsic nodes to create the new nodes instead.

llvm-svn: 148664

09462641

Remove unused X86 ISD node type defines. · cb3433cd
Craig Topper authored Jan 22, 2012
```
llvm-svn: 148644
```
cb3433cd

Jan 19, 2012
- Merge 128-bit and 256-bit SHUFPS/SHUFPD handling. · 80576e8d
  Craig Topper authored Jan 19, 2012
```
llvm-svn: 148466
```
  80576e8d
Jan 08, 2012
- Reverted commit #147601 upon Evan's request. · 540651cf
  Victor Umansky authored Jan 08, 2012
```
llvm-svn: 147748
```
  540651cf
Jan 05, 2012

Peephole optimization of ptest-conditioned branch in X86 arch. Performs... · 9255b6d9

Victor Umansky authored Jan 05, 2012

Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX.

Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX)

Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov
llvm-svn: 147601

9255b6d9

Jan 01, 2012
- Merge X86 SHUFPS and SHUFPD node types. · 6e54ba7e
  Craig Topper authored Dec 31, 2011
```
llvm-svn: 147394
```
  6e54ba7e
Dec 24, 2011

Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the · 7e9453e9

Chandler Carruth authored Dec 24, 2011

X86ISelLowering C++ code. Because this is lowered via an xor wrapped
around a bsr, we want the dagcombine which runs after isel lowering to
have a chance to clean things up. In particular, it is very common to
see code which looks like:

  (sizeof(x)*8 - 1) ^ __builtin_clz(x)

Which is trying to compute the most significant bit of 'x'. That's
actually the value computed directly by the 'bsr' instruction, but if we
match it too late, we'll get completely redundant xor instructions.

The more naive code for the above (subtracting rather than using an xor)
still isn't handled correctly due to the dagcombine getting confused.

Also, while here fix an issue spotted by inspection: we should have been
expanding the zero-undef variants to the normal variants when there is
an 'lzcnt' instruction. Do so, and test for this. We don't want to
generate unnecessary 'bsr' instructions.

These two changes fix some regressions in encoding and decoding
benchmarks. However, there is still a *lot* to be improve on in this
type of code.

llvm-svn: 147244

7e9453e9

Dec 17, 2011
- Remove an unused X86ISD node type. · a913dde0
  Craig Topper authored Dec 17, 2011
```
llvm-svn: 146833
```
  a913dde0
Dec 16, 2011

Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is... · a4d411cb

Craig Topper authored Dec 16, 2011

Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is supported. Fix 'unpackh v, v' for 256-bit types to understand 128-bit lanes.

llvm-svn: 146726

a4d411cb

Dec 11, 2011

Remove some remants of the old palign pattern fragment that were still hanging... · 1fdfec63

Craig Topper authored Dec 11, 2011

Remove some remants of the old palign pattern fragment that were still hanging around. Also remove a cast from inside getShuffleVPERM2X128Immediate and getShuffleVPERMILPImmediate since the only caller already had done the cast.

llvm-svn: 146344

1fdfec63

Dec 06, 2011
- Merge floating point and integer UNPCK X86ISD node types. · 8d4ba198
  Craig Topper authored Dec 06, 2011
```
llvm-svn: 145926
```
  8d4ba198
Nov 30, 2011
- Merge VPERM2F128/VPERM2I128 ISD node types. · 0a672eaf
  Craig Topper authored Nov 30, 2011
```
llvm-svn: 145485
```
  0a672eaf
- Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node... · bafd224c
  Craig Topper authored Nov 30, 2011
```
Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128.

llvm-svn: 145483
```
  bafd224c
Nov 28, 2011

Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge... · 818a983e

Craig Topper authored Nov 28, 2011

Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.

llvm-svn: 145238

818a983e

Nov 26, 2011

Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD.... · 51280d56

Craig Topper authored Nov 26, 2011

Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.

llvm-svn: 145153

51280d56

Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to... · 7704bd7a

Craig Topper authored Nov 26, 2011

Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.

llvm-svn: 145148

7704bd7a

Nov 24, 2011

Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit... · d65a4444

Craig Topper authored Nov 24, 2011

Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.

llvm-svn: 145126

d65a4444

Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse... · d2646674

Craig Topper authored Nov 24, 2011

Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish.

llvm-svn: 145125

d2646674

Nov 21, 2011
- Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled. · 6270d072
  Craig Topper authored Nov 21, 2011
```
llvm-svn: 145028
```
  6270d072
- Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled. · 669199ca
  Craig Topper authored Nov 21, 2011
```
llvm-svn: 145026
```
  669199ca
Nov 19, 2011
- Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from... · f984efbf
  Craig Topper authored Nov 19, 2011
```
Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors.

llvm-svn: 144989
```
  f984efbf
- Collapse X86 PSIGNB/PSIGNW/PSIGND node types. · 81390be0
  Craig Topper authored Nov 19, 2011
```
llvm-svn: 144988
```
  81390be0
Oct 27, 2011
- Rename NonScalarIntSafe to something more appropriate. · 58dba012
  Lang Hames authored Oct 26, 2011
```
llvm-svn: 143080
```
  58dba012
Oct 21, 2011
- Remove intrinsics for X86 BLSI, BLSMSK, and BLSR intrinsics and replace with... · 039a7906
  Craig Topper authored Oct 21, 2011
```
Remove intrinsics for X86 BLSI, BLSMSK, and BLSR intrinsics and replace with custom isel lowering code.

llvm-svn: 142642
```
  039a7906
Oct 14, 2011
- Add X86 ANDN instruction. Including instruction selection. · 965de2c1
  Craig Topper authored Oct 14, 2011
```
llvm-svn: 141947
```
  965de2c1
Sep 22, 2011

Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from · 0e4fcb8e

Duncan Sands authored Sep 22, 2011

floating point add/sub of appropriate shuffle vectors.  Does not
synthesize the 256 bit AVX versions because they work differently.

llvm-svn: 140332

0e4fcb8e

Sep 11, 2011

CR fixes per Bruno's request. · b873b187

Nadav Rotem authored Sep 11, 2011

Undo the changes from r139285 which added custom lowering to vselect.
Add tablegen lowering for vselect.

llvm-svn: 139479

b873b187

Sep 09, 2011

Implement vector-select support for avx256. Refactor the vblend implementation... · de838dae

Nadav Rotem authored Sep 09, 2011

Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type

llvm-svn: 139400

de838dae

Sep 08, 2011
- Add X86-SSE4 codegen support for vector-select. · 2550ba2a
  Nadav Rotem authored Sep 08, 2011
```
llvm-svn: 139285
```
  2550ba2a
Sep 06, 2011

Fix comment. Noticed by Duncan. · 9d96c942
Rafael Espindola authored Sep 06, 2011
```
llvm-svn: 139161
```
9d96c942

Add codegen support for vector select (in the IR this means a select · f2641e1b

Duncan Sands authored Sep 06, 2011

with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons.  Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all").  Patch mostly by
Nadav Rotem.

llvm-svn: 139159

f2641e1b

Split the init.trampoline intrinsic, which currently combines GCC's · a098436b

Duncan Sands authored Sep 06, 2011

init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC. While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function. To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function. Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!). Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC. Patch mostly by Sanjoy Das.

llvm-svn: 139140

a098436b

Aug 30, 2011

Adds support for variable sized allocas. For a variable sized alloca, · 94d32536

Rafael Espindola authored Aug 30, 2011

code is inserted to first check if the current stacklet has enough
space. If so, space is allocated by simply decrementing the stack
pointer. Otherwise a runtime routine (__morestack_allocate_stack_space
in libgcc) is called which allocates the required memory from the
heap.

Patch by Sanjoy Das.

llvm-svn: 138818

94d32536

Adds a SelectionDAG node X86SegAlloca which will be custom lowered · 33530176

Rafael Espindola authored Aug 30, 2011

from DYNAMIC_STACKALLOC.

Two new pseudo instructions (SEG_ALLOCA_32 and SEG_ALLOCA_64) which
will match X86SegAlloca (based on word size) are also added.  They
will be custom emitted to inject the actual stack handling code.

Patch by Sanjoy Das.

llvm-svn: 138814

33530176

Aug 26, 2011
- Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction. · 5e570427
  Eli Friedman authored Aug 26, 2011
```
llvm-svn: 138660
```
  5e570427