Commits · aa58397b3ce17ec4fed008150911111c7025fc91 · Roger Ferrer / llvm-epi-0.8

May 25, 2012

Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall · aa58397b

Justin Holewinski authored May 25, 2012

to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.

NV_CONTRIB

llvm-svn: 157479

aa58397b

Apr 27, 2012

X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures. · 913da4b2

Benjamin Kramer authored Apr 27, 2012

* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.

Fixes PR6679. Patch by Christoph Erhardt!

llvm-svn: 155704

913da4b2

Apr 16, 2012
- Merge vpermps/vpermd and vpermpd/vpermq SD nodes. · b86fa404
  Craig Topper authored Apr 16, 2012
```
llvm-svn: 154782
```
  b86fa404
Apr 15, 2012
- Added VPERM optimization for AVX2 shuffles · 779a72b4
  Elena Demikhovsky authored Apr 15, 2012
```
llvm-svn: 154761
```
  779a72b4
Apr 14, 2012
- Fix X86 codegen for 'atomicrmw nand' to generate *x = ~(*x & y), not *x = ~*x & y. · 3e8f1f6a
  Richard Smith authored Apr 13, 2012
```
llvm-svn: 154705
```
  3e8f1f6a
Apr 11, 2012

Reapply 154396 after fixing a test. · 9bc178ac

Nadav Rotem authored Apr 11, 2012

Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.

llvm-svn: 154483

9bc178ac

Apr 10, 2012

Temporarily revert this patch to see if it brings the buildbots back. · 65ada95b
Eric Christopher authored Apr 10, 2012
```
llvm-svn: 154425
```
65ada95b

Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. · f934f917

Nadav Rotem authored Apr 10, 2012

blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.

llvm-svn: 154396

f934f917

Fix a long standing tail call optimization bug. When a libcall is emitted · f8bad080

Evan Cheng authored Apr 10, 2012

legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178

llvm-svn: 154370

f8bad080

Apr 09, 2012
- Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type. · b801ca39
  Nadav Rotem authored Apr 09, 2012
```
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.

llvm-svn: 154310
```
  b801ca39
Apr 04, 2012

Always compute all the bits in ComputeMaskedBits. · ba0a6cab

Rafael Espindola authored Apr 04, 2012

This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011

ba0a6cab

Feb 28, 2012

Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call. · 65f9d19c
Evan Cheng authored Feb 28, 2012
```
llvm-svn: 151645
```
65f9d19c

Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack... · ee7b8993

Daniel Dunbar authored Feb 28, 2012

Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part.

llvm-svn: 151630

ee7b8993

Some ARM implementaions, e.g. A-series, does return stack prediction. That is, · 87c7b09d

Evan Cheng authored Feb 28, 2012

the processor keeps a return addresses stack (RAS) which stores the address
and the instruction execution state of the instruction after a function-call
type branch instruction.

Calling a "noreturn" function with normal call instructions (e.g. bl) can
corrupt RAS and causes 100% return misprediction so LLVM should use a
unconditional branch instead. i.e.
mov lr, pc
b _foo
The "mov lr, pc" is issued in order to get proper backtrace.

rdar://8979299

llvm-svn: 151623

87c7b09d

Feb 25, 2012

Target/X86: Fix assertion failures and warnings caused by r151382 _ftol2... · bdf94879

NAKAMURA Takumi authored Feb 25, 2012

Target/X86: Fix assertion failures and warnings caused by r151382 _ftol2 lowering for i386-*-win32 targets. Patch by Joe Groff.

[Joe Groff] Hi everyone. My previous patch applied as r151382 had a few problems:
Clang raised a warning, and X86 LowerOperation would assert out for
fptoui f64 to i32 because it improperly lowered to an illegal
BUILD_PAIR. Here's a patch that addresses these issues. Let me know if
any other changes are necessary. Thanks.

llvm-svn: 151432

bdf94879

Feb 24, 2012
- Add WIN_FTOL_* psudo-instructions to model the unique calling convention · 248d65e7
  Michael J. Spencer authored Feb 24, 2012
```
used by the Win32 _ftol2 runtime function. Patch by Joe Groff!

llvm-svn: 151382
```
  248d65e7
Feb 22, 2012

Make all pointers to TargetRegisterClass const since they are all pointers to... · 760b134f

Craig Topper authored Feb 22, 2012

Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified.

llvm-svn: 151134

760b134f

Feb 19, 2012
- Make a bunch of X86ISelLowering shuffle functions static now that they are no... · 3e5c04e4
  Craig Topper authored Feb 19, 2012
```
Make a bunch of X86ISelLowering shuffle functions static now that they are no longer needed by isel.

llvm-svn: 150908
```
  3e5c04e4
Feb 05, 2012

Add target specific node for PMULUDQ. Change patterns to use it and custom... · 1d471e31

Craig Topper authored Feb 05, 2012

Add target specific node for PMULUDQ. Change patterns to use it and custom lower intrinsics to it. Use it instead of intrinsic to handle 64-bit vector multiplies.

llvm-svn: 149807

1d471e31

Feb 02, 2012
- Optimization for SIGN_EXTEND operation on AVX. · fb44980b
  Elena Demikhovsky authored Feb 02, 2012
```
Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32
extensions.

llvm-svn: 149600
```
  fb44980b
Feb 01, 2012
- Optimization for "truncate" operation on AVX. · 0e48c70b
  Elena Demikhovsky authored Feb 01, 2012
```
Truncating v4i64 -> v4i32 and v8i32 -> v8i16 may be done with set of shuffles.

llvm-svn: 149485
```
  0e48c70b
Jan 30, 2012

Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic... · ca29bcfc

Craig Topper authored Jan 30, 2012

Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic patterns with custom lowering to a target specific nodes.

llvm-svn: 149216

ca29bcfc

Jan 23, 2012
- Combine X86 CMPPD and CMPPS node types. Simplifies selection code and pattern matching. · 0b7ad76b
  Craig Topper authored Jan 22, 2012
```
llvm-svn: 148670
```
  0b7ad76b
Jan 22, 2012

Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86... · bd488437

Craig Topper authored Jan 22, 2012

Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86 ISD node types into only two node types. Simplifying opcode selection and pattern matching.

llvm-svn: 148667

bd488437

Add target specific ISD node types for SSE/AVX vector shuffle instructions and... · 09462641

Craig Topper authored Jan 22, 2012

Add target specific ISD node types for SSE/AVX vector shuffle instructions and change all the code that used to create intrinsic nodes to create the new nodes instead.

llvm-svn: 148664

09462641

Remove unused X86 ISD node type defines. · cb3433cd
Craig Topper authored Jan 22, 2012
```
llvm-svn: 148644
```
cb3433cd

Jan 19, 2012
- Merge 128-bit and 256-bit SHUFPS/SHUFPD handling. · 80576e8d
  Craig Topper authored Jan 19, 2012
```
llvm-svn: 148466
```
  80576e8d
Jan 08, 2012
- Reverted commit #147601 upon Evan's request. · 540651cf
  Victor Umansky authored Jan 08, 2012
```
llvm-svn: 147748
```
  540651cf
Jan 05, 2012

Peephole optimization of ptest-conditioned branch in X86 arch. Performs... · 9255b6d9

Victor Umansky authored Jan 05, 2012

Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX.

Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX)

Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov
llvm-svn: 147601

9255b6d9

Jan 01, 2012
- Merge X86 SHUFPS and SHUFPD node types. · 6e54ba7e
  Craig Topper authored Dec 31, 2011
```
llvm-svn: 147394
```
  6e54ba7e
Dec 24, 2011

Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the · 7e9453e9

Chandler Carruth authored Dec 24, 2011

X86ISelLowering C++ code. Because this is lowered via an xor wrapped
around a bsr, we want the dagcombine which runs after isel lowering to
have a chance to clean things up. In particular, it is very common to
see code which looks like:

  (sizeof(x)*8 - 1) ^ __builtin_clz(x)

Which is trying to compute the most significant bit of 'x'. That's
actually the value computed directly by the 'bsr' instruction, but if we
match it too late, we'll get completely redundant xor instructions.

The more naive code for the above (subtracting rather than using an xor)
still isn't handled correctly due to the dagcombine getting confused.

Also, while here fix an issue spotted by inspection: we should have been
expanding the zero-undef variants to the normal variants when there is
an 'lzcnt' instruction. Do so, and test for this. We don't want to
generate unnecessary 'bsr' instructions.

These two changes fix some regressions in encoding and decoding
benchmarks. However, there is still a *lot* to be improve on in this
type of code.

llvm-svn: 147244

7e9453e9

Dec 17, 2011
- Remove an unused X86ISD node type. · a913dde0
  Craig Topper authored Dec 17, 2011
```
llvm-svn: 146833
```
  a913dde0
Dec 16, 2011

Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is... · a4d411cb

Craig Topper authored Dec 16, 2011

Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is supported. Fix 'unpackh v, v' for 256-bit types to understand 128-bit lanes.

llvm-svn: 146726

a4d411cb

Dec 11, 2011

Remove some remants of the old palign pattern fragment that were still hanging... · 1fdfec63

Craig Topper authored Dec 11, 2011

Remove some remants of the old palign pattern fragment that were still hanging around. Also remove a cast from inside getShuffleVPERM2X128Immediate and getShuffleVPERMILPImmediate since the only caller already had done the cast.

llvm-svn: 146344

1fdfec63

Dec 06, 2011
- Merge floating point and integer UNPCK X86ISD node types. · 8d4ba198
  Craig Topper authored Dec 06, 2011
```
llvm-svn: 145926
```
  8d4ba198
Nov 30, 2011
- Merge VPERM2F128/VPERM2I128 ISD node types. · 0a672eaf
  Craig Topper authored Nov 30, 2011
```
llvm-svn: 145485
```
  0a672eaf
- Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node... · bafd224c
  Craig Topper authored Nov 30, 2011
```
Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128.

llvm-svn: 145483
```
  bafd224c
Nov 28, 2011

Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge... · 818a983e

Craig Topper authored Nov 28, 2011

Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.

llvm-svn: 145238

818a983e

Nov 26, 2011

Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD.... · 51280d56

Craig Topper authored Nov 26, 2011

Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.

llvm-svn: 145153

51280d56

Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to... · 7704bd7a

Craig Topper authored Nov 26, 2011

Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.

llvm-svn: 145148

7704bd7a