Commits · 7f5088e6defece71bf3153bf037531f68f88d565 · Roger Ferrer / llvm-epi-0.8

Apr 17, 2010
- a bunch of ssse3 instructions are misencoded to think they have an · 7f5088e6
  Chris Lattner authored Apr 17, 2010
```
i8 field when they really do not.  This fixes rdar://7840289

llvm-svn: 101629
```
  7f5088e6
Apr 15, 2010
- Allow lowering for palignr instructions for mmx sized vectors. Add · eabc9623
  Eric Christopher authored Apr 15, 2010
```
patterns to handle the lowering.

llvm-svn: 101331
```
  eabc9623
Apr 08, 2010
- mpsadbw is not commutative. · c0f63cf7
  Eric Christopher authored Apr 08, 2010
```
Fixes PR3440.

llvm-svn: 100736
```
  c0f63cf7
Apr 03, 2010
- Rewrite aesimc handling. It only takes a single input and has a single · 000e502e
  Eric Christopher authored Apr 02, 2010
```
dest.

llvm-svn: 100252
```
  000e502e
Apr 02, 2010

Separate out the AES-NI instructions from the SSE4.2 instructions. Add · 2ef63183

Eric Christopher authored Apr 02, 2010

a new subtarget option for AES and check for the support.  Add "westmere"
line of processors and add AES-NI support to the core i7.

Add a couple of TODOs for information I couldn't verify.

llvm-svn: 100231

2ef63183

Apr 01, 2010

Add aeskeygenassist intrinsic and rename all of the aes intrinsics to · 9002ac5d

Eric Christopher authored Apr 01, 2010

aes instead of sse4.2.  Add a brief todo for a subtarget flag and rework
the aeskeygenassist instruction to more closely match the docs.

llvm-svn: 100078

9002ac5d

Mar 31, 2010
- Replace V_SET0 with variants for each SSE execution domain. · 9986ba95
  Jakob Stoklund Olesen authored Mar 31, 2010
```
llvm-svn: 99975
```
  9986ba95
- V_SETALLONES is an integer instruction. · 3493398f
  Jakob Stoklund Olesen authored Mar 30, 2010
```
Since it is just a pxor in disguise, we should probably expand it to a full
polymorphic triple.

llvm-svn: 99953
```
  3493398f
Mar 30, 2010
- Remove the pmulld intrinsic and autoupdate it as a vector multiply. · 6ad81677
  Eric Christopher authored Mar 30, 2010
```
Rewrite the pmulld patterns, and make sure that they fold in loads of
arguments into the instruction.

llvm-svn: 99910
```
  6ad81677
Mar 29, 2010
- We'll never match these as instructions, just as intrinsics so remove · 9bdadf0d
  Eric Christopher authored Mar 29, 2010
```
the SDNodes.

llvm-svn: 99835
```
  9bdadf0d
Mar 28, 2010
- zap an extra line that Eli noticed! · 11f85ccf
  Chris Lattner authored Mar 28, 2010
```
llvm-svn: 99770
```
  11f85ccf
- remove a pattern with no testcase that doesn't appear to be · 505849d2
  Chris Lattner authored Mar 28, 2010
```
matchable: it seems like it would always constant fold.

llvm-svn: 99758
```
  505849d2
- fix some modelling problems exposed by a patch I'm working on. bsr/bsf/ptest · ec5fe658
  Chris Lattner authored Mar 28, 2010
```
nodes all have an EFLAGS result when made by isel lowering.

llvm-svn: 99736
```
  ec5fe658
Mar 25, 2010

Tag SSE2 integer instructions as SSEPackedInt. · 3758ff91
Jakob Stoklund Olesen authored Mar 25, 2010
```
llvm-svn: 99540
```
3758ff91
Reapply Kevin's change 94440, now that Chris has fixed the limitation on · e543e7fc
Bob Wilson authored Mar 25, 2010
```
opcode values fitting in one byte (svn r99494).

llvm-svn: 99514
```
e543e7fc

Speculatively revert this to see if it fixes buildbot failures. · 5b2da69f

Bob Wilson authored Mar 24, 2010

--- Reverse-merging r99440 into '.':
U    test/MC/AsmParser/X86/x86_32-bit_cat.s
U    test/MC/AsmParser/X86/x86_32-encoding.s
U    include/llvm/IntrinsicsX86.td
U    include/llvm/CodeGen/SelectionDAGNodes.h
U    lib/Target/X86/X86InstrSSE.td
U    lib/Target/X86/X86ISelLowering.h

llvm-svn: 99450

5b2da69f

Mar 24, 2010
- Added the Advanced Encryption Standard (AES) Instructions. · f5584a73
  Kevin Enderby authored Mar 24, 2010
```
llvm-svn: 99440
```
  f5584a73
Mar 19, 2010

Fixed the encoding problems of the crc32 instructions. All had the Operand size · cf0843ed

Kevin Enderby authored Mar 19, 2010

override prefix and only the r/m16 forms should have had that.  Also for variant
one, the AT&T syntax, added suffixes to all forms.  Also added the missing
64-bit form for 'CRC32 r64, r/m8'.  Plus added test cases for all forms and
tweaked one test case to add the needed suffixes.

llvm-svn: 98980

cf0843ed

Now that tblgen can handle matching implicit defs of instructions · 83facb08

Chris Lattner authored Mar 19, 2010

to input patterns, we can fix X86ISD::CMP and X86ISD::BT as taking
two inputs (which have to be the same type) and *returning an i32*.
This is how the SDNodes get made in the graph, but we weren't able
to model it this way due to deficiencies in the pattern language.

Now we can change things like this:

 def UCOM_FpIr80: FpI_<(outs), (ins RFP80:$lhs, RFP80:$rhs), CompareFP,
-                  [(X86cmp RFP80:$lhs, RFP80:$rhs),
-                   (implicit EFLAGS)]>; // CC = ST(0) cmp ST(i)
+                  [(set EFLAGS, (X86cmp RFP80:$lhs, RFP80:$rhs))]>;

and fix terrible crimes like this:

-def : Pat<(parallel (X86cmp GR8:$src1, 0), (implicit EFLAGS)),
+def : Pat<(X86cmp GR8:$src1, 0),
           (TEST8rr GR8:$src1, GR8:$src1)>;

This relies on matching the result of TEST8rr (which is EFLAGS, which is
an implicit def) to the result of X86cmp, an i32.

llvm-svn: 98903

83facb08

Mar 15, 2010
- fix a few more ambiguous types. · 26e62737
  Chris Lattner authored Mar 15, 2010
```
llvm-svn: 98531
```
  26e62737
Mar 08, 2010
- fix some more ambiguous patterns, remove another nontemporalstore · d8045649
  Chris Lattner authored Mar 08, 2010
```
pattern which is broken (source and address swapped).

llvm-svn: 97958
```
  d8045649
- remove a non-temporal store pattern which is not tested and · ca8d590c
  Chris Lattner authored Mar 08, 2010
```
could never have matched because the operand list was backwards.

llvm-svn: 97933
```
  ca8d590c
Feb 28, 2010

Implement XMM subregs. · bdd6405f

Dan Gohman authored Feb 28, 2010

Extracting the low element of a vector is now done with EXTRACT_SUBREG,
and the zero-extension performed by load movss is now modeled with
SUBREG_TO_REG, and so on.

Register-to-register movss and movsd are no longer considered copies;
they are two-address instructions which insert a scalar into a vector.

llvm-svn: 97354

bdd6405f

The mayHaveSideEffects flag is no longer used. · 8c5d683a
Dan Gohman authored Feb 27, 2010
```
llvm-svn: 97348
```
8c5d683a

Feb 26, 2010
- Delete a bunch of redundant predicates. · 9300486d
  Dan Gohman authored Feb 26, 2010
```
llvm-svn: 97201
```
  9300486d
Feb 23, 2010
- remove a bunch of dead named arguments in input patterns, · d1708923
  Chris Lattner authored Feb 23, 2010
```
though some look dubious afaict, these are all ok.

llvm-svn: 96899
```
  d1708923
Feb 18, 2010
- add a missing type cast. · fd47c797
  Chris Lattner authored Feb 18, 2010
```
llvm-svn: 96574
```
  fd47c797
Feb 16, 2010

· 9641d068

David Greene authored Feb 16, 2010

Add support for emitting non-temporal stores for DAGs marked
non-temporal.  Fix from r96241 for botched encoding of MOVNTDQ.

Add documentation for !nontemporal metadata.

Add a simpler movnt testcase.

llvm-svn: 96386

9641d068

Feb 15, 2010
- revert r96241. It breaks two regression tests, isn't documented, · bcbaaba5
  Chris Lattner authored Feb 15, 2010
```
and the testcase needs improvement.

llvm-svn: 96265
```
  bcbaaba5
- · 63cedef7
  David Greene authored Feb 15, 2010
```
Add support for emitting non-temporal stores for DAGs marked
non-temporal.

llvm-svn: 96241
```
  63cedef7
Feb 13, 2010
- Remove special cases for [LM]FENCE, MONITOR and MWAIT from · 064e9263
  Chris Lattner authored Feb 12, 2010
```
encoder and decoder by using new MRM_ forms.

llvm-svn: 96048
```
  064e9263
Feb 12, 2010

Add a missing pattern for movhps so that we get: · c780af64

Nate Begeman authored Feb 12, 2010

movq	(%ecx,%edx,2), %xmm2
movhps	(%ecx,%eax,2), %xmm2

rather than:

movq     (%eax, %edx, 2), %xmm2		
movq     (%eax, %ebx, 2), %xmm3		
movlhps  %xmm3, %xmm2			

Testcase forthcoming.

llvm-svn: 95948

c780af64

Feb 10, 2010
- Fix the encoding of the movntdqa X86 instruction. It was missing the 0x66 · a7c1d6cf
  Kevin Enderby authored Feb 10, 2010
```
prefix which is part of the opcode encoding.

llvm-svn: 95729
```
  a7c1d6cf
Feb 05, 2010
- really kill off the last MRMInitReg inst, remove logic from encoder. · 86bd1942
  Chris Lattner authored Feb 05, 2010
```
llvm-svn: 95437
```
  86bd1942
- lower the last of the MRMInitReg instructions in MCInstLower. · e96d534c
  Chris Lattner authored Feb 05, 2010
```
llvm-svn: 95435
```
  e96d534c
Jan 11, 2010

· 206351a1

David Greene authored Jan 11, 2010

Implement a feature (-vector-unaligned-mem) to allow targets to
ignore alignment requirements for SIMD memory operands.  This
is useful on architectures like the AMD 10h that do not trap on
unaligned references if a status bit is twiddled at startup time.

llvm-svn: 93151

206351a1

Dec 22, 2009

Remove target attribute break-sse-dep. Instead, do not fold load into sse... · 71d7eaa8

Evan Cheng authored Dec 22, 2009

Remove target attribute break-sse-dep. Instead, do not fold load into sse partial update instructions unless optimizing for size.

llvm-svn: 91910

71d7eaa8

Dec 18, 2009

On recent Intel u-arch's, folding loads into some unary SSE instructions can · 4cf30b72

Evan Cheng authored Dec 18, 2009

be non-optimal. To be precise, we should avoid folding loads if the instructions
only update part of the destination register, and the non-updated part is not
needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks
the partial register dependency and it can improve performance. e.g.

movss (%rdi), %xmm0
cvtss2sd %xmm0, %xmm0

instead of
cvtss2sd (%rdi), %xmm0

An alternative method to break dependency is to clear the register first. e.g.
xorps %xmm0, %xmm0
cvtss2sd (%rdi), %xmm0

llvm-svn: 91672

4cf30b72

Instruction fixes, added instructions, and AsmString changes in the · 04d8cb74

Sean Callanan authored Dec 18, 2009

X86 instruction tables.

Also (while I was at it) cleaned up the X86 tables, removing tabs and
80-line violations.

This patch was reviewed by Chris Lattner, but please let me know if
there are any problems.

* X86*.td
	Removed tabs and fixed 80-line violations

* X86Instr64bit.td
	(IRET, POPCNT, BT_, LSL, SWPGS, PUSH_S, POP_S, L_S, SMSW)
		Added
	(CALL, CMOV) Added qualifiers
	(JMP) Added PC-relative jump instruction
	(POPFQ/PUSHFQ) Added qualifiers; renamed PUSHFQ to indicate
		that it is 64-bit only (ambiguous since it has no
		REX prefix)
	(MOV) Added rr form going the other way, which is encoded
		differently
	(MOV) Changed immediates to offsets, which is more correct;
		also fixed MOV64o64a to have to a 64-bit offset
	(MOV) Fixed qualifiers
	(MOV) Added debug-register and condition-register moves
	(MOVZX) Added more forms
	(ADC, SUB, SBB, AND, OR, XOR) Added reverse forms, which
		(as with MOV) are encoded differently
	(ROL) Made REX.W required
	(BT) Uncommented mr form for disassembly only
	(CVT__2__) Added several missing non-intrinsic forms
	(LXADD, XCHG) Reordered operands to make more sense for
		MRMSrcMem
	(XCHG) Added register-to-register forms
	(XADD, CMPXCHG, XCHG) Added non-locked forms
* X86InstrSSE.td
	(CVTSS2SI, COMISS, CVTTPS2DQ, CVTPS2PD, CVTPD2PS, MOVQ)
		Added
* X86InstrFPStack.td
	(COM_FST0, COMP_FST0, COM_FI, COM_FIP, FFREE, FNCLEX, FNOP,
	 FXAM, FLDL2T, FLDL2E, FLDPI, FLDLG2, FLDLN2, F2XM1, FYL2X,
	 FPTAN, FPATAN, FXTRACT, FPREM1, FDECSTP, FINCSTP, FPREM,
	 FYL2XP1, FSINCOS, FRNDINT, FSCALE, FCOMPP, FXSAVE,
	 FXRSTOR)
		Added
	(FCOM, FCOMP) Added qualifiers
	(FSTENV, FSAVE, FSTSW) Fixed opcode names
	(FNSTSW) Added implicit register operand
* X86InstrInfo.td
	(opaque512mem) Added for FXSAVE/FXRSTOR
	(offset8, offset16, offset32, offset64) Added for MOV
	(NOOPW, IRET, POPCNT, IN, BTC, BTR, BTS, LSL, INVLPG, STR,
	 LTR, PUSHFS, PUSHGS, POPFS, POPGS, LDS, LSS, LES, LFS,
	 LGS, VERR, VERW, SGDT, SIDT, SLDT, LGDT, LIDT, LLDT,
	 LODSD, OUTSB, OUTSW, OUTSD, HLT, RSM, FNINIT, CLC, STC,
	 CLI, STI, CLD, STD, CMC, CLTS, XLAT, WRMSR, RDMSR, RDPMC,
	 SMSW, LMSW, CPUID, INVD, WBINVD, INVEPT, INVVPID, VMCALL,
	 VMCLEAR, VMLAUNCH, VMRESUME, VMPTRLD, VMPTRST, VMREAD,
	 VMWRITE, VMXOFF, VMXON) Added
	(NOOPL, POPF, POPFD, PUSHF, PUSHFD) Added qualifier
	(JO, JNO, JB, JAE, JE, JNE, JBE, JA, JS, JNS, JP, JNP, JL,
	 JGE, JLE, JG, JCXZ) Added 32-bit forms
	(MOV) Changed some immediate forms to offset forms
	(MOV) Added reversed reg-reg forms, which are encoded
		differently
	(MOV) Added debug-register and condition-register moves
	(CMOV) Added qualifiers
	(AND, OR, XOR, ADC, SUB, SBB) Added reverse forms, like MOV
	(BT) Uncommented memory-register forms for disassembler
	(MOVSX, MOVZX) Added forms
	(XCHG, LXADD) Made operand order make sense for MRMSrcMem
	(XCHG) Added register-register forms
	(XADD, CMPXCHG) Added unlocked forms
* X86InstrMMX.td
	(MMX_MOVD, MMV_MOVQ) Added forms
* X86InstrInfo.cpp: Changed PUSHFQ to PUSHFQ64 to reflect table
	change

* X86RegisterInfo.td: Added debug and condition register sets
* x86-64-pic-3.ll: Fixed testcase to reflect call qualifier
* peep-test-3.ll: Fixed testcase to reflect test qualifier
* cmov.ll: Fixed testcase to reflect cmov qualifier
* loop-blocks.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-11.ll: Fixed testcase to reflect call qualifier
* 2009-11-04-SubregCoalescingBug.ll: Fixed testcase to reflect call
  qualifier
* x86-64-pic-2.ll: Fixed testcase to reflect call qualifier
* live-out-reg-info.ll: Fixed testcase to reflect test qualifier
* tail-opts.ll: Fixed testcase to reflect call qualifiers
* x86-64-pic-10.ll: Fixed testcase to reflect call qualifier
* bss-pagealigned.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-1.ll: Fixed testcase to reflect call qualifier
* widen_load-1.ll: Fixed testcase to reflect call qualifier

llvm-svn: 91638

04d8cb74

Dec 09, 2009

Optimize splat of a scalar load into a shuffle of a vector load when it's legal. e.g. · 493b882f

Evan Cheng authored Dec 09, 2009

vector_shuffle (scalar_to_vector (i32 load (ptr + 4))), undef, <0, 0, 0, 0>
=>
vector_shuffle (v4i32 load ptr), undef, <1, 1, 1, 1>

iff ptr is 16-byte aligned (or can be made into 16-byte aligned).

llvm-svn: 90984

493b882f