Commits · c1215324a308b82f73049cf47c59dbb628f2aefc · Roger Ferrer / llvm-epi-0.8

Jan 03, 2012
- Intel style asm variant does not need '%' prefix. · c1215324
  Devang Patel authored Jan 03, 2012
```
llvm-svn: 147453
```
  c1215324
Jan 02, 2012

Miscellaneous shuffle lowering cleanup. No functional changes. Primarily... · 5bacb7e9

Craig Topper authored Jan 02, 2012

Miscellaneous shuffle lowering cleanup. No functional changes. Primarily converting the indexing loops to unsigned to be consistent across functions.

llvm-svn: 147430

5bacb7e9

Make CanXFormVExtractWithShuffleIntoLoad reject loads with multiple uses. Also... · 53d55964

Craig Topper authored Jan 02, 2012

Make CanXFormVExtractWithShuffleIntoLoad reject loads with multiple uses. Also make it return false if there's not even a load at all. This makes the code better match the code in DAGCombiner that it tries to match. These two changes prevent some cases where vector_shuffles were making it to instruction selection and causing the older shuffle selection code to be triggered. Also needed to fix a bad pattern that this change exposed. This is the first step towards getting rid of the old shuffle selection support. No test cases yet because there's no way to tell whether a shuffle was handled in the legalize stage or at instruction selection.

llvm-svn: 147428

53d55964

· 6c7a0e6c

Nadav Rotem authored Jan 02, 2012

Optimize the sequence blend(sign_extend(x)) to blend(shl(x)) since SSE blend instructions only look at the highest bit.

llvm-svn: 147426

6c7a0e6c

Jan 01, 2012
- Allow CRC32 instructions to be selected when AVX is enabled. · b9109844
  Craig Topper authored Jan 01, 2012
```
llvm-svn: 147411
```
  b9109844
- Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is... · 1c064e0a
  Craig Topper authored Jan 01, 2012
```
Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers.

llvm-svn: 147409
```
  1c064e0a
- X86Disassembler: Fix undefined behavior found by GCC 4.6 · 47aecca5
  Benjamin Kramer authored Jan 01, 2012
```
llvm-svn: 147404
```
  47aecca5
- Merge X86 SHUFPS and SHUFPD node types. · 6e54ba7e
  Craig Topper authored Dec 31, 2011
```
llvm-svn: 147394
```
  6e54ba7e
- Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load. · d51092d9
  Craig Topper authored Dec 31, 2011
```
llvm-svn: 147393
```
  d51092d9
- Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a... · 0e796fee
  Craig Topper authored Dec 31, 2011
```
Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected.

llvm-svn: 147392
```
  0e796fee
Dec 30, 2011

Cleanup Mips code and rename some variables. Patch by Jack Carter · cd1d447d
Bruno Cardoso Lopes authored Dec 30, 2011
```
llvm-svn: 147383
```
cd1d447d

Improve Mips JIT. · d5b2834f

Bruno Cardoso Lopes authored Dec 30, 2011

Implement encoder methods getJumpTargetOpValue and getBranchTargetOpValue
for jmptarget and brtarget Mips tablegen operand types in the code emitter
for old-style JIT. Rename the pc relative relocation for branches - new
name is Mips::reloc_mips_pc16.

Patch by Sasa Stankovic

llvm-svn: 147382

d5b2834f

Make FMA4 imply AVX so that YMM registers would be available. Necessitates... · a5d1fc2c

Craig Topper authored Dec 30, 2011

Make FMA4 imply AVX so that YMM registers would be available. Necessitates removing from Bulldozer CPU types since it would enable AVX code generation implicitly. Also make SSE4A imply SSE3. Without some level of SSE implied, XMM registers wouldn't be legal.

llvm-svn: 147369

a5d1fc2c

Add disassembler support for VPERMIL2PD and VPERMIL2PS. · 2ba766ae
Craig Topper authored Dec 30, 2011
```
llvm-svn: 147368
```
2ba766ae
Add FMA4 instructions to disassembler. · 03a0beda
Craig Topper authored Dec 30, 2011
```
llvm-svn: 147367
```
03a0beda

Separate the concept of having memory access in operand 4 from the concept of... · cd93de93

Craig Topper authored Dec 30, 2011

Separate the concept of having memory access in operand 4 from the concept of having the W bit set for XOP instructons. Removes ORing W-bits in the encoder and will similarly simplify the disassembler implementation.

llvm-svn: 147366

cd93de93

Combine FMA4 SS/SD patterns with the instruction definitions. · c0f9bcb5
Craig Topper authored Dec 30, 2011
```
llvm-svn: 147365
```
c0f9bcb5
Combine FMA4 PS/PD patterns with the instruction definitions. · 51fe43fc
Craig Topper authored Dec 30, 2011
```
llvm-svn: 147364
```
51fe43fc

Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to... · 6c08930c

Craig Topper authored Dec 30, 2011

Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms.

llvm-svn: 147361

6c08930c

Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size,... · 2ca79b9d

Craig Topper authored Dec 30, 2011

Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere.

llvm-svn: 147360

2ca79b9d

Cleanup stack/frame register define/kill states. This fixes two bugs: · 692d1fb3

Hal Finkel authored Dec 30, 2011

1. The ST*UX instructions that store and update the stack pointer did not set define/kill on R1. This became a problem when I activated post-RA scheduling (and had incorrectly adjusted the Frames-large test).

2. eliminateFrameIndex did not kill its scavenged temporary register, and this could cause the scavenger to exhaust all available registers (and its emergency spill slot) when there were a lot of CR values to spill. The 2010-02-12-saveCR test has been adjusted to check for this.

llvm-svn: 147359

692d1fb3

Dec 29, 2011
- Fix execution domains for PS/PD FMA3 instructions. Add SS/SD forms o FMA3 instructions. · d773607e
  Craig Topper authored Dec 29, 2011
```
llvm-svn: 147353
```
  d773607e
- Expose FMA3 instructions to the disassembler. · 8cab06a2
  Craig Topper authored Dec 29, 2011
```
llvm-svn: 147351
```
  8cab06a2
- Make FMA3 imply AVX needs to be enabled. Particularly because 256-bit types... · e1bd0512
  Craig Topper authored Dec 29, 2011
```
Make FMA3 imply AVX needs to be enabled. Particularly because 256-bit types aren't valid unless AVX is enabled.

llvm-svn: 147349
```
  e1bd0512
- Change XOP detection to use the correct CPUID bit instead of using the FMA4 bit. · dd286a52
  Craig Topper authored Dec 29, 2011
```
llvm-svn: 147348
```
  dd286a52
- Add FeaturePOPCNT to all CPU types that lost it was removed from SSE42/SSE4A in r147339. · a060afb5
  Craig Topper authored Dec 29, 2011
```
llvm-svn: 147347
```
  a060afb5
- Mark non-VEX forms of PCLMUL instructions as requiring SSE2 to be enabled... · 97f05c57
  Craig Topper authored Dec 29, 2011
```
Mark non-VEX forms of PCLMUL instructions as requiring SSE2 to be enabled along with CLMUL. That's required for the XMM registers to be valid for integer data. Doesn't change any behavior since the CLMUL instructions don't have patterns yet.

llvm-svn: 147345
```
  97f05c57
- Mark non-VEX forms of AES instructions as requiring SSE2 to be enabled along... · 1559123c
  Craig Topper authored Dec 29, 2011
```
Mark non-VEX forms of AES instructions as requiring SSE2 to be enabled along with AES. Since that's required for the XMM registers to be valid for integer data. Doesn't change any behavior though since you can't use an intrinsic with an illegal type anyway. Just makes it consistent with the VEX forms.

llvm-svn: 147344
```
  1559123c
- Remove the separate explicit AES instruction patterns. They are equivalent to... · 9e61291b
  Craig Topper authored Dec 29, 2011
```
Remove the separate explicit AES instruction patterns. They are equivalent to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns.

llvm-svn: 147342
```
  9e61291b
- Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled on... · 7bd3305f
  Craig Topper authored Dec 29, 2011
```
Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled on its own without disabling SSE4.2 or SSE4A.

llvm-svn: 147339
```
  7bd3305f
- Make LowerBUILD_VECTOR keep node vector types consistent when creating MOVL for v16i16 and v32i8. · 0fdf720d
  Craig Topper authored Dec 29, 2011
```
llvm-svn: 147337
```
  0fdf720d
- Remove some elses after returns. · 862c9b65
  Craig Topper authored Dec 29, 2011
```
llvm-svn: 147336
```
  862c9b65
- Remove trailing spaces. Fix an assert to use && instead of || before string.... · 274e20a4
  Craig Topper authored Dec 29, 2011
```
Remove trailing spaces. Fix an assert to use && instead of || before string. Add same assert on similar code path.

llvm-svn: 147335
```
  274e20a4
Dec 28, 2011
- Fix type-checking for load transformation which is not legal on floating-point types. PR11674. · 3a01ddb7
  Eli Friedman authored Dec 28, 2011
```
llvm-svn: 147323
```
  3a01ddb7
- Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR. · b3515a8d
  Elena Demikhovsky authored Dec 28, 2011
```
Matching MOVLP mask for AVX (265-bit vectors) was wrong.
The failure was detected by conformance tests.

llvm-svn: 147308
```
  b3515a8d
Dec 27, 2011

Clean up some Release build warnings. · b668401b
Benjamin Kramer authored Dec 27, 2011
```
llvm-svn: 147289
```
b668401b

Add handling of x86_avx2_pmovmskb to computeMaskedBitsForTargetNode for... · df34d152

Craig Topper authored Dec 27, 2011

Add handling of x86_avx2_pmovmskb to computeMaskedBitsForTargetNode for consistency. Add comments and an assert for BMI instructions to PerformXorCombine since the enabling of the combine is conditional on it, but the function itself isn't.

llvm-svn: 147287

df34d152

Dec 25, 2011
- Sparc: Implement emitFrameIndexDebugValue and getDebugValue Location hooks. · 1fc8263b
  Venkatraman Govindaraju authored Dec 25, 2011
```
llvm-svn: 147269
```
  1fc8263b
Dec 24, 2011

Section relative fixups are a coff concept, not a x86 one. Replace the · a56ab0ed
Rafael Espindola authored Dec 24, 2011
```
x86 specific reloc_coff_secrel32 with a generic FK_SecRel_4.

llvm-svn: 147252
```
a56ab0ed

Use standard promotion for i8 CTTZ nodes and i8 CTLZ nodes when the · a3d54fe0

Chandler Carruth authored Dec 24, 2011

LZCNT instructions are available. Force promotion to i32 to get
a smaller encoding since the fix-ups necessary are just as complex for
either promoted type

We can't do standard promotion for CTLZ when lowering through BSR
because it results in poor code surrounding the 'xor' at the end of this
instruction. Essentially, if we promote the entire CTLZ node to i32, we
end up doing the xor on a 32-bit CTLZ implementation, and then
subtracting appropriately to get back to an i8 value. Instead, our
custom logic just uses the knowledge of the incoming size to compute
a perfect xor. I'd love to know of a way to fix this, but so far I'm
drawing a blank. I suspect the legalizer could be more clever and/or it
could collude with the DAG combiner, but how... ;]

llvm-svn: 147251

a3d54fe0