- Jan 01, 2012
-
-
Benjamin Kramer authored
llvm-svn: 147404
-
Craig Topper authored
llvm-svn: 147394
-
Craig Topper authored
llvm-svn: 147393
-
Craig Topper authored
Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected. llvm-svn: 147392
-
- Dec 30, 2011
-
-
Bruno Cardoso Lopes authored
llvm-svn: 147383
-
Bruno Cardoso Lopes authored
Implement encoder methods getJumpTargetOpValue and getBranchTargetOpValue for jmptarget and brtarget Mips tablegen operand types in the code emitter for old-style JIT. Rename the pc relative relocation for branches - new name is Mips::reloc_mips_pc16. Patch by Sasa Stankovic llvm-svn: 147382
-
Craig Topper authored
Make FMA4 imply AVX so that YMM registers would be available. Necessitates removing from Bulldozer CPU types since it would enable AVX code generation implicitly. Also make SSE4A imply SSE3. Without some level of SSE implied, XMM registers wouldn't be legal. llvm-svn: 147369
-
Craig Topper authored
llvm-svn: 147368
-
Craig Topper authored
llvm-svn: 147367
-
Craig Topper authored
Separate the concept of having memory access in operand 4 from the concept of having the W bit set for XOP instructons. Removes ORing W-bits in the encoder and will similarly simplify the disassembler implementation. llvm-svn: 147366
-
Craig Topper authored
llvm-svn: 147365
-
Craig Topper authored
llvm-svn: 147364
-
Craig Topper authored
Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms. llvm-svn: 147361
-
Craig Topper authored
Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere. llvm-svn: 147360
-
Hal Finkel authored
1. The ST*UX instructions that store and update the stack pointer did not set define/kill on R1. This became a problem when I activated post-RA scheduling (and had incorrectly adjusted the Frames-large test). 2. eliminateFrameIndex did not kill its scavenged temporary register, and this could cause the scavenger to exhaust all available registers (and its emergency spill slot) when there were a lot of CR values to spill. The 2010-02-12-saveCR test has been adjusted to check for this. llvm-svn: 147359
-
- Dec 29, 2011
-
-
Craig Topper authored
llvm-svn: 147353
-
Craig Topper authored
llvm-svn: 147351
-
Craig Topper authored
Make FMA3 imply AVX needs to be enabled. Particularly because 256-bit types aren't valid unless AVX is enabled. llvm-svn: 147349
-
Craig Topper authored
llvm-svn: 147348
-
Craig Topper authored
llvm-svn: 147347
-
Craig Topper authored
Mark non-VEX forms of PCLMUL instructions as requiring SSE2 to be enabled along with CLMUL. That's required for the XMM registers to be valid for integer data. Doesn't change any behavior since the CLMUL instructions don't have patterns yet. llvm-svn: 147345
-
Craig Topper authored
Mark non-VEX forms of AES instructions as requiring SSE2 to be enabled along with AES. Since that's required for the XMM registers to be valid for integer data. Doesn't change any behavior though since you can't use an intrinsic with an illegal type anyway. Just makes it consistent with the VEX forms. llvm-svn: 147344
-
Craig Topper authored
Remove the separate explicit AES instruction patterns. They are equivalent to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns. llvm-svn: 147342
-
Craig Topper authored
Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled on its own without disabling SSE4.2 or SSE4A. llvm-svn: 147339
-
Craig Topper authored
llvm-svn: 147337
-
Craig Topper authored
llvm-svn: 147336
-
Craig Topper authored
Remove trailing spaces. Fix an assert to use && instead of || before string. Add same assert on similar code path. llvm-svn: 147335
-
- Dec 28, 2011
-
-
Eli Friedman authored
llvm-svn: 147323
-
Elena Demikhovsky authored
Matching MOVLP mask for AVX (265-bit vectors) was wrong. The failure was detected by conformance tests. llvm-svn: 147308
-
- Dec 27, 2011
-
-
Benjamin Kramer authored
llvm-svn: 147289
-
Craig Topper authored
Add handling of x86_avx2_pmovmskb to computeMaskedBitsForTargetNode for consistency. Add comments and an assert for BMI instructions to PerformXorCombine since the enabling of the combine is conditional on it, but the function itself isn't. llvm-svn: 147287
-
- Dec 25, 2011
-
-
Venkatraman Govindaraju authored
llvm-svn: 147269
-
- Dec 24, 2011
-
-
Rafael Espindola authored
x86 specific reloc_coff_secrel32 with a generic FK_SecRel_4. llvm-svn: 147252
-
Chandler Carruth authored
LZCNT instructions are available. Force promotion to i32 to get a smaller encoding since the fix-ups necessary are just as complex for either promoted type We can't do standard promotion for CTLZ when lowering through BSR because it results in poor code surrounding the 'xor' at the end of this instruction. Essentially, if we promote the entire CTLZ node to i32, we end up doing the xor on a 32-bit CTLZ implementation, and then subtracting appropriately to get back to an i8 value. Instead, our custom logic just uses the knowledge of the incoming size to compute a perfect xor. I'd love to know of a way to fix this, but so far I'm drawing a blank. I suspect the legalizer could be more clever and/or it could collude with the DAG combiner, but how... ;] llvm-svn: 147251
-
Chandler Carruth authored
inspection earlier. llvm-svn: 147250
-
Benjamin Kramer authored
llvm-svn: 147247
-
Chandler Carruth authored
'bsf' instructions here. This one is actually debatable to my eyes. It's not clear that any chip implementing 'tzcnt' would have a slow 'bsf' for any reason, and unless EFLAGS or a zero input matters, 'tzcnt' is just a longer encoding. Still, this restores the old behavior with 'tzcnt' enabled for now. llvm-svn: 147246
-
Chandler Carruth authored
X86ISelLowering C++ code. Because this is lowered via an xor wrapped around a bsr, we want the dagcombine which runs after isel lowering to have a chance to clean things up. In particular, it is very common to see code which looks like: (sizeof(x)*8 - 1) ^ __builtin_clz(x) Which is trying to compute the most significant bit of 'x'. That's actually the value computed directly by the 'bsr' instruction, but if we match it too late, we'll get completely redundant xor instructions. The more naive code for the above (subtracting rather than using an xor) still isn't handled correctly due to the dagcombine getting confused. Also, while here fix an issue spotted by inspection: we should have been expanding the zero-undef variants to the normal variants when there is an 'lzcnt' instruction. Do so, and test for this. We don't want to generate unnecessary 'bsr' instructions. These two changes fix some regressions in encoding and decoding benchmarks. However, there is still a *lot* to be improve on in this type of code. llvm-svn: 147244
-
Jakob Stoklund Olesen authored
llvm-svn: 147238
-
Akira Hatanaka authored
loadRegFromStackSlot. llvm-svn: 147235
-