- Oct 16, 2013
-
-
Rafael Espindola authored
We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before. llvm-svn: 192760
-
Rafael Espindola authored
No functionality change, but exposes the API so that codegen can use it too. Patch by Katya Romanova. llvm-svn: 192757
-
Andrew Trick authored
This changes the SelectionDAG scheduling preference to source order. Soon, the SelectionDAG scheduler can be bypassed saving a nice chunk of compile time. Performance differences that result from this change are often a consequence of register coalescing. The register coalescer is far from perfect. Bugs can be filed for deficiencies. On x86 SandyBridge/Haswell, the source order schedule is often preserved, particularly for small blocks. Register pressure is generally improved over the SD scheduler's ILP mode. However, we are still able to handle large blocks that require latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also attempts to discover the critical path in single-block loops and adjust heuristics accordingly. The MI scheduler relies on the new machine model. This is currently unimplemented for AVX, so we may not be generating the best code yet. Unit tests are updated so they don't depend on SD scheduling heuristics. llvm-svn: 192750
-
- Oct 15, 2013
-
-
Michael Liao authored
- Type of index used in extract_vector_elt or insert_vector_elt supposes to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd better to truncate (or zero-extend in case it's changed later) it to mask element type to guarantee they are matching instead of asserting that. llvm-svn: 192722
-
Michael Liao authored
- Lower signed division by constant powers-of-2 to target-independent DAG operators instead of target-dependent ones to support them better on targets where vector types are legal but shift operators on that types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16> though <16 x i16> is a legal type. llvm-svn: 192721
-
Craig Topper authored
Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext. llvm-svn: 192672
-
Quentin Colombet authored
through bitcast, ptrtoint, and inttoptr instructions. This is valid only if the related instructions are in that same basic block, otherwise we may reference variables that were not live accross basic blocks resulting in undefined virtual registers. The bug was exposed when both SDISel and FastISel were used within the same function, i.e., one basic block is issued with FastISel and another with SDISel, as demonstrated with the testcase. <rdar://problem/15192473> llvm-svn: 192636
-
Andrew Trick authored
This pass is needed to break false dependencies. Without it, unlucky register assignment can result in wild (5x) swings in performance. This pass was trying to handle AVX but not getting it right. AVX doesn't have partial register defs, it has unused register reads in which the high bits of a source operand are copied into the unused bits of the dest. Fixing this requires conservative liveness analysis. This is awkard because the pass already has its own pseudo-liveness. However, proper liveness is expensive, and we would like to use a generic utility to compute it. The fix only invokes liveness on-demand. It is rare to detect a case that needs undef-read dependence breaking, but when it happens, it can be needed many times within a very large block. I think the existing heuristic which uses a register window of 16 is too conservative for loop-carried false dependencies. If the loop is a reduction. The out-of-order engine may be able to execute several loop iterations in parallel. However, I'll leave this tuning exercise for next time. llvm-svn: 192635
-
Andrew Trick authored
llvm-svn: 192633
-
- Oct 14, 2013
-
-
Eric Christopher authored
a) x86-64 TLS has been documented b) the code path should use movq for the correct relocation to be generated. I've also added a fixme for the test case that we should improve the code generated, it should look something like is documented in the tls abi document. llvm-svn: 192631
-
Eric Christopher authored
llvm-svn: 192630
-
Eric Christopher authored
llvm-svn: 192629
-
Elena Demikhovsky authored
The alignment of allocated space was wrong, see Bugzila 17345. Done by Zvi Rackover <zvi.rackover@intel.com>. llvm-svn: 192573
-
Craig Topper authored
llvm-svn: 192568
-
Craig Topper authored
Allow pinsrw/pinsrb/pextrb/pextrw/movmskps/movmskpd/pmovmskb/extractps instructions to parse either GR32 or GR64 without resorting to duplicating instructions. llvm-svn: 192567
-
Craig Topper authored
Add disassembler support for SSE4.1 register/register form of PEXTRW. There is a shorter encoding that was part of SSE2, but a memory form was added in SSE4.1. This is the register form of that encoding. llvm-svn: 192566
-
Craig Topper authored
Mark MOVMSKPS/MOVMSKPD/VPINSRWrr64i as AsmParserOnly to remove them from the disassembler tables. Add PINSRWrr64i to complement the AVX version. llvm-svn: 192565
-
Craig Topper authored
Don't use 64-bit versions of MOVMSKPD in CodeGen. The instructions only produce a 1-bit result so we can just use SUBREG_TO_REG to extend the 32-bit versions. llvm-svn: 192562
-
- Oct 12, 2013
-
-
Craig Topper authored
llvm-svn: 192525
-
Craig Topper authored
llvm-svn: 192522
-
- Oct 10, 2013
-
-
Craig Topper authored
llvm-svn: 192341
-
Craig Topper authored
llvm-svn: 192340
-
- Oct 09, 2013
-
-
Elena Demikhovsky authored
llvm-svn: 192283
-
Andrew Trick authored
This was only working because AVX had cheaper rules in all cases. I'm sure there are other places in this file where predicates are missing. llvm-svn: 192276
-
Craig Topper authored
Replace a couple instructions with patterns referring to other instructions with same encoding and operands. Mark a couple other instructions as CodeGenOnly since we have FR and VR instructions and only one of them is needed by the assembler/disassembler. llvm-svn: 192274
-
Craig Topper authored
Use AVX512PIi8 for the alt forms of vcmp instructions. This adds the TB prefix and keeps the mnemonic from starting with an extra 'v' llvm-svn: 192272
-
Craig Topper authored
Mark some instructions as CodeGenOnly since they aren't needed by the assembler or disassembler. Disassembler already filtered them, but asm parser still had them in its tables. llvm-svn: 192271
-
Craig Topper authored
Add in64BitMode/in32BitMode to the MMX/SSE2/AVX maskmovq/dq instructions. This way the asm parser will pick the right one based on the mode. Instruction selection already did the right thing based on the pointer size. llvm-svn: 192266
-
- Oct 08, 2013
-
-
Rafael Espindola authored
This patch fixes an old FIXME by creating a MCTargetStreamer interface and moving the target specific functions for ARM, Mips and PPC to it. The ARM streamer is still declared in a common place because it is used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are completely hidden in the corresponding Target directories. I will send an email to llvmdev with instructions on how to use this. llvm-svn: 192181
-
Craig Topper authored
Remove unneeded MMX instruction definition by moving pattern to an equivalent instruction definition and removing the filtering from the disassembler table building. llvm-svn: 192175
-
Craig Topper authored
Remove some instructions that existed to provide aliases to the assembler. Can be done with InstAlias instead. Unfortunately, this was causing printer to use 'vmovq' or 'vmovd' based on what was parsed. To cleanup the inconsistencies convert all 'vmovd' with 64-bit registers to 'vmovq', but provide an alias so that 'vmovd' will still parse. llvm-svn: 192171
-
- Oct 07, 2013
-
-
Benjamin Kramer authored
Fixes PR17495, where an i24 triggered this code. It's intended to optimize i64 loads on 32 bit x86. llvm-svn: 192123
-
Rafael Espindola authored
They haven't been used for a long time. Patch by MathOnNapkins. llvm-svn: 192099
-
Craig Topper authored
Remove some instructions that seem to only exist to trick the filtering checks in the disassembler table creation. Just fix up the filter to let the real instruction through instead. llvm-svn: 192090
-
Craig Topper authored
llvm-svn: 192089
-
Craig Topper authored
Teach X86 asm parser that VMOVAPSrr and other VEX-encoded register to register moves should be switched from using the MRMSrcReg form to the MRMDestReg form if the source register is a 64-bit extended register and the destination register is not. This allows the instruction to be encoded using the 2-byte VEX form instead of the 3-byte VEX form. The GNU assembler has similar behavior and instruction selection already does this. llvm-svn: 192088
-
Craig Topper authored
llvm-svn: 192086
-
- Oct 06, 2013
-
-
Benjamin Kramer authored
Regalloc can emit unaligned spills nowadays, but we can't fold the spills into SSE ops if we can't guarantee alignment. PR12250. llvm-svn: 192064
-
Elena Demikhovsky authored
Fixed load folding in VPERM2I instruction. llvm-svn: 192063
-
Elena Demikhovsky authored
in case of BLEND and added VSHUFPS patterns. llvm-svn: 192055
-