Commits · 6174e5ee681991af5a7dc907c3e68fbe12968251 · Roger Ferrer / llvm-epi-0.8

Oct 16, 2013

Add a MCAsmInfoELF class and factor some code into it. · 43c4e24f
Rafael Espindola authored Oct 16, 2013
```
We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before.

llvm-svn: 192760
```
43c4e24f

Move .ident handling to MCStreamer. · 5645bade

Rafael Espindola authored Oct 16, 2013

No functionality change, but exposes the API so that codegen can use it too.

Patch by Katya Romanova.

llvm-svn: 192757

5645bade

Enable MI Sched for x86. · e97d8d6d

Andrew Trick authored Oct 15, 2013

This changes the SelectionDAG scheduling preference to source
order. Soon, the SelectionDAG scheduler can be bypassed saving
a nice chunk of compile time.

Performance differences that result from this change are often a
consequence of register coalescing. The register coalescer is far from
perfect. Bugs can be filed for deficiencies.

On x86 SandyBridge/Haswell, the source order schedule is often
preserved, particularly for small blocks.

Register pressure is generally improved over the SD scheduler's ILP
mode. However, we are still able to handle large blocks that require
latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also
attempts to discover the critical path in single-block loops and
adjust heuristics accordingly.

The MI scheduler relies on the new machine model. This is currently
unimplemented for AVX, so we may not be generating the best code yet.

Unit tests are updated so they don't depend on SD scheduling heuristics.

llvm-svn: 192750

e97d8d6d

Oct 15, 2013

Fix PR17546 · ad71659d

Michael Liao authored Oct 15, 2013

- Type of index used in extract_vector_elt or insert_vector_elt supposes
  to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd
  better to truncate (or zero-extend in case it's changed later) it to
  mask element type to guarantee they are matching instead of asserting
  that.

llvm-svn: 192722

ad71659d

Fix PR16807 · 8ba06821

Michael Liao authored Oct 15, 2013

- Lower signed division by constant powers-of-2 to target-independent
  DAG operators instead of target-dependent ones to support them better
  on targets where vector types are legal but shift operators on that
  types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16>
  though <16 x i16> is a legal type.

llvm-svn: 192721

8ba06821

Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from... · ef9e993e

Craig Topper authored Oct 15, 2013

Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext.

llvm-svn: 192672

ef9e993e

[X86][FastISel] During X86 fastisel, the address of indirect call was resolved · 778dba1d

Quentin Colombet authored Oct 14, 2013

through bitcast, ptrtoint, and inttoptr instructions. This is valid
only if the related instructions are in that same basic block, otherwise
we may reference variables that were not live accross basic blocks
resulting in undefined virtual registers.

The bug was exposed when both SDISel and FastISel were used within the same
function, i.e., one basic block is issued with FastISel and another with SDISel,
as demonstrated with the testcase.

<rdar://problem/15192473>

llvm-svn: 192636

778dba1d

Fix the ExecutionDepsFix pass to handle AVX instructions. · b6d56be6

Andrew Trick authored Oct 14, 2013

This pass is needed to break false dependencies. Without it, unlucky
register assignment can result in wild (5x) swings in
performance. This pass was trying to handle AVX but not getting it
right. AVX doesn't have partial register defs, it has unused register
reads in which the high bits of a source operand are copied into the
unused bits of the dest.

Fixing this requires conservative liveness analysis. This is awkard
because the pass already has its own pseudo-liveness. However, proper
liveness is expensive, and we would like to use a generic utility to
compute it. The fix only invokes liveness on-demand. It is rare to
detect a case that needs undef-read dependence breaking, but when it
happens, it can be needed many times within a very large block.

I think the existing heuristic which uses a register window of 16 is
too conservative for loop-carried false dependencies. If the loop is a
reduction. The out-of-order engine may be able to execute several loop
iterations in parallel. However, I'll leave this tuning exercise for
next time.

llvm-svn: 192635

b6d56be6

whitespace · 8460a3bf
Andrew Trick authored Oct 14, 2013
```
llvm-svn: 192633
```
8460a3bf

Oct 14, 2013

Revert part of a fix from 2010, changes since then: · 74002574

Eric Christopher authored Oct 14, 2013

a) x86-64 TLS has been documented
b) the code path should use movq for the correct relocation
   to be generated.

I've also added a fixme for the test case that we should improve
the code generated, it should look something like is documented
in the tls abi document.

llvm-svn: 192631

74002574

Reformat this routine slightly. · 755711e5
Eric Christopher authored Oct 14, 2013
```
llvm-svn: 192630
```
755711e5
Remove some extraneous whitespace. · 584d71c6
Eric Christopher authored Oct 14, 2013
```
llvm-svn: 192629
```
584d71c6

Fixed a bug in dynamic allocation memory on stack. · 82a46ebe

Elena Demikhovsky authored Oct 14, 2013

The alignment of allocated space was wrong, see Bugzila 17345.

Done by Zvi Rackover <zvi.rackover@intel.com>.

llvm-svn: 192573

82a46ebe

Create classes to reduce the size of the tablegen entries for the CRC32 instructions. · d7abdb6f
Craig Topper authored Oct 14, 2013
```
llvm-svn: 192568
```
d7abdb6f

Allow pinsrw/pinsrb/pextrb/pextrw/movmskps/movmskpd/pmovmskb/extractps... · a422b09a

Craig Topper authored Oct 14, 2013

Allow pinsrw/pinsrb/pextrb/pextrw/movmskps/movmskpd/pmovmskb/extractps instructions to parse either GR32 or GR64 without resorting to duplicating instructions.

llvm-svn: 192567

a422b09a

Add disassembler support for SSE4.1 register/register form of PEXTRW. There is... · 44322088

Craig Topper authored Oct 14, 2013

Add disassembler support for SSE4.1 register/register form of PEXTRW. There is a shorter encoding that was part of SSE2, but a memory form was added in SSE4.1. This is the register form of that encoding.

llvm-svn: 192566

44322088

Mark MOVMSKPS/MOVMSKPD/VPINSRWrr64i as AsmParserOnly to remove them from the... · 7158745e

Craig Topper authored Oct 14, 2013

Mark MOVMSKPS/MOVMSKPD/VPINSRWrr64i as AsmParserOnly to remove them from the disassembler tables. Add PINSRWrr64i to complement the AVX version.

llvm-svn: 192565

7158745e

Don't use 64-bit versions of MOVMSKPD in CodeGen. The instructions only... · c4a5a3f6

Craig Topper authored Oct 14, 2013

Don't use 64-bit versions of MOVMSKPD in CodeGen. The instructions only produce a 1-bit result so we can just use SUBREG_TO_REG to extend the 32-bit versions.

llvm-svn: 192562

c4a5a3f6

Oct 12, 2013
- Remove more filters from the disassembler. Mark some AVX512 instructions as CodeGenOnly. · 88adf2a4
  Craig Topper authored Oct 12, 2013
```
llvm-svn: 192525
```
  88adf2a4
- Mark some more instructions as CodeGenOnly. Remove filters from the disassembler. · aab53e77
  Craig Topper authored Oct 12, 2013
```
llvm-svn: 192522
```
  aab53e77
Oct 10, 2013
- Allow non-AVX form of pmovmskb to take a GR64 operand. · 5fb5bd33
  Craig Topper authored Oct 10, 2013
```
llvm-svn: 192341
```
  5fb5bd33
- Remove duplicate instructions. · 3ada6dea
  Craig Topper authored Oct 10, 2013
```
llvm-svn: 192340
```
  3ada6dea
Oct 09, 2013

AVX-512: Added VRCP28 and VRSQRT28 instructions and intrinsics. · a3a71408
Elena Demikhovsky authored Oct 09, 2013
```
llvm-svn: 192283
```
a3a71408

Add missing HasAVX512 predicate. · 15a47743

Andrew Trick authored Oct 09, 2013

This was only working because AVX had cheaper rules in all cases.
I'm sure there are other places in this file where predicates are missing.

llvm-svn: 192276

15a47743

Replace a couple instructions with patterns referring to other instructions... · a5f628ce

Craig Topper authored Oct 09, 2013

Replace a couple instructions with patterns referring to other instructions with same encoding and operands. Mark a couple other instructions as CodeGenOnly since we have FR and VR instructions and only one of them is needed by the assembler/disassembler.

llvm-svn: 192274

a5f628ce

Use AVX512PIi8 for the alt forms of vcmp instructions. This adds the TB prefix... · a328ee42

Craig Topper authored Oct 09, 2013

Use AVX512PIi8 for the alt forms of vcmp instructions. This adds the TB prefix and keeps the mnemonic from starting with an extra 'v'

llvm-svn: 192272

a328ee42

Mark some instructions as CodeGenOnly since they aren't needed by the... · 49d33198

Craig Topper authored Oct 09, 2013

Mark some instructions as CodeGenOnly since they aren't needed by the assembler or disassembler. Disassembler already filtered them, but asm parser still had them in its tables.

llvm-svn: 192271

49d33198

Add in64BitMode/in32BitMode to the MMX/SSE2/AVX maskmovq/dq instructions. This... · bc749db9

Craig Topper authored Oct 09, 2013

Add in64BitMode/in32BitMode to the MMX/SSE2/AVX maskmovq/dq instructions. This way the asm parser will pick the right one based on the mode. Instruction selection already did the right thing based on the pointer size.

llvm-svn: 192266

bc749db9

Oct 08, 2013

Add a MCTargetStreamer interface. · a17151ad

Rafael Espindola authored Oct 08, 2013

This patch fixes an old FIXME by creating a MCTargetStreamer interface
and moving the target specific functions for ARM, Mips and PPC to it.

The ARM streamer is still declared in a common place because it is
used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are
completely hidden in the corresponding Target directories.

I will send an email to llvmdev with instructions on how to use this.

llvm-svn: 192181

a17151ad

Remove unneeded MMX instruction definition by moving pattern to an equivalent... · a984729f

Craig Topper authored Oct 08, 2013

Remove unneeded MMX instruction definition by moving pattern to an equivalent instruction definition and removing the filtering from the disassembler table building.

llvm-svn: 192175

a984729f

Remove some instructions that existed to provide aliases to the assembler. Can... · 72c8cd7b

Craig Topper authored Oct 08, 2013

Remove some instructions that existed to provide aliases to the assembler. Can be done with InstAlias instead. Unfortunately, this was causing printer to use 'vmovq' or 'vmovd' based on what was parsed. To cleanup the inconsistencies convert all 'vmovd' with 64-bit registers to 'vmovq', but provide an alias so that 'vmovd' will still parse.

llvm-svn: 192171

72c8cd7b

Oct 07, 2013

X86: Fix type check. Just because an integer type is illegal doesn't mean it's i64. · 7b5e1594
Benjamin Kramer authored Oct 07, 2013
```
Fixes PR17495, where an i24 triggered this code. It's intended to
optimize i64 loads on 32 bit x86.

llvm-svn: 192123
```
7b5e1594
Remove getEHExceptionRegister and getEHHandlerRegister. · e90fd9c5
Rafael Espindola authored Oct 07, 2013
```
They haven't been used for a long time. Patch by MathOnNapkins.

llvm-svn: 192099
```
e90fd9c5

Remove some instructions that seem to only exist to trick the filtering checks... · 07ad1b23

Craig Topper authored Oct 07, 2013

Remove some instructions that seem to only exist to trick the filtering checks in the disassembler table creation. Just fix up the filter to let the real instruction through instead.

llvm-svn: 192090

07ad1b23

Remove FsMOVAPSrr and friends. They have no patterns and are no longer selected anywhere. · 68d2546e
Craig Topper authored Oct 07, 2013
```
llvm-svn: 192089
```
68d2546e

Teach X86 asm parser that VMOVAPSrr and other VEX-encoded register to register... · a0e0735e

Craig Topper authored Oct 07, 2013

Teach X86 asm parser that VMOVAPSrr and other VEX-encoded register to register moves should be switched from using the MRMSrcReg form to the MRMDestReg form if the source register is a 64-bit extended register and the destination register is not.

This allows the instruction to be encoded using the 2-byte VEX form instead of the 3-byte VEX form. The GNU assembler has similar behavior and instruction selection already does this.

llvm-svn: 192088

a0e0735e

Add disassembler support for long encodings for INC/DEC in 32-bit mode. · 2658d897
Craig Topper authored Oct 07, 2013
```
llvm-svn: 192086
```
2658d897

Oct 06, 2013
- X86: Don't fold spills into SSE operations if the stack is unaligned. · 858a3880
  Benjamin Kramer authored Oct 06, 2013
```
Regalloc can emit unaligned spills nowadays, but we can't fold the
spills into SSE ops if we can't guarantee alignment. PR12250.

llvm-svn: 192064
```
  858a3880
- AVX-512: added scalar convert instructions and intrinsics. · 2e408aef
  Elena Demikhovsky authored Oct 06, 2013
```
Fixed load folding in VPERM2I instruction.

llvm-svn: 192063
```
  2e408aef
- AVX-512: fixed shuffle lowering · 462a2d23
  Elena Demikhovsky authored Oct 06, 2013
```
in case of BLEND and added VSHUFPS patterns.

llvm-svn: 192055
```
  462a2d23