Commits · 7de05027a56b80c5a9a3e5bfffb6431384e527e3 · Roger Ferrer / llvm-epi-0.8

Oct 02, 2011

Special case disassembler handling of REX.B prefix on NOP instruction to... · 21c33657

Craig Topper authored Oct 02, 2011

Special case disassembler handling of REX.B prefix on NOP instruction to decode as XCHG R8D, EAX instead. Fixes PR10344.

llvm-svn: 140971

21c33657

Oct 01, 2011

Fix disassembling of INVEPT and INVVPID to take operands · d07a59f2
Craig Topper authored Oct 01, 2011
```
llvm-svn: 140955
```
d07a59f2

Fix disassembler handling of CRC32 which is an odd instruction that uses 0xf2... · 88cb33e0

Craig Topper authored Oct 01, 2011

Fix disassembler handling of CRC32 which is an odd instruction that uses 0xf2 as an opcode extension and allows the opsize prefix. This necessitated adding IC_XD_OPSIZE and IC_64BIT_XD_OPSIZE contexts. Unfortunately, this increases the size of the disassembler tables. Fixes PR10702.

llvm-svn: 140954

88cb33e0

Store sub-class lists as a bit vector. · 237dceff

Jakob Stoklund Olesen authored Sep 30, 2011

This uses less memory and it reduces the complexity of sub-class
operations:

- hasSubClassEq() and friends become O(1) instead of O(N).

- getCommonSubClass() becomes O(N) instead of O(N^2).

In the future, TableGen will infer register classes.  This makes it
cheap to add them.

llvm-svn: 140898

237dceff

Sep 29, 2011

Expand the x86 V_SET0* pseudos right after register allocation. · dd1904e7

Jakob Stoklund Olesen authored Sep 29, 2011

This also makes it possible to reduce the number of pseudo instructions
and get rid of the encoding information.

llvm-svn: 140776

dd1904e7

Sep 28, 2011
- PR11033: Make sure we don't generate PCMPGTQ and PCMPEQQ if the target CPU does not support them. · 2fb357a5
  Eli Friedman authored Sep 28, 2011
```
llvm-svn: 140723
```
  2fb357a5
- Rename SSEDomainFix -> lib/CodeGen/ExecutionDepsFix. · 934b7d76
  Jakob Stoklund Olesen authored Sep 28, 2011
```
I'll clean up the source in the next commit.

llvm-svn: 140663
```
  934b7d76
- Remove X86-dependent stuff from SSEDomainFix. · 30c81124
  Jakob Stoklund Olesen authored Sep 27, 2011
```
This also enables domain swizzling for AVX code which required a few
trivial test changes.

The pass will be moved to lib/CodeGen shortly.

llvm-svn: 140659
```
  30c81124
- Promote the X86 Get/SetSSEDomain functions to TargetInstrInfo. · b48c994c
  Jakob Stoklund Olesen authored Sep 27, 2011
```
I am going to unify the SSEDomainFix and NEONMoveFix passes into a
single target independent pass.  They are essentially doing the same
thing.

llvm-svn: 140652
```
  b48c994c
Sep 26, 2011
- Fix VEX decoding in i386 mode. Fixes PR11008. · 45faba98
  Craig Topper authored Sep 26, 2011
```
llvm-svn: 140515
```
  45faba98
Sep 24, 2011
- Only run MF.verify() with EXPENSIVE_CHECKS=1. · 55cf2ed1
  Jakob Stoklund Olesen authored Sep 24, 2011
```
llvm-svn: 140441
```
  55cf2ed1
Sep 23, 2011

Implement Chris's suggestion of legalizing the various SSE and AVX · a54fd541
Duncan Sands authored Sep 23, 2011
```
hadd/hsub intrinsics into the new fhadd/fhsub X86 node.

llvm-svn: 140383
```
a54fd541

PR10991: make fast-isel correctly check whether accessing a global through an... · 87c844cd

Eli Friedman authored Sep 22, 2011

PR10991: make fast-isel correctly check whether accessing a global through an alias involves thread-local storage.  (I'm not entirely sure how this is supposed to work, but this patch makes fast-isel consistent with the normal isel path.)

llvm-svn: 140355

87c844cd

Add support for GR32 <-> FR32 cross class copies. · f05864ad

Jakob Stoklund Olesen authored Sep 22, 2011

We already support GR64 <-> VR128 copies.  All of these copies break
partial register dependencies by zeroing the high part of the target
register.

llvm-svn: 140348

f05864ad

Sep 22, 2011

Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from · 0e4fcb8e

Duncan Sands authored Sep 22, 2011

floating point add/sub of appropriate shuffle vectors.  Does not
synthesize the 256 bit AVX versions because they work differently.

llvm-svn: 140332

0e4fcb8e

Fix register printing in disassembling of push/pop of segment registers and... · 6d1872b7

Craig Topper authored Sep 22, 2011

Fix register printing in disassembling of push/pop of segment registers and in/out in Intel syntax mode. Fixes PR10960

llvm-svn: 140299

6d1872b7

The SSE version differences for fmin/fmax are more involved than I thought. · cfd26cd7

Benjamin Kramer authored Sep 22, 2011

- x87: no min or max.
- SSE1: min/max for single precision scalars and vectors.
- SSE2: min/max for single and double precision scalars and vectors.
- AVX: as SSE2, but also supports the wider ymm vectors. (this is covered by the isTypeLegal check)

llvm-svn: 140296

cfd26cd7

X86: Don't form min/max nodes if the target is missing SSE. · dc397a64
Benjamin Kramer authored Sep 22, 2011
```
llvm-svn: 140294
```
dc397a64

Sep 21, 2011
- X86Disassembler: if verbose logging is going to nulls(), disable logging completely. · e5e189f6
  Benjamin Kramer authored Sep 21, 2011
```
Otherwise we'll spend a ridiculous amount of time pretty printing debug output and then discarding it.

llvm-svn: 140276
```
  e5e189f6
- fix comment · 50f123d8
  Nadav Rotem authored Sep 21, 2011
```
llvm-svn: 140258
```
  50f123d8
- Insert a sanity check on the combining of x86 truncing-store nodes. This comes... · c1cd8506
  Nadav Rotem authored Sep 21, 2011
```
Insert a sanity check on the combining of x86 truncing-store nodes. This comes to replace the problematic check that was removed in r139995.

llvm-svn: 140246
```
  c1cd8506
- Change: · a318b8dc
  Richard Trieu authored Sep 21, 2011
```
  assert(!"error message");

To:

  assert(0 && "error message");

which is more consistant across the code base.

llvm-svn: 140234
```
  a318b8dc
- In the disassembler C API, be careful not to confuse the comment streamer that... · 69fa8ffe
  Owen Anderson authored Sep 21, 2011
```
In the disassembler C API, be careful not to confuse the comment streamer that the disassembler outputs annotations on with the streamer that the InstPrinter will print them on.

llvm-svn: 140217
```
  69fa8ffe
- Revert r140097, working on a better approach · 8058234b
  Bruno Cardoso Lopes authored Sep 20, 2011
```
llvm-svn: 140203
```
  8058234b
- Simplify max/minp[s|d] dagcombine matching · f7638e1e
  Bruno Cardoso Lopes authored Sep 20, 2011
```
llvm-svn: 140199
```
  f7638e1e
Sep 20, 2011
- Tidy up a bit more, fix tab and remove trailing whitespaces · 60aa85b6
  Bruno Cardoso Lopes authored Sep 20, 2011
```
llvm-svn: 140186
```
  60aa85b6
- The wrong relocation was being emitted for several SSSE3 instructions. · 33e91a6c
  Bruno Cardoso Lopes authored Sep 20, 2011
```
This fixes PR10963. Thanks to Benjamin for finding the wrong tablegen
declaration.

llvm-svn: 140184
```
  33e91a6c
- Tidy up code! · 05f3f493
  Bruno Cardoso Lopes authored Sep 20, 2011
```
llvm-svn: 140183
```
  05f3f493
- Extend changes from r139986 to produce 256-bit AVX minps/minpd/maxps/maxpd. · 68c92d86
  Craig Topper authored Sep 20, 2011
```
llvm-svn: 140140
```
  68c92d86
- Fix PR10949. Fix the encoding of VMOVPQIto64rr. · c4398d2c
  Bruno Cardoso Lopes authored Sep 19, 2011
```
llvm-svn: 140098
```
  c4398d2c
- Based on the small opt Zvi's patch was trying to achieve, eliminate · 51792dcc
  Bruno Cardoso Lopes authored Sep 19, 2011
```
128-bit undef subvector insertion into a 256-bit vector

llvm-svn: 140097
```
  51792dcc
Sep 19, 2011
- Match X86ISD::FSETCCsd and X86ISD::FSETCCss while in AVX mode. This fix · d4a3d452
  Bruno Cardoso Lopes authored Sep 19, 2011
```
PR10955 and PR10948.

llvm-svn: 140069
```
  d4a3d452
Sep 18, 2011
- Fix typos in my prev commit, found by Tobi. · 763c11cc
  Nadav Rotem authored Sep 18, 2011
```
llvm-svn: 140003
```
  763c11cc
- setOperationAction should be done on the return value of the type, not the operands. · 261a10a0
  Nadav Rotem authored Sep 18, 2011
```
llvm-svn: 140001
```
  261a10a0
- When promoting integer vectors we often create ext-loads. This patch adds a · 7ae11279
  Nadav Rotem authored Sep 18, 2011
```
dag-combine optimization to implement the ext-load efficiently (using shuffles).

For example the type <4 x i8> is stored in memory as i32, but it needs to
find its way into a <4 x i32> register. Previously we scalarized the memory
access, now we use shuffles.

llvm-svn: 139995
```
  7ae11279
- Fix typo by changing Lower256IntVETCC to Lower256IntVSETCC. · d9d01917
  Craig Topper authored Sep 18, 2011
```
llvm-svn: 139993
```
  d9d01917
Sep 17, 2011

Synthesize x86 max/min instructions also for vectors (i.e. produce · f2b8c854

Duncan Sands authored Sep 17, 2011

maxps and maxpd).  This broke the sse41-blend.ll testcase by causing
maxpd to be produced rather than a cmp+blend pair, which is the reason
I tweaked it.  Gives a small speedup on doduc with dragonegg when the
GCC vectorizer is used.

llvm-svn: 139986

f2b8c854

Describe more AVX 128-bit convert instructions without patterns to have · 4641efe3
Bruno Cardoso Lopes authored Sep 16, 2011
```
mayLoad = 1

llvm-svn: 139973
```
4641efe3

Add mayLoad attribute to AVX convert instructions, since non of them · 5389ed5d

Bruno Cardoso Lopes authored Sep 16, 2011

are declared with load patterns. This fix the crash in PR10941. No testcases,
since a fold is triggered and then converted back to the register form
afterwards.

llvm-svn: 139953

5389ed5d

Sep 16, 2011

Fix PR10884. · 2d406f02

Bruno Cardoso Lopes authored Sep 16, 2011

This PR basically reports a problem where a crash in generated code
happened due to %rbp being clobbered:

  pushq %rbp
  movq  %rsp, %rbp
  ....
  vmovmskps %ymm12, %ebp
  ....
  movq  %rbp, %rsp
  popq  %rbp
  ret

Since Eric's r123367 commit, the default stack alignment for x86 32-bit
has changed to be 16-bytes. Since then, the MaxStackAlignmentHeuristicPass
hasn't been really used, but with AVX it becomes useful again, since per
ABI compliance we don't always align the stack to 256-bit, but only when
there are 256-bit incoming arguments.

ReserveFP was only used by this pass, but there's no RA target hook that
uses getReserveFP() to check for the presence of FP (since nothing was
triggering the pass to run, the uses of getReserveFP() were removed
through time without being noticed). Change this pass to use
setForceFramePointer, which is properly called by MachineFunction
hasFP method.

The testcase is very big and dependent on RA, not sure if it's worth
adding to test/CodeGen/X86.

llvm-svn: 139939

2d406f02