Commits · bcf15388abaed2a55ae772fd2ee5872662d13a8e · Roger Ferrer / llvm-epi-0.8

Dec 25, 2008
- Add a simple pattern for matching 'bt'. · 2a7c9886
  Chris Lattner authored Dec 25, 2008
```
llvm-svn: 61426
```
  2a7c9886
- translateX86CC can never fail. Simplify it based on this. · 8175f27d
  Chris Lattner authored Dec 24, 2008
```
llvm-svn: 61423
```
  8175f27d
Dec 24, 2008
- indentation · 4b46b74e
  Chris Lattner authored Dec 24, 2008
```
llvm-svn: 61407
```
  4b46b74e
- simplify some control flow and reduce indentation, no functionality change. · e9988b66
  Chris Lattner authored Dec 23, 2008
```
llvm-svn: 61404
```
  e9988b66
Dec 23, 2008

Add instruction patterns and encodings for the x86 bt instructions. · 25a767d7
Dan Gohman authored Dec 23, 2008
```
llvm-svn: 61400
```
25a767d7

Clean up the atomic opcodes in SelectionDAG. · 12f24904

Dan Gohman authored Dec 23, 2008

This removes all the _8, _16, _32, and _64 opcodes and replaces each
group with an unsuffixed opcode. The MemoryVT field of the AtomicSDNode
is now used to carry the size information. In tablegen, the size-specific
opcodes are replaced by size-independent opcodes that utilize the
ability to compose them with predicates.

This shrinks the per-opcode tables and makes the code that handles
atomics much more concise.

llvm-svn: 61389

12f24904

Fixed code generation for v8i16 and v16i8 splats on X86. · ec95070c
Mon P Wang authored Dec 23, 2008
```
Fixed lowering of v8i16 shuffles for v8i16 when we fall back to extract/insert.

llvm-svn: 61365
```
ec95070c

Dec 18, 2008
- Fixed x86 code generation of multiple for v2i64. It was incorrect for SSE4.1. · 998fd29c
  Mon P Wang authored Dec 18, 2008
```
llvm-svn: 61211
```
  998fd29c
Dec 12, 2008

- Use patterns instead of creating completely new instruction matching patterns, · c4499feb

Bill Wendling authored Dec 12, 2008

  which are identical to the original patterns.

- Change the multiply with overflow so that we distinguish between signed and
  unsigned multiplication. Currently, unsigned multiplication with overflow
  isn't working!

llvm-svn: 60963

c4499feb

Added support for SELECT v8i8 v4i16 for X86 (MMX) · 9c2d26d2
Mon P Wang authored Dec 12, 2008
```
Added support for TRUNC v8i16 to v8i8 for X86 (MMX)

llvm-svn: 60916
```
9c2d26d2

Redo the arithmetic with overflow architecture. I was changing the semantics of · 1a317678

Bill Wendling authored Dec 12, 2008

ISD::ADD to emit an implicit EFLAGS. This was horribly broken. Instead, replace
the intrinsic with an ISD::SADDO node. Then custom lower that into an
X86ISD::ADD node with a associated SETCC that checks the correct condition code
(overflow or carry). Then that gets lowered into the correct X86::ADDOvf
instruction.

Similar for SUB and MUL instructions.

llvm-svn: 60915

1a317678

Dec 10, 2008
- Whitespace changes. · f482f379
  Bill Wendling authored Dec 10, 2008
```
llvm-svn: 60826
```
  f482f379
Dec 09, 2008

Add sub/mul overflow intrinsics. This currently doesn't have a · db8ec2d7

Bill Wendling authored Dec 09, 2008

target-independent way of determining overflow on multiplication. It's very
tricky. Patch by Zoltan Varga!

llvm-svn: 60800

db8ec2d7

Dec 05, 2008

Make LoopStrengthReduce smarter about hoisting things out of · 9efd2ce5

Dale Johannesen authored Dec 05, 2008

loops when they can be subsumed into addressing modes.

Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)

llvm-svn: 60608

9efd2ce5

Dec 03, 2008
- Refactor code. No functionality change. · 501089f6
  Evan Cheng authored Dec 03, 2008
```
llvm-svn: 60478
```
  501089f6
- CC should only be a ConstantSDNode at this point. Just use 'cast' instead of 'dyn_cast'. · f8d1ef98
  Bill Wendling authored Dec 03, 2008
```
llvm-svn: 60477
```
  f8d1ef98
Dec 02, 2008

Second stab at target-dependent lowering of everyone's favorite nodes: [SU]ADDO · 30e9dc81

Bill Wendling authored Dec 02, 2008

- LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The
  EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C,
  depending on the type of ADDO requested.

- LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or
  COND_C set.

llvm-svn: 60388

30e9dc81

Dec 01, 2008

There are no longer any places that require a · 3d960941

Duncan Sands authored Dec 01, 2008

MERGE_VALUES node with only one operand, so get
rid of special code that only existed to handle
that possibility.

llvm-svn: 60349

3d960941

Change the interface to the type legalization method · 6ed40141

Duncan Sands authored Dec 01, 2008

ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.

llvm-svn: 60348

6ed40141

Nov 27, 2008
- Comment out code that isn't entirely correct. · 128f032c
  Bill Wendling authored Nov 27, 2008
```
llvm-svn: 60156
```
  128f032c
Nov 26, 2008

Generate something sensible for an [SU]ADDO op when the overflow/carry flag is · 751a694a

Bill Wendling authored Nov 26, 2008

the conditional for the BRCOND statement. For instance, it will generate:

    addl %eax, %ecx
    jo LOF

instead of

    addl %eax, %ecx
    ; About 10 instructions to compare the signs of LHS, RHS, and sum.
    jl LOF

llvm-svn: 60123

751a694a

Nov 24, 2008
- - Make lowering of "add with overflow" customizable by back-ends. · 66835479
  Bill Wendling authored Nov 24, 2008
```
- Mark "add with overflow" as having a custom lowering for X86. Give it a null
  lowering representation for now.

llvm-svn: 59971
```
  66835479
- Added missing description for -disable-mmx option. · 35a70ec1
  Mon P Wang authored Nov 24, 2008
```
llvm-svn: 59929
```
  35a70ec1
Nov 23, 2008
- Rename SetCCResultContents to BooleanContents. In · 8d6e2e13
  Duncan Sands authored Nov 23, 2008
```
practice these booleans are mostly produced by SetCC,
however the concept is more general.

llvm-svn: 59911
```
  8d6e2e13
- Added -disable-mmx using a patch from Preston Gurd. · 0aa8f0a5
  Mon P Wang authored Nov 23, 2008
```
llvm-svn: 59901
```
  0aa8f0a5
Nov 13, 2008

Extend InlineAsm::C_Register to allow multiple specific registers · bee1ad97

Dale Johannesen authored Nov 13, 2008

(actually, code already all worked, only the comment
changed).  Use this to implement 'A' constraint on x86.
Fixes PR 1779.

llvm-svn: 59266

bee1ad97

Nov 06, 2008
- Widening cleanup · 9a8d60a7
  Mon P Wang authored Nov 06, 2008
```
llvm-svn: 58796
```
  9a8d60a7
Nov 05, 2008
- Indentation. · 3cd5e8c9
  Evan Cheng authored Nov 05, 2008
```
llvm-svn: 58750
```
  3cd5e8c9
Oct 31, 2008
- Use MOVSSmr instead of EXTRACTPSmr in the case of extracting · 99cdf889
  Dan Gohman authored Oct 31, 2008
```
vector element 0 for a store, as it's smaller and faster.

llvm-svn: 58483
```
  99cdf889
Oct 30, 2008
- Add initial support for vector widening. Logic is set to widen for X86. · 58c3794c
  Mon P Wang authored Oct 30, 2008
```
One will only see an effect if legalizetype is not active.  Will move
support to LegalizeType soon.

llvm-svn: 58426
```
  58c3794c
Oct 28, 2008

Fix a nasty miscompilation of 176.gcc on linux/x86 where we synthesized · 38461f6b

Chris Lattner authored Oct 28, 2008

a memset using 16-byte XMM stores, but where the stack realignment code
didn't work.  Until it does (PR2962) disable use of xmm regs in memcpy
and memset formation for linux and other targets with insufficiently
aligned stacks.

This is part of PR2888

llvm-svn: 58317

38461f6b

Oct 24, 2008

Fix translateX86CC: if SetCCOpcode is SETULE and · 014f5bba

Duncan Sands authored Oct 24, 2008

LHS is a foldable load, then LHS and RHS are swapped
and SetCCOpcode is changed to SETUGT.  But the later
code is expecting operands to be the wrong way round
for SETUGT, but they are not in this case, resulting
in an inverted compare.  The solution is to move the
load normalization before the correction for SETUGT.
This bug was tickled by LegalizeTypes which happened
to legalize the testcase slightly differently to
LegalizeDAG.

llvm-svn: 58092

014f5bba

Oct 22, 2008
- Remove allocation of unused stack slot. · f6655a9e
  Dale Johannesen authored Oct 22, 2008
```
llvm-svn: 57987
```
  f6655a9e
- Get this working with LegalizeTypes: (1) don't · 5ee1dde8
  Duncan Sands authored Oct 22, 2008
```
assume that i64 has been turned into a BUILD_PAIR
node (when called from LegalizeTypes this hasn't
happened yet) and don't use a vector shuffle mask
with an illegal element type.

llvm-svn: 57972
```
  5ee1dde8
- Adjust comments for pedantic satisfaction. · cf4607fc
  Dale Johannesen authored Oct 22, 2008
```
llvm-svn: 57940
```
  cf4607fc
- Add comments to explain uint64->f64 algorithm, · 3d7ece1a
  Dale Johannesen authored Oct 21, 2008
```
well, sort of.  (Algorithm by Ian Ollmann.)

llvm-svn: 57932
```
  3d7ece1a
Oct 21, 2008

Add an SSE2 algorithm for uint64->f64 conversion. · 28929589

Dale Johannesen authored Oct 21, 2008

The same one Apple gcc uses, faster.  Also gets the
extreme case in gcc.c-torture/execute/ieee/rbug.c
correct which we weren't before; this is not
sufficient to get the test to pass though, there
is another bug.

llvm-svn: 57926

28929589

Don't create TargetGlobalAddress nodes with offsets that don't fit · 269246b0

Dan Gohman authored Oct 21, 2008

in the 32-bit signed offset field of addresses. Even though this
may be intended, some linkers refuse to relocate code where the
relocated address computation overflows.

Also, fix the sign-extension of constant offsets to use the
actual pointer size, rather than the size of the GlobalAddress
node, which may be different, for example on x86-64 where MVT::i32
is used when the address is being fit into the 32-bit displacement
field.

llvm-svn: 57885

269246b0

Optimized FCMP_OEQ and FCMP_UNE for x86. · 97d95d6d

Dan Gohman authored Oct 21, 2008

Where previously LLVM might emit code like this:

        ucomisd %xmm1, %xmm0
        setne   %al
        setp    %cl
        orb     %al, %cl
        jne     .LBB4_2

it now emits this:

        ucomisd %xmm1, %xmm0
        jne     .LBB4_2
        jp      .LBB4_2

It has fewer instructions and uses fewer registers, but it does
have more branches. And in the case that this code is followed by
a non-fallthrough edge, it may be followed by a jmp instruction,
resulting in three branch instructions in sequence. Some effort
is made to avoid this situation.

To achieve this, X86ISelLowering.cpp now recognizes FCMP_OEQ and
FCMP_UNE in lowered form, and replace them with code that emits
two branches, except in the case where it would require converting
a fall-through edge to an explicit branch.

Also, X86InstrInfo.cpp's branch analysis and transform code now
knows now to handle blocks with multiple conditional branches. It
uses loops instead of having fixed checks for up to two
instructions. It can now analyze and transform code generated
from FCMP_OEQ and FCMP_UNE.

llvm-svn: 57873

97d95d6d

Oct 20, 2008

Have X86 custom lowering for LegalizeTypes use · 1d20ab57

Duncan Sands authored Oct 20, 2008

LowerOperation if it doesn't know what else to do.
This methods should probably be factorized some,
but this is good enough for the moment.  Have
LowerATOMIC_BINARY_64 use EXTRACT_ELEMENT rather
than assuming the operand is a BUILD_PAIR (if it
is then getNode will automagically simplify the
EXTRACT_ELEMENT).  This way LowerATOMIC_BINARY_64
usable from LegalizeTypes.

llvm-svn: 57831

1d20ab57