Commits · d952b247d14b936321c47819a4b674fb02f10a0c · Roger Ferrer / llvm-epi-0.8

Nov 05, 2008

Add more vector move low and zero-extend patterns. · 27889ab2
Evan Cheng authored Nov 05, 2008
```
llvm-svn: 58752
```
27889ab2
Indentation. · 3cd5e8c9
Evan Cheng authored Nov 05, 2008
```
llvm-svn: 58750
```
3cd5e8c9

Eliminate the ISel priority queue, which used the topological order for a · f14b77eb

Dan Gohman authored Nov 05, 2008

priority function. Instead, just iterate over the AllNodes list, which is
already in topological order. This eliminates a fair amount of bookkeeping,
and speeds up the isel phase by about 15% on many testcases.

The impact on most targets is that AddToISelQueue calls can be simply removed.

In the x86 target, there are two additional notable changes.

The rule-bending AND+SHIFT optimization in MatchAddress that creates new
pre-isel nodes during isel is now a little more verbose, but more robust.
Instead of either creating an invalid DAG or creating an invalid topological
sort, as it has historically done, it can now just insert the new nodes into
the node list at a position where they will be consistent with the topological
ordering.

Also, the address-matching code has logic that checked to see if a node was
"already selected". However, when a node is selected, it has all its uses
taken away via ReplaceAllUsesWith or equivalent, so it won't recieve any
further visits from MatchAddress. This code is now removed.

llvm-svn: 58748

f14b77eb

Rename isGVLazyPtr to isGVNonLazyPtr relocation. This represents Mac OS X · 132de198
Evan Cheng authored Nov 05, 2008
```
indirect gv reference. Please don't call it lazy.

llvm-svn: 58746
```
132de198

Nov 04, 2008
- The ANDMask node folds to a constant, and isn't the node that needs to · b9110e7f
  Dan Gohman authored Nov 03, 2008
```
have its node id set. The new and and shift nodes are the nodes that need
the IDs. This fixes PR2982.

llvm-svn: 58655
```
  b9110e7f
Nov 03, 2008
- Refactor various TargetAsmInfo subclasses' TargetMachine members away · ac41d9f5
  Dan Gohman authored Nov 03, 2008
```
adding a TargetMachine member to the base TargetAsmInfo class instead.

llvm-svn: 58624
```
  ac41d9f5
Oct 31, 2008
- x86_64 rip-relative and magic mode address · ef89465c
  Mon P Wang authored Oct 31, 2008
```
llvm-svn: 58528
```
  ef89465c
- Revert r58489. It isn't correct for all cases. · d2bc1338
  Bill Wendling authored Oct 31, 2008
```
llvm-svn: 58523
```
  d2bc1338
- Change x86 register allocation ordering to match that of gcc. Otherwise some... · 7e8202fc
  Evan Cheng authored Oct 31, 2008
```
Change x86 register allocation ordering to match that of gcc. Otherwise some tools get confused by prologue generated by llvm.

llvm-svn: 58517
```
  7e8202fc
- Don't skip over all "terminator" instructions when determining where to put the · 6d70df0b
  Bill Wendling authored Oct 31, 2008
```
callee-saved restore code. It could skip over conditional jumps
accidentally. Instead, just skip the "return" instructions.

llvm-svn: 58489
```
  6d70df0b
- Use MOVSSmr instead of EXTRACTPSmr in the case of extracting · 99cdf889
  Dan Gohman authored Oct 31, 2008
```
vector element 0 for a store, as it's smaller and faster.

llvm-svn: 58483
```
  99cdf889
Oct 30, 2008
- Add initial support for vector widening. Logic is set to widen for X86. · 58c3794c
  Mon P Wang authored Oct 30, 2008
```
One will only see an effect if legalizetype is not active.  Will move
support to LegalizeType soon.

llvm-svn: 58426
```
  58c3794c
- Fix PEXTRQ encoding · 534ac08e
  Nate Begeman authored Oct 29, 2008
```
llvm-svn: 58403
```
  534ac08e
Oct 28, 2008

Fix a nasty miscompilation of 176.gcc on linux/x86 where we synthesized · 38461f6b

Chris Lattner authored Oct 28, 2008

a memset using 16-byte XMM stores, but where the stack realignment code
didn't work.  Until it does (PR2962) disable use of xmm regs in memcpy
and memset formation for linux and other targets with insufficiently
aligned stacks.

This is part of PR2888

llvm-svn: 58317

38461f6b

Oct 27, 2008

· ce2a9381

David Greene authored Oct 27, 2008

Have TableGen emit setSubgraphColor calls under control of a -gen-debug
flag.  Then in a debugger developers can set breakpoints at these calls
to see waht is about to be selected and what the resulting subgraph
looks like.  This really helps when debugging instruction selection.

llvm-svn: 58278

ce2a9381

For now, don't split live intervals around x87 stack register barriers.... · f7137229

Evan Cheng authored Oct 27, 2008

For now, don't split live intervals around x87 stack register barriers. FpGET_ST0_80 must be right after a call instruction (and ADJCALLSTACKUP) so we need to find a way to prevent reload of x87 registers between them.

llvm-svn: 58230

f7137229

Oct 25, 2008

Move the code that adds the DeadMachineInstructionElimPass from · 19145317

Dan Gohman authored Oct 25, 2008

target-independent code to target-specific code. This prevents it
from running on targets that aren't using fast-isel.

In addition to saving compile time, this addresses the problem
that not all targets are prepared for it. In order to use this
pass, all instructions must declare all their fixed uses and
defs of physical registers.

llvm-svn: 58144

19145317

Support for allocation of TLS variables in the JIT. Allocation of a global · 5457ce9a

Nicolas Geoffray authored Oct 25, 2008

variable is moved to the execution engine. The JIT calls the TargetJITInfo
to allocate thread local storage. Currently, only linux/x86 knows how to
allocate thread local global variables.

llvm-svn: 58142

5457ce9a

Generate code for TLS instructions. · db30612f
Nicolas Geoffray authored Oct 25, 2008
```
llvm-svn: 58141
```
db30612f

Oct 24, 2008

move the note to the correct README · 33986d8f
Torok Edwin authored Oct 24, 2008
```
llvm-svn: 58104
```
33986d8f

Fix translateX86CC: if SetCCOpcode is SETULE and · 014f5bba

Duncan Sands authored Oct 24, 2008

LHS is a foldable load, then LHS and RHS are swapped
and SetCCOpcode is changed to SETUGT.  But the later
code is expecting operands to be the wrong way round
for SETUGT, but they are not in this case, resulting
in an inverted compare.  The solution is to move the
load normalization before the correction for SETUGT.
This bug was tickled by LegalizeTypes which happened
to legalize the testcase slightly differently to
LegalizeDAG.

llvm-svn: 58092

014f5bba

Fix constant-offset emission for x86-64 absolute addresses. This · 712886f5
Dan Gohman authored Oct 24, 2008
```
fixes a bunch of test-suite JIT failures on x86-64 in
-relocation-model=static mode.

llvm-svn: 58066
```
712886f5

Oct 22, 2008
- Remove allocation of unused stack slot. · f6655a9e
  Dale Johannesen authored Oct 22, 2008
```
llvm-svn: 57987
```
  f6655a9e
- Get this working with LegalizeTypes: (1) don't · 5ee1dde8
  Duncan Sands authored Oct 22, 2008
```
assume that i64 has been turned into a BUILD_PAIR
node (when called from LegalizeTypes this hasn't
happened yet) and don't use a vector shuffle mask
with an illegal element type.

llvm-svn: 57972
```
  5ee1dde8
- CMake: Turned some libraries into partially linked objects. Corrected · f3c03b02
  Oscar Fuentes authored Oct 22, 2008
```
names of LLVMCore and ARMCodeGen.

llvm-svn: 57943
```
  f3c03b02
- Adjust comments for pedantic satisfaction. · cf4607fc
  Dale Johannesen authored Oct 22, 2008
```
llvm-svn: 57940
```
  cf4607fc
- Add comments to explain uint64->f64 algorithm, · 3d7ece1a
  Dale Johannesen authored Oct 21, 2008
```
well, sort of.  (Algorithm by Ian Ollmann.)

llvm-svn: 57932
```
  3d7ece1a
Oct 21, 2008

Add an SSE2 algorithm for uint64->f64 conversion. · 28929589

Dale Johannesen authored Oct 21, 2008

The same one Apple gcc uses, faster.  Also gets the
extreme case in gcc.c-torture/execute/ieee/rbug.c
correct which we weren't before; this is not
sufficient to get the test to pass though, there
is another bug.

llvm-svn: 57926

28929589

Implement the optimized FCMP_OEQ/FCMP_UNE code for x86 fast-isel. · 4ddf7a4c
Dan Gohman authored Oct 21, 2008
```
llvm-svn: 57915
```
4ddf7a4c

Don't create TargetGlobalAddress nodes with offsets that don't fit · 269246b0

Dan Gohman authored Oct 21, 2008

in the 32-bit signed offset field of addresses. Even though this
may be intended, some linkers refuse to relocate code where the
relocated address computation overflows.

Also, fix the sign-extension of constant offsets to use the
actual pointer size, rather than the size of the GlobalAddress
node, which may be different, for example on x86-64 where MVT::i32
is used when the address is being fit into the 32-bit displacement
field.

llvm-svn: 57885

269246b0

Optimized FCMP_OEQ and FCMP_UNE for x86. · 97d95d6d

Dan Gohman authored Oct 21, 2008

Where previously LLVM might emit code like this:

        ucomisd %xmm1, %xmm0
        setne   %al
        setp    %cl
        orb     %al, %cl
        jne     .LBB4_2

it now emits this:

        ucomisd %xmm1, %xmm0
        jne     .LBB4_2
        jp      .LBB4_2

It has fewer instructions and uses fewer registers, but it does
have more branches. And in the case that this code is followed by
a non-fallthrough edge, it may be followed by a jmp instruction,
resulting in three branch instructions in sequence. Some effort
is made to avoid this situation.

To achieve this, X86ISelLowering.cpp now recognizes FCMP_OEQ and
FCMP_UNE in lowered form, and replace them with code that emits
two branches, except in the case where it would require converting
a fall-through edge to an explicit branch.

Also, X86InstrInfo.cpp's branch analysis and transform code now
knows now to handle blocks with multiple conditional branches. It
uses loops instead of having fixed checks for up to two
instructions. It can now analyze and transform code generated
from FCMP_OEQ and FCMP_UNE.

llvm-svn: 57873

97d95d6d

When the coalescer is doing rematerializing, have it remove · c835458d

Dan Gohman authored Oct 21, 2008

the copy instruction from the instruction list before asking the
target to create the new instruction. This gets the old instruction
out of the way so that it doesn't interfere with the target's
rematerialization code. In the case of x86, this helps it find
more cases where EFLAGS is not live.

Also, in the X86InstrInfo.cpp, teach isSafeToClobberEFLAGS to check
to see if it reached the end of the block after scanning each
instruction, instead of just before. This lets it notice when the
end of the block is only two instructions away, without doing any
additional scanning.

These changes allow rematerialization to clobber EFLAGS in more
cases, for example using xor instead of mov to set the return value
to zero in the included testcase.

llvm-svn: 57872

c835458d

Oct 20, 2008

Have X86 custom lowering for LegalizeTypes use · 1d20ab57

Duncan Sands authored Oct 20, 2008

LowerOperation if it doesn't know what else to do.
This methods should probably be factorized some,
but this is good enough for the moment.  Have
LowerATOMIC_BINARY_64 use EXTRACT_ELEMENT rather
than assuming the operand is a BUILD_PAIR (if it
is then getNode will automagically simplify the
EXTRACT_ELEMENT).  This way LowerATOMIC_BINARY_64
usable from LegalizeTypes.

llvm-svn: 57831

1d20ab57

Oct 18, 2008

Teach DAGCombine to fold constant offsets into GlobalAddress nodes, · 2fe6bee5

Dan Gohman authored Oct 18, 2008

and add a TargetLowering hook for it to use to determine when this
is legal (i.e. not in PIC mode, etc.)

This allows instruction selection to emit folded constant offsets
in more cases, such as the included testcase, eliminating the need
for explicit arithmetic instructions.

This eliminates the need for the C++ code in X86ISelDAGToDAG.cpp
that attempted to achieve the same effect, but wasn't as effective.

Also, fix handling of offsets in GlobalAddressSDNodes in several
places, including changing GlobalAddressSDNode's offset from
int to int64_t.

The Mips, Alpha, Sparc, and CellSPU targets appear to be
unaware of GlobalAddress offsets currently, so set the hook to
false on those targets.

llvm-svn: 57748

2fe6bee5

Oct 17, 2008
- This is done. · b1d8d6ec
  Dan Gohman authored Oct 17, 2008
```
llvm-svn: 57733
```
  b1d8d6ec
- Add implicit defs of XMM8 to XMM15 on 32-bit call instructions. While this is... · 0fcc89b5
  Evan Cheng authored Oct 17, 2008
```
Add implicit defs of XMM8 to XMM15 on 32-bit call instructions. While this is not technically true, it tells tblgen that these instructions "clobber" the entire XMM register file.

llvm-svn: 57723
```
  0fcc89b5
- add support for 128 bit inputs on both x86-64 and x86-32. · 8e2ef196
  Chris Lattner authored Oct 17, 2008
```
llvm-svn: 57709
```
  8e2ef196
- Fix a bug where the x86 backend would reject 64-bit r constraints when · c7e65f43
  Chris Lattner authored Oct 17, 2008
```
in 32-bit mode instead of assigning a register pair.  This has nothing to
do with PR2356, but I happened to notice it while working on it.

llvm-svn: 57704
```
  c7e65f43
- Fix lfence and mfence encoding. These look like MRM5r and MRM6r instructions... · 27c37022
  Evan Cheng authored Oct 17, 2008
```
Fix lfence and mfence encoding. These look like MRM5r and MRM6r instructions except they do not have any operands. The RegModRM byte is encoded with register number 0.

llvm-svn: 57692
```
  27c37022
- getX86RegNum has long been moved to X86RegisterInfo. · 9e23d746
  Evan Cheng authored Oct 17, 2008
```
llvm-svn: 57691
```
  9e23d746