Commits · 610634fe391ca47be2cdee7f9402a5b6ff7364a4 · Roger Ferrer / llvm-epi-0.8

Oct 30, 2008
- Resolve bug 2947: vararg-marked functions must spill registers R3-R79 to stack · 487c4341
  Scott Michel authored Oct 30, 2008
```
so that va_start/va_arg/et.al. will walk arguments correctly for Cell SPU.

N.B.: Because neither clang nor llvm-gcc-4.2 can be built for CellSPU, this is
still unexorcised code.

llvm-svn: 58415
```
  487c4341
- Correct way to handle CONSTPOOL_ENTRY instructions. · 19d64ba8
  Evan Cheng authored Oct 29, 2008
```
llvm-svn: 58409
```
  19d64ba8
- Add debugging support. · de9dbc55
  Evan Cheng authored Oct 29, 2008
```
llvm-svn: 58408
```
  de9dbc55
- Fix PEXTRQ encoding · 534ac08e
  Nate Begeman authored Oct 29, 2008
```
llvm-svn: 58403
```
  534ac08e
Oct 29, 2008

Add a RM pseudoreg for the rounding mode, which · 98aa9d3e

Dale Johannesen authored Oct 29, 2008

allows ppcf128->int conversion to work with
DeadInstructionElimination.  This is now turned
off but RM is harmless.  It does not do a complete
job of modeling the rounding mode.

Revert marking MFCR as using all 7 CR subregisters;
while correct, this caused the problem in PR 2964,
plus the local RA crash noted in the comments.
This was needed to make DeadInstructionElimination,
but as we are not running that, it is backed out
for now.  Eventually it should go back in and the
other problems fixed where they're broken.

llvm-svn: 58391

98aa9d3e

Oct 28, 2008

Support for constant islands in the ARM JIT. · ff2b4948

Jim Grosbach authored Oct 28, 2008

Since the ARM constant pool handling supercedes the standard LLVM constant
pool entirely, the JIT emitter does not allocate space for the constants,
nor initialize the memory. The constant pool is considered part of the 
instruction stream.

Likewise, when resolving relocations into the constant pool, a hook into
the target back end is used to resolve from the constant ID# to the
address where the constant is stored.

For now, the support in the ARM emitter is limited to 32-bit integer. Future
patches will expand this to the full range of constants necessary.

llvm-svn: 58338

ff2b4948

Fix darwin ppc llvm-gcc build breakage: intercept · 4068a7f3

Duncan Sands authored Oct 28, 2008

ppcf128 to i32 conversion and expand it into a code
sequence like in LegalizeDAG.  This needs custom
ppc lowering of FP_ROUND_INREG, so turn that on and
make it work with LegalizeTypes.  Probably PPC should
simply custom lower the original conversion.

llvm-svn: 58329

4068a7f3

Fix a nasty miscompilation of 176.gcc on linux/x86 where we synthesized · 38461f6b

Chris Lattner authored Oct 28, 2008

a memset using 16-byte XMM stores, but where the stack realignment code
didn't work.  Until it does (PR2962) disable use of xmm regs in memcpy
and memset formation for linux and other targets with insufficiently
aligned stacks.

This is part of PR2888

llvm-svn: 58317

38461f6b

Oct 27, 2008

· ce2a9381

David Greene authored Oct 27, 2008

Have TableGen emit setSubgraphColor calls under control of a -gen-debug
flag.  Then in a debugger developers can set breakpoints at these calls
to see waht is about to be selected and what the resulting subgraph
looks like.  This really helps when debugging instruction selection.

llvm-svn: 58278

ce2a9381

For now, don't split live intervals around x87 stack register barriers.... · f7137229

Evan Cheng authored Oct 27, 2008

For now, don't split live intervals around x87 stack register barriers. FpGET_ST0_80 must be right after a call instruction (and ADJCALLSTACKUP) so we need to find a way to prevent reload of x87 registers between them.

llvm-svn: 58230

f7137229

Oct 25, 2008

Move the code that adds the DeadMachineInstructionElimPass from · 19145317

Dan Gohman authored Oct 25, 2008

target-independent code to target-specific code. This prevents it
from running on targets that aren't using fast-isel.

In addition to saving compile time, this addresses the problem
that not all targets are prepared for it. In order to use this
pass, all instructions must declare all their fixed uses and
defs of physical registers.

llvm-svn: 58144

19145317

Support for allocation of TLS variables in the JIT. Allocation of a global · 5457ce9a

Nicolas Geoffray authored Oct 25, 2008

variable is moved to the execution engine. The JIT calls the TargetJITInfo
to allocate thread local storage. Currently, only linux/x86 knows how to
allocate thread local global variables.

llvm-svn: 58142

5457ce9a

Generate code for TLS instructions. · db30612f
Nicolas Geoffray authored Oct 25, 2008
```
llvm-svn: 58141
```
db30612f
CMake: lib/Target/ARM/AsmPrinter/CMakeLists.txt added. · 9ba4650b
Oscar Fuentes authored Oct 25, 2008
```
llvm-svn: 58133
```
9ba4650b
Mark MFCR as reading all condition code registers. · 71f361e7
Dale Johannesen authored Oct 24, 2008
```
Prevents some more overzealous deletions (mostly
in AltiVec code).

llvm-svn: 58121
```
71f361e7

Oct 24, 2008

Rewrite logic to figure out whether LR needs to · 3863f8e7

Dale Johannesen authored Oct 24, 2008

be saved/restored in the prolog/epilog.  We need
to do this iff something in the function stores
into it.

llvm-svn: 58116

3863f8e7

move the note to the correct README · 33986d8f
Torok Edwin authored Oct 24, 2008
```
llvm-svn: 58104
```
33986d8f
add note about va_arg code on x86 and x86-64 · fcaae546
Torok Edwin authored Oct 24, 2008
```
llvm-svn: 58103
```
fcaae546

Fix translateX86CC: if SetCCOpcode is SETULE and · 014f5bba

Duncan Sands authored Oct 24, 2008

LHS is a foldable load, then LHS and RHS are swapped
and SetCCOpcode is changed to SETUGT.  But the later
code is expecting operands to be the wrong way round
for SETUGT, but they are not in this case, resulting
in an inverted compare.  The solution is to move the
load normalization before the correction for SETUGT.
This bug was tickled by LegalizeTypes which happened
to legalize the testcase slightly differently to
LegalizeDAG.

llvm-svn: 58092

014f5bba

Fix constant-offset emission for x86-64 absolute addresses. This · 712886f5
Dan Gohman authored Oct 24, 2008
```
fixes a bunch of test-suite JIT failures on x86-64 in
-relocation-model=static mode.

llvm-svn: 58066
```
712886f5

Oct 23, 2008
- Mark defs and uses of CTR and LR correctly. · e395d786
  Dale Johannesen authored Oct 23, 2008
```
Prevents DeadMachineInstructionElim from thinking
things like MTCTR are dead (fixes massive
testsuite breakage at -O0).

llvm-svn: 58043
```
  e395d786
- remove extraneous #ifdef's · 1ecf1fd5
  Jim Grosbach authored Oct 22, 2008
```
llvm-svn: 58006
```
  1ecf1fd5
Oct 22, 2008
- Remove allocation of unused stack slot. · f6655a9e
  Dale Johannesen authored Oct 22, 2008
```
llvm-svn: 57987
```
  f6655a9e
- Get this working with LegalizeTypes: (1) don't · 5ee1dde8
  Duncan Sands authored Oct 22, 2008
```
assume that i64 has been turned into a BUILD_PAIR
node (when called from LegalizeTypes this hasn't
happened yet) and don't use a vector shuffle mask
with an illegal element type.

llvm-svn: 57972
```
  5ee1dde8
- Fix PR2907 by digging through constant expressions to find FP constants that · 35b40f8c
  Chris Lattner authored Oct 22, 2008
```
are their operands.

llvm-svn: 57956
```
  35b40f8c
- CMake: Turned some libraries into partially linked objects. Corrected · f3c03b02
  Oscar Fuentes authored Oct 22, 2008
```
names of LLVMCore and ARMCodeGen.

llvm-svn: 57943
```
  f3c03b02
- Adjust comments for pedantic satisfaction. · cf4607fc
  Dale Johannesen authored Oct 22, 2008
```
llvm-svn: 57940
```
  cf4607fc
- Add comments to explain uint64->f64 algorithm, · 3d7ece1a
  Dale Johannesen authored Oct 21, 2008
```
well, sort of.  (Algorithm by Ian Ollmann.)

llvm-svn: 57932
```
  3d7ece1a
Oct 21, 2008

Add an SSE2 algorithm for uint64->f64 conversion. · 28929589

Dale Johannesen authored Oct 21, 2008

The same one Apple gcc uses, faster.  Also gets the
extreme case in gcc.c-torture/execute/ieee/rbug.c
correct which we weren't before; this is not
sufficient to get the test to pass though, there
is another bug.

llvm-svn: 57926

28929589

Implement the optimized FCMP_OEQ/FCMP_UNE code for x86 fast-isel. · 4ddf7a4c
Dan Gohman authored Oct 21, 2008
```
llvm-svn: 57915
```
4ddf7a4c
use pre-UAL mnemonics for push/pop for compilaton callback function · cfebc18d
Jim Grosbach authored Oct 21, 2008
```
llvm-svn: 57911
```
cfebc18d
Disable constant-offset folding for PowerPC, as the PowerPC target · c14e5227
Dan Gohman authored Oct 21, 2008
```
isn't yet prepared for it.

llvm-svn: 57886
```
c14e5227

Don't create TargetGlobalAddress nodes with offsets that don't fit · 269246b0

Dan Gohman authored Oct 21, 2008

in the 32-bit signed offset field of addresses. Even though this
may be intended, some linkers refuse to relocate code where the
relocated address computation overflows.

Also, fix the sign-extension of constant offsets to use the
actual pointer size, rather than the size of the GlobalAddress
node, which may be different, for example on x86-64 where MVT::i32
is used when the address is being fit into the 32-bit displacement
field.

llvm-svn: 57885

269246b0

Optimized FCMP_OEQ and FCMP_UNE for x86. · 97d95d6d

Dan Gohman authored Oct 21, 2008

Where previously LLVM might emit code like this:

        ucomisd %xmm1, %xmm0
        setne   %al
        setp    %cl
        orb     %al, %cl
        jne     .LBB4_2

it now emits this:

        ucomisd %xmm1, %xmm0
        jne     .LBB4_2
        jp      .LBB4_2

It has fewer instructions and uses fewer registers, but it does
have more branches. And in the case that this code is followed by
a non-fallthrough edge, it may be followed by a jmp instruction,
resulting in three branch instructions in sequence. Some effort
is made to avoid this situation.

To achieve this, X86ISelLowering.cpp now recognizes FCMP_OEQ and
FCMP_UNE in lowered form, and replace them with code that emits
two branches, except in the case where it would require converting
a fall-through edge to an explicit branch.

Also, X86InstrInfo.cpp's branch analysis and transform code now
knows now to handle blocks with multiple conditional branches. It
uses loops instead of having fixed checks for up to two
instructions. It can now analyze and transform code generated
from FCMP_OEQ and FCMP_UNE.

llvm-svn: 57873

97d95d6d

When the coalescer is doing rematerializing, have it remove · c835458d

Dan Gohman authored Oct 21, 2008

the copy instruction from the instruction list before asking the
target to create the new instruction. This gets the old instruction
out of the way so that it doesn't interfere with the target's
rematerialization code. In the case of x86, this helps it find
more cases where EFLAGS is not live.

Also, in the X86InstrInfo.cpp, teach isSafeToClobberEFLAGS to check
to see if it reached the end of the block after scanning each
instruction, instead of just before. This lets it notice when the
end of the block is only two instructions away, without doing any
additional scanning.

These changes allow rematerialization to clobber EFLAGS in more
cases, for example using xor instead of mov to set the return value
to zero in the included testcase.

llvm-svn: 57872

c835458d

Oct 20, 2008

Update the stub and callback code to handle lazy compilation. The stub · 9396051e

Jim Grosbach authored Oct 20, 2008

is re-written by the callback to branch directly to the compiled code
in future invocations.

Added back in range-based memory permission functions for the updating of
the stub on Darwin.

llvm-svn: 57846

9396051e

Have X86 custom lowering for LegalizeTypes use · 1d20ab57

Duncan Sands authored Oct 20, 2008

LowerOperation if it doesn't know what else to do.
This methods should probably be factorized some,
but this is good enough for the moment.  Have
LowerATOMIC_BINARY_64 use EXTRACT_ELEMENT rather
than assuming the operand is a BUILD_PAIR (if it
is then getNode will automagically simplify the
EXTRACT_ELEMENT).  This way LowerATOMIC_BINARY_64
usable from LegalizeTypes.

llvm-svn: 57831

1d20ab57

Oct 18, 2008

Teach DAGCombine to fold constant offsets into GlobalAddress nodes, · 2fe6bee5

Dan Gohman authored Oct 18, 2008

and add a TargetLowering hook for it to use to determine when this
is legal (i.e. not in PIC mode, etc.)

This allows instruction selection to emit folded constant offsets
in more cases, such as the included testcase, eliminating the need
for explicit arithmetic instructions.

This eliminates the need for the C++ code in X86ISelDAGToDAG.cpp
that attempted to achieve the same effect, but wasn't as effective.

Also, fix handling of offsets in GlobalAddressSDNodes in several
places, including changing GlobalAddressSDNode's offset from
int to int64_t.

The Mips, Alpha, Sparc, and CellSPU targets appear to be
unaware of GlobalAddress offsets currently, so set the hook to
false on those targets.

llvm-svn: 57748

2fe6bee5

Oct 17, 2008
- This is now partly done. · 209fc264
  Dan Gohman authored Oct 17, 2008
```
llvm-svn: 57734
```
  209fc264
- This is done. · b1d8d6ec
  Dan Gohman authored Oct 17, 2008
```
llvm-svn: 57733
```
  b1d8d6ec