Commits · d51f196ff5d18ef794c3064b1aa2bd7f16d771ed · Roger Ferrer / llvm-epi-0.8

Mar 31, 2009
- Minor top-level comment fix. · d51f196f
  Dan Gohman authored Mar 31, 2009
```
llvm-svn: 68113
```
  d51f196f
- Fix live-out reg logic to not insert over-aggressive AssertZExt · 97a20b8d
  Dan Gohman authored Mar 31, 2009
```
instructions. This fixes lua.

llvm-svn: 68083
```
  97a20b8d
Mar 29, 2009
- Fix PR3899: add support for extracting floats from vectors · d21581ea
  Duncan Sands authored Mar 29, 2009
```
when using -soft-float.
Based on a patch by Jakob Stoklund Olesen.

llvm-svn: 67996
```
  d21581ea
Mar 28, 2009

Make check in CheckTailCallReturnConstraints for ignorable instructions between · e622cbf3
Arnold Schwaighofer authored Mar 28, 2009
```
a CALL and a RET node more generic. Add a test for tail calls with a void
return.

llvm-svn: 67943
```
e622cbf3

Enable tail call optimization for functions that return a struct (bug 3664)... · 83d5420d

Arnold Schwaighofer authored Mar 28, 2009

Enable tail call optimization for functions that return a struct (bug 3664) and for functions that return types that need extending (e.g i1).

llvm-svn: 67934

83d5420d

Optimize some 64-bit multiplication by constants into two lea's or one lea +... · fd81c73c

Evan Cheng authored Mar 28, 2009

Optimize some 64-bit multiplication by constants into two lea's or one lea + shl since imulq is slow (latency 5). e.g.
x * 40
=>
shlq    $3, %rdi
leaq    (%rdi,%rdi,4), %rax

This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq    (%rdi,%rdi,2), %rax
leaq    (%rsi,%rax,8), %rax

llvm-svn: 67917

fd81c73c

Fix what surely must be a copy+pasto. · 2785e4be
Dan Gohman authored Mar 27, 2009
```
llvm-svn: 67881
```
2785e4be
Initialize LiveOutInfo's APInt members to zero, as APInt's · 6d758764
Dan Gohman authored Mar 27, 2009
```
default constructor produces an uninitialized APInt.
This fixes PR3896.

llvm-svn: 67879
```
6d758764

Mar 26, 2009
- Pull transform from target-dependent code into target-independent code. · aa28be65
  Bill Wendling authored Mar 26, 2009
```
llvm-svn: 67742
```
  aa28be65
Mar 25, 2009

Revert 67132. This is breaking some objective-c apps. · 2e9f42be

Evan Cheng authored Mar 25, 2009

Also fixes SDISel so it *does not* force promote return value if the function is not marked signext / zeroext.

llvm-svn: 67701

2e9f42be

When optimizing with debug info, don't keep the · eb1646d2

Dale Johannesen authored Mar 25, 2009

stoppoint nodes around until Legalize; doing this
imposed an ordering on a sequence of loads that
came from different lines, interfering with scheduling.

llvm-svn: 67692

eb1646d2

Mar 24, 2009
- more tidying: name the components of PhysReg in the case when · c35847e1
  Chris Lattner authored Mar 24, 2009
```
the target constraint specifies a specific physreg.

llvm-svn: 67618
```
  c35847e1
- Tidy a bit more. · 42eceb34
  Chris Lattner authored Mar 24, 2009
```
llvm-svn: 67617
```
  42eceb34
- simplify this code a bit now that "allocation to a vreg class" can never · 246eda43
  Chris Lattner authored Mar 24, 2009
```
fail.

llvm-svn: 67616
```
  246eda43
- Minor compile-time optimization; don't bother checking · f3746cbc
  Dan Gohman authored Mar 24, 2009
```
canClobberPhysRegDefs if the successor node doesn't
clobber any physical registers.

llvm-svn: 67587
```
  f3746cbc
- Add a pre-pass to the burr-list scheduler which makes adjustments to · 9a658d72
  Dan Gohman authored Mar 24, 2009
```
help out the register pressure reduction heuristics in the case of
nodes with multiple uses. Currently this uses very conservative
heuristics, so it doesn't have a broad impact, but in cases where it
does help it can make a big difference.

llvm-svn: 67586
```
  9a658d72
Mar 23, 2009

When unfolding a load during scheduling, the new operator node has · ed0e8d44

Dan Gohman authored Mar 23, 2009

a data dependency on the load node, so it really needs a
data-dependence edge to the load node, even if the load previously
existed.

And add a few comments.

llvm-svn: 67554

ed0e8d44

Don't set SUnit::hasPhysRegDefs to true unless the defs are · f477262e
Dan Gohman authored Mar 23, 2009
```
actually have uses, which reflects the way it's used.

llvm-svn: 67540
```
f477262e

Fix canClobberPhysRegDefs to check all SDNodes grouped together · a366da1b

Dan Gohman authored Mar 23, 2009

in an SUnit, instead of just the first one. This fix is needed
by some upcoming scheduler changes.

llvm-svn: 67531

a366da1b

Add a new bit to SUnit to record whether a node has implicit physreg · 52c278e5
Dan Gohman authored Mar 23, 2009
```
defs, regardless of whether they are actually used.

llvm-svn: 67528
```
52c278e5
Now that errs() is properly non-buffered, there's no need to · 4f2fea1a
Dan Gohman authored Mar 23, 2009
```
explicitly flush it.

llvm-svn: 67526
```
4f2fea1a

Model inline asm constraint which ties an input to an output register as... · 968c3b0d

Evan Cheng authored Mar 23, 2009

Model inline asm constraint which ties an input to an output register as machine operand TIED_TO constraint. This eliminated the need to pre-allocate registers for these. This also allows register allocator can eliminate the unneeded copies.

llvm-svn: 67512

968c3b0d

Mar 20, 2009
- Simplify this code; use a while instead of an if and a do-while. · 3bdc4bdb
  Dan Gohman authored Mar 20, 2009
```
llvm-svn: 67400
```
  3bdc4bdb
- For inline asm output operand that matches an input. Encode the input operand... · 2e55923f
  Evan Cheng authored Mar 20, 2009
```
For inline asm output operand that matches an input. Encode the input operand index in the high bits.

llvm-svn: 67387
```
  2e55923f
- Fixed the comment. No functionality change. · e9759c45
  Sanjiv Gupta authored Mar 20, 2009
```
llvm-svn: 67370
```
  e9759c45
Mar 18, 2009
- Added missing support for widening when splitting an unary op (PR3683) · 32c8074b
  Mon P Wang authored Mar 18, 2009
```
and expanding a bit convert (PR3711).  In both cases, we extract the
valid part of the widen vector and then do the conversion.

llvm-svn: 67175
```
  32c8074b
- Don't force promotion of return arguments on the callee. · 4606b121
  Rafael Espindola authored Mar 17, 2009
```
Some architectures (like x86) don't require it.
This fixes bug 3779.

llvm-svn: 67132
```
  4606b121
Mar 17, 2009

Fix codegen to compute the size of an allocation by multiplying the · 2363d0b8

Chris Lattner authored Mar 17, 2009

size by the array amount as an i32 value instead of promoting from
i32 to i64 then doing the multiply.  Not doing this broke wrap-around
assumptions that the optimizers (validly) made.  The ultimate real
fix for this is to introduce i64 version of alloca and remove mallocinst.

This fixes PR3829

llvm-svn: 67093

2363d0b8

Fix a problem with DAGCombine where we were building an illegal build · 523c0852

Mon P Wang authored Mar 17, 2009

vector shuffle mask. Forced the mask to be built using i32.  Note: this will
be irrelevant once vector_shuffle no longer takes a build vector for the
shuffle mask.

llvm-svn: 67076

523c0852

Mar 14, 2009

Avoid doing the transformation c ? 1.0 : 2.0 as load { 2.0, 1.0 } + c*4 · c8671563

Mon P Wang authored Mar 14, 2009

if FPConstant is legal because if the FPConstant doesn't need to be stored
in a constant pool, the transformation is unlikely to be profitable.

llvm-svn: 66994

c8671563

Improve FastISel's handling of truncates to i1, and implement · a62e4ab6

Dan Gohman authored Mar 13, 2009

ptrtoint and inttoptr in X86FastISel. These casts aren't always
handled in the generic FastISel code because X86 sometimes needs
custom code to do truncation and zero-extension.

llvm-svn: 66988

a62e4ab6

Mar 13, 2009

Fix FastISel's assumption that i1 values are always zero-extended · c0bb9595

Dan Gohman authored Mar 13, 2009

by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.

llvm-svn: 66941

c0bb9595

Fix some significant problems with constant pools that resulted in unnecessary... · 1fb8aedd

Evan Cheng authored Mar 13, 2009

Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.

1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.

Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.

llvm-svn: 66875

1fb8aedd

Oops...I committed too much. · fa54bc20
Bill Wendling authored Mar 13, 2009
```
llvm-svn: 66867
```
fa54bc20
Temporarily XFAIL this test. · b02eadf6
Bill Wendling authored Mar 13, 2009
```
llvm-svn: 66866
```
b02eadf6
Fix a typo in a comment. · a19c662a
Dan Gohman authored Mar 12, 2009
```
llvm-svn: 66843
```
a19c662a

Mar 12, 2009

Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))" · 4147f08e

Chris Lattner authored Mar 12, 2009

related transformations out of target-specific dag combine into the
ARM backend.  These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).

Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0  -> (zext(cond) << 3).  This happens frequently
with the recently added cp constant select optimization, but is a
very general xform.  For example, we now compile the second example
in const-select.ll to:

_test:
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        seta    %al
        movzbl  %al, %eax
        movl    4(%esp), %ecx
        movsbl  (%ecx,%eax,4), %eax
        ret

instead of:

_test:
        movl    4(%esp), %eax
        leal    4(%eax), %ecx
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        cmovbe  %eax, %ecx
        movsbl  (%ecx), %eax
        ret

This passes multisource and dejagnu.

llvm-svn: 66779

4147f08e

Enable Chris' value propagation change. It make available known sign, zero,... · 44659546

Evan Cheng authored Mar 12, 2009

Enable Chris' value propagation change. It make available known sign, zero, one bits information for values that are live out of basic blocks. The goal is to eliminate unnecessary sext, zext, truncate of values that are live-in to blocks. This does not handle PHI nodes yet.

llvm-svn: 66777

44659546

Mar 11, 2009

reapply my previous patch (r66358) with a tweak to set the · 43d6377f

Chris Lattner authored Mar 11, 2009

alignment of the generated constant pool entry to the
desired alignment of a type.  If we don't do this, we end up
trying to do movsd from 4-byte alignment memory.  This fixes
450.soplex and 456.hmmer.

llvm-svn: 66641

43d6377f

Mar 10, 2009
- Revert 66358 for now. It's breaking povray, 450.soplex, and 456.hmmer on x86 / Darwin. · aa887653
  Evan Cheng authored Mar 10, 2009
```
llvm-svn: 66574
```
  aa887653