Commits · 2bde90b3ee6a391dc6902d2339e9e74485be2b90 · Roger Ferrer / llvm-epi-0.8

Jun 03, 2009
- Remove unnecessary #includes. · 11231d0c
  Dan Gohman authored Jun 03, 2009
```
llvm-svn: 72782
```
  11231d0c
Jun 02, 2009

Revert 72707 and 72709, for the moment. · 5234d379
Dale Johannesen authored Jun 02, 2009
```
llvm-svn: 72712
```
5234d379

Make the implicit inputs and outputs of target-independent · 0b8ca792

Dale Johannesen authored Jun 01, 2009

ADDC/ADDE use MVT::i1 (later, whatever it gets legalized to)
instead of MVT::Flag.  Remove CARRY_FALSE in favor of 0; adjust
all target-independent code to use this format.

Most targets will still produce a Flag-setting target-dependent
version when selection is done.  X86 is converted to use i32
instead, which means TableGen needs to produce different code
in xxxGenDAGISel.inc.  This keys off the new supportsHasI1 bit
in xxxInstrInfo, currently set only for X86; in principle this
is temporary and should go away when all other targets have
been converted.  All relevant X86 instruction patterns are
modified to represent setting and using EFLAGS explicitly.  The
same can be done on other targets.

The immediate behavior change is that an ADC/ADD pair are no
longer tightly coupled in the X86 scheduler; they can be
separated by instructions that don't clobber the flags (MOV).
I will soon add some peephole optimizations based on using
other instructions that set the flags to feed into ADC.

llvm-svn: 72707

0b8ca792

May 30, 2009
- Untabification. · 09f17a84
  Bill Wendling authored May 30, 2009
```
llvm-svn: 72604
```
  09f17a84
May 28, 2009

Added optimization that narrow load / op / store and the 'op' is a bit... · a9cda8ab

Evan Cheng authored May 28, 2009

Added optimization that narrow load / op / store and the 'op' is a bit twiddling instruction and its second operand is an immediate. If bits that are touched by 'op' can be done with a narrower instruction, reduce the width of the load and store as well. This happens a lot with bitfield manipulation code.
e.g.
orl     $65536, 8(%rax)
=>
orb     $1, 10(%rax)

Since narrowing is not always a win, e.g. i32 -> i16 is a loss on x86, dag combiner consults with the target before performing the optimization.

llvm-svn: 72507

a9cda8ab

May 27, 2009
- Ger rid of some dead code. · a56159b7
  Eli Friedman authored May 27, 2009
```
llvm-svn: 72494
```
  a56159b7
- Don't abuse the quirky behavior of LegalizeDAG for XINT_TO_FP and · acb851a8
  Eli Friedman authored May 27, 2009
```
FP_TO_XINT.  Necessary for some cleanups I'm working on.  Updated 
from the previous version (r72431) to fix a bug and make some things a 
bit clearer.

llvm-svn: 72445
```
  acb851a8
May 26, 2009
- Back out r72431, it is causing a number of compilation crashes with clang. · d96b1178
  Daniel Dunbar authored May 26, 2009
```
llvm-svn: 72436
```
  d96b1178
- Don't abuse the quirky behavior of LegalizeDAG for XINT_TO_FP and · 8c7bff96
  Eli Friedman authored May 26, 2009
```
FP_TO_XINT.  Necessary for some cleanups I'm working on. 

llvm-svn: 72431
```
  8c7bff96
May 24, 2009
- Make the X86 backend mark EXTRACT_SUBVECTOR as Expand, at least for the · 2199ed39
  Eli Friedman authored May 23, 2009
```
moment.

llvm-svn: 72350
```
  2199ed39
May 23, 2009

Make the x86 backend custom-lower UINT_TO_FP and FP_TO_UINT on 32-bit · dfe4f253

Eli Friedman authored May 23, 2009

systems instead of attempting to promote them to a 64-bit SINT_TO_FP or 
FP_TO_SINT.  This is in preparation for removing the type legalization 
code from LegalizeDAG: once type legalization is gone from LegalizeDAG, 
it won't be able to handle the i64 operand/result correctly.

This isn't quite ideal, but I don't think any other operation for any 
target ends up in this situation, so treating this case specially seems 
reasonable.

llvm-svn: 72324

dfe4f253

May 13, 2009
- Run code placement optimization for targets that want it (arm and x86 for now). · ab0d2339
  Evan Cheng authored May 13, 2009
```
llvm-svn: 71726
```
  ab0d2339
May 08, 2009
- Fix PR4152: asm constraint validation happens before dag combine, so we · f1d9b914
  Chris Lattner authored May 08, 2009
```
need to work a bit to combine things like (x+c1+c2) into x+c3.

llvm-svn: 71232
```
  f1d9b914
Apr 30, 2009
- Fix infinite recursion in the C++ code which handles movddup by making it unnecessary. · 7e6e3527
  Nate Begeman authored Apr 29, 2009
```
llvm-svn: 70425
```
  7e6e3527
Apr 29, 2009
- Implement review feedback for vector shuffle work. · 5f829d89
  Nate Begeman authored Apr 29, 2009
```
llvm-svn: 70372
```
  5f829d89
Apr 27, 2009

2nd attempt, fixing SSE4.1 issues and implementing feedback from duncan. · 8d6d4b92

Nate Begeman authored Apr 27, 2009

PR2957

ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask.  A value of -1 represents UNDEF.

In addition to eliminating the creation of illegal BUILD_VECTORS just to 
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.

llvm-svn: 70225

8d6d4b92

Apr 24, 2009

Fix PR 4004 by including the call to __tls_get_addr in X86tlsaddr. This is not · c1396a23
Rafael Espindola authored Apr 24, 2009
```
very elegant, but neither is the tls specification :-(

llvm-svn: 69968
```
c1396a23
Revert 69952. Causes testsuite failures on linux x86-64. · b93db668
Rafael Espindola authored Apr 24, 2009
```
llvm-svn: 69967
```
b93db668

PR2957 · bb881d66

Nate Begeman authored Apr 24, 2009

ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask. A value of -1 represents UNDEF.

In addition to eliminating the creation of illegal BUILD_VECTORS just to
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.

A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next.

llvm-svn: 69952

bb881d66

Apr 21, 2009
- Get rid of what looks like a copy-and-pasted typo. · 7ce5cc6b
  Duncan Sands authored Apr 21, 2009
```
Spotted by gcc-4.5.

llvm-svn: 69673
```
  7ce5cc6b
Apr 20, 2009

Move duplicated AddLiveIn function from X86 and ARM backends to be a method · f8b85477

Bob Wilson authored Apr 20, 2009

in the MachineFunction class, renaming it to addLiveIn for consistency with
the same method in MachineBasicBlock.  Thanks for Anton for suggesting this.

llvm-svn: 69615

f8b85477

Apr 17, 2009

For general dynamic TLS access we must use · 355fe12c

Rafael Espindola authored Apr 17, 2009

leaq	foo@TLSGD(%rip), %rdi

as part of the instruction sequence. Using a register other than %rdi and then
copying it to %rdi is not valid.

llvm-svn: 69350

355fe12c

Apr 13, 2009
- X86-64 TLS support for local exec and initial exec. · 6d6c6043
  Rafael Espindola authored Apr 13, 2009
```
llvm-svn: 68947
```
  6d6c6043
Apr 10, 2009
- Remove the obsolete SelectionDAG::getNodeValueTypes and simplify · de912e24
  Dan Gohman authored Apr 09, 2009
```
code that uses it by using SelectionDAG::getVTList instead.

llvm-svn: 68744
```
  de912e24
Apr 09, 2009
- Fix grammaros in comments. · f1545486
  Dan Gohman authored Apr 09, 2009
```
llvm-svn: 68666
```
  f1545486
Apr 08, 2009

Re-apply 68552. · 3b2df10c

Rafael Espindola authored Apr 08, 2009

Tested by bootstrapping llvm-gcc and using that to build llvm.

llvm-svn: 68645

3b2df10c

Avoid a hard coded constant. · d173f423
Rafael Espindola authored Apr 08, 2009
```
llvm-svn: 68603
```
d173f423

Implement support for using modeling implicit-zero-extension on x86-64 · ad3e549a

Dan Gohman authored Apr 08, 2009

with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.

This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.

Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.

Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.

Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.

Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.

llvm-svn: 68576

ad3e549a

Temporarily revert r68552. This was causing a failure in the self-hosting LLVM · 4aa25b79

Bill Wendling authored Apr 07, 2009

builds.

--- Reverse-merging (from foreign repository) r68552 into '.':
U    test/CodeGen/X86/tls8.ll
U    test/CodeGen/X86/tls10.ll
U    test/CodeGen/X86/tls2.ll
U    test/CodeGen/X86/tls6.ll
U    lib/Target/X86/X86Instr64bit.td
U    lib/Target/X86/X86InstrSSE.td
U    lib/Target/X86/X86InstrInfo.td
U    lib/Target/X86/X86RegisterInfo.cpp
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86CodeEmitter.cpp
U    lib/Target/X86/X86FastISel.cpp
U    lib/Target/X86/X86InstrInfo.h
U    lib/Target/X86/X86ISelDAGToDAG.cpp
U    lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp
U    lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp
U    lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h
U    lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h
U    lib/Target/X86/X86ISelLowering.h
U    lib/Target/X86/X86InstrInfo.cpp
U    lib/Target/X86/X86InstrBuilder.h
U    lib/Target/X86/X86RegisterInfo.td

llvm-svn: 68560

4aa25b79

Apr 07, 2009

Reduce code duplication on the TLS implementation. · 1edda067

Rafael Espindola authored Apr 07, 2009

This introduces a small regression on the generated code
quality in the case we are just computing addresses, not
loading values.

Will work on it and on X86-64 support.

llvm-svn: 68552

1edda067

Apr 03, 2009
- Added a x86 dag combine to increase the chances to use a · 9c186c5d
  Mon P Wang authored Apr 03, 2009
```
movq for v2i64 on x86-32.

llvm-svn: 68368
```
  9c186c5d
Apr 02, 2009
- silence warning in release-asserts build. · d2eb0a63
  Chris Lattner authored Apr 01, 2009
```
llvm-svn: 68253
```
  d2eb0a63
Mar 31, 2009
- i128 shift libcalls are not available on x86. · d9d6e427
  Evan Cheng authored Mar 31, 2009
```
llvm-svn: 68133
```
  d9d6e427
Mar 30, 2009

When optimzing a mul by immediate into two, the resulting mul's should get a... · a84a3188

Evan Cheng authored Mar 30, 2009

When optimzing a mul by immediate into two, the resulting mul's should get a x86 specific node to avoid dag combiner from hacking on them further.

llvm-svn: 68066

a84a3188

Mar 28, 2009

Have only one definition of X86AddrNumOperands. · 6ff3dabb
Rafael Espindola authored Mar 28, 2009
```
llvm-svn: 67949
```
6ff3dabb

Optimize some 64-bit multiplication by constants into two lea's or one lea +... · fd81c73c

Evan Cheng authored Mar 28, 2009

Optimize some 64-bit multiplication by constants into two lea's or one lea + shl since imulq is slow (latency 5). e.g.
x * 40
=>
shlq    $3, %rdi
leaq    (%rdi,%rdi,4), %rax

This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq    (%rdi,%rdi,2), %rax
leaq    (%rsi,%rax,8), %rax

llvm-svn: 67917

fd81c73c

Mar 27, 2009
- I am trying to add a segment to the X86 addresses matching to · e7280193
  Rafael Espindola authored Mar 27, 2009
```
improve TLS support (see http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090309/075220.html), but that code is VERY brittle.

This patch just makes it a bit more resistant.

llvm-svn: 67843
```
  e7280193
- -no-implicit-float means explicit fp operations are legal. · d88ebc35
  Evan Cheng authored Mar 26, 2009
```
llvm-svn: 67784
```
  d88ebc35
Mar 26, 2009

Pull transform from target-dependent code into target-independent code. · aa28be65
Bill Wendling authored Mar 26, 2009
```
llvm-svn: 67742
```
aa28be65

Match this pattern so that we can generate simpler code: · 94f299f2

Bill Wendling authored Mar 26, 2009

  %a = ...
  %b = and i32 %a, 2
  %c = srl i32 %b, 1
  %d = br i32 %c, 

into

  %a = ...
  %b = and %a, 2
  %c = X86ISD::CMP %b, 0
  %d = X86ISD::BRCOND %c ...

This applies only when the AND constant value has one bit set and the SRL
constant is equal to the log2 of the AND constant. The back-end is smart enough
to convert the result into a TEST/JMP sequence.

llvm-svn: 67728

94f299f2