Commits · 09f5be814639e29bebdf818cff7baaa5a81cb22c · Roger Ferrer / llvm-epi-0.8

Mar 30, 2009
- Do not propagate ELF-specific stuff (data.rel) into other targets. This... · 7c5f3c40
  Anton Korobeynikov authored Mar 30, 2009
```
Do not propagate ELF-specific stuff (data.rel) into other targets. This simplifies code and also ensures correctness.

llvm-svn: 68032
```
  7c5f3c40
- Add data.rel stuff · c247fd39
  Anton Korobeynikov authored Mar 30, 2009
```
llvm-svn: 68031
```
  c247fd39
Mar 28, 2009

Use array_lengthof · 1f11c3c3
Rafael Espindola authored Mar 28, 2009
```
llvm-svn: 67950
```
1f11c3c3
Have only one definition of X86AddrNumOperands. · 6ff3dabb
Rafael Espindola authored Mar 28, 2009
```
llvm-svn: 67949
```
6ff3dabb
Make code a bit less brittle by no hardcoding the number · c2a17d30
Rafael Espindola authored Mar 28, 2009
```
of operands in an address in so many places.

llvm-svn: 67945
```
c2a17d30

Optimize some 64-bit multiplication by constants into two lea's or one lea +... · fd81c73c

Evan Cheng authored Mar 28, 2009

Optimize some 64-bit multiplication by constants into two lea's or one lea + shl since imulq is slow (latency 5). e.g.
x * 40
=>
shlq    $3, %rdi
leaq    (%rdi,%rdi,4), %rax

This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq    (%rdi,%rdi,2), %rax
leaq    (%rsi,%rax,8), %rax

llvm-svn: 67917

fd81c73c

Mar 27, 2009
- Avoid hardcoding that X86 addresses have 4 operands. · 705f2a6c
  Rafael Espindola authored Mar 27, 2009
```
llvm-svn: 67848
```
  705f2a6c
- Use less hard coded constants to make the code less brittle. · 22781543
  Rafael Espindola authored Mar 27, 2009
```
llvm-svn: 67846
```
  22781543
- I am trying to add a segment to the X86 addresses matching to · e7280193
  Rafael Espindola authored Mar 27, 2009
```
improve TLS support (see http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090309/075220.html), but that code is VERY brittle.

This patch just makes it a bit more resistant.

llvm-svn: 67843
```
  e7280193
- -no-implicit-float means explicit fp operations are legal. · d88ebc35
  Evan Cheng authored Mar 26, 2009
```
llvm-svn: 67784
```
  d88ebc35
Mar 26, 2009

Pull transform from target-dependent code into target-independent code. · aa28be65
Bill Wendling authored Mar 26, 2009
```
llvm-svn: 67742
```
aa28be65

Match this pattern so that we can generate simpler code: · 94f299f2

Bill Wendling authored Mar 26, 2009

  %a = ...
  %b = and i32 %a, 2
  %c = srl i32 %b, 1
  %d = br i32 %c, 

into

  %a = ...
  %b = and %a, 2
  %c = X86ISD::CMP %b, 0
  %d = X86ISD::BRCOND %c ...

This applies only when the AND constant value has one bit set and the SRL
constant is equal to the log2 of the AND constant. The back-end is smart enough
to convert the result into a TEST/JMP sequence.

llvm-svn: 67728

94f299f2

Doxygen-ify comments. · 189d6718
Bill Wendling authored Mar 26, 2009
```
llvm-svn: 67727
```
189d6718

Mar 25, 2009
- CodeGen still defaults to non-verbose asm, but llc now overrides it and default to verbose. · 5e5a63cf
  Evan Cheng authored Mar 25, 2009
```
llvm-svn: 67668
```
  5e5a63cf
- Don't print global names twice with -asm-verbose. · 9966403e
  Evan Cheng authored Mar 25, 2009
```
llvm-svn: 67667
```
  9966403e
Mar 24, 2009

I was convinced that it's ok to allow a second i8 return value · efd2d44a

Dan Gohman authored Mar 24, 2009

to be returned in DL. LLVM's multiple-return-value support is
not ABI-conforming; front-ends that wish to have code emitted
that conforms to an ABI are currently expected to make
arrangements for this on their own rather than assuming that
multiple-return-values will automatically do the right thing.
This commit doesn't fundamentally change this situation.

llvm-svn: 67588

efd2d44a

Do not emit comments unless -asm-verbose. · a774a992
Evan Cheng authored Mar 24, 2009
```
llvm-svn: 67580
```
a774a992

Mar 23, 2009
- Correct some comments. Operand numbers start at 0. · 4a683478
  Dan Gohman authored Mar 23, 2009
```
llvm-svn: 67518
```
  4a683478
- Model inline asm constraint which ties an input to an output register as... · 968c3b0d
  Evan Cheng authored Mar 23, 2009
```
Model inline asm constraint which ties an input to an output register as machine operand TIED_TO constraint. This eliminated the need to pre-allocate registers for these. This also allows register allocator can eliminate the unneeded copies.

llvm-svn: 67512
```
  968c3b0d
- Fix a grammaro in a comment that Bill noticed. · 772de0ae
  Dan Gohman authored Mar 23, 2009
```
llvm-svn: 67507
```
  772de0ae
- Add comments explaining why there's only one register for · 70d9929d
  Dan Gohman authored Mar 23, 2009
```
i8 return values.

llvm-svn: 67502
```
  70d9929d
Mar 19, 2009
- Remove strange extra semicolons. · bfd4ad67
  Nick Lewycky authored Mar 19, 2009
```
llvm-svn: 67287
```
  bfd4ad67
Mar 18, 2009

Disable the "call to immediate" optimization on x86-64. It is · a6bed3e9

Chris Lattner authored Mar 18, 2009

not safe in general because the immediate could be an arbitrary
value that does not fit in a 32-bit pcrel displacement.  
Conservatively fall back to loading the value into a register
and calling through it.

We still do the optzn on X86-32.

llvm-svn: 67142

a6bed3e9

Mar 17, 2009
- Recognize bswapl as bswap too. · d6e571b2
  Dan Gohman authored Mar 17, 2009
```
llvm-svn: 67072
```
  d6e571b2
- Recognize "bswapq" as an alternate spelling for the bswap instruction. · 77a9279d
  Dan Gohman authored Mar 17, 2009
```
llvm-svn: 67071
```
  77a9279d
Mar 14, 2009

Use %rip-relative addressing on x86-64 whenever practical, as · f98cd1b4
Dan Gohman authored Mar 14, 2009
```
it has a smaller encoding than absolute addressing.

llvm-svn: 67002
```
f98cd1b4

Don't forego folding of loads into 64-bit adds when the other · 2293eb60

Dan Gohman authored Mar 14, 2009

operand is a signed 32-bit immediate. Unlike with the 8-bit
signed immediate case, it isn't actually smaller to fold a
32-bit signed immediate instead of a load. In fact, it's
larger in the case of 32-bit unsigned immediates, because
they can be materialized with movl instead of movq.

llvm-svn: 67001

2293eb60

Improve FastISel's handling of truncates to i1, and implement · a62e4ab6

Dan Gohman authored Mar 13, 2009

ptrtoint and inttoptr in X86FastISel. These casts aren't always
handled in the generic FastISel code because X86 sometimes needs
custom code to do truncation and zero-extension.

llvm-svn: 66988

a62e4ab6

Mar 13, 2009

Fix FastISel's assumption that i1 values are always zero-extended · c0bb9595

Dan Gohman authored Mar 13, 2009

by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.

llvm-svn: 66941

c0bb9595

add 8 and 16 bit TLS moves. · 997b74ac
Rafael Espindola authored Mar 13, 2009
```
add a fixme note on how to remove code duplication.

llvm-svn: 66932
```
997b74ac
Improve sext and zext of TLS variables. · 71144973
Rafael Espindola authored Mar 13, 2009
```
llvm-svn: 66922
```
71144973

generalize this code so that fast isel handles integer truncates to i1, which · 3fb71c8f

Chris Lattner authored Mar 13, 2009

codegen to the same thing as integer truncates to i8 (the top bits are 
just undefined).  This implements rdar://6667338

llvm-svn: 66902

3fb71c8f

These instructions have special lowering that may lower them to SSE · 798fd56d
Bill Wendling authored Mar 13, 2009
```
instructions. Prevent that if we don't want implicit uses of SSE.

llvm-svn: 66877
```
798fd56d

Fix some significant problems with constant pools that resulted in unnecessary... · 1fb8aedd

Evan Cheng authored Mar 13, 2009

Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.

1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.

Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.

llvm-svn: 66875

1fb8aedd

generalize the previous code to use the full generality of LEA · 99cc1337

Chris Lattner authored Mar 13, 2009

for i32/i64 expressions (we could also do i16 on cpus where
i16 lea is fast, but I didn't add this).  On the example, we now
generate:

_test:
	movl	4(%esp), %eax
	cmpl	$42, (%eax)
	setl	%al
	movzbl	%al, %eax
	leal	4(%eax,%eax,8), %eax
	ret

instead of:

_test:
	movl	4(%esp), %eax
	cmpl	$41, (%eax)
	movl	$4, %ecx
	movl	$13, %eax
	cmovg	%ecx, %eax
	ret

llvm-svn: 66869

99cc1337

optimize the case of cond ? 42 : 41 and friends. This compiles the · 4be6df5d

Chris Lattner authored Mar 13, 2009

example to:

_test:
	movl	4(%esp), %eax
	cmpl	$41, (%eax)
	setg	%al
	movzbl	%al, %eax
	orl	$4294967294, %eax
	ret

instead of:

        movl    4(%esp), %eax
        cmpl    $41, (%eax)
	movl	$4294967294, %ecx
	movl	$4294967295, %eax
	cmova	%ecx, %eax
	ret

which is smaller in code size and faster. rdar://6668608

llvm-svn: 66868

4be6df5d

Enhance address-mode folding of ISD::ADD to handle cases where the · a1d92423

Dan Gohman authored Mar 13, 2009

operands can't both be fully folded at the same time. For example,
in the included testcase, a global variable is being added with
an add of two values. The global variable wants RIP-relative
addressing, so it can't share the address with another base
register, but it's still possible to fold the initial add.

llvm-svn: 66865

a1d92423

Mar 12, 2009

Re-apply 66024 with fixes: 1. Fixed indirect call to immediate address... · 2a332aa8

Evan Cheng authored Mar 12, 2009

Re-apply 66024 with fixes: 1. Fixed indirect call to immediate address assembly. 2. Fixed JIT encoding by making the address pc-relative.

llvm-svn: 66803

2a332aa8

Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))" · 4147f08e

Chris Lattner authored Mar 12, 2009

related transformations out of target-specific dag combine into the
ARM backend.  These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).

Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0  -> (zext(cond) << 3).  This happens frequently
with the recently added cp constant select optimization, but is a
very general xform.  For example, we now compile the second example
in const-select.ll to:

_test:
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        seta    %al
        movzbl  %al, %eax
        movl    4(%esp), %ecx
        movsbl  (%ecx,%eax,4), %eax
        ret

instead of:

_test:
        movl    4(%esp), %eax
        leal    4(%eax), %ecx
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        cmovbe  %eax, %ecx
        movsbl  (%ecx), %eax
        ret

This passes multisource and dejagnu.

llvm-svn: 66779

4147f08e

improve comment. · a492d29c
Chris Lattner authored Mar 12, 2009
```
llvm-svn: 66778
```
a492d29c