Commits · f3c6d6def51dd083aedf8f043db0c1ba69353b01 · Roger Ferrer / llvm-epi-0.8

Jun 18, 2011
- Remove unused but set variables. · 25e17b0f
  Benjamin Kramer authored Jun 18, 2011
```
llvm-svn: 133347
```
  25e17b0f
- Delete unneeded allocation order override. · b68fee1e
  Jakob Stoklund Olesen authored Jun 18, 2011
```
llvm-svn: 133331
```
  b68fee1e
- Switch ARM to using AltOrders instead of MethodBodies. · 831ae010
  Jakob Stoklund Olesen authored Jun 18, 2011
```
This slightly changes the GPR allocation order on Darwin where R9 is not
a callee-saved register:

Before: %R0 %R1 %R2 %R3 %R12 %R9 %LR %R4 %R5 %R6 %R8 %R10 %R11
After:  %R0 %R1 %R2 %R3 %R9 %R12 %LR %R4 %R5 %R6 %R8 %R10 %R11
llvm-svn: 133326
```
  831ae010
- Switch x86 to using AltOrders instead of MethodBodies. · 3337f7d5
  Jakob Stoklund Olesen authored Jun 18, 2011
```
llvm-svn: 133325
```
  3337f7d5
- Reserve D16-D13 on subtargets that don't support them. · 19d968e6
  Jakob Stoklund Olesen authored Jun 18, 2011
```
llvm-svn: 133321
```
  19d968e6
- Zap the last reference to allocation_order_begin(). · d3fec5ed
  Jakob Stoklund Olesen authored Jun 17, 2011
```
llvm-svn: 133310
```
  d3fec5ed
- SI, DI, BP, and SP don't have 8-bit sub-registers in x86 mode. · 157e6a79
  Jakob Stoklund Olesen authored Jun 17, 2011
```
llvm-svn: 133308
```
  157e6a79
Jun 17, 2011

Use the verbose asm flag instead of a new flag for decoding the LSDA. · b74b9de1
Bill Wendling authored Jun 17, 2011
```
llvm-svn: 133292
```
b74b9de1

Add an alternative rev16 pattern. We should figure out a better way to handle... · 7552a62a

Evan Cheng authored Jun 17, 2011

Add an alternative rev16 pattern. We should figure out a better way to handle these complex rev patterns. rdar://9609108

llvm-svn: 133289

7552a62a

Add an option that allows one to "decode" the LSDA. · e303114b

Bill Wendling authored Jun 17, 2011

The LSDA is a bit difficult for the non-initiated to read. Even with comments,
it's not always clear what's going on. This wraps the ASM streamer in a class
that retains the LSDA and then emits a human-readable description of what's
going on in it.

So instead of having to make sense of:

Lexception1:
        .byte   255
        .byte   155
        .byte   168
        .space  1
        .byte   3
        .byte   26
Lset0 = Ltmp7-Leh_func_begin1
      .long     Lset0
Lset1 = Ltmp812-Ltmp7
      .long     Lset1
Lset2 = Ltmp913-Leh_func_begin1
      .long     Lset2
      .byte     3
Lset3 = Ltmp812-Leh_func_begin1
      .long     Lset3
Lset4 = Leh_func_end1-Ltmp812
      .long     Lset4
      .long     0
      .byte     0
      .byte     1
      .byte     0
      .byte     2
      .byte     125
      .long     __ZTIi@GOTPCREL+4
      .long     __ZTIPKc@GOTPCREL+4

you can read this instead:

## Exception Handling Table: Lexception1
##  @LPStart Encoding: omit
##    @TType Encoding: indirect pcrel sdata4
##        @TType Base: 40 bytes
## @CallSite Encoding: udata4
## @Action Table Size: 26 bytes

## Action 1:
##   A throw between Ltmp7 and Ltmp812 jumps to Ltmp913 on an exception.
##     For type(s):  __ZTIi@GOTPCREL+4 __ZTIPKc@GOTPCREL+4
## Action 2:
##   A throw between Ltmp812 and Leh_func_end1 does not have a landing pad.

llvm-svn: 133286

e303114b

Fix a few places where 32bit instructions/registerset were used on PPC64. · d041962c
Roman Divacky authored Jun 17, 2011
```
llvm-svn: 133260
```
d041962c

PTX: Adjust rounding modes · 3604d9a4

Justin Holewinski authored Jun 17, 2011

* rounding modes for fp add, mul, sub now use .rn
* float -> int rounding correctly uses .rzi not .rni
* 32bit fdiv for sm13 uses div.rn (instead of div.approx)
* 32bit fdiv for sm10 now uses div (instead of div.approx)

Approx is not IEEE 754 compatible (and should be optionally set by a flag to the backend instead). The .rn rounding modifier is the PTX default anyway, but it's better to be explicit.

All these modifiers should be available by using __fmul_rz functions for example, but support will need to be added for this in the backend.

Patch by Dan Bailey

llvm-svn: 133253

3604d9a4

Allocate SystemZ callee-saved registers backwards: R13-R6 · 3982029f

Jakob Stoklund Olesen authored Jun 17, 2011

The reserved R14-R15 are always saved in the prolog, and using CSRs
starting from R13 allows them to be saved in one instruction.

Thanks to Anton for explaining this.

llvm-svn: 133233

3982029f

Update an insertion point iterator after replacing a return instruction with a · 033026ff
Cameron Zwarich authored Jun 17, 2011
```
tail call pseudoinstruction. This fixes <rdar://problem/9624333>.

llvm-svn: 133227
```
033026ff
Explicitly invoke ArrayRef constructor to keep gcc happy. · 66773c33
Jakob Stoklund Olesen authored Jun 17, 2011
```
Patch by Richard Smith!

llvm-svn: 133220
```
66773c33

Rename TRI::getAllocationOrder() to getRawAllocationOrder(). · 801f7ab3

Jakob Stoklund Olesen authored Jun 16, 2011

Also switch the return type to ArrayRef<unsigned> which works out nicely
for ARM's implementation of this function because of the clever ArrayRef
constructors.

The name change indicates that the returned allocation order may contain
reserved registers as has been the case for a while.

llvm-svn: 133216

801f7ab3

Jun 16, 2011

Change the REG_SEQUENCE SDNode to take an explict register class ID as its... · 5fc8b77f

Owen Anderson authored Jun 16, 2011

Change the REG_SEQUENCE SDNode to take an explict register class ID as its first operand. This operand is lowered away by the time we reach MachineInstrs, so the actual register-allocation handling of them doesn't need to change.
This is intended to support using REG_SEQUENCE SDNode's with type MVT::untyped, and is part of the long road to eliminating some of the hacks we currently use to support register pairs and other strange constraints, particularly on ARM NEON.

llvm-svn: 133178

5fc8b77f

Mark ldrexd/strexd w/ volatile memory by default · d66ab9ea
Bruno Cardoso Lopes authored Jun 16, 2011
```
llvm-svn: 133175
```
d66ab9ea
PTX: Finish new calling convention implementation · 7f191b2a
Justin Holewinski authored Jun 16, 2011
```
llvm-svn: 133172
```
7f191b2a
PTX: Rename register classes for readability and combine int and fp registers · 6b356c1f
Justin Holewinski authored Jun 16, 2011
```
llvm-svn: 133171
```
6b356c1f
Add a comment describing why transforming (shl x, 1) to (add x, x) is to be · 8eb36ef4
Dan Gohman authored Jun 16, 2011
```
considered safe enough in this context.

llvm-svn: 133159
```
8eb36ef4
PTX: Fix whitespace errors · 5ccf812b
Justin Holewinski authored Jun 16, 2011
```
llvm-svn: 133158
```
5ccf812b
Add AVX suport for fpextend. · bbf2ab99
Bruno Cardoso Lopes authored Jun 16, 2011
```
Original patch by Syoyo Fujita with more comments by me.

llvm-svn: 133153
```
bbf2ab99

Revision r128665 added an optimization to make use of NEON multiplier · 2730162b

Chad Rosier authored Jun 16, 2011

accumulator forwarding.  Specifically (from SVN log entry):

Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier
accumulator forwarding:
vadd d3, d0, d1
vmul d3, d3, d2
=>
vmul d3, d0, d2
vmla d3, d1, d2

Make sure it catches cases where operand 1 is add/fadd/sub/fsub, which was
intended in the original revision.

llvm-svn: 133127

2730162b

Silence warnings in non assert builds. Patch by David Blaikie · 5444a7b4
Bruno Cardoso Lopes authored Jun 16, 2011
```
llvm-svn: 133118
```
5444a7b4

Use set operations instead of plain lists to enumerate register classes. · 99f35eab

Jakob Stoklund Olesen authored Jun 15, 2011

This simplifies many of the target description files since it is common
for register classes to be related or contain sequences of numbered
registers.

I have verified that this doesn't change the files generated by TableGen
for ARM and X86. It alters the allocation order of MBlaze GPR and Mips
FGR32 registers, but I believe the change is benign.

llvm-svn: 133105

99f35eab

Jun 15, 2011

Add a new function attribute, nonlazybind, which inhibits lazy-loading · 4b7a8d68

John McCall authored Jun 15, 2011

optimizations when emitting calls to the function;  instead those calls may
use faster relocations which require the function to be immediately resolved
upon loading the dynamic object featuring the call.  This is useful when it
is known that the function will be called frequently and pervasively and
therefore there is no merit in delaying binding of the function.

Currently only implemented for x86-64, where it turns into a call through
the global offset table.

Patch by Dan Gohman, who assures me that he's going to add LangRef documentation
for this once it's committed.

llvm-svn: 133080

4b7a8d68

Remove custom allocation orders in SystemZ. · 5977109f

Jakob Stoklund Olesen authored Jun 15, 2011

Note that this actually changes code generation, and someone who
understands this target better should check the changes.

- R12Q is now allocatable. I think it was omitted from the allocation
  order by mistake since it isn't reserved. It as apparently used as a
  GOT pointer sometimes, and it should probably be reserved if that is
  the case.

- The GR64 registers are allocated in a different order now. The
  register allocator will automatically put the CSRs last. There were
  other changes to the order that may have been significant.

The test fix is because r0 and r1 swapped places in the allocation order.

llvm-svn: 133067

5977109f

Another revsh pattern. rdar://9609059 · 678b691a
Evan Cheng authored Jun 15, 2011
```
llvm-svn: 133064
```
678b691a
Make PPC64CompilationCallback compilable no non-darwin platforms. · 6874b26d
Roman Divacky authored Jun 15, 2011
```
Patch by Nathan Whitehorn!

llvm-svn: 133059
```
6874b26d

Replace the statically generated hashtables for checking register... · 86fd3c00

Owen Anderson authored Jun 15, 2011

Replace the statically generated hashtables for checking register relationships with just scanning the (typically tiny) static lists.

At the time I wrote this code (circa 2007), TargetRegisterInfo was using a std::set to perform these queries. Switching to the static hashtables was an obvious improvement, but in reality there's no reason to do anything other than scan.
With this change, total LLC time on a whole-program 403.gcc is reduced by approximately 1.5%, almost all of which comes from a 15% reduction in LiveVariables time. It also reduces the binary size of LLC by 86KB, thanks to eliminating a bunch of very large static tables.

llvm-svn: 133051

86fd3c00

A minor simplification: no functional change. · 4b12a11f
Bob Wilson authored Jun 15, 2011
```
llvm-svn: 133047
```
4b12a11f

PerformBFICombine - (bfi A, (and B, Mask1), Mask2) -> (bfi A, B, Mask2) iff · 6d02d904

Evan Cheng authored Jun 15, 2011

the bits being cleared by the AND are not demanded by the BFI.

The previous BFI dag combine rule was actually incorrect (or used to be
correct until BFI representation changed).

rdar://9609030

llvm-svn: 133034

6d02d904

Add an optimization that looks for a specific pair-wise add pattern and... · e9e6705c

Tanya Lattner authored Jun 14, 2011

Add an optimization that looks for a specific pair-wise add pattern and generates a vpaddl instruction instead of scalarizing the add.
Includes a test case.

llvm-svn: 133027

e9e6705c

Anna's test commit (#2). · cd7f70e8
Anna Zaks authored Jun 14, 2011
```
llvm-svn: 133023
```
cd7f70e8

PR10136: fix PPCTargetLowering::LowerCall_SVR4 so that a necessary CopyToReg... · 164b1d75

Eli Friedman authored Jun 14, 2011

PR10136: fix PPCTargetLowering::LowerCall_SVR4 so that a necessary CopyToReg doesn't appear to be dead.

Roman, since you're writing tests for other PPC-SVR4 vararg-related stuff, would you mind writing a test for this?

llvm-svn: 133018

164b1d75

Anna's test commit. · d7f7fcd3
Anna Zaks authored Jun 14, 2011
```
llvm-svn: 133017
```
d7f7fcd3

Jun 14, 2011
- Also recognize ARM v4t and v5e variants. · 965ed2e7
  Evan Cheng authored Jun 14, 2011
```
llvm-svn: 133002
```
  965ed2e7
- Add one more argument to the prefetch intrinsic to indicate whether it's a data · dc9ff3a4
  Bruno Cardoso Lopes authored Jun 14, 2011
```
or instruction cache access. Update the targets to match it and also teach
autoupgrade.

llvm-svn: 132976
```
  dc9ff3a4
- Fit banner in 80-col and adjust whitespace. No functionality changes. · 34a425b0
  Nick Lewycky authored Jun 14, 2011
```
llvm-svn: 132964
```
  34a425b0