Commits · e629d008fbd6f8bd75d8b22240d5a53c9342f980 · Roger Ferrer / llvm-epi-0.8

Aug 10, 2011

Cleanup. Make ScalarEvolution an explicit argument of the · e629d008
Andrew Trick authored Aug 10, 2011
```
SimplifyIndVar utility since it is required.

llvm-svn: 137202
```
e629d008
SimplifyIndVar: make foldIVUser iterative to fold a chain of operands. · 74664d5e
Andrew Trick authored Aug 10, 2011
```
llvm-svn: 137199
```
74664d5e
Update CMake build. · 0b0e47d6
Benjamin Kramer authored Aug 10, 2011
```
llvm-svn: 137198
```
0b0e47d6

Added a SimplifyIndVar utility to simplify induction variable users · 3ec331ea

Andrew Trick authored Aug 10, 2011

based on ScalarEvolution without changing the induction variable phis.

This utility is the main tool of IndVarSimplifyPass, but the pass also
restructures induction variables in strange ways that are sensitive to
pass ordering. This provides a way for other loop passes to simplify
new uses of induction variables created during transformation. The
utility may be used by any pass that preserves ScalarEvolution. Soon
LoopUnroll will use it.

The net effect in this checkin is to cleanup the IndVarSimplify pass
by factoring out the SimplifyIndVar algorithm into a standalone utility.

llvm-svn: 137197

3ec331ea

Cleanup. Added LoopBlocksDFS::perform for simple clients. · 78b40c3f
Andrew Trick authored Aug 10, 2011
```
llvm-svn: 137195
```
78b40c3f
Fix a bug in vpermilps mask checking. Fix PR10560 · 278ffd7d
Bruno Cardoso Lopes authored Aug 10, 2011
```
llvm-svn: 137194
```
278ffd7d

Fix the LoopUnroller to handle nontrivial loops and partial unrolling. · b72bbe2a

Andrew Trick authored Aug 10, 2011

These are not individual bug fixes. I had to rewrite a good chunk of
the unroller to make it sane. I think it was getting lucky on trivial
completely unrolled loops with no early exits. I included some fairly
simple unit tests for partial unrolling. I didn't do much stress
testing, so it may not be perfect, but should be usable now.

llvm-svn: 137190

b72bbe2a

Push GPRnopc through a large number of instruction definitions to tighten operand decoding. · 8059f0cf
Owen Anderson authored Aug 10, 2011
```
llvm-svn: 137189
```
8059f0cf
Trim an unneeded header. · b91e4899
Jakob Stoklund Olesen authored Aug 09, 2011
```
llvm-svn: 137184
```
b91e4899

Promote VMOVS to VMOVD when possible. · 6a14dc01

Jakob Stoklund Olesen authored Aug 09, 2011

On Cortex-A8, we use the NEON v2f32 instructions for f32 arithmetic. For
better latency, we also send D-register copies down the NEON pipeline by
translating them to vorr instructions.

This patch promotes even S-register copies to D-register copies when
possible so they can also go down the NEON pipeline.  Example:

        vldr.32 s0, LCPI0_0
    loop:
        vorr    d1, d0, d0
    loop2:
        ...
        vadd.f32        d1, d1, d16

The vorr instruction looked like this after regalloc:

    %S2<def> = COPY %S0, %D1<imp-def>

Copies involving odd S-registers, and copies that don't define the full
D-register are left alone.

llvm-svn: 137182

6a14dc01

Tighten operand checking of register-shifted-register operands. · 92b942b1
Owen Anderson authored Aug 09, 2011
```
llvm-svn: 137180
```
92b942b1
Add 256-bit support for v8i32, v4i64 and v4f64 ISD::SELECT. Fix PR10556 · 72323966
Bruno Cardoso Lopes authored Aug 09, 2011
```
llvm-svn: 137179
```
72323966
Tighten operand checking on memory barrier instructions. · e008931b
Owen Anderson authored Aug 09, 2011
```
llvm-svn: 137176
```
e008931b

VMCore/BasicBlock.cpp: Don't assume BasicBlock::iterator might end with a... · 4f041651

NAKAMURA Takumi authored Aug 09, 2011

VMCore/BasicBlock.cpp: Don't assume BasicBlock::iterator might end with a non-PHInode Instruction in successors.

Frontends(eg. clang) might pass incomplete form of IR, to step off the way beyond iterator end. In the case I had met, it took infinite loop due to meeting bogus PHInode.

Thanks to Jay Foad and John McCall.

llvm-svn: 137175

4f041651

Fix whitespace. · 5b64b810
NAKAMURA Takumi authored Aug 09, 2011
```
llvm-svn: 137174
```
5b64b810
Tighten operand checking on CPS instructions. · 3d2e0e9d
Owen Anderson authored Aug 09, 2011
```
llvm-svn: 137172
```
3d2e0e9d
Representation of 'atomic load' and 'atomic store' in IR. · 59b66883
Eli Friedman authored Aug 09, 2011
```
llvm-svn: 137170
```
59b66883
Create a new register class for the set of all GPRs except the PC. Use it to... · 042619f9
Owen Anderson authored Aug 09, 2011
```
Create a new register class for the set of all GPRs except the PC.  Use it to tighten our decoding of BFI.

llvm-svn: 137168
```
042619f9
Add v16i16 and v32i8 store patterns · fc481959
Bruno Cardoso Lopes authored Aug 09, 2011
```
llvm-svn: 137166
```
fc481959
Fix 80-column violations. · a15e3aaa
Chad Rosier authored Aug 09, 2011
```
llvm-svn: 137163
```
a15e3aaa
Use fp unpack instructions to unpack int types. Until we have AVX2, this · 6963062a
Bruno Cardoso Lopes authored Aug 09, 2011
```
is the best we can do for these patterns. This fix PR10554.

llvm-svn: 137161
```
6963062a
Fix a couple ridiculous copy-paste errors. rdar://9914773 . · 4ef2426b
Eli Friedman authored Aug 09, 2011
```
llvm-svn: 137160
```
4ef2426b
Add a C interface to PassManagerBuilder. It is missing the addExtension · 07f60915
Rafael Espindola authored Aug 09, 2011
```
functionality since in the C api a pass is created and added to a pass
manager in a single call.

llvm-svn: 137159
```
07f60915

Don't truncate MachO addresses. · a3171603

Jim Grosbach authored Aug 09, 2011

Assigned symbol addresses get truncated to 32-bits, even on 64-bit platforms.
That's obviously bogus.
For example,

 .globl _foo
 .equ _foo, 0x987654321ULL


rdar://9922863

llvm-svn: 137158

a3171603

ARM Disassembler: sign extend branch immediates. · 406dc175
Benjamin Kramer authored Aug 09, 2011
```
Not sure about BLXi, but this is what the old disassembler did.

llvm-svn: 137156
```
406dc175

Aug 09, 2011
- Silence an false-positive warning. · d151b099
  Owen Anderson authored Aug 09, 2011
```
llvm-svn: 137154
```
  d151b099
- Don't generate the old-style disassembler in CMake builds either. · d770f6c1
  Owen Anderson authored Aug 09, 2011
```
llvm-svn: 137153
```
  d770f6c1
- The new ARM disassembler disassembles "bx lr" as a special BX_ret instruction... · de2c3813
  Benjamin Kramer authored Aug 09, 2011
```
The new ARM disassembler disassembles "bx lr" as a special BX_ret instruction so target specific analysis isn't needed anymore.

llvm-svn: 137151
```
  de2c3813
- Don't continue generating the old-style decoder file. · 982aa050
  Owen Anderson authored Aug 09, 2011
```
llvm-svn: 137150
```
  982aa050
- ARM fix typo in pre-indexed store lowering. · 5e80abbb
  Jim Grosbach authored Aug 09, 2011
```
rdar://9915869

llvm-svn: 137148
```
  5e80abbb
- Attempt to fix CMake build. · c7afd843
  Owen Anderson authored Aug 09, 2011
```
llvm-svn: 137147
```
  c7afd843
- Tighten Thumb1 branch predicate decoding. · 7a2401db
  Owen Anderson authored Aug 09, 2011
```
llvm-svn: 137146
```
  7a2401db
- Replace the existing ARM disassembler with a new one based on the FixedLenDecoderEmitter. · e0152a73
  Owen Anderson authored Aug 09, 2011
```
This new disassembler can correctly decode all the testcases that the old one did, though
some "expected failure" testcases are XFAIL'd for now because it is not (yet) as strict in
operand checking as the old one was.

llvm-svn: 137144
```
  e0152a73
- Put Darwin-specific code inside an __APPLE__ ifdef. · f60d6df8
  Bob Wilson authored Aug 09, 2011
```
llvm-svn: 137137
```
  f60d6df8
- Revert r137134. It breaks some code as Eli pointed out. · d7f41b7f
  Bill Wendling authored Aug 09, 2011
```
llvm-svn: 137135
```
  d7f41b7f
- Print out the variable declaration only if it is a declaration. Otherwise, a · 84ec8f65
  Bill Wendling authored Aug 09, 2011
```
'static' variable will be emitted twice.
PR10081

llvm-svn: 137134
```
  84ec8f65
- Inflate register classes after coalescing. · 53910d6a
  Jakob Stoklund Olesen authored Aug 09, 2011
```
Coalescing can remove copy-like instructions with sub-register operands
that constrained the register class.  Examples are:

  x86: GR32_ABCD:sub_8bit_hi -> GR32
  arm: DPR_VFP2:ssub0 -> DPR

Recompute the register class of any virtual registers that are used by
less instructions after coalescing.

This affects code generation for the Cortex-A8 where we use NEON
instructions for f32 operations, c.f. fp_convert.ll:

  vadd.f32  d16, d1, d0
  vcvt.s32.f32  d0, d16

The register allocator is now free to use d16 for the temporary, and
that comes first in the allocation order because it doesn't interfere
with any s-registers.

llvm-svn: 137133
```
  53910d6a
- Reapply a more appropriate solution than in r137114. AVX supports · bed48dc8
  Bruno Cardoso Lopes authored Aug 09, 2011
```
v4f64 = sitofp v4i32. This fix PR10559.
Also add support for v4i32 = fptosi v4f64.

llvm-svn: 137128
```
  bed48dc8
- Revert r137114 · 24dd1d4a
  Bruno Cardoso Lopes authored Aug 09, 2011
```
llvm-svn: 137127
```
  24dd1d4a
- PTX: Add initial support for device function calls · db05c2b9
  Justin Holewinski authored Aug 09, 2011
```
- Calls are supported on SM 2.0+ for function with no return values

llvm-svn: 137125
```
  db05c2b9