Commits · 111174be7b14e10ef45c5bd5e57b4b55521f6ff8 · Roger Ferrer / llvm-epi-0.8

Aug 17, 2012

Correct MCJIT functionality for MIPS32 architecture. · 111174be

Akira Hatanaka authored Aug 17, 2012

No new tests are added.
All tests in ExecutionEngine/MCJIT that have been failing pass after this patch
is applied (when "make check" is done on a mips board). 

Patch by Petar Jovanovic.

llvm-svn: 162135

111174be

Avoid folding ADD instructions with FI operands. · 7b1a2e8f

Jakob Stoklund Olesen authored Aug 17, 2012

PEI can't handle the pseudo-instructions. This can be removed when the
pseudo-instructions are replaced by normal predicated instructions.

Fixes PR13628.

llvm-svn: 162130

7b1a2e8f

Add stub methods for mips assembly matcher. · 7605630c
Akira Hatanaka authored Aug 17, 2012
```
Patch by Vladimir Medic.

llvm-svn: 162124
```
7605630c

Change the `linker_private_weak_def_auto' linkage to `linkonce_odr_auto_hide' to · 34bc34ec

Bill Wendling authored Aug 17, 2012

make it more consistent with its intended semantics.

The `linker_private_weak_def_auto' linkage type was meant to automatically hide
globals which never had their addresses taken. It has nothing to do with the
`linker_private' linkage type, which outputs the symbols with a `l' (ell) prefix
among other things.

The intended semantic is more like the `linkonce_odr' linkage type.

Change the name of the linkage type to `linkonce_odr_auto_hide'. And therefore
changing the semantics so that it produces the correct output for the linker.

Note: The old linkage name `linker_private_weak_def_auto' will still parse but
is not a synonym for `linkonce_odr_auto_hide'. This should be removed in 4.0.
<rdar://problem/11754934>

llvm-svn: 162114

34bc34ec

Add comment, clean up code. No functional change. · c1dee482
Jakob Stoklund Olesen authored Aug 17, 2012
```
llvm-svn: 162107
```
c1dee482
Implement NEON domain switching for scalar <-> S-register vmovs on ARM · f6618153
Tim Northover authored Aug 17, 2012
```
llvm-svn: 162094
```
f6618153
Use nested switch to select arguments to reduce calls to EmitPCMP. · 31625574
Craig Topper authored Aug 17, 2012
```
llvm-svn: 162089
```
31625574

Make ReplaceATOMIC_BINARY_64 a static function. Use a nested switch to reduce... · 602e1abe

Craig Topper authored Aug 17, 2012

Make ReplaceATOMIC_BINARY_64 a static function. Use a nested switch to reduce to only a single call to it thus allowing it to be inlined by the compiler.

llvm-svn: 162088

602e1abe

Remove unnecessary include of ARMGenInstrInfo.inc. · f6add7e6
Craig Topper authored Aug 17, 2012
```
llvm-svn: 162086
```
f6add7e6

Add ADD and SUB to the predicable ARM instructions. · 0ea1fce6

Jakob Stoklund Olesen authored Aug 16, 2012

It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.

Then the pseudo-instructions can go away.

llvm-svn: 162061

0ea1fce6

Handle ARM MOVCC optimization in PeepholeOptimizer. · c19bf028
Jakob Stoklund Olesen authored Aug 16, 2012
```
Use the target independent select analysis hooks.

llvm-svn: 162060
```
c19bf028

Aug 16, 2012
- Revert r162034, r162035 and r162037. · 2039a987
  Roman Divacky authored Aug 16, 2012
```
llvm-svn: 162039
```
  2039a987
- Define and handle additional fixup kinds. By Adhemerval Zanella. · 9d38fc8d
  Roman Divacky authored Aug 16, 2012
```
llvm-svn: 162037
```
  9d38fc8d
- Fix typo and grammar. By Adhemerval Zanella. · 1faf5b07
  Roman Divacky authored Aug 16, 2012
```
llvm-svn: 162032
```
  1faf5b07
- [arm-fast-isel] Add support for fastcc. · 26088cb3
  Jush Lu authored Aug 16, 2012
```
Without fastcc support, the caller just falls through to CallingConv::C
for fastcc, but callee still uses fastcc, this inconsistency of calling
convention is a problem, and fastcc support can fix it.

llvm-svn: 162013
```
  26088cb3
- Patch to enable FMA on bdver2 target. Make XOP feature enable FMA4 as well. · af3e9834
  Anitha Boyapati authored Aug 16, 2012
```
llvm-svn: 162012
```
  af3e9834
- (no commit message) · 426feb61
  Anitha Boyapati authored Aug 16, 2012
```
llvm-svn: 162010
```
  426feb61
- Add Android ABI to Mips backend to handle functions returning vectors of four · 89d50b39
  Akira Hatanaka authored Aug 16, 2012
```
floats.

llvm-svn: 162008
```
  89d50b39
- Fold predicable instructions into MOVCC / t2MOVCC. · 6cb96120
  Jakob Stoklund Olesen authored Aug 15, 2012
```
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.

This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.

llvm-svn: 161994
```
  6cb96120
Aug 15, 2012

Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows... · eec6bc62

Evan Cheng authored Aug 15, 2012

Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029

llvm-svn: 161962

eec6bc62

Add missing Rfalse operand to the predicated pseudo-instructions. · 2ec0c41e

Jakob Stoklund Olesen authored Aug 15, 2012

When predicating this instruction:

  Rd = ADD Rn, Rm

We need an extra operand to represent the value given to Rd when the
predicate is false:

  Rd = ADDCC Rfalse, Rn, Rm, pred

The Rd and Rfalse operands are different registers while in SSA form.
Rfalse is tied to Rd to make sure they get the same register during
register allocation.

Previously, Rd and Rn were tied, but that is not required.

Compare to MOVCC:

  Rd = MOVCC Rfalse, Rtrue, pred

llvm-svn: 161955

2ec0c41e

The names of VFP variants of half-to-float conversion instructions were · c6d945b1

Anton Korobeynikov authored Aug 14, 2012

reversed. This leads to wrong codegen for float-to-half conversion
intrinsics which are used to support storage-only fp16 type.
NEON variants of same instructions are fine.

llvm-svn: 161907

c6d945b1

This needs braces. Spotted by Bill. · 5f61a749
Eric Christopher authored Aug 14, 2012
```
llvm-svn: 161906
```
5f61a749
minor fix of X86ISD::VSEXT_MOVL dump · 06f6fe87
Michael Liao authored Aug 14, 2012
```
llvm-svn: 161902
```
06f6fe87

Aug 14, 2012

fix PR11334 · 34107b91

Michael Liao authored Aug 14, 2012

- FP_EXTEND only support extending from vectors with matching elements.
  This results in the scalarization of extending to v2f64 from v2f32,
  which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
  extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.

llvm-svn: 161894

34107b91

Switch the fixed-length disassembler to be table-driven. · ecaef49f

Jim Grosbach authored Aug 14, 2012

Refactor the TableGen'erated fixed length disassemblmer to use a
table-driven state machine rather than a massive set of nested
switch() statements.

As a result, the ARM Disassembler (ARMDisassembler.cpp) builds much more
quickly and generates a smaller end result. For a Release+Asserts build on
a 16GB 3.4GHz i7 iMac w/ SSD:

Time to compile at -O2 (averaged w/ hot caches):
  Previous: 35.5s
  New:       8.9s

TEXT size:
  Previous: 447,251
  New:      297,661

Builds in 25% of the time previously required and generates code 66% of
the size.

Execution time of the disassembler is only slightly slower (7% disassembling
10 million ARM instructions, 19.6s vs 21.0s). The new implementation has
not yet been tuned, however, so the performance should almost certainly
be recoverable should it become a concern.

llvm-svn: 161888

ecaef49f

Factor duplicate calls to getUNDEF in several functions. · 925a281b
Craig Topper authored Aug 14, 2012
```
llvm-svn: 161860
```
925a281b

Re-factor intrinsic lowering to combine common parts of similar intrinsics.... · d0d4b11f

Craig Topper authored Aug 14, 2012

Re-factor intrinsic lowering to combine common parts of similar intrinsics. Reduces compiled code size a little bit.

llvm-svn: 161859

d0d4b11f

Aug 13, 2012

Remove the TII::scheduleTwoAddrSource() hook. · 702bcc3b

Jakob Stoklund Olesen authored Aug 13, 2012

It never does anything when running 'make check', and it get's in the
way of updating live intervals in 2-addr.

The hook was originally added to help form IT blocks in Thumb2 code
before register allocation, but the pass ordering has changed since
then, and we run if-conversion after register allocation now.

When the MI scheduler is enabled, there will be no less than two
schedulers between 2-addr and Thumb2ITBlockPass, so this hook is
unlikely to help anything.

llvm-svn: 161794

702bcc3b

ARM: enable struct byval for AAPCS-VFP. · d6c8270e
Manman Ren authored Aug 13, 2012
```
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 161789
```
d6c8270e

[Hexagon] Don't mark callee saved registers as clobbered by a tail call · 0bb7f23c

Arnold Schwaighofer authored Aug 13, 2012

This was causing unnecessary spills/restores of callee saved registers.

Fixes PR13572.

Patch by Pranav Bhandarkar!

llvm-svn: 161778

0bb7f23c

Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND... · 3a94c545

Nadav Rotem authored Aug 13, 2012

Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user. 

rdar://11876519

llvm-svn: 161775

3a94c545

X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from · 959acb10

Manman Ren authored Aug 13, 2012

OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed
to a memory operand.

PR13576

llvm-svn: 161769

959acb10

Add support for the %H output modifier. · 7d8b53c1
Eric Christopher authored Aug 13, 2012
```
Patch by Weiming Zhao.

llvm-svn: 161768
```
7d8b53c1
X86: when auto-detecting the subtarget features, make sure use IsIntel to detect · e90e94f1
Manman Ren authored Aug 13, 2012
```
Nehalem, Westmere and Sandy Bridge. AMD also has processor family 6.

llvm-svn: 161763
```
e90e94f1

Use correct loads for vector types during extending-load operations. · 5aaa7fde

Tim Northover authored Aug 13, 2012

Previously, we used VLD1.32 in all cases, however there are both 16 and 64-bit
accesses being selected, so we need to use an appropriate width load in those
cases.

llvm-svn: 161748

5aaa7fde

Tidy up VSETCC lowering code a bit more by adding an llvm_unreachable and... · 4e5eb727

Craig Topper authored Aug 13, 2012

Tidy up VSETCC lowering code a bit more by adding an llvm_unreachable and putting an a couple if conditions in a better order.

llvm-svn: 161746

4e5eb727

Refactor code a bit to share commonalities. No functional change intended. · 5145a0d9
Craig Topper authored Aug 13, 2012
```
llvm-svn: 161745
```
5145a0d9
Fix an unused variable warning from r161742. · ff6e4d19
Craig Topper authored Aug 13, 2012
```
llvm-svn: 161743
```
ff6e4d19

Remove the LowerMMXCONCAT_VECTORS function. It could never execute because... · a7aaa62d

Craig Topper authored Aug 13, 2012

Remove the LowerMMXCONCAT_VECTORS function. It could never execute because there are no legal 64-bit vector types that could be used as inputs to a 128-bit concat_vectors. Remove a target specific SDNode and its patterns that become unused as a result.

llvm-svn: 161742

a7aaa62d