Commits · fc33e1d99b40936321cbfb9a8c8e4841da9351aa · Roger Ferrer / llvm-epi-0.8

May 17, 2013

X86: Make shuffle -> shift conversion more aggressive about undefs. · fc33e1d9

Benjamin Kramer authored May 17, 2013

Shuffles that only move an element into position 0 of the vector are common in
the output of the loop vectorizer and often generate suboptimal code when SSSE3
is not available. Lower them to vector shifts if possible.

We still prefer palignr over psrldq because it has higher throughput on
sandybridge.

llvm-svn: 182102

fc33e1d9

FileCheckize test. · 7ccd1b86
Benjamin Kramer authored May 17, 2013
```
llvm-svn: 182101
```
7ccd1b86
LoopVectorize: Simplify code. No functionality change. · d84a6339
Benjamin Kramer authored May 17, 2013
```
llvm-svn: 182100
```
d84a6339
r182085 introduced a change that triggered an assertion on ARM. This is an immediate fix · 3285dc13
David Tweed authored May 17, 2013
```
which doesn't resolve the deeper problem.

llvm-svn: 182098
```
3285dc13

· 2dbe06a9

Ulrich Weigand authored May 17, 2013

[PowerPC] Fix hi/lo encoding in old-style code emitter

This patch implements the equivalent change to r182091/r182092
in the old-style code emitter.  Instead of having two separate
16-bit immediate encoding routines depending on the instruction,
this patch introduces a single encoder that checks the machine
operand flags to decide whether the low or high half of a
symbol address is required.

Since now both encoders make no further distinction between
"symbolLo" and "symbolHi", the .td operand can now use a
single getS16ImmEncoding method.

Tested by running the old-style JIT tests on 32-bit Linux.

llvm-svn: 182097

2dbe06a9

· 6e23ac60

Ulrich Weigand authored May 17, 2013

[PowerPC] Merge/rename PPC fixup types

Now that fixup_ppc_ha16 and fixup_ppc_lo16 are being treated exactly
the same everywhere, it no longer makes sense to have two fixup types.

This patch merges them both into a single type fixup_ppc_half16,
and renames fixup_ppc_lo16_ds to fixup_ppc_half16ds for consistency.
(The half16 and half16ds names are taken from the description of
relocation types in the PowerPC ABI.)

No change in code generation expected.

llvm-svn: 182092

6e23ac60

· 994f49ed

Ulrich Weigand authored May 17, 2013

[PowerPC] Fix processing of ha16/lo16 fixups

The current PowerPC MC back end distinguishes between fixup_ppc_ha16
and fixup_ppc_lo16, which are determined by the instruction the fixup
applies to, and uses this distinction to decide whether a fixup ought
to resolve to the high or the low part of a symbol address.

This isn't quite correct, however.  It is valid -if unusual- assembler
to use, e.g.
  li 1, symbol@ha
or
  lis 1, symbol@l
Whether the high or the low part of the address is used depends solely
on the @ suffix, not on the instruction.

In addition, both
  li 1, symbol
and
  lis 1, symbol
are valid, assuming the symbol address fits into 16 bits; again, both
will then refer to the actual symbol value (so li will load the value
itself, while lis will load the value shifted by 16).


To fix this, two places need to be adapted.  If the fixup cannot be
resolved at assembler time, a relocation needs to be emitted via
PPCELFObjectWriter::getRelocType.  This routine already looks at
the VK_ type to determine the relocation.  The only problem is that
will reject any _LO modifier in a ha16 fixup and vice versa.  This
is simply incorrect; any of those modifiers ought to be accepted
for either fixup type.

If the fixup *can* be resolved at assembler time, adjustFixupValue
currently selects the high bits of the symbol value if the fixup
type is ha16.  Again, this is incorrect; see the above example
  lis 1, symbol

Now, in theory we'd have to respect a VK_ modifier here.  However,
in fact common code never even attempts to resolve symbol references
using any nontrivial VK_ modifier at assembler time; it will always
fall back to emitting a reloc and letting the linker handle it.

If this ever changes, presumably there'd have to be a target callback
to resolve VK_ modifiers.  We'd then have to handle @ha etc. there.

llvm-svn: 182091

994f49ed

Fix a typo (ouput => output) · b4dc9f01
Sylvestre Ledru authored May 17, 2013
```
llvm-svn: 182090
```
b4dc9f01
Don't cast away constness. · 2057a2b8
Benjamin Kramer authored May 17, 2013
```
llvm-svn: 182086
```
2057a2b8

Minor changes to the MCJITTest unittests to use the correct API for finalizing · 2e7efedd

David Tweed authored May 17, 2013

the JIT object (including XFAIL an ARM test that now needs fixing). Also renames
internal function for consistency.

llvm-svn: 182085

2e7efedd

R600/SI: return undef instead of null for skipped arguments · b7be72df

Christian Konig authored May 17, 2013

This is a candidate for the stable branch.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=64694



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182084

b7be72df

[Sparc] Prevent instructions that defines or uses %o7 to be in call's delay slot. · 54bf611c
Venkatraman Govindaraju authored May 16, 2013
```
llvm-svn: 182063
```
54bf611c
Generate debug info for by-value struct args even if they are not used. · 9c93059a
Adrian Prantl authored May 16, 2013
```
radar://problem/13865940

llvm-svn: 182062
```
9c93059a

May 16, 2013

llvm-objdump: Initialize MCDisassembler once instead of for each section. · 0835ca12
Ahmed Bougacha authored May 16, 2013
```
llvm-svn: 182054
```
0835ca12

[mips] Improve instruction selection for pattern (store (fp_to_sint $src), $ptr). · 252f54f7

Akira Hatanaka authored May 16, 2013

Previously, three instructions were needed:

trunc.w.s $f0, $f2
mfc1 $4, $f0
sw $4, 0($2)

Now we need only two:

trunc.w.s $f0, $f2
swc1 $f0, 0($2)

llvm-svn: 182053

252f54f7

Remove addFrameMove. · b08d2c2d

Rafael Espindola authored May 16, 2013

Now that we have good testing, remove addFrameMove and create cfi
instructions directly.

llvm-svn: 182052

b08d2c2d

More test coverage for addFrameMove. · da5d1000
Rafael Espindola authored May 16, 2013
```
llvm-svn: 182051
```
da5d1000
[mips] Factor out unaligned store lowering code. · d82ee940
Akira Hatanaka authored May 16, 2013
```
llvm-svn: 182050
```
d82ee940
Fix cpu on test CodeGen/PowerPC/ctrloop-fp64.ll · 778c73c5
Hal Finkel authored May 16, 2013
```
We need ppc instead of generic to override native features on ppc machines.

llvm-svn: 182049
```
778c73c5

Mips assembler: Add TwoOperandConstraint definitions · 03f0fd37

Jack Carter authored May 16, 2013

This patch removes alias definition for addiu $rs,$imm 
and instead uses the TwoOperandAliasConstraint field in 
the ArithLogicI instruction class. 

This way all instructions that inherit ArithLogicI class 
have the same macro defined. 

The usage examples are added to test files.

Patch by Vladimir Medic

llvm-svn: 182048

03f0fd37

Mips td file formatting: white space and long lines · 59817110
Jack Carter authored May 16, 2013
```
llvm-svn: 182047
```
59817110
More addFrameMove test coverage. · aed131d6
Rafael Espindola authored May 16, 2013
```
llvm-svn: 182046
```
aed131d6

Create an new preheader in PPCCTRLoops to avoid counter register clobbers · 5f587c59

Hal Finkel authored May 16, 2013

Some IR-level instructions (such as FP <-> i64 conversions) are not chained
w.r.t. the mtctr intrinsic and yet may become function calls that clobber the
counter register. At the selection-DAG level, these might be reordered with the
mtctr intrinsic causing miscompiles. To avoid this situation, if an existing
preheader has instructions that might use the counter register, create a new
preheader for the mtctr intrinsic. This extra block will be remerged with the
old preheader at the MI level, but will prevent unwanted reordering at the
selection-DAG level.

llvm-svn: 182045

5f587c59

[mips] Test case for r182042. Add comment. · fce4dd79
Akira Hatanaka authored May 16, 2013
```
llvm-svn: 182044
```
fce4dd79

[mips] Fix instruction selection pattern for sint_to_fp node to avoid emitting an · 39d40f7b

Akira Hatanaka authored May 16, 2013

invalid instruction sequence.

Rather than emitting an int-to-FP move instruction and an int-to-FP conversion
instruction during instruction selection, we emit a pseudo instruction which gets
expanded post-RA. Without this change, register allocation can possibly insert a
floating point register move instruction between the two instructions, which is not
valid according to the ISA manual.

mtc1 $f4, $4         # int-to-fp move instruction.
mov.s $f2, $f4       # move contents of $f4 to $f2.
cvt.s.w $f0, $f2     # int-to-fp conversion.

llvm-svn: 182042

39d40f7b

More test coverage for addFrameMove. · 81250934
Rafael Espindola authored May 16, 2013
```
llvm-svn: 182041
```
81250934

Mips assembler: Add branch macro definitions · 51785c47

Jack Carter authored May 16, 2013

This patch adds bnez and beqz instructions which represent alias definitions for bne and beq instructions as follows:
bnez $rs,$imm => bne $rs,$zero,$imm
beqz $rs,$imm => beq $rs,$zero,$imm

The corresponding test cases are added.

Patch by Vladimir Medic

llvm-svn: 182040

51785c47

DAGCombine: Also shrink eq compares where the constant is exactly as large as the smaller type. · fc88c376
Benjamin Kramer authored May 16, 2013
```
if ((x & 255) == 255)

before: movzbl  %al, %eax
        cmpl  $255, %eax

after:  cmpb  $-1, %al
llvm-svn: 182038
```
fc88c376
[mips] Fix indentation. · 21bab5ba
Akira Hatanaka authored May 16, 2013
```
llvm-svn: 182036
```
21bab5ba
[mips] Delete unused enum value. · 7b6e4f13
Akira Hatanaka authored May 16, 2013
```
llvm-svn: 182035
```
7b6e4f13

Add TargetRegisterInfo::getCoveringLanes(). · 9ae96c7a

Jakob Stoklund Olesen authored May 16, 2013

This lane mask provides information about which register lanes
completely cover super-registers. See the block comment before
getCoveringLanes().

llvm-svn: 182034

9ae96c7a

· 9d980cbd

Ulrich Weigand authored May 16, 2013

[PowerPC] Use true offset value in "memrix" machine operands

This is the second part of the change to always return "true"
offset values from getPreIndexedAddressParts, tackling the
case of "memrix" type operands.

This is about instructions like LD/STD that only have a 14-bit
field to encode immediate offsets, which are implicitly extended
by two zero bits by the machine, so that in effect we can access
16-bit offsets as long as they are a multiple of 4.

The PowerPC back end currently handles such instructions by
carrying the 14-bit value (as it will get encoded into the
actual machine instructions) in the machine operand fields
for such instructions.  This means that those values are
in fact not the true offset, but rather the offset divided
by 4 (and then truncated to an unsigned 14-bit value).

Like in the case fixed in r182012, this makes common code
operations on such offset values not work as expected.
Furthermore, there doesn't really appear to be any strong
reason why we should encode machine operands this way.

This patch therefore changes the encoding of "memrix" type
machine operands to simply contain the "true" offset value
as a signed immediate value, while enforcing the rules that
it must fit in a 16-bit signed value and must also be a
multiple of 4.

This change must be made simultaneously in all places that
access machine operands of this type.  However, just about
all those changes make the code simpler; in many cases we
can now just share the same code for memri and memrix
operands.

llvm-svn: 182032

9d980cbd

PPC32 cannot form counter loops around i64 FP conversions · 47db66d4

Hal Finkel authored May 16, 2013

On PPC32, i64 FP conversions are implemented using runtime calls (which clobber
the counter register). These must be excluded.

llvm-svn: 182023

47db66d4

Add a triple to the test to try to fix the windows bots. · eb03f8a7
Rafael Espindola authored May 16, 2013
```
llvm-svn: 182022
```
eb03f8a7
More addFrameMove test coverage. · 8174c8cc
Rafael Espindola authored May 16, 2013
```
llvm-svn: 182021
```
8174c8cc

Use new CHECK-DAG support to stabilize CodeGen/PowerPC/recipest.ll · 22f91919

Bill Schmidt authored May 16, 2013

While testing some experimental code to add vector-scalar registers to
PowerPC, I noticed that a couple of independent instructions were
flipped by the scheduler.  The new CHECK-DAG support is perfect for
avoiding this problem.

llvm-svn: 182020

22f91919

Add more addFrameMove test coverage. · 12adfd8e
Rafael Espindola authored May 16, 2013
```
llvm-svn: 182019
```
12adfd8e
Fixing a 64-bit conversion warning in MSVC. · b4284e6c
Aaron Ballman authored May 16, 2013
```
llvm-svn: 182018
```
b4284e6c
Add more test coverage for addFrameMove. · c6b7383b
Rafael Espindola authored May 16, 2013
```
llvm-svn: 182017
```
c6b7383b

Remove dead calls to addFrameMove. · 63d2e0ad

Rafael Espindola authored May 16, 2013

Without a PROLOG_LABEL present, the cfi instructions are never printed.

llvm-svn: 182016

63d2e0ad