Commits · 39d40f7baf48e10010f421f8eda791c18090a3d8 · Roger Ferrer / llvm-epi-0.8

May 16, 2013

[mips] Fix instruction selection pattern for sint_to_fp node to avoid emitting an · 39d40f7b

Akira Hatanaka authored May 16, 2013

invalid instruction sequence.

Rather than emitting an int-to-FP move instruction and an int-to-FP conversion
instruction during instruction selection, we emit a pseudo instruction which gets
expanded post-RA. Without this change, register allocation can possibly insert a
floating point register move instruction between the two instructions, which is not
valid according to the ISA manual.

mtc1 $f4, $4         # int-to-fp move instruction.
mov.s $f2, $f4       # move contents of $f4 to $f2.
cvt.s.w $f0, $f2     # int-to-fp conversion.

llvm-svn: 182042

39d40f7b

Mips assembler: Add branch macro definitions · 51785c47

Jack Carter authored May 16, 2013

This patch adds bnez and beqz instructions which represent alias definitions for bne and beq instructions as follows:
bnez $rs,$imm => bne $rs,$zero,$imm
beqz $rs,$imm => beq $rs,$zero,$imm

The corresponding test cases are added.

Patch by Vladimir Medic

llvm-svn: 182040

51785c47

[mips] Fix indentation. · 21bab5ba
Akira Hatanaka authored May 16, 2013
```
llvm-svn: 182036
```
21bab5ba
[mips] Delete unused enum value. · 7b6e4f13
Akira Hatanaka authored May 16, 2013
```
llvm-svn: 182035
```
7b6e4f13

· 9d980cbd

Ulrich Weigand authored May 16, 2013

[PowerPC] Use true offset value in "memrix" machine operands

This is the second part of the change to always return "true"
offset values from getPreIndexedAddressParts, tackling the
case of "memrix" type operands.

This is about instructions like LD/STD that only have a 14-bit
field to encode immediate offsets, which are implicitly extended
by two zero bits by the machine, so that in effect we can access
16-bit offsets as long as they are a multiple of 4.

The PowerPC back end currently handles such instructions by
carrying the 14-bit value (as it will get encoded into the
actual machine instructions) in the machine operand fields
for such instructions.  This means that those values are
in fact not the true offset, but rather the offset divided
by 4 (and then truncated to an unsigned 14-bit value).

Like in the case fixed in r182012, this makes common code
operations on such offset values not work as expected.
Furthermore, there doesn't really appear to be any strong
reason why we should encode machine operands this way.

This patch therefore changes the encoding of "memrix" type
machine operands to simply contain the "true" offset value
as a signed immediate value, while enforcing the rules that
it must fit in a 16-bit signed value and must also be a
multiple of 4.

This change must be made simultaneously in all places that
access machine operands of this type.  However, just about
all those changes make the code simpler; in many cases we
can now just share the same code for memri and memrix
operands.

llvm-svn: 182032

9d980cbd

PPC32 cannot form counter loops around i64 FP conversions · 47db66d4

Hal Finkel authored May 16, 2013

On PPC32, i64 FP conversions are implemented using runtime calls (which clobber
the counter register). These must be excluded.

llvm-svn: 182023

47db66d4

Fixing a 64-bit conversion warning in MSVC. · b4284e6c
Aaron Ballman authored May 16, 2013
```
llvm-svn: 182018
```
b4284e6c

Remove dead calls to addFrameMove. · 63d2e0ad

Rafael Espindola authored May 16, 2013

Without a PROLOG_LABEL present, the cfi instructions are never printed.

llvm-svn: 182016

63d2e0ad

· 7aa76b6a

Ulrich Weigand authored May 16, 2013

[PowerPC] Report true displacement value from getPreIndexedAddressParts

DAGCombiner::CombineToPreIndexedLoadStore calls a target routine to
decompose a memory address into a base/offset pair.  It expects the
offset (if constant) to be the true displacement value in order to
perform optional additional optimizations; in particular, to convert
other uses of the original pointer into uses of the new base pointer
after pre-increment.

The PowerPC implementation of getPreIndexedAddressParts, however,
simply calls SelectAddressRegImm, which returns a TargetConstant.
This value is appropriate for encoding into the instruction, but
it is not always usable as true displacement value:

- Its type is always MVT::i32, even on 64-bit, where addresses
  ought to be i64 ... this causes the optimization to simply
  always fail on 64-bit due to this line in DAGCombiner:

      // FIXME: In some cases, we can be smarter about this.
      if (Op1.getValueType() != Offset.getValueType()) {

- Its value is truncated to an unsigned 16-bit value if negative.
  This causes the above opimization to generate wrong code.

This patch fixes both problems by simply returning the true
displacement value (in its original type).  This doesn't
affect any other user of the displacement.

llvm-svn: 182012

7aa76b6a

[SystemZ] Tweak register array comment · 7fdd268b
Richard Sandiford authored May 16, 2013
```
llvm-svn: 182007
```
7fdd268b
Removed unused variable, detected by gcc · b3391b58
Patrik Hagglund authored May 16, 2013
```
-Wunused-but-set-variable. Leftover from r181979.

llvm-svn: 181993
```
b3391b58
Delete dead code. · 7242186b
Rafael Espindola authored May 16, 2013
```
llvm-svn: 181982
```
7242186b

Don't call addFrameMove on XCore. · e3d5e535

Rafael Espindola authored May 16, 2013

getExceptionHandlingType is not ExceptionHandling::DwarfCFI on xcore, so
etFrameInstructions is never called. There is no point creating cfi
instructions if they are never used.

llvm-svn: 181979

e3d5e535

Removed dead code. · 6e8c0d94
Rafael Espindola authored May 16, 2013
```
llvm-svn: 181975
```
6e8c0d94

Patch number 2 for mips16/32 floating point interoperability stubs. · 515e9376

Reed Kotler authored May 16, 2013

This creates stubs that help Mips32 functions call Mips16 
functions which have floating point parameters that are normally passed
in floating point registers.
 

llvm-svn: 181972

515e9376

Revert "Support unaligned load/store on more ARM targets" · 36f00d9f
Derek Schuff authored May 15, 2013
```
This reverts r181898.

llvm-svn: 181944
```
36f00d9f
Delete dead code. · 84ee6c40
Rafael Espindola authored May 15, 2013
```
llvm-svn: 181941
```
84ee6c40

undef setjmp in PPCCTRLoops · 80267a0a

Hal Finkel authored May 15, 2013

Trying to unbreak the VS build by copying some undef code from
Utils/LowerInvoke.cpp.

llvm-svn: 181938

80267a0a

X86: Remove redundant test instructions · 8f169742

David Majnemer authored May 15, 2013

Increase the number of instructions LLVM recognizes as setting the ZF
flag. This allows us to remove test instructions that redundantly
recalculate the flag.

llvm-svn: 181937

8f169742

May 15, 2013

Implement PPC counter loops as a late IR-level pass · 25c1992b

Hal Finkel authored May 15, 2013

The old PPCCTRLoops pass, like the Hexagon pass version from which it was
derived, could only handle some simple loops in canonical form. We cannot
directly adapt the new Hexagon hardware loops pass, however, because the
Hexagon pass contains a fundamental assumption that non-constant-trip-count
loops will contain a guard, and this is not always true (the result being that
incorrect negative counts can be generated). With this commit, we replace the
pass with a late IR-level pass which makes use of SE to calculate the
backedge-taken counts and safely generate the loop-count expressions (including
any necessary max() parts). This IR level pass inserts custom intrinsics that
are lowered into the desired decrement-and-branch instructions.

The most fragile part of this new implementation is that interfering uses of
the counter register must be detected on the IR level (and, on PPC, this also
includes any indirect branches in addition to function calls). Also, to make
all of this work, we need a variant of the mtctr instruction that is marked
as having side effects. Without this, machine-code level CSE, DCE, etc.
illegally transform the resulting code. Hopefully, this can be improved
in the future.

This new pass is smaller than the original (and much smaller than the new
Hexagon hardware loops pass), and can handle many additional cases correctly.
In addition, the preheader-creation code has been copied from LoopSimplify, and
after we decide on where it belongs, this code will be refactored so that it
can be explicitly shared (making this implementation even smaller).

The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for
the new Hexagon pass. There are a few classes of loops that this pass does not
transform (noted by FIXMEs in the files), but these deficiencies can be
addressed within the SE infrastructure (thus helping many other passes as well).

llvm-svn: 181927

25c1992b

Cleanup relocation sorting for ELF. · 0f2a6fe6

Rafael Espindola authored May 15, 2013

We want the order to be deterministic on all platforms. NAKAMURA Takumi
fixed that in r181864. This patch is just two small cleanups:

* Move the function to the cpp file. It is only passed to array_pod_sort.
* Remove the ppc implementation which is now redundant

llvm-svn: 181910

0f2a6fe6

PPCISelLowering.h: Escape \@ in comments. [-Wdocumentation] · dc9f013a
NAKAMURA Takumi authored May 15, 2013
```
llvm-svn: 181907
```
dc9f013a
Whitespace. · dcc66456
NAKAMURA Takumi authored May 15, 2013
```
llvm-svn: 181906
```
dcc66456

Support unaligned load/store on more ARM targets · 72ddaba7

Derek Schuff authored May 15, 2013

This patch matches GCC behavior: the code used to only allow unaligned
load/store on ARM for v6+ Darwin, it will now allow unaligned load/store for
v6+ Darwin as well as for v7+ on other targets.

The distinction is made because v6 doesn't guarantee support (but LLVM assumes
that Apple controls hardware+kernel and therefore have conformant v6 CPUs),
whereas v7 does provide this guarantee (and Linux behaves sanely).

Overall this should slightly improve performance in most cases because of
reduced I$ pressure.

Patch by JF Bastien

llvm-svn: 181897

72ddaba7

· 2fb140ef

Ulrich Weigand authored May 15, 2013

[PowerPC] Remove need for adjustFixupOffst hack

Now that applyFixup understands differently-sized fixups, we can define
fixup_ppc_lo16/fixup_ppc_lo16_ds/fixup_ppc_ha16 to properly be 2-byte
fixups, applied at an offset of 2 relative to the start of the 
instruction text.

This has the benefit that if we actually need to generate a real
relocation record, its address will come out correctly automatically,
without having to fiddle with the offset in adjustFixupOffset.

Tested on both 64-bit and 32-bit PowerPC, using external and
integrated assembler.

llvm-svn: 181894

2fb140ef

[SystemZ] Make use of SUBTRACT HALFWORD · ffd14417
Richard Sandiford authored May 15, 2013
```
Thanks to Ulrich Weigand for noticing that this instruction was missing.

llvm-svn: 181893
```
ffd14417

· 56f5b28d

Ulrich Weigand authored May 15, 2013

[PowerPC] Correctly handle fixups of other than 4 byte size

The PPCAsmBackend::applyFixup routine handles the case where a
fixup can be resolved within the same object file.  However,
this routine is currently hard-coded to assume the size of
any fixup is always exactly 4 bytes.

This is sort-of correct for fixups on instruction text; even
though it only works because several of what really would be
2-byte fixups are presented as 4-byte fixups instead (requiring
another hack in PPCELFObjectWriter::adjustFixupOffset to clean
it up).

However, this assumption breaks down completely for fixups
on data, which legitimately can be of any size (1, 2, 4, or 8).

This patch makes applyFixup aware of fixups of varying sizes,
introducing a new helper routine getFixupKindNumBytes (along
the lines of what the ARM back end does).  Note that in order
to handle fixups of size 8, we also need to fix the return type
of adjustFixupValue to uint64_t to avoid truncation.

Tested on both 64-bit and 32-bit PowerPC, using external and
integrated assembler.

llvm-svn: 181891

56f5b28d

[SystemZ] Add more future work items to the README · 619859f4
Richard Sandiford authored May 15, 2013
```
Based on an analysis by Ulrich Weigand.

llvm-svn: 181882
```
619859f4

ARM ISel: Don't create illegal types during LowerMUL · af85f608

Arnold Schwaighofer authored May 14, 2013

The transformation happening here is that we want to turn a
"mul(ext(X), ext(X))" into a "vmull(X, X)", stripping off the extension. We have
to make sure that X still has a valid vector type - possibly recreate an
extension to a smaller type. In case of a extload of a memory type smaller than
64 bit we used create a ext(load()). The problem with doing this - instead of
recreating an extload - is that an illegal type is exposed.

This patch fixes this by creating extloads instead of ext(load()) sequences.

Fixes PR15970.

radar://13871383

llvm-svn: 181842

af85f608

May 14, 2013

Implement the PowerPC system call (sc) instruction. · a87a7e26
Bill Schmidt authored May 14, 2013
```
Instruction added at request of Roman Divacky.  Tested via asm-parser.

llvm-svn: 181821
```
a87a7e26
Hexagon: Pass to replace tranfer/copy instructions into combine instruction · 803e506f
Jyotsna Verma authored May 14, 2013
```
where possible.

llvm-svn: 181817
```
803e506f
Reapply "Subtract isn't commutative, fix this for MMX psub." with · b27cd8be
Eric Christopher authored May 14, 2013
```
a somewhat randomly chosen cpu that will minimize cpu specific
differences on bots.

llvm-svn: 181814
```
b27cd8be
Temporarily revert "Subtract isn't commutative, fix this for MMX psub." · 3eee7454
Eric Christopher authored May 14, 2013
```
It's causing failures on the atom bot.

llvm-svn: 181812
```
3eee7454
Subtract isn't commutative, fix this for MMX psub. · 0344f495
Eric Christopher authored May 14, 2013
```
Patch by Andrea DiBiagio.

llvm-svn: 181809
```
0344f495
Hexagon: Add patterns to generate 'combine' instructions. · 2dca82ad
Jyotsna Verma authored May 14, 2013
```
llvm-svn: 181805
```
2dca82ad
Hexagon: ArePredicatesComplement should not restrict itself to TFRs. · 11bd54af
Jyotsna Verma authored May 14, 2013
```
llvm-svn: 181803
```
11bd54af

PPC32: Fix stack collision between FP and CR save areas. · ef3d1a24

Bill Schmidt authored May 14, 2013

The changes to CR spill handling missed a case for 32-bit PowerPC.
The code in PPCFrameLowering::processFunctionBeforeFrameFinalized()
checks whether CR spill has occurred using a flag in the function
info.  This flag is only set by storeRegToStackSlot and
loadRegFromStackSlot.  spillCalleeSavedRegisters does not call
storeRegToStackSlot, but instead produces MI directly.  Thus we don't
see the CR is spilled when assigning frame offsets, and the CR spill
ends up colliding with some other location (generally the FP slot).

This patch sets the flag in spillCalleeSavedRegisters for PPC32 so
that the CR spill is properly detected and gets its own slot in the
stack frame.

llvm-svn: 181800

ef3d1a24

Hexagon: Remove dead-code after unconditional return from addPreSched2. · c61e350a
Jyotsna Verma authored May 14, 2013
```
llvm-svn: 181797
```
c61e350a

R600/SI: Add processor type for Hainan asic · 1e21b530

Tom Stellard authored May 14, 2013



Patch by: Alex Deucher

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181792

1e21b530

[SystemZ] Add disassembler support · eb9af294
Richard Sandiford authored May 14, 2013
```
llvm-svn: 181777
```
eb9af294