- May 17, 2013
-
-
Benjamin Kramer authored
Shuffles that only move an element into position 0 of the vector are common in the output of the loop vectorizer and often generate suboptimal code when SSSE3 is not available. Lower them to vector shifts if possible. We still prefer palignr over psrldq because it has higher throughput on sandybridge. llvm-svn: 182102
-
Benjamin Kramer authored
llvm-svn: 182101
-
Benjamin Kramer authored
llvm-svn: 182100
-
David Tweed authored
which doesn't resolve the deeper problem. llvm-svn: 182098
-
Ulrich Weigand authored
[PowerPC] Fix hi/lo encoding in old-style code emitter This patch implements the equivalent change to r182091/r182092 in the old-style code emitter. Instead of having two separate 16-bit immediate encoding routines depending on the instruction, this patch introduces a single encoder that checks the machine operand flags to decide whether the low or high half of a symbol address is required. Since now both encoders make no further distinction between "symbolLo" and "symbolHi", the .td operand can now use a single getS16ImmEncoding method. Tested by running the old-style JIT tests on 32-bit Linux. llvm-svn: 182097
-
Ulrich Weigand authored
[PowerPC] Merge/rename PPC fixup types Now that fixup_ppc_ha16 and fixup_ppc_lo16 are being treated exactly the same everywhere, it no longer makes sense to have two fixup types. This patch merges them both into a single type fixup_ppc_half16, and renames fixup_ppc_lo16_ds to fixup_ppc_half16ds for consistency. (The half16 and half16ds names are taken from the description of relocation types in the PowerPC ABI.) No change in code generation expected. llvm-svn: 182092
-
Ulrich Weigand authored
[PowerPC] Fix processing of ha16/lo16 fixups The current PowerPC MC back end distinguishes between fixup_ppc_ha16 and fixup_ppc_lo16, which are determined by the instruction the fixup applies to, and uses this distinction to decide whether a fixup ought to resolve to the high or the low part of a symbol address. This isn't quite correct, however. It is valid -if unusual- assembler to use, e.g. li 1, symbol@ha or lis 1, symbol@l Whether the high or the low part of the address is used depends solely on the @ suffix, not on the instruction. In addition, both li 1, symbol and lis 1, symbol are valid, assuming the symbol address fits into 16 bits; again, both will then refer to the actual symbol value (so li will load the value itself, while lis will load the value shifted by 16). To fix this, two places need to be adapted. If the fixup cannot be resolved at assembler time, a relocation needs to be emitted via PPCELFObjectWriter::getRelocType. This routine already looks at the VK_ type to determine the relocation. The only problem is that will reject any _LO modifier in a ha16 fixup and vice versa. This is simply incorrect; any of those modifiers ought to be accepted for either fixup type. If the fixup *can* be resolved at assembler time, adjustFixupValue currently selects the high bits of the symbol value if the fixup type is ha16. Again, this is incorrect; see the above example lis 1, symbol Now, in theory we'd have to respect a VK_ modifier here. However, in fact common code never even attempts to resolve symbol references using any nontrivial VK_ modifier at assembler time; it will always fall back to emitting a reloc and letting the linker handle it. If this ever changes, presumably there'd have to be a target callback to resolve VK_ modifiers. We'd then have to handle @ha etc. there. llvm-svn: 182091
-
Sylvestre Ledru authored
llvm-svn: 182090
-
Benjamin Kramer authored
llvm-svn: 182086
-
David Tweed authored
the JIT object (including XFAIL an ARM test that now needs fixing). Also renames internal function for consistency. llvm-svn: 182085
-
Christian Konig authored
This is a candidate for the stable branch. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=64694 Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Michel Dänzer <michel.daenzer@amd.com> Tested-by:
Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 182084
-
Venkatraman Govindaraju authored
llvm-svn: 182063
-
Adrian Prantl authored
radar://problem/13865940 llvm-svn: 182062
-
- May 16, 2013
-
-
Ahmed Bougacha authored
llvm-svn: 182054
-
Akira Hatanaka authored
Previously, three instructions were needed: trunc.w.s $f0, $f2 mfc1 $4, $f0 sw $4, 0($2) Now we need only two: trunc.w.s $f0, $f2 swc1 $f0, 0($2) llvm-svn: 182053
-
Rafael Espindola authored
Now that we have good testing, remove addFrameMove and create cfi instructions directly. llvm-svn: 182052
-
Rafael Espindola authored
llvm-svn: 182051
-
Akira Hatanaka authored
llvm-svn: 182050
-
Hal Finkel authored
We need ppc instead of generic to override native features on ppc machines. llvm-svn: 182049
-
Jack Carter authored
This patch removes alias definition for addiu $rs,$imm and instead uses the TwoOperandAliasConstraint field in the ArithLogicI instruction class. This way all instructions that inherit ArithLogicI class have the same macro defined. The usage examples are added to test files. Patch by Vladimir Medic llvm-svn: 182048
-
Jack Carter authored
llvm-svn: 182047
-
Rafael Espindola authored
llvm-svn: 182046
-
Hal Finkel authored
Some IR-level instructions (such as FP <-> i64 conversions) are not chained w.r.t. the mtctr intrinsic and yet may become function calls that clobber the counter register. At the selection-DAG level, these might be reordered with the mtctr intrinsic causing miscompiles. To avoid this situation, if an existing preheader has instructions that might use the counter register, create a new preheader for the mtctr intrinsic. This extra block will be remerged with the old preheader at the MI level, but will prevent unwanted reordering at the selection-DAG level. llvm-svn: 182045
-
Akira Hatanaka authored
llvm-svn: 182044
-
Akira Hatanaka authored
invalid instruction sequence. Rather than emitting an int-to-FP move instruction and an int-to-FP conversion instruction during instruction selection, we emit a pseudo instruction which gets expanded post-RA. Without this change, register allocation can possibly insert a floating point register move instruction between the two instructions, which is not valid according to the ISA manual. mtc1 $f4, $4 # int-to-fp move instruction. mov.s $f2, $f4 # move contents of $f4 to $f2. cvt.s.w $f0, $f2 # int-to-fp conversion. llvm-svn: 182042
-
Rafael Espindola authored
llvm-svn: 182041
-
Jack Carter authored
This patch adds bnez and beqz instructions which represent alias definitions for bne and beq instructions as follows: bnez $rs,$imm => bne $rs,$zero,$imm beqz $rs,$imm => beq $rs,$zero,$imm The corresponding test cases are added. Patch by Vladimir Medic llvm-svn: 182040
-
Benjamin Kramer authored
if ((x & 255) == 255) before: movzbl %al, %eax cmpl $255, %eax after: cmpb $-1, %al llvm-svn: 182038
-
Akira Hatanaka authored
llvm-svn: 182036
-
Akira Hatanaka authored
llvm-svn: 182035
-
Jakob Stoklund Olesen authored
This lane mask provides information about which register lanes completely cover super-registers. See the block comment before getCoveringLanes(). llvm-svn: 182034
-
Ulrich Weigand authored
[PowerPC] Use true offset value in "memrix" machine operands This is the second part of the change to always return "true" offset values from getPreIndexedAddressParts, tackling the case of "memrix" type operands. This is about instructions like LD/STD that only have a 14-bit field to encode immediate offsets, which are implicitly extended by two zero bits by the machine, so that in effect we can access 16-bit offsets as long as they are a multiple of 4. The PowerPC back end currently handles such instructions by carrying the 14-bit value (as it will get encoded into the actual machine instructions) in the machine operand fields for such instructions. This means that those values are in fact not the true offset, but rather the offset divided by 4 (and then truncated to an unsigned 14-bit value). Like in the case fixed in r182012, this makes common code operations on such offset values not work as expected. Furthermore, there doesn't really appear to be any strong reason why we should encode machine operands this way. This patch therefore changes the encoding of "memrix" type machine operands to simply contain the "true" offset value as a signed immediate value, while enforcing the rules that it must fit in a 16-bit signed value and must also be a multiple of 4. This change must be made simultaneously in all places that access machine operands of this type. However, just about all those changes make the code simpler; in many cases we can now just share the same code for memri and memrix operands. llvm-svn: 182032
-
Hal Finkel authored
On PPC32, i64 FP conversions are implemented using runtime calls (which clobber the counter register). These must be excluded. llvm-svn: 182023
-
Rafael Espindola authored
llvm-svn: 182022
-
Rafael Espindola authored
llvm-svn: 182021
-
Bill Schmidt authored
While testing some experimental code to add vector-scalar registers to PowerPC, I noticed that a couple of independent instructions were flipped by the scheduler. The new CHECK-DAG support is perfect for avoiding this problem. llvm-svn: 182020
-
Rafael Espindola authored
llvm-svn: 182019
-
Aaron Ballman authored
llvm-svn: 182018
-
Rafael Espindola authored
llvm-svn: 182017
-
Rafael Espindola authored
Without a PROLOG_LABEL present, the cfi instructions are never printed. llvm-svn: 182016
-