- May 19, 2013
-
-
Jakob Stoklund Olesen authored
The wired physreg doesn't work on tied operands like on MOVXCC. Add a README note to fix this later. llvm-svn: 182225
-
Jakob Stoklund Olesen authored
llvm-svn: 182224
-
rdar://problem/13924072Bob Wilson authored
This fixes a bootstrapping problem with builds for Apple ARM targets. Clang had the wrong prototype for __clear_cache with ARM targets. Rafael fixed that in clang svn r181784 and r181810, but without those changes, we can't build this code for ARM because clang reports an error about the declaration in Memory.inc not matching the builtin declaration. Some of our buildbots need to use an older compiler that doesn't have the clang fix. Since __clear_cache is never used here when __APPLE__ is defined, I'm just conditionalizing the declaration to match that. I also moved the declaration of sys_icache_invalidate inside the conditional for __APPLE__ while I was at it. llvm-svn: 182223
-
Jakob Stoklund Olesen authored
llvm-svn: 182222
-
Jakob Stoklund Olesen authored
Also clean up the arguments to all the MOVCC instructions so the operands always are (true-val, false-val, cond-code). llvm-svn: 182221
-
Venkatraman Govindaraju authored
[Sparc] Rearrange integer registers' allocation order so that register allocator will use I and G registers before using L and O registers. Also, enable registers %g2-%g4 to be used in application and %g5 in 64 bit mode. llvm-svn: 182219
-
Jakob Stoklund Olesen authored
llvm-svn: 182216
-
Tim Northover authored
AArch64 ELF uses .rela relocations so there's no need to actually make use of the bits we're setting in the destination However, we should make sure all bits are cleared properly since multiple runs of resolveRelocations are possible and these could combine to produce invalid results if stale versions remain in the code. llvm-svn: 182214
-
Tim Northover authored
lli's remote MCJIT code calls setExecutable just prior to running code. In line with Darwin behaviour this seems to be the place to invalidate any caches needed so that relocations can take effect properly. llvm-svn: 182213
-
- May 18, 2013
-
-
David Majnemer authored
This is useful if something that looks like (x & (1 << y)) ? 64 : 32 is the divisor in a modulo operation. llvm-svn: 182200
-
Arnold Schwaighofer authored
We might encouter single edge PHIs - handle them with an identity select. Fixes PR15990. llvm-svn: 182199
-
Hal Finkel authored
We don't need to reject all inline asm as using the counter register (most does not). Only those that explicitly clobber the counter register need to prevent the transformation. llvm-svn: 182191
-
Tim Northover authored
llvm-svn: 182190
-
David Majnemer authored
The peephole tries to reorder MOV32r0 instructions such that they are before the instruction that modifies EFLAGS. The problem is that the peephole does not consider the case where the instruction that modifies EFLAGS also depends on the previous state of EFLAGS. Instead, walk backwards until we find an instruction that has a def for EFLAGS but does not have a use. If we find such an instruction, insert the MOV32r0 before it. If it cannot find such an instruction, skip the optimization. llvm-svn: 182184
-
Matt Arsenault authored
The same comment is already made in the header llvm-svn: 182181
-
Matt Arsenault authored
llvm-svn: 182180
-
JF Bastien authored
This patch matches GCC behavior: the code used to only allow unaligned load/store on ARM for v6+ Darwin, it will now allow unaligned load/store for v6+ Darwin as well as for v7+ on Linux and NaCl. The distinction is made because v6 doesn't guarantee support (but LLVM assumes that Apple controls hardware+kernel and therefore have conformant v6 CPUs), whereas v7 does provide this guarantee (and Linux/NaCl behave sanely). The patch keeps the -arm-strict-align command line option, and adds -arm-no-strict-align. They behave similarly to GCC's -mstrict-align and -mnostrict-align. I originally encountered this discrepancy in FastIsel tests which expect unaligned load/store generation. Overall this should slightly improve performance in most cases because of reduced I$ pressure. llvm-svn: 182175
-
Rafael Espindola authored
llvm-svn: 182169
-
Rafael Espindola authored
The errors were: non-constant-expression cannot be narrowed from type 'int64_t' (aka 'long') to 'uint32_t' (aka 'unsigned int') in initializer list and non-constant-expression cannot be narrowed from type 'long' to 'uint32_t' (aka 'unsigned int') in initializer list llvm-svn: 182168
-
- May 17, 2013
-
-
Matt Arsenault authored
Use EVT::changeExtendedVectorElementTypeToInteger instead of doing the same thing that it does llvm-svn: 182165
-
Matt Arsenault authored
llvm-svn: 182164
-
Vincent Lejeune authored
It solves a bug uncovered by dot4 patch where the register class of int_load_input use was ignored. llvm-svn: 182130
-
Vincent Lejeune authored
llvm-svn: 182129
-
Vincent Lejeune authored
It should increase PV substitution opportunities and lower gpr usage (pending computations path are "flushed" sooner) llvm-svn: 182128
-
Vincent Lejeune authored
llvm-svn: 182127
-
Vincent Lejeune authored
Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register coalescer to remove some unneeded COPY. This patch also defines some structures/functions that can be used to handle every vector instructions (CUBE, Cayman special instructions...) in a similar fashion. llvm-svn: 182126
-
Vincent Lejeune authored
llvm-svn: 182125
-
Vincent Lejeune authored
Almost all instructions that takes a 128 bits reg as input (fetch, export...) have the abilities to swizzle their argument and output. Instead of printing default swizzle for each 128 bits reg, rename T*.XYZW to T* and let instructions print potentially optimized swizzles themselves. llvm-svn: 182124
-
Vincent Lejeune authored
llvm-svn: 182123
-
Vincent Lejeune authored
llvm-svn: 182122
-
Vincent Lejeune authored
llvm-svn: 182121
-
Tom Stellard authored
Reviewed-by:
Vincent Lejeune <vljn@ovi.com> https://bugs.freedesktop.org/show_bug.cgi?id=64193 https://bugs.freedesktop.org/show_bug.cgi?id=64257 https://bugs.freedesktop.org/show_bug.cgi?id=64320 NOTE: This is a candidate for the 3.3 branch. llvm-svn: 182113
-
Tom Stellard authored
llvm-svn: 182112
-
Venkatraman Govindaraju authored
This is to generate correct framesetup code when the function has variable sized allocas. llvm-svn: 182108
-
Benjamin Kramer authored
Shuffles that only move an element into position 0 of the vector are common in the output of the loop vectorizer and often generate suboptimal code when SSSE3 is not available. Lower them to vector shifts if possible. We still prefer palignr over psrldq because it has higher throughput on sandybridge. llvm-svn: 182102
-
Benjamin Kramer authored
llvm-svn: 182100
-
David Tweed authored
which doesn't resolve the deeper problem. llvm-svn: 182098
-
Ulrich Weigand authored
[PowerPC] Fix hi/lo encoding in old-style code emitter This patch implements the equivalent change to r182091/r182092 in the old-style code emitter. Instead of having two separate 16-bit immediate encoding routines depending on the instruction, this patch introduces a single encoder that checks the machine operand flags to decide whether the low or high half of a symbol address is required. Since now both encoders make no further distinction between "symbolLo" and "symbolHi", the .td operand can now use a single getS16ImmEncoding method. Tested by running the old-style JIT tests on 32-bit Linux. llvm-svn: 182097
-
Ulrich Weigand authored
[PowerPC] Merge/rename PPC fixup types Now that fixup_ppc_ha16 and fixup_ppc_lo16 are being treated exactly the same everywhere, it no longer makes sense to have two fixup types. This patch merges them both into a single type fixup_ppc_half16, and renames fixup_ppc_lo16_ds to fixup_ppc_half16ds for consistency. (The half16 and half16ds names are taken from the description of relocation types in the PowerPC ABI.) No change in code generation expected. llvm-svn: 182092
-
Ulrich Weigand authored
[PowerPC] Fix processing of ha16/lo16 fixups The current PowerPC MC back end distinguishes between fixup_ppc_ha16 and fixup_ppc_lo16, which are determined by the instruction the fixup applies to, and uses this distinction to decide whether a fixup ought to resolve to the high or the low part of a symbol address. This isn't quite correct, however. It is valid -if unusual- assembler to use, e.g. li 1, symbol@ha or lis 1, symbol@l Whether the high or the low part of the address is used depends solely on the @ suffix, not on the instruction. In addition, both li 1, symbol and lis 1, symbol are valid, assuming the symbol address fits into 16 bits; again, both will then refer to the actual symbol value (so li will load the value itself, while lis will load the value shifted by 16). To fix this, two places need to be adapted. If the fixup cannot be resolved at assembler time, a relocation needs to be emitted via PPCELFObjectWriter::getRelocType. This routine already looks at the VK_ type to determine the relocation. The only problem is that will reject any _LO modifier in a ha16 fixup and vice versa. This is simply incorrect; any of those modifiers ought to be accepted for either fixup type. If the fixup *can* be resolved at assembler time, adjustFixupValue currently selects the high bits of the symbol value if the fixup type is ha16. Again, this is incorrect; see the above example lis 1, symbol Now, in theory we'd have to respect a VK_ modifier here. However, in fact common code never even attempts to resolve symbol references using any nontrivial VK_ modifier at assembler time; it will always fall back to emitting a reloc and letting the linker handle it. If this ever changes, presumably there'd have to be a target callback to resolve VK_ modifiers. We'd then have to handle @ha etc. there. llvm-svn: 182091
-