- Apr 12, 2012
-
-
Kevin Enderby authored
of a VST instruction. llvm-svn: 154544
-
Akira Hatanaka authored
- FCOPYSIGN nodes that have operands of different types were not handled. - Different code was generated depending on the endianness of the target. Additionally, code is added that emits INS and EXT instructions, if they are supported by target (they are R2 instructions). llvm-svn: 154540
-
- Apr 11, 2012
-
-
Jim Grosbach authored
While there is an encoding for it in VUZP, the result of that is undefined, so we should avoid it. Define the instruction as a pseudo for VTRN.32 instead, as the ARM ARM indicates. rdar://11222366 llvm-svn: 154511
-
Jim Grosbach authored
While there is an encoding for it in VZIP, the result of that is undefined, so we should avoid it. Define the instruction as a pseudo for VTRN.32 instead, as the ARM ARM indicates. rdar://11221911 llvm-svn: 154505
-
Nadav Rotem authored
llvm-svn: 154494
-
Duncan Sands authored
binary and assembly. Patch by Carlo Kok. Emitting was inspired by but not based on the D llvm bindings. llvm-svn: 154493
-
-
Nadav Rotem authored
Original message: Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendV uses a register for the selection while Vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154483
-
Evan Cheng authored
predicates. Also remove NEON2 since it's not really useful and it is confusing. If NEON + VFP4 implies NEON2 but NEON2 doesn't imply NEON + VFP4, what does it really mean? rdar://10139676 llvm-svn: 154480
-
-
Charles Davis authored
ret instructions. llvm-svn: 154468
-
Kevin Enderby authored
for all opcodes handed by DecodeVLDInstruction() in ARMDisassembler.cpp . llvm-svn: 154459
-
Jim Grosbach authored
rdar://11222742 llvm-svn: 154457
-
Evan Cheng authored
1. The new instruction itinerary entries are not properly described. 2. The asm parser can't handle vfms and vfnms. 3. There were no assembler, disassembler test cases. 4. HasNEON2 has the wrong assembler predicate. rdar://10139676 llvm-svn: 154456
-
- Apr 10, 2012
-
-
-
Chad Rosier authored
llvm-svn: 154427
-
Chad Rosier authored
llvm-svn: 154426
-
Eric Christopher authored
llvm-svn: 154425
-
Jim Grosbach authored
We were incorrectly conflating some add variants which don't have a cc_out operand with the mirroring sub encodings, which do. Part of the awesome non-orthogonality legacy of thumb1. Similarly, handling of add/sub of an immediate was sometimes incorrectly removing the cc_out operand for add/sub register variants. rdar://11216577 llvm-svn: 154411
-
David Blaikie authored
llvm-svn: 154398
-
Nadav Rotem authored
blendv uses a register for the selection while vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154396
-
Evan Cheng authored
legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370
-
Jim Grosbach authored
Generalized logic of r154141. llvm-svn: 154362
-
- Apr 09, 2012
-
-
Chad Rosier authored
in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 llvm-svn: 154340
-
Chad Rosier authored
llvm-svn: 154336
-
David Blaikie authored
A couple of cases where we were accidentally creating constant conditions by something like "x == a || b" instead of "x == a || x == b". In one case a conditional & then unreachable was used - I transformed this into a direct assert instead. llvm-svn: 154324
-
Preston Gurd authored
original patch to add itineraries, to X86InstrArithmetc.td. llvm-svn: 154320
-
Nadav Rotem authored
llvm-svn: 154313
-
Nadav Rotem authored
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering. llvm-svn: 154310
-
Chandler Carruth authored
x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is *not* using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304
-
- Apr 08, 2012
-
-
Chandler Carruth authored
optimizations which are valid for position independent code being linked into a single executable, but not for such code being linked into a shared library. I discussed the design of this with Eric Christopher, and the decision was to support an optional bit rather than a completely separate relocation model. Fundamentally, this is still PIC relocation, its just that certain optimizations are only valid under a PIC relocation model when the resulting code won't be in a shared library. The simplest path to here is to expose a single bit option in the TargetOptions. If folks have different/better designs, I'm all ears. =] I've included the first optimization based upon this: changing TLS models to the *Exec models when PIE is enabled. This is the LLVM component of PR12380 and is all of the hard work. llvm-svn: 154294
-
Chandler Carruth authored
in TargetLowering. There was already a FIXME about this location being odd. The interface is simplified as a consequence. This will also make it easier to change TLS models when compiling with PIE. llvm-svn: 154292
-
Nadav Rotem authored
Previously we used three instructions to broadcast an immediate value into a vector register. On Sandybridge we continue to load the broadcasted value from the constant pool. llvm-svn: 154284
-
Craig Topper authored
Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1. llvm-svn: 154272
-
- Apr 07, 2012
-
-
Craig Topper authored
Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns. llvm-svn: 154268
-
rdar://problem/11203543Bob Wilson authored
The tLDRr instruction with the last register operand set to the zero register prints in assembly as if no register was specified, and the assembler encodes it as a tLDRi instruction with a zero immediate. With the integrated assembler, that zero register gets emitted as "r0", so we get "ldr rx, [ry, r0]" which is broken. Emit the instruction as tLDRi with a zero immediate. I don't know if there's a good way to write a testcase for this. Suggestions welcome. Opportunities for follow-up work: 1) The asm printer should complain if a non-optional register operand is set to the zero register, instead of silently dropping it. 2) The integrated assembler should complain in the same situation, instead of silently emitting the operand as "r0". llvm-svn: 154261
-
NAKAMURA Takumi authored
Cygwin-1.7 supports dw2. Some recent mingw distros support one, too. I have confirmed test-suite/SingleSource/Benchmarks/Shootout-C++/except.cpp can pass on Cygwin. llvm-svn: 154247
-
Alexis Hunt authored
by default. This is a behaviour configurable in the MCAsmInfo. I've decided to turn it on by default in (possibly optimistic) hopes that most assemblers are reasonably sane. If this proves a problem, switching to default seems reasonable. I'm not sure if this is the opportune place to test, but it seemed good to make sure it was tested somewhere. llvm-svn: 154235
-
Jim Grosbach authored
llvm-svn: 154226
-
- Apr 06, 2012
-
-
Jakob Stoklund Olesen authored
llvm-svn: 154210
-