- Sep 16, 2013
-
-
Vladimir Medic authored
llvm-svn: 190780
-
Benjamin Kramer authored
llvm-svn: 190779
-
Richard Sandiford authored
The port originally had special patterns for extload, mapping them to the same instructions as sextload. It seemed neater to have patterns that match "an extension that is allowed to be signed" and "an extension that is allowed to be unsigned". This was originally meant to be a clean-up, but it does improve the handling of promoted integers a little, as shown by args-06.ll. llvm-svn: 190777
-
Craig Topper authored
llvm-svn: 190775
-
Hal Finkel authored
This is a re-commit of r190764, with an extra check to make sure that we're not performing the transformation on illegal types (a small test case has been added for this as well). Original commit message: The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190771
-
Benjamin Kramer authored
llvm-svn: 190770
-
- Sep 15, 2013
-
-
Hal Finkel authored
This is causing test-suite failures. Original commit message: The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190765
-
Hal Finkel authored
The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190764
-
Reed Kotler authored
so it can be better used for general interoperability testing between mips32 and mips16. llvm-svn: 190762
-
- Sep 14, 2013
-
-
Ben Langmuir authored
Also assembly/disassembly tests, and for sha256rnds2, aliases with an explicit xmm0 dependency. llvm-svn: 190754
-
Robert Wilhelm authored
llvm-svn: 190749
-
Zoran Jovanovic authored
llvm-svn: 190746
-
Zoran Jovanovic authored
llvm-svn: 190745
-
Zoran Jovanovic authored
llvm-svn: 190744
-
- Sep 13, 2013
-
-
Hal Finkel authored
As it turns out, not a problem in practice, but it should be there. llvm-svn: 190720
-
Preston Gurd authored
Implements Instruction scheduler latencies for Silvermont, using latencies from the Intel Silvermont Optimization Guide. Auto detects SLM. Turns on post RA scheduler when generating code for SLM. llvm-svn: 190717
-
Joey Gouly authored
to be more consistent. llvm-svn: 190692
-
Joey Gouly authored
Patch by Bradley Smith! llvm-svn: 190683
-
Zoran Jovanovic authored
llvm-svn: 190676
-
Richard Sandiford authored
Just a clean-up, no behavioral change intended. llvm-svn: 190673
-
Richard Sandiford authored
E.g. "SRL %r2, 2; TMLL %r2, 1" => "TMLL %r2, 4". llvm-svn: 190672
-
Tim Northover authored
Previously we modelled VPR128 and VPR64 as essentially identical register-classes containing V0-V31 (which had Q0-Q31 as "sub_alias" sub-registers). This model is starting to cause significant problems for code generation, particularly writing EXTRACT/INSERT_SUBREG patterns for converting between the two. The change here switches to classifying VPR64 & VPR128 as RegisterOperands, which are essentially aliases for RegisterClasses with different parsing and printing behaviour. This fits almost exactly with their real status (VPR128 == FPR128 printed strangely, VPR64 == FPR64 printed strangely). llvm-svn: 190665
-
Craig Topper authored
llvm-svn: 190659
-
Vincent Lejeune authored
llvm-svn: 190645
-
Vincent Lejeune authored
llvm-svn: 190644
-
Vincent Lejeune authored
This move makes possible to correctly handle multiples instructions from a single pattern. llvm-svn: 190643
-
Chandler Carruth authored
llvm-svn: 190640
-
Hal Finkel authored
When a structure is passed by value, and that structure contains a vector member, according to the PPC ABI, the structure will receive enhanced alignment (so that the vector within the structure will always be aligned). This should resolve PR16641. llvm-svn: 190636
-
- Sep 12, 2013
-
-
Hal Finkel authored
In fast-math mode sqrt(x) is calculated using the fast expansion of the reciprocal of the reciprocal sqrt expansion. The reciprocal and reciprocal sqrt expansions use the associated estimate instructions along with some Newton iterations. Unfortunately, as a result, sqrt(0) was being calculated as NaN, which is not correct. Now we explicitly return a result of zero if the input is zero. llvm-svn: 190624
-
Roman Divacky authored
FreeBSD kernel. llvm-svn: 190618
-
Ben Langmuir authored
Add basic assembly/disassembly support for the first Intel SHA instruction 'sha1rnds4'. Also includes feature flag, and test cases. Support for the remaining instructions will follow in a separate patch. llvm-svn: 190611
-
Hal Finkel authored
Use the new instruction deprecation feature to mark mftb (now replaced with mfspr) and dst (along with the other Altivec cache control instructions) as deprecated when targeting cores supporting at least ISA v2.03. llvm-svn: 190605
-
Joey Gouly authored
The 'Deprecated' class allows you to specify a SubtargetFeature that the instruction is deprecated on. The 'ComplexDeprecationPredicate' class allows you to define a custom predicate that is called to check for deprecation. For example: ComplexDeprecationPredicate<"MCR"> would mean you would have to define the following function: bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI, std::string &Info) Which returns 'false' for not deprecated, and 'true' for deprecated and store the warning message in 'Info'. The MCTargetAsmParser constructor was chaned to take an extra argument of the MCInstrInfo class, so out-of-tree targets will need to be changed. llvm-svn: 190598
-
Elena Demikhovsky authored
Added parsing of mask register and "zeroing" semantic, like {%k1} {z}. llvm-svn: 190595
-
Hal Finkel authored
Aggressive anti-dependency breaking is enabled by default for all PPC cores. This provides a general speedup on the P7 and other platforms (among other factors, the instruction group formation for the non-embedded PPC cores is done during post-RA scheduling). In order to do this safely, the incompatibility between uses of the MFOCRF instruction and anti-dependency breaking are resolved by marking MFOCRF with hasExtraSrcRegAllocReq. As noted in the removed FIXME, the problem was that MFOCRF's output is sensitive to the identify of the source register, and always paired with a shift to undo this effect. Because anti-dependency breaking is unaware of this hidden dependency of the shift amount on the source register of the MFOCRF instruction, changing that register must be inhibited. Two test cases were adjusted: The SjLj test was made more insensitive to register choices and scheduling; the saveCR test disabled anti-dependency breaking because part of what it is testing is proper register reuse. llvm-svn: 190587
-
Tom Stellard authored
For _XYZ, the type of VDATA is v4i32, because v3i32 doesn't exist. The ADDR64 bit is not exposed. A simpler intrinsic that doesn't take a resource descriptor might be nicer. The maximum number of input SGPRs is bumped to 17. Signed-off-by:
Marek Olšák <marek.olsak@amd.com> Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190575
-
Tom Stellard authored
This fixes some regressions in the piglit local memory store tests introduced by recent commits which made the scheduler aware of the trans slot. It's not possible to test this using lit, because there is no way to determine from the assembly dumps whether or not an instruction is in the trans slot. Even if this were possible, the test would be highly sensitive to changes in the scheduler and might generate confusing false negatives. Reviewed-by: Vincent Lejeune<vljn at ovi.com> llvm-svn: 190574
-
Hal Finkel authored
As Andy pointed out to me a long time ago, there are no structural hazards in the later pipeline stages of the A2, and so modeling them is useless. Also, modeling the top pre-dispatch stages is deceiving because, when multiple hardware threads are active, those resources are shared among the threads. The bypass definitions were mostly wrong, and so those have been removed. The resulting itinerary is much simpler, and more accurate. llvm-svn: 190562
-
Hal Finkel authored
For embedded PPC cores (especially the A2 core), using the MI scheduler with AA is far superior to the other scheduling options. llvm-svn: 190558
-
- Sep 11, 2013
-
-
Bill Wendling authored
llvm-svn: 190551
-