- Aug 04, 2014
-
-
Saleem Abdulrasool authored
GCC 4.8.2 points out the ambiguity in evaluation of the assertion condition: lib/Target/X86/X86FloatingPoint.cpp:949:49: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses] assert(STReturns == 0 || isMask_32(STReturns) && N <= 2); llvm-svn: 214672
-
Saleem Abdulrasool authored
GCC 4.8.2 objects to the tautological condition in the assert as the unsigned value is guaranteed to be >= 0. Simplify the assertion by dropping the tautological condition. llvm-svn: 214671
-
Sanjay Patel authored
This is intended to be the minimal change needed to fix PR20354 ( http://llvm.org/bugs/show_bug.cgi?id=20354 ). The check for a vector operation was wrong; we need to check that the fabs itself is not a vector operation. This patch will not generate the optimal code. A constant pool load and 'and' op will be generated instead of just returning a value that we can calculate in advance (as we do for the scalar case). I've put a 'TODO' comment for that here and expect to have that patch ready soon. There is a very similar optimization that we can do in visitFNEG, so I've put another 'TODO' there and expect to have another patch for that too. llvm-svn: 214670
-
Gerolf Hoflehner authored
sequence - AArch64 target support This patch turns off madd/msub generation in the DAGCombiner and generates them in the MachineCombiner instead. It replaces the original code sequence with the combined sequence when it is beneficial to do so. When there is no machine model support it always generates the madd/msub instruction. This is true also when the objective is to optimize for code size: when the combined sequence is shorter is always chosen and does not get evaluated. When there is a machine model the combined instruction sequence is evaluated for critical path and resource length using machine trace metrics and the original code sequence is replaced when it is determined to be faster. rdar://16319955 llvm-svn: 214669
-
- Aug 03, 2014
-
-
Justin Bogner authored
It's a bit more obvious what's going on if we use path::filename rather than decrementing an iterator here. llvm-svn: 214668
-
Jason Molenda authored
call Target::SetArchitecture instead of modifying a reference to the target's architecture so that the target logging can show that the arch has been changed. llvm-svn: 214667
-
Gerolf Hoflehner authored
sequence - target independent framework When the DAGcombiner selects instruction sequences it could increase the critical path or resource len. For example, on arm64 there are multiply-accumulate instructions (madd, msub). If e.g. the equivalent multiply-add sequence is not on the crictial path it makes sense to select it instead of the combined, single accumulate instruction (madd/msub). The reason is that the conversion from add+mul to the madd could lengthen the critical path by the latency of the multiply. But the DAGCombiner would always combine and select the madd/msub instruction. This patch uses machine trace metrics to estimate critical path length and resource length of an original instruction sequence vs a combined instruction sequence and picks the faster code based on its estimates. This patch only commits the target independent framework that evaluates and selects code sequences. The machine instruction combiner is turned off for all targets and expected to evolve over time by gradually handling DAGCombiner pattern in the target specific code. This framework lays the groundwork for fixing rdar://16319955 llvm-svn: 214666
-
Tobias Grosser authored
There is no needed for neither 1-dimensional nor higher dimensional arrays to require positive offsets in the outermost array dimension. We originally introduced this assumption with the support for delinearizing multi-dimensional arrays. llvm-svn: 214665
-
Saleem Abdulrasool authored
This makes EmitWindowsUnwindTables a virtual function and lowers the implementation of the function to the X86WinCOFFStreamer. This method is a target specific operation. This enables making the behaviour target dependent by isolating it entirely to the target specific streamer. llvm-svn: 214664
-
Saleem Abdulrasool authored
The frame information stored in this structure is driven by the requirements for Windows NT unwinding rather than Windows 64 specifically. As a result, this type can be shared across multiple architectures (ARM, AXP, MIPS, PPC, SH). Rename this class in preparation for adding support for supporting unwinding information for Windows on ARM. Take the opportunity to constify the members as everything except the ChainedParent is read-only. This required some adjustment to the label handling. llvm-svn: 214663
-
Simon Atanasyan authored
type handling. llvm-svn: 214662
-
Matt Arsenault authored
This slipped in in r214467, so something like V_MOV_B32_e32 v0, ... is now printed with 2 spaces between the instruction name and first operand. llvm-svn: 214660
-
Johannes Doerfert authored
+ Remove the class IslGenerator which duplicates the functionality of IslExprBuilder. + Use the IslExprBuilder to create code for memory access relations. + Also handle array types during access creation. + Enable scev codegen for one of the transformed memory access tests, thus access creation without canonical induction variables available. + Update one test case to the new output. llvm-svn: 214659
-
Johannes Doerfert authored
llvm-svn: 214658
-
Johannes Doerfert authored
The updated tests use a different context than the old ones did. Other than that only their path and the code generation we use changed. llvm-svn: 214657
-
NAKAMURA Takumi authored
llvm-svn: 214656
-
Manman Ren authored
When we have a covered lookup table, make sure we don't delete PHINodes that are cached in PHIs. rdar://17887153 llvm-svn: 214642
-
- Aug 02, 2014
-
-
Simon Atanasyan authored
target independent. llvm-svn: 214641
-
Joerg Sonnenberger authored
llvm-svn: 214640
-
Joerg Sonnenberger authored
llvm-svn: 214639
-
Erik Eckstein authored
llvm-svn: 214638
-
James Molloy authored
llvm-svn: 214637
-
Joerg Sonnenberger authored
when let can do the same thing. Keep the 64bit variants as codegen-only. While they have a different register class, the encoding is the same for 32bit and 64bit mode. Having both present would otherwise confuse the disassembler. llvm-svn: 214636
-
Joerg Sonnenberger authored
llvm-svn: 214635
-
James Molloy authored
[AArch64] Teach DAGCombiner that converting two consecutive loads into a vector load is not a good transform when paired loads are available. The combiner was creating Q-register loads and stores, which then had to be spilled because there are no callee-save Q registers! llvm-svn: 214634
-
Tobias Grosser authored
This area of code is currently not very much tested. It will hopefully be superseeded by Yabin's GSoC project. llvm-svn: 214633
-
Tobias Grosser authored
llvm-svn: 214632
-
Chandler Carruth authored
forget to update the comment here... =/ llvm-svn: 214630
-
Chandler Carruth authored
Darwin x86 asm comment prefix designed to work around GAS on that platform. That makes the comment-matching of the test much more stable. llvm-svn: 214629
-
Chandler Carruth authored
lowering with a small addition to it and adding PSHUFB combining. There is one obvious place in the new vector shuffle lowering where we should form PSHUFBs directly: when without them we will unpack a vector of i8s across two different registers and do a potentially 4-way blend as i16s only to re-pack them into i8s afterward. This is the crazy expensive fallback path for i8 shuffles and we can just directly use pshufb here as it will always be cheaper (the unpack and pack are two instructions so even a single shuffle between them hits our three instruction limit for forming PSHUFB). However, this doesn't generate very good code in many cases, and it leaves a bunch of common patterns not using PSHUFB. So this patch also adds support for extracting a shuffle mask from PSHUFB in the X86 lowering code, and uses it to handle PSHUFBs in the recursive shuffle combining. This allows us to combine through them, combine multiple ones together, and generally produce sufficiently high quality code. Extracting the PSHUFB mask is annoyingly complex because it could be either pre-legalization or post-legalization. At least this doesn't have to deal with re-materialized constants. =] I've added decode routines to handle the different patterns that show up at this level and we dispatch through them as appropriate. The two primary test cases are updated. For the v16 test case there is still a lot of room for improvement. Since I was going through it systematically I left behind a bunch of FIXME lines that I'm hoping to turn into ALL lines by the end of this. llvm-svn: 214628
-
Chandler Carruth authored
Spotted this missed refactoring by inspection when reading code, and it doesn't changethe functionality at all. llvm-svn: 214627
-
Chandler Carruth authored
llvm-svn: 214626
-
Chandler Carruth authored
of normally binary shuffle instructions like PUNPCKL and MOVLHPS. This detects cases where a single register is used for both operands making the shuffle behave in a unary way. We detect this and adjust the mask to use the unary form which allows the existing DAG combine for shuffle instructions to actually work at all. As a consequence, this uncovered a number of obvious bugs in the existing DAG combine which are fixed. It also now canonicalizes several shuffles even with the existing lowering. These typically are trying to match the shuffle to the domain of the input where before we only really modeled them with the floating point variants. All of the cases which change to an integer shuffle here have something in the integer domain, so there are no more or fewer domain crosses here AFAICT. Technically, it might be better to go from a GPR directly to the floating point domain, but detecting floating point *outputs* despite integer inputs is a lot more code and seems unlikely to be worthwhile in practice. If folks are seeing domain-crossing regressions here though, let me know and I can hack something up to fix it. Also as a consequence, a bunch of missed opportunities to form pshufb now can be formed. Notably, splats of i8s now form pshufb. Interestingly, this improves the existing splat lowering too. We go from 3 instructions to 1. Yes, we may tie up a register, but it seems very likely to be worth it, especially if splatting the 0th byte (the common case) as then we can use a zeroed register as the mask. llvm-svn: 214625
-
Chandler Carruth authored
PSHUFB forms. This will be important to update some AVX tests when I add PSHUFB combining. llvm-svn: 214624
-
Chandler Carruth authored
so using a single helper which adds operands back onto the worklist. Several places didn't rigorously do this but a couple already did. Factoring them together and doing it rigorously is important to delete things recursively early on in the combiner and get a chance to see accurate hasOneUse values. While no existing test cases change, an upcoming patch to add DAG combining logic for PSHUFB requires this to work correctly. llvm-svn: 214623
-
Owen Anderson authored
during DAGCombine in certain circumstances. Unfortunately, the circumstances required to trigger the issue seem to require a pretty specific interaction of DAGCombines, and I haven't been able to find a testcase that reproduces on X86, ARM, or AArch64. The functionality added here is replicated in essentially every other DAG combine, so it seems pretty obviously correct. llvm-svn: 214622
-
Alexander Kornienko authored
Reviewers: pcc, klimek Reviewed By: klimek Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D4765 llvm-svn: 214621
-
NAKAMURA Takumi authored
llvm-svn: 214620
-
NAKAMURA Takumi authored
llvm-svn: 214619
-
Zachary Turner authored
It was hardcoding the value "python", which will end up at best getting a different python executable (if the user has overridden the value of PYTHON_EXECUTABLE), and at worst encountering an error (if there is no copy of python on the system path). This patch changes the script to use sys.executable so that it runs the sub-script with the same executable that it was run with. llvm-svn: 214618
-