- Jul 11, 2018
-
-
Florian Hahn authored
The original version caused a memsan failure. llvm-svn: 336792
-
Sander de Smalen authored
The compact instruction shuffles active elements of vector into lowest numbered elements and sets remaining elements to zero. e.g. compact z0.s, p0, z1.s llvm-svn: 336789
-
Simon Pilgrim authored
llvm-svn: 336787
-
Simon Pilgrim authored
Will make codegen diffs much easier to grok in a future patch llvm-svn: 336786
-
Roman Lebedev authored
The @masked_and_notA_slightly_optimized and @masked_or_A will break when PR38123 will be fixed: https://rise4fun.com/Alive/Rny Clearly, they aren't optimized currently. https://rise4fun.com/Alive/ERo llvm-svn: 336784
-
Sander de Smalen authored
The LASTB and LASTA instructions extract the last active element, or element after the last active, from the source vector. The added variants are: Scalar: last(a|b) w0, p0, z0.b last(a|b) w0, p0, z0.h last(a|b) w0, p0, z0.s last(a|b) x0, p0, z0.d SIMD & FP Scalar: last(a|b) b0, p0, z0.b last(a|b) h0, p0, z0.h last(a|b) s0, p0, z0.s last(a|b) d0, p0, z0.d The CLASTB and CLASTA conditionally extract the last or element after the last active element from the source vector. The added variants are: Scalar: clast(a|b) w0, p0, w0, z0.b clast(a|b) w0, p0, w0, z0.h clast(a|b) w0, p0, w0, z0.s clast(a|b) x0, p0, x0, z0.d SIMD & FP Scalar: clast(a|b) b0, p0, b0, z0.b clast(a|b) h0, p0, h0, z0.h clast(a|b) s0, p0, s0, z0.s clast(a|b) d0, p0, d0, z0.d Vector: clast(a|b) z0.b, p0, z0.b, z1.b clast(a|b) z0.h, p0, z0.h, z1.h clast(a|b) z0.s, p0, z0.s, z1.s clast(a|b) z0.d, p0, z0.d, z1.d Please refer to the architecture specification for more details on the semantics of the added instructions. llvm-svn: 336783
-
Paul Semel authored
Differential Revision: https://reviews.llvm.org/D48281 llvm-svn: 336782
-
Roman Lebedev authored
update_test_checks will drop it anyway, creating noise.. llvm-svn: 336781
-
Roman Lebedev authored
llvm-svn: 336780
-
Simon Pilgrim authored
This allows us to use SelectionDAG::isKnownNeverZero in DAGCombiner::visitREM (visitSDIVLike/visitUDIVLike handle the checking for constants). llvm-svn: 336779
-
Andrea Di Biagio authored
llvm-mca doesn't know that on modern AMD processors, portions of a general purpose register are not treated independently. So, a partial register write has a false dependency on the super-register. The issue with partial register writes will be addressed by a follow-up patch. llvm-svn: 336778
-
Simon Atanasyan authored
llvm-svn: 336777
-
Simon Pilgrim authored
First stage in PR38057 - support non-uniform constant vectors in the combine to reuse the division-by-constant logic. We can definitely do better for srem pow2 remainders (and avoid that extra multiply....) but this at least helps keep everything on the vector unit. Differential Revision: https://reviews.llvm.org/D48975 llvm-svn: 336774
-
Simon Pilgrim authored
llvm-svn: 336773
-
Simon Tatham authored
gcc 4.7 seems to disagree with gcc 5.3 about whether you need to say 'return std::move(thing)' instead of just 'return thing'. All the json::Arrays and json::Objects that I was implicitly turning into json::Values by returning them from functions now have explicit std::move wrappers, so hopefully 4.7 will be happy now. llvm-svn: 336772
-
Simon Tatham authored
The aim of this backend is to output everything TableGen knows about the record set, similarly to the default -print-records backend. But where -print-records produces output in TableGen's input syntax (convenient for humans to read), this backend produces it as structured JSON data, which is convenient for loading into standard scripting languages such as Python, in order to extract information from the data set in an automated way. The output data contains a JSON representation of the variable definitions in output 'def' records, and a few pieces of metadata such as which of those definitions are tagged with the 'field' prefix and which defs are derived from which classes. It doesn't dump out absolutely every piece of knowledge it _could_ produce, such as type information and complicated arithmetic operator nodes in abstract superclasses; the main aim is to allow consumers of this JSON dump to essentially act as new backends, and backends don't generally need to depend on that kind of data. The new backend is implemented as an EmitJSON() function similar to all of llvm-tblgen's other EmitFoo functions, except that it lives in lib/TableGen instead of utils/TableGen on the basis that I'm expecting to add it to clang-tblgen too in a future patch. To test it, I've written a Python script that loads the JSON output and tests properties of it based on comments in the .td source - more or less like FileCheck, except that the CHECK: lines have Python expressions after them instead of textual pattern matches. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arichardson, labath, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D46054 llvm-svn: 336771
-
Eric Liu authored
This fixes compile error in r336759. llvm::value::dump is not available in released build. llvm-svn: 336770
-
Craig Topper authored
Summary: These changes cover the PR#31399. Now the ffs(x) function is lowered to (x != 0) ? llvm.cttz(x) + 1 : 0 and it corresponds to the following llvm code: %cnt = tail call i32 @llvm.cttz.i32(i32 %v, i1 true) %tobool = icmp eq i32 %v, 0 %.op = add nuw nsw i32 %cnt, 1 %add = select i1 %tobool, i32 0, i32 %.op and x86 asm code: bsfl %edi, %ecx addl $1, %ecx testl %edi, %edi movl $0, %eax cmovnel %ecx, %eax In this case the 'test' instruction can't be eliminated because the 'add' instruction modifies the EFLAGS, namely, ZF flag that is set by the 'bsf' instruction when 'x' is zero. We now produce the following code: bsfl %edi, %ecx movl $-1, %eax cmovnel %ecx, %eax addl $1, %eax Patch by Ivan Kulagin Reviewers: davide, craig.topper, spatel, RKSimon Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48765 llvm-svn: 336768
-
Lang Hames authored
This patch broke a few buildbots. I will investigate and re-apply when I have a fix. llvm-svn: 336767
-
Craig Topper authored
These patterns looked for a MOVSS/SD followed by a scalar_to_vector. Or a scalar_to_vector followed by a load. In both cases we emitted a MOVSS/SD for the MOVSS/SD part, a REG_CLASS for the scalar_to_vector, and a MOVSS/SD for the load. But we have patterns that do each of those 3 things individually so there's no reason to build large patterns. Most of the test changes are just reorderings. The one test that had a meaningful change is pr30430.ll and it appears to be a regression. But its doing -O0 so I think it missed a lot of opportunities and was just getting lucky before. llvm-svn: 336762
-
Lang Hames authored
There is already a VSO member V in the CoreAPIsStandardTest test fixture. llvm-svn: 336761
-
Lang Hames authored
and fix a bug that these exposed. llvm-svn: 336760
-
Sam Clegg authored
See https://bugs.llvm.org/show_bug.cgi?id=35385 Differential Revision: https://reviews.llvm.org/D48471 llvm-svn: 336759
-
Lang Hames authored
llvm-svn: 336758
-
Lang Hames authored
llvm-svn: 336757
-
Stefan Pintilie authored
Implement this as it is done on GCC: __float128 a, b, c, d; a = __builtin_fmaf128_round_to_odd (b, c, d); // generates xsmaddqpo a = __builtin_fmaf128_round_to_odd (b, c, -d); // generates xsmsubqpo a = - __builtin_fmaf128_round_to_odd (b, c, d); // generates xsnmaddqpo a = - __builtin_fmaf128_round_to_odd (b, c, -d); // generates xsnmsubpqp Differential Revision: https://reviews.llvm.org/D48218 llvm-svn: 336754
-
Chen Zheng authored
[test cases] add test cases for find more abs pattern Differential Revision: https://reviews.llvm.org/D49123 llvm-svn: 336752
-
Craig Topper authored
llvm-svn: 336751
-
Eli Friedman authored
Let's be conservative here; it matches what we actually implemented, and it should be rare in practice anyway. Differential Revision: https://reviews.llvm.org/D49042 llvm-svn: 336744
-
Eli Friedman authored
The original code attempted to do this, but the std::abs() call didn't actually do anything due to implicit type conversions. Fix the type conversions, and perform the correct check for negative immediates. This probably has very little practical impact, but it's worth fixing just to avoid confusion in the future, I think. Differential Revision: https://reviews.llvm.org/D48907 llvm-svn: 336742
-
Lang Hames authored
symbols in another VSO). Also fixes a bug where chained aliases within a single VSO would deadlock on materialization. llvm-svn: 336741
-
George Burgess IV authored
If we don't include Initialization.h, `LLVMInitializeAggressiveInstCombiner` won't see its `extern "C"` decl. This causes sadness, name mangling, and linker errors. Reported on the mailing lists by Vladimir Vissoultchev. Thanks! llvm-svn: 336736
-
Craig Topper authored
Some added 20 and some added 15. Its unclear when to use which value and whether they are required at all. This patch removes them all. If we start finding real world issues we may need to add them back with proper tests. llvm-svn: 336735
-
Richard Trieu authored
class -> struct in forward declaration. llvm-svn: 336733
-
Craig Topper authored
[X86] Teach X86InstrInfo::commuteInstructionImpl to use MOVSD/MOVSS for BLEND under optsize when the immediate allows it. Isel currently emits movss/movsd a lot of the time and an accidental double commute turns it into a blend. Ideally we'd select blend directly in isel under optspeed and not rely on the double commute to create blend. llvm-svn: 336731
-
- Jul 10, 2018
-
-
JF Bastien authored
llvm-svn: 336730
-
Craig Topper authored
These ISD nodes try to select the MOVLPS and MOVLPD instructions which are special load only instructions. They load data and merge it into the lower 64-bits of an XMM register. They are logically equivalent to our MOVSD node plus a load. There was only one place in X86ISelLowering that used MOVLPD and no places that selected MOVLPS. The one place that selected MOVLPD had to choose between it and MOVSD based on whether there was a load. But lowering is too early to tell if the load can really be folded. So in isel we have patterns that use MOVSD for MOVLPD if we can't find a load. We also had patterns that select the MOVLPD instruction for a MOVSD if we can find a load, but didn't choose the MOVLPD ISD opcode for some reason. So it seems better to just standardize on MOVSD ISD opcode and manage MOVSD vs MOVLPD instruction with isel patterns. llvm-svn: 336728
-
Scott Linder authored
llvm-svn: 336722
-
Teresa Johnson authored
Summary: I noticed that the .imports files emitted for distributed ThinLTO backends do not have consistent ordering. This is because StringMap iteration order is not guaranteed to be deterministic. Since we already have a std::map with this information, used when emitting the individual index files (ModuleToSummariesForIndex), use it for the imports files as well. This issue is likely causing some unnecessary rebuilds of the ThinLTO backends in our distributed build system as the imports files are inputs to those backends. Reviewers: pcc, steven_wu, mehdi_amini Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D48783 llvm-svn: 336721
-
Craig Topper authored
It points to an opcode that doesn't exist. llvm-svn: 336720
-