- Nov 17, 2017
-
-
David Blaikie authored
All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
-
Craig Topper authored
The sign extend might be from an i16 or i8 type and was inserted by InstCombine to match the pointer width. X86 gather legalization isn't currently detecting this to reinsert a sign extend to make things legal. It's a bit weird for the SelectionDAGBuilder to do this kind of optimization in the first place. With this removed we can at least lean on InstCombine somewhat to ensure the index is i32 or i64. I'll work on trying to recover some of the test cases by removing sign extends in the backend when its safe to do so with an understanding of the current legalizer capabilities. This should fix PR30690. llvm-svn: 318466
-
- Nov 16, 2017
-
-
Vedant Kumar authored
This reverts commit r318448. It looks like some of the asserts need to be weakened. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/16296 llvm-svn: 318455
-
Craig Topper authored
llvm-svn: 318450
-
Vedant Kumar authored
TransferDbgValues (capital 'T') is wired into ReplaceAllUsesWith, and transferDbgValues (lowercase 't') is used elsewhere (e.g in Legalize). Both functions should be doing the exact same thing. This patch consolidates the logic into one place. Differential Revision: https://reviews.llvm.org/D40104 llvm-svn: 318448
-
Yaxun Liu authored
SelectionDAGBuilder::visitAlloca assumes alloca address space is 0, which is incorrect for triple amdgcn---amdgiz and causes isel failure. This patch fixes that. Differential Revision: https://reviews.llvm.org/D40095 llvm-svn: 318392
-
Sam Parker authored
Change the calculation for the desired ValueType for non-sign extending loads, as in those cases we don't care about the higher bits. This creates a smaller ExtVT and allows for such combinations as: (srl (zextload i16, [addr]), 8) -> (zextload i8, [addr + 1]) Differential Revision: https://reviews.llvm.org/D40034 llvm-svn: 318390
-
Benjamin Kramer authored
The LatencyPriorityQueue doesn't currently check whether the SU being removed really exists in the Queue. This method fails quietly when SU is not found and removes the last element from the Queue, leading to unexpected behavior. Unfortunately, this only occurs on our custom target, with the custom scheduler. In our case, when remove() is invoked, it removes the wrong SU at the end of the Queue, which is only discovered later when VerifyScheduledDAG() is invoked and finds that some nodes were not scheduled at all. As this is only reproducible with a lot of proprietary code, I'm hopeful this assert is straightforward enough to not necessitate a test. Patch by Ondrej Glasnak! Differential Revision: https://reviews.llvm.org/D40084 llvm-svn: 318387
-
Mikael Holmen authored
Summary: Use use_nodbg_empty() rather than use_empty() in MachineRegisterInfo::EmitLiveInCopies() when determining if a livein register has any uses or not. Otherwise a single dbg.value can make us generate different code, meaning -g would affect code generation. Found when compiling code for my out-of-tree target. Unfortunately I haven't been able to reproduce the problem on X86 or any of the other in-tree targets that I tried, so no test case. Reviewers: MatzeB Reviewed By: MatzeB Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39044 llvm-svn: 318382
-
Craig Topper authored
[SelectionDAG] Use report_fatal_error instead of llvm_unreachable in some code that can be reached if targets don't configure things correctly. For example, this is currently reachable by X86 if you use a masked store intrinsic with a v1iX type. Using a fatal error seems like a better user experience if someone were to encounter this on a release build. There are several other similar places that have been converted from unreachable to fatal error previously. llvm-svn: 318379
-
Yaxun Liu authored
processDbgDeclares assumes pointer size is the same for different addr spaces. It uses pointer size for addr space 0 for all pointers, which causes assertion in stripAndAccumulateInBoundsConstantOffsets for amdgcn---amdgiz since pointer in addr space 5 has different size than in addr space 0. This patch fixes that. Differential Revision: https://reviews.llvm.org/D40085 llvm-svn: 318370
-
Daniel Sanders authored
Summary: This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV, causes TableGen to instrument the generated table to collect rule coverage information. However, LLVM_ENABLE_GISEL_COV goes a bit further than LLVM_ENABLE_DAGISEL_COV. The information is written to files (${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will read this information and use it to emit warnings about untested rules. This technique could also be used by SelectionDAG and can be further extended to detect hot rules and give them priority over colder rules. Usage: * Enable LLVM_ENABLE_GISEL_COV in CMake * Build the compiler and run some tests * cat gisel-coverage-[0-9]* > gisel-coverage-all * Delete lib/Target/*/*GenGlobalISel.inc* * Build the compiler Known issues: * ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual step due to a lack of a portable 'cat' command. It should be the concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files. * There's no mechanism to discard coverage information when the ruleset changes Depends on D39742 Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka Reviewed By: rovka Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D39747 llvm-svn: 318356
-
Rong Xu authored
Due to integer precision, we might have numerator greater than denominator in the branch probability scaling. Add a check to prevent this from happening. llvm-svn: 318353
-
Aditya Nandakumar authored
[GISel][NFC]: Move getOpcodeDef from the LegalizationArtifactCombiner into GlobalISel/Utils for use elsewhere llvm-svn: 318350
-
- Nov 15, 2017
-
-
Jonas Devlieghere authored
In constructAbstractSubprogramScopeDIE there can be a potential mismatch between `this` and the CU of ContextDIE when a scope is shared between two DISubprograms belonging to a different CU. In that case, `this` is the CU that was specified in the IR, but the CU of ContextDIE is that of the first subprogram that was emitted. This patch fixes the mismatch by looking up the CU of ContextDIE, and switching to use that. This fixes PR35212 (https://bugs.llvm.org/show_bug.cgi?id=35212) Patch by Philip Craig! Differential revision: https://reviews.llvm.org/D39981 llvm-svn: 318289
-
Fangrui Song authored
Differential Revision: https://reviews.llvm.org/D40005 llvm-svn: 318272
-
- Nov 14, 2017
-
-
Aditya Nandakumar authored
artifacts along with DCE Legalization Artifacts are all those insts that are there to make the type system happy. Currently, the target needs to say all combinations of extends and truncs are legal and there's no way of verifying that post legalization, we only have *truly* legal instructions. This patch changes roughly the legalization algorithm to process all illegal insts at one go, and then process all truncs/extends that were added to satisfy the type constraints separately trying to combine trivial cases until they converge. This has the added benefit that, the target legalizerinfo can only say which truncs and extends are okay and the artifact combiner would combine away other exts and truncs. Updated legalization algorithm to roughly the following pseudo code. WorkList Insts, Artifacts; collect_all_insts_and_artifacts(Insts, Artifacts); do { for (Inst in Insts) legalizeInstrStep(Inst, Insts, Artifacts); for (Artifact in Artifacts) tryCombineArtifact(Artifact, Insts, Artifacts); } while(!Insts.empty()); Also, wrote a simple wrapper equivalent to SetVector, except for erasing, it avoids moving all elements over by one and instead just nulls them out. llvm-svn: 318210
-
Rong Xu authored
This patch peels off the top case in switch statement into a branch if the probability exceeds a threshold. This will help the branch prediction and avoids the extra compares when lowering into chain of branches. Differential Revision: http://reviews.llvm.org/D39262 llvm-svn: 318202
-
Hans Wennborg authored
Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining Clang implements the -finstrument-functions flag inherited from GCC, which inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit. This is useful for getting a trace of how the functions in a program are executed. Normally, the calls remain even if a function is inlined into another function, but it is useful to be able to turn this off for users who are interested in a lower-level trace, i.e. one that reflects what functions are called post-inlining. (We use this to generate link order files for Chromium.) LLVM already has a pass for inserting similar instrumentation calls to mcount(), which it does after inlining. This patch renames and extends that pass to handle calls both to mcount and the cygprofile functions, before and/or after inlining as controlled by function attributes. Differential Revision: https://reviews.llvm.org/D39287 llvm-svn: 318195
-
Easwaran Raman authored
Summary: Bypass of slow divs based on operand values is currently disabled for -Os. Do the same when profile summary is available and the working set size of the application is huge. This is similar to how loop peeling is guarded by hasHugeWorkingSetSize. In the div bypass case, the generated extra code (and the extra branch) tendss to outweigh the benefits of the bypass. This results in noticeable performance improvement on an internal application. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39992 llvm-svn: 318179
-
Yaxun Liu authored
TargetLowering::LowerCallTo assumes that sret value type corresponds to a pointer in default address space, which is incorrect, since sret value type should correspond to a pointer in alloca address space, which may not be the default address space. This causes assertion for amdgcn target in amdgiz environment. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39996 llvm-svn: 318167
-
Sam Clegg authored
For now at least. We clearly need some kind of comdat or linkonce_odr support for wasm but currently COMDAT is not supported. Disable COMDAT support in the same way we do the Mach-O. This also causes clang not to generated COMDATs. Differential Revision: https://reviews.llvm.org/D39873 llvm-svn: 318123
-
- Nov 13, 2017
-
-
Adrian Prantl authored
when transferring debug info describing the lower bits of an extended SDNode. rdar://problem/35504722 llvm-svn: 318086
-
Simon Dardis authored
This reverts commit r318032. The test broke some sanitizer bots. llvm-svn: 318049
-
Simon Dardis authored
CodeGenPrepare sinks address computations from one basic block to another and attempts to reuse address computations that have already been sunk. If the same address computation appears twice with the first instance as an operand of a load whose result is an operand to a simplifable select, CodeGenPrepare simplifies the select and recursively erases the now dead instructions. CodeGenPrepare then attempts to use the erased address computation for the second load. Fix this by erasing the cached address value if it has zero uses before looking for the address value in the sunken address map. This partially resolves PR35209. Thanks to Alexander Richardson for reporting the issue! Reviewers: john.brawn Differential Revision: https://reviews.llvm.org/D39841 llvm-svn: 318032
-
Matt Arsenault authored
llvm-svn: 318020
-
- Nov 11, 2017
-
-
Daniel Sanders authored
Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to correct situations where SelectionDAG and GlobalISel disagree on representation. For example, it would rewrite: (sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16> to: (sext:i32 (load:i16 $ptr)<<unindexedload>>) I'd have preferred to replace the fragments and have the expansion happen naturally as part of PatFrag expansion but the type inferencing system can't cope with loads of types narrower than those mentioned in register classes. This is because the SDTCisInt's on the sext constrain both the result and operand to the 'legal' integer types (where legal is defined as 'a register class can contain the type') which immediately rules the narrower types out. Several targets (those with only one legal integer type) would then go on to crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types for the result of the extend. Also, improve isObviouslySafeToFold() slightly to automatically return true for neighbouring instructions. There can't be any re-ordering problems if re-ordering isn't happenning. We'll need to improve it further to handle sign/zero-extending loads when the extend and load aren't immediate neighbours though. llvm-svn: 317971
-
Craig Topper authored
[SelectionDAG] Make getUniformBase in SelectionDAGBuilder fail if any of the middle GEP indices are non-constant. This is a fix for a bug in r317947. We were supposed to check that all the indices are are constant 0, but instead we're only make sure that indices that are constant are 0. Non-constant indices are being ignored. llvm-svn: 317950
-
- Nov 10, 2017
-
-
Craig Topper authored
[SelectionDAG] Teach SelectionDAGBuilder's getUniformBase for gather/scatter handling to accept GEPs with more than 2 operands if the middle operands are all 0s Currently we can only get a uniform base from a simple GEP with 2 operands. This causes us to miss address folding opportunities for simple global array accesses as the test case shows. This patch adds support for larger GEPs if the other indices are 0 since those don't require any additional computations to be inserted. We may also want to handle constant splats of zero here, but I'm leaving that for future work when I have a real world example. Differential Revision: https://reviews.llvm.org/D39911 llvm-svn: 317947
-
Amaury Sechet authored
[DAGcombine] Do not replace truncate node by itself when doing constant folding, this trigger needless extra rounds of combine for nothing. NFC llvm-svn: 317926
-
Jatin Bhateja authored
Summary: Fixes PR35220 Reviewers: vadimcn, alexcrichton Reviewed By: alexcrichton Subscribers: pepyakin, alexcrichton, jfb, dschuff, sbc100, jgravelle-google, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D39866 llvm-svn: 317895
-
Alexander Timofeev authored
Differential revision: https://reviews.llvm.org/D38754 llvm-svn: 317884
-
Karl-Johan Karlsson authored
Summary: The associated debug value is updated when the virtual source register of a copy is completely eliminated and replaced with a rematerialize value in the defed register of the copy. As the debug value now is associated with another register it also need to be moved, otherwise the debug value isn't valid. Reviewers: aprantl Reviewed By: aprantl Subscribers: MatzeB, llvm-commits, qcolombet Differential Revision: https://reviews.llvm.org/D38024 llvm-svn: 317880
-
Jonas Paulsson authored
* The method getRegAllocationHints() is now of bool type instead of void. If true is returned, regalloc (AllocationOrder) will *only* try to allocate the hints, as opposed to merely trying them before non-hinted registers. * TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with an increase in number of LOCRs. In this case, it is desired to force the hints even though there is a slight increase in spilling, because if a non-hinted register would be allocated, the LOCRMux pseudo would have to be expanded with a jump sequence. The LOCR (Load On Condition) SystemZ instruction must have both operands in either the low or high part of the 64 bit register. Reviewers: Quentin Colombet and Ulrich Weigand https://reviews.llvm.org/D36795 llvm-svn: 317879
-
- Nov 09, 2017
-
-
Adrian Prantl authored
rdar://problem/27139077 llvm-svn: 317825
-
Mandeep Singh Grang authored
Summary: This fixes failure in CodeGen/AArch64/global-merge-group-by-use.ll uncovered by D39245. Reviewers: ab, asl Reviewed By: ab Subscribers: aemerson, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D39635 llvm-svn: 317817
-
Andrew V. Tischenko authored
Differential Revision: https://reviews.llvm.org/D39728 llvm-svn: 317782
-
- Nov 08, 2017
-
-
Adrian Prantl authored
In Rust, a trait can be implemented for any type, and if a trait object pointer is used for the type, then a virtual table will be emitted for that trait/type combination. We would like debuggers to be able to inspect trait objects, which requires finding the concrete type associated with a given vtable. This patch changes LLVM so that any type can be passed to replaceVTableHolder. This allows the Rust compiler to emit the needed debug info -- associating a vtable with the concrete type for which it was emitted. This is a DWARF extension: DWARF only specifies the meaning of DW_AT_containing_type in one specific situation. This style of DWARF extension is routine, though, and LLVM already has one such case for DW_AT_containing_type. Patch by Tom Tromey! Differential Revision: https://reviews.llvm.org/D39503 llvm-svn: 317730
-
Dan Gohman authored
This patch implements Chandler's idea [0] for supporting languages that require support for infinite loops with side effects, such as Rust, providing part of a solution to bug 965 [1]. Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual effect, but which appears to optimization passes to have obscure side effects, such that they don't optimize away loops containing it. It also teaches several optimization passes to ignore this intrinsic, so that it doesn't significantly impact optimization in most cases. As discussed on llvm-dev [2], this patch is the first of two major parts. The second part, to change LLVM's semantics to have defined behavior on infinite loops by default, with a function attribute for opting into potential-undefined-behavior, will be implemented and posted for review in a separate patch. [0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html [1] https://bugs.llvm.org/show_bug.cgi?id=965 [2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html Differential Revision: https://reviews.llvm.org/D38336 llvm-svn: 317729
-
Reid Kleckner authored
This reverts r317579, originally committed as r317100. There is a design issue with marking CFI instructions duplicatable. Not all targets support the CFIInstrInserter pass, and targets like Darwin can't cope with duplicated prologue setup CFI instructions. The compact unwind info emission fails. When the following code is compiled for arm64 on Mac at -O3, the CFI instructions end up getting tail duplicated, which causes compact unwind info emission to fail: int a, c, d, e, f, g, h, i, j, k, l, m; void n(int o, int *b) { if (g) f = 0; for (; f < o; f++) { m = a; if (l > j * k > i) j = i = k = d; h = b[c] - e; } } We get assembly that looks like this: ; BB#1: ; %if.then Lloh3: adrp x9, _f@GOTPAGE Lloh4: ldr x9, [x9, _f@GOTPAGEOFF] mov w8, wzr Lloh5: str wzr, [x9] stp x20, x19, [sp, #-16]! ; 8-byte Folded Spill .cfi_def_cfa_offset 16 .cfi_offset w19, -8 .cfi_offset w20, -16 cmp w8, w0 b.lt LBB0_3 b LBB0_7 LBB0_2: ; %entry.if.end_crit_edge Lloh6: adrp x8, _f@GOTPAGE Lloh7: ldr x8, [x8, _f@GOTPAGEOFF] Lloh8: ldr w8, [x8] stp x20, x19, [sp, #-16]! ; 8-byte Folded Spill .cfi_def_cfa_offset 16 .cfi_offset w19, -8 .cfi_offset w20, -16 cmp w8, w0 b.ge LBB0_7 LBB0_3: ; %for.body.lr.ph Note the multiple .cfi_def* directives. Compact unwind info emission can't handle that. llvm-svn: 317726
-