- Jun 12, 2020
-
-
Florian Hahn authored
getOrCreateTripCount is used to generate code for the outer loop, but it requires a computable backedge taken counts. Check that in the VPlan native path. Reviewers: Ayal, gilr, rengolin, sguggill Reviewed By: sguggill Differential Revision: https://reviews.llvm.org/D81088
-
Sebastian Neubauer authored
Add G16 feature for GFX10 and support A16 and G16 in GlobalISel. Differential Revision: https://reviews.llvm.org/D76836
-
Georgii Rymar authored
`PubName` and `PubType` are optional fields since D80722. They are defined as: Optional<PubSection> PubNames; Optional<PubSection> PubTypes; And initialized in the following way: IO.mapOptional("debug_pubnames", DWARF.PubNames); IO.mapOptional("debug_pubtypes", DWARF.PubTypes); But problem is that because of the issue in `YAMLTraits.cpp`, when there are no `debug_pubnames`/`debug_pubtypes` keys in a YAML description, they are not initialized to `Optional::None` as the code expects, but they are initialized to default `PubSection()` instances. Because of this, the `if` condition in the following code is always true: if (Obj.DWARF.PubNames) Err = DWARFYAML::emitPubSection(OS, *Obj.DWARF.PubNames, Obj.IsLittleEndian); What means `emitPubSection` is always called and it writes few values. This patch fixes the issue. I've reduced `sizeofcmds` by size of data previously written because of this bug. Differential revision: https://reviews.llvm.org/D81686
-
Chen Zheng authored
This is a NFC patch to make convertToImmediateForm a light wrapper for converting xform and imm form instructions on PowerPC. Reviewed By: Steven.zhang Differential Revision: https://reviews.llvm.org/D80907
-
EgorBo authored
Summary: "X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two) However, "X % Y" can also be represented as "X - (X / Y) * Y" so if I rewrite the initial expression: "X - (X / C) * C == 0" it's not currently optimized to "X & C-1 == 0", see godbolt: https://godbolt.org/z/KzuXUj This is my first contribution to LLVM so I hope I didn't mess things up Reviewers: lebedev.ri, spatel Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79369
-
Jonas Devlieghere authored
Use indices into the Symbols vector instead of casting the objects in the vector and dereferencing std::vector::end(). This change is NFC modulo the Windows failure reported by llvm-clang-x86_64-expensive-checks-win. Differential revision: https://reviews.llvm.org/D81717
-
Kristof Beyls authored
To make sure that no barrier gets placed on the architectural execution path, each BLR x<N> instruction gets transformed to a BL __llvm_slsblr_thunk_x<N> instruction, with __llvm_slsblr_thunk_x<N> a thunk that contains __llvm_slsblr_thunk_x<N>: BR x<N> <speculation barrier> Therefore, the BLR instruction gets split into 2; one BL and one BR. This transformation results in not inserting a speculation barrier on the architectural execution path. The mitigation is off by default and can be enabled by the harden-sls-blr subtarget feature. As a linker is allowed to clobber X16 and X17 on function calls, the above code transformation would not be correct in case a linker does so when N=16 or N=17. Therefore, when the mitigation is enabled, generation of BLR x16 or BLR x17 is avoided. As BLRA* indirect calls are not produced by LLVM currently, this does not aim to implement support for those. Differential Revision: https://reviews.llvm.org/D81402
-
Yevgeny Rouban authored
Avoid division by zero in updatePredecessorProfileMetadata(). Reviewers: yamauchi Tags: #llvm Differential Revision: https://reviews.llvm.org/D81499
-
Craig Topper authored
[X86] Add a helper lambda to getIntelProcessorTypeAndSubtype to select feature bits from the correct 32-bit feature variable. We have three 32 bit variables containing feature bits. But our enum is a flat 96 bit space. So we need to pick which of the variables to use based on the bit value. We used to do this manually by mentioning the correct variable and subtracting an offset from the enum. But this is error prone.
-
Vitaly Buka authored
We don't need process paramenters which marked as byval as we are not going to pass interested allocas without copying. If we pass value into byval argument, we just handle that as Load of corresponding type and stop that branch of analysis.
-
Yonghong Song authored
In BPF Instruction Selection DAGToDAG transformation phase, BPF backend had an optimization to turn load from readonly data section to direct load of the values. This phase is implemented before libbpf has readonly section support and before alu32 is supported. This phase however may generate incorrect type when alu32 is enabled. The following is an example, -bash-4.4$ cat ~/tmp2/t.c struct t { unsigned char a; unsigned char b; unsigned char c; }; extern void foo(void *); int test() { struct t v = { .b = 2, }; foo(&v); return 0; } The compiler will turn local variable "v" into a readonly section. During instruction selection phase, the compiler generates two loads from readonly section, one 2 byte load or 1 byte load, e.g., for 2 loads, t8: i32,ch = load<(dereferenceable load 2 from `i8* getelementptr inbounds (%struct.t, %struct.t* @__const.test.v, i64 0, i32 0)`, align 1), anyext from i16> t3, GlobalAddress:i64<%struct.t* @__const.test.v> 0, undef:i64 t9: ch = store<(store 2 into %ir.v1.sub1), trunc to i16> t3, t8, FrameIndex:i64<0>, undef:i64 BPF backend changed t8 to i64 = Constant<2> and eventually the generated machine IR: t10: i64 = MOV_ri TargetConstant:i64<2> t40: i32 = SLL_ri_32 t10, TargetConstant:i32<8> t41: i32 = OR_ri_32 t40, TargetConstant:i64<0> t9: ch = STH32<Mem:(store 2 into %ir.v1.sub1)> t41, TargetFrameIndex:i64<0>, TargetConstant:i64<0>, t3 Note that t10 in the above is not correct. The type should be i32 and instruction should be MOV_ri_32. The reason for incorrect insn selection is BPF insn selection generated an i64 constant instead of an i32 constant as specified in the original load instruction. Such incorrect insn sequence eventually caused the following fatal error when a COPY insn tries to copy a 64bit register to a 32bit subregister. Impossible reg-to-reg copy UNREACHABLE executed at ../lib/Target/BPF/BPFInstrInfo.cpp:42! This patch fixed the issue by using the load result type instead of always i64 when doing readonly load optimization. Differential Revision: https://reviews.llvm.org/D81630
-
Cyndy Ishida authored
Summary: This completes the needed glueing to support reading tbd files from nm. This includes specifying which slice filtering with `--arch` and a new option specifically for tbd files `--add-inlinedinfo` which will show the reexported libraries that are appended in the tbd file. Reviewers: ributzka, steven_wu, JDevlieghere, jhenderson Reviewed By: JDevlieghere Subscribers: hiraditya, MaskRay, dexonsmith, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81614
-
Alina Sbirlea authored
Verify after completing all updates. Resolves PR46275.
-
Eric Christopher authored
-
Eric Christopher authored
-
Matt Arsenault authored
-
Sanjay Patel authored
-
Vitaly Buka authored
Code does not need iterate arguments and can get ArgNo from CallBase::getArgOperandNo.
-
Matt Arsenault authored
-
Matt Arsenault authored
This was implicitly assuming the branch instruction was the next after the pseudo. It's possible for another non-terminator instruction to be inserted between the intrinsic and the branch, so adjust the insertion point. Fixes a non-terminator after terminator verifier error (which without the verifier, manifested itself as an infinite loop in analyzeBranch much later on).
-
Kirill Naumov authored
- Renaming the printer class, flag - Refactoring - Changing some tests This patch is a preparational stage for introducing a new printing pass and new functionality to the existing Annotation Writer. I plan to extend this functionality for this tool to be more useful when looking at the inline process.
-
Fangrui Song authored
This reverts part of D81156. Accessing errs() concurrently was safe before and racy after D81156. (`errs() << 'a'` is always racy) Accessing outs() and errs() concurrently was safe before and racy after D81156. Don't tie errs() to outs() by default to fix the fallout. llvm-dwarfdump is single-threaded and opting in the tie behavior is safe.
-
Stanislav Mekhanoshin authored
BasicBlock::isLegalToHoistInto() asserts if block does not have successors. The case is degenarate but assertion still needs to be avoided. https://bugs.llvm.org/show_bug.cgi?id=46280 Differential Revision: https://reviews.llvm.org/D81674
-
Craig Topper authored
The exact same #if is already inside isCpuIdSupported and causes it to return true. The definition of isCpuIdSupported isn't conditional so we should be able just rely on its body doing the right thing.
-
Thomas Lively authored
Summary: After their range checks were removed in 7f50c15b, br_tables started being duplicated into their predecessors by tail folding. Unfortunately, when the br_tables were in loops this transformation introduced bad irreducible control flow which was later expanded into even more br_tables. This commit abuses the `isNotDuplicable` property to prevent this irreducible control flow from being introduced. This change saves a few dozen bytes of code size and has a negligible affect on performance for most of the large Emscripten benchmarks, but can improve performance significantly on microbenchmarks of switches in loops. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81628
-
- Jun 11, 2020
-
-
Reid Kleckner authored
This reverts commit 101fbc01. Remove leftover debugging attribute. Update LLDB as well, which was missed before.
-
Craig Topper authored
[X86] Force VIA PadLock crypto instructions to emit a 0xF3 prefix when they encode to match what GNU as does. The spec for these says they need 0xf3 but also mentions REP before the mnemonic. But I don't think its fair to users to make them write REP first. And gas doesn't make them. objdump seems to disassemble with or without the prefix and just prints any 0xf3 as REP.
-
Craig Topper authored
'NP' means that the instruction is not recognized with a 66, F2 or F3 prefix. It will either #UD or decode to a different instruction. All of the cases are here should fall into the #UD variety since we should be detecting the collision with other instructions when we build the disassembler tables.
-
diggerlin authored
SUMMARY: Since we deal with aix emitLinkage in the PPCAIXAsmPrinter::emitLinkage() in the patch https://reviews.llvm.org/D75866. It do not go to AsmPrinter::emitLinkage() any more, we clean up some aix related code in the AsmPrinter::emitLinkage() Reviewers: Jason liu Differential Revision: https://reviews.llvm.org/D81613
-
Petar Avramovic authored
Put AND before ADD in LegalizerHelper::lowerFPTRUNC_F64_TO_F16 in order to match algorithm from AMDGPUTargetLowering::LowerFP_TO_FP16. Differential Revision: https://reviews.llvm.org/D81666
-
Mircea Trofin authored
Summary: Other derivations will all want to emit optimization remarks and, as part of that, use debug info. Additionally, drive-by const-ing. Reviewers: davidxl, dblaikie Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81507
-
Simon Pilgrim authored
Convert shift+or bool vector patterns into CONCAT_VECTORS if we know this will be lowered to KUNPCK (which requires 16+ vector elements). Fixes PR32547
-
serge-sans-paille authored
Take into account added functions, global values and attribute change. Differential Revision: https://reviews.llvm.org/D81239
-
Jay Foad authored
Change BasicBlock::removePredecessor to optionally return a vector of instructions which might be dead. Use this in ConstantFoldTerminator to delete them if they are dead. Reapply with a bug fix: don't drop the "!KeepOneInputPHIs" argument when removePredecessor calls PHINode::removeIncomingValue. Differential Revision: https://reviews.llvm.org/D80206
-
Sam Parker authored
Which triggers on valid, but not useful, IR such as a undef mask. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46276 Differential Revision: https://reviews.llvm.org/D81634
-
Jay Foad authored
Change BasicBlock::removePredecessor to optionally return a vector of instructions which might be dead. Use this in ConstantFoldTerminator to delete them if they are dead. Differential Revision: https://reviews.llvm.org/D80206
-
Jay Foad authored
Previously these functions either returned a "changed" flag or a "repeat instruction" flag, and could also modify an iterator to control which instruction would be processed next. Simplify this by always returning a "changed" flag, and handling all of the "repeat instruction" functionality by modifying the iterator. No functional change intended except in this case: // If the source and destination of the memcpy are the same, then zap it. ... where the previous code failed to process the instruction after the zapped memcpy. Differential Revision: https://reviews.llvm.org/D81540
-
Pavel Labath authored
Other warnings messages don't have a trailing full stop.
-
Pavel Labath authored
-