- Mar 23, 2018
-
-
Tony Tye authored
Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue. Differential Revision: https://reviews.llvm.org/D44696 llvm-svn: 328350
-
Tony Tye authored
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS. Differential Revision: https://reviews.llvm.org/D43736 llvm-svn: 328349
-
Zachary Turner authored
When investigating bugs in PDB generation, the first step is often to do the same link with link.exe and then compare PDBs. But comparing PDBs is hard because two completely different byte sequences can both be correct, so it hampers the investigation when you also have to spend time figuring out not just which bytes are different, but also if the difference is meaningful. This patch fixes a couple of cases related to string table emission, hash table emission, and the order in which we emit strings that makes more of our bytes the same as the bytes generated by MS PDBs. Differential Revision: https://reviews.llvm.org/D44810 llvm-svn: 328348
-
Tony Tye authored
[AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU (CLANG) - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use a function attribute to communicate to the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D43735 llvm-svn: 328347
-
Krzysztof Parzyszek authored
HexagonGenMux would collapse pairs of predicated transfers if it assumed that the predicated .new forms cannot be created. Turns out that generating mux is preferable in almost all cases. Introduce an option -hexagon-gen-mux-threshold that controls the minimum distance between the instruction defining the predicate and the later of the two transfers. If the distance is closer than the threshold, mux will not be generated. Set the threshold to 0 by default. llvm-svn: 328346
-
Jordan Rose authored
The nodes keep a reference back to the original document, but the document is streamed, not read all into memory at once, and the position is part of the state. If nodes are ever copied, the document position can end up being advanced more than once. This did not reveal any problems in LLVM or Clang but caught a handful over in Swift! llvm-svn: 328345
-
Krzysztof Parzyszek authored
Patch by Anand Kodnani. llvm-svn: 328344
-
Simon Pilgrim authored
llvm-svn: 328343
-
Alex Shlyapnikov authored
Summary: Porting HWASan to Linux x86-64, first of the three patches, LLVM part. The approach is similar to ARM case, trap signal is used to communicate memory tag check failure. int3 instruction is used to generate a signal, access parameters are stored in nop [eax + offset] instruction immediately following the int3 one. One notable difference is that x86-64 has to untag the pointer before use due to the lack of feature comparable to ARM's TBI (Top Byte Ignore). Reviewers: eugenis Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44699 llvm-svn: 328342
-
Ana Pazos authored
This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode. In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode, adjustBBOffsetsAfter() is not calculating postOffset correctly by properly accounting for the padding that is required for the constant pool that immediately follows the jump table branch instruction. Reviewers: t.p.northover, eli.friedman Reviewed By: t.p.northover Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44709 llvm-svn: 328341
-
Andrea Di Biagio authored
This should have been part of r328335. I forgot to svn add these files. llvm-svn: 328340
-
Krzysztof Parzyszek authored
- Fix checking for vector predicate registers. - Avoid speculating llvm.lifetime.end intrinsic. Patch by Harsha Jagasia and Brendon Cahoon. llvm-svn: 328339
-
Simon Pilgrim authored
Add missing non-VEX and (V)PMOVMSKB instructions to the pattern llvm-svn: 328338
-
Ben Langmuir authored
This make -ivfsoverlay behave more like other fatal errors (e.g. missing -include file) by skipping the missing file instead of bailing out of the whole compilation. This makes it possible for libclang to still provide some functionallity as well as to correctly produce the fatal error diagnostic (previously we lost the diagnostic in libclang since there was no TU to tie it to). rdar://33385423 llvm-svn: 328337
-
Andrew Kaylor authored
Differential Revision: https://reviews.llvm.org/D44817 llvm-svn: 328336
-
Andrea Di Biagio authored
This is done in preparation for the fix for PR36874. The number of cycles consumed for each pipe is now a double quantity. This allows reuse of the resource pressure view to print out instruction tables. llvm-svn: 328335
-
Fangrui Song authored
llvm-svn: 328334
-
Krzysztof Parzyszek authored
When converting an instruction to the wider version, copy any subregisters if the original operand has a subregister. Patch by Brendon Cahoon. llvm-svn: 328333
-
Rafael Espindola authored
When looking for the output section and the output offset the expectation was that the caller had looked at Repl. That works fine for InputSections, but in the case of MergeInputSections the caller doesn't have the section that is actually replaced. The original testcase was failing because getOutputSection was returning null. The slightly extended testcase also checks that getOffset also checks Repl. I will send a refactoring separetelly. llvm-svn: 328332
-
Simon Pilgrim authored
llvm-svn: 328331
-
Sanjay Patel authored
llvm-svn: 328329
-
Simon Pilgrim authored
llvm-svn: 328328
-
Sanjay Patel authored
llvm-svn: 328327
-
Zaara Syeda authored
This patch adds functions to allow MachineLICM to hoist invariant stores. Currently, MachineLICM does not hoist any store instructions, however when storing the same value to a constant spot on the stack, the store instruction should be considered invariant and be hoisted. The function isInvariantStore iterates each operand of the store instruction and checks that each register operand satisfies isCallerPreservedPhysReg. The store may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore. This patch also adds the PowerPC changes needed to consider the stack register as caller preserved. Differential Revision: https://reviews.llvm.org/D40196 llvm-svn: 328326
-
Sanjay Patel authored
llvm-svn: 328325
-
Simon Pilgrim authored
llvm-svn: 328324
-
Sanjay Patel authored
llvm-svn: 328323
-
Sanjay Patel authored
llvm-svn: 328322
-
John Brawn authored
Loads and stores can only shift the offset register by the size of the value being loaded, but currently the DAGCombiner will reduce the width of the load if it's followed by a trunc making it impossible to later combine the shift. Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and make it prevent the width reduction if this is what would happen, though do allow it if reducing the load width will let us eliminate a later sign or zero extend. Differential Revision: https://reviews.llvm.org/D44794 llvm-svn: 328321
-
Simon Pilgrim authored
This was due to a misunderstanding over what llvm calls a micro-op (retirement unit) is actually called a macro-op on the AMD/Jaguar target. Folded loads don't affect num macro ops. llvm-svn: 328320
-
George Rimar authored
llvm-svn: 328319
-
Simon Pilgrim authored
[X86][Btver2] Cleanup SSE42 PCMPISTR/PCMPESTR string instructions to correctly use JFPU1 scheduler pipe followed by JLAGU/JSAGU/JFPA/JVALU function units Fixes throughput to match Agner/Fam16h-SoG as well. llvm-svn: 328318
-
Daniel Neilson authored
Summary: This change is part of step six in the series of changes to remove the alignment argument from memcpy/memmove/memset in favour of alignment attributes. At this point all users of the IRBuilder APIs for creating a memcpy/memmove call given a single value for alignment have been updated. We want to discourage usage of these old APIs in favour of the newer ones that allow for separate source and destination alignments, so this patch deletes the old API. Specifically, we remove from IRBuilder: CallInst *CreateMemCpy(Value *Dst, Value *Src, uint64_t Size, unsigned Align, bool isVolatile = false, MDNode *TBAATag = nullptr, MDNode *TBAAStructTag = nullptr, MDNode *ScopeTag = nullptr, MDNode *NoAliasTag = nullptr) CallInst *CreateMemCpy(Value *Dst, Value *Src, Value *Size, unsigned Align, bool isVolatile = false, MDNode *TBAATag = nullptr, MDNode *TBAAStructTag = nullptr, MDNode *ScopeTag = nullptr, MDNode *NoAliasTag = nullptr) CallInst *CreateMemMove(Value *Dst, Value *Src, uint64_t Size, unsigned Align, bool isVolatile = false, MDNode *TBAATag = nullptr, MDNode *ScopeTag = nullptr, MDNode *NoAliasTag = nullptr) CallInst *CreateMemMove(Value *Dst, Value *Src, Value *Size, unsigned Align, bool isVolatile = false, MDNode *TBAATag = nullptr, MDNode *ScopeTag = nullptr, MDNode *NoAliasTag = nullptr) Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421, rL328097 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 328317
-
Matthew Simpson authored
When building the SLP tree, we look for reuse among the vectorized tree entries. However, each gather sequence is represented by a unique tree entry, even though the sequence may be identical to another one. This means, for example, that a gather sequence with two uses will be counted twice when computing the cost of the tree. We should only count the cost of the definition of a gather sequence rather than its uses. During code generation, the redundant gather sequences are emitted, but we optimize them away with CSE. So it looks like this problem just affects the cost model. Differential Revision: https://reviews.llvm.org/D44742 llvm-svn: 328316
-
Daniel Neilson authored
Summary: This change is part of step six in the series of changes to remove the alignment argument from memcpy/memmove/memset in favour of alignment attributes. At this point all uses of MemIntrinsicInst::[get|set]Alignment() have been updated, so we now remove these methods entirely to discourage their use. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421, rL328097 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 328315
-
Alexey Bataev authored
Summary: Some targets does not support labels inside debug sections, but support references in form `section+offset`. Patch adds initial support for this. Reviewers: echristo, probinson, jlebar Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D43943 llvm-svn: 328314
-
Christof Douma authored
When targeting execute-only and fp-armv8, float constants in a compare resulted in instruction selection failures. This is now fixed by using vmov.f32 where possible, otherwise the floating point constant is lowered into a integer constant that is moved into a floating point register. This patch also restores using fpcmp with immediate 0 under fp-armv8. Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443 llvm-svn: 328313
-
Florian Hahn authored
Reverted for now, due to it causing verifier failures. llvm-svn: 328312
-
Amara Emerson authored
This was being masked because GISel is enabled by default for -O0 and the abort was disabled. Modified test to explicitly enable abort. llvm-svn: 328311
-
Matthew Simpson authored
llvm-svn: 328310
-