- Aug 09, 2018
-
-
Reid Kleckner authored
The inalloca parameter has to be the only parameter passed in memory. Changing the convention to fastcc can break that. At some point we should teach global opt how to optimize ABI attributes like inalloca and maybe byval. These attributes are mainly used to match C ABIs. They are harder for LLVM to optimize and they don't always generate the best code. Fixes PR38487 llvm-svn: 339360
-
- Jul 28, 2018
-
-
David Green authored
We now, from clang, can turn arrays of static short g_data[] = {16, 16, 16, 16, 16, 16, 16, 16, 0, 0, 0, 0, 0, 0, 0, 0}; into structs of the form @g_data = internal global <{ [8 x i16], [8 x i16] }> ... GlobalOpt will incorrectly SROA it, not realising that the access to the first element may overflow into the second. This fixes it by checking geps more thoroughly. I believe this makes the globalsra-partial.ll test case invalid as the %i value could be out of bounds. I've re-purposed it as a negative test for this case. Differential Revision: https://reviews.llvm.org/D49816 llvm-svn: 338192
-
- Jul 10, 2018
-
-
Manoj Gupta authored
Summary: Support for this option is needed for building Linux kernel. This is a very frequently requested feature by kernel developers. More details : https://lkml.org/lkml/2018/4/4/601 GCC option description for -fdelete-null-pointer-checks: This Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. -fno-delete-null-pointer-checks is the inverse of this implying that null pointer dereferencing is not undefined. This feature is implemented in LLVM IR in this CL as the function attribute "null-pointer-is-valid"="true" in IR (Under review at D47894). The CL updates several passes that assumed null pointer dereferencing is undefined to not optimize when the "null-pointer-is-valid"="true" attribute is present. Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv Reviewed By: efriedma, george.burgess.iv Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47895 llvm-svn: 336613
-
- Jun 12, 2018
-
-
Florian Hahn authored
Currently SmallSet<PointerTy> inherits from SmallPtrSet<PointerTy>. This patch replaces such types with SmallPtrSet, because IMO it is slightly clearer and allows us to get rid of unnecessarily including SmallSet.h Reviewers: dblaikie, craig.topper Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D47836 llvm-svn: 334492
-
- Jun 04, 2018
-
-
David Blaikie authored
Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954
-
- May 14, 2018
-
-
Nicola Zaghen authored
The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
-
- Apr 27, 2018
-
-
Adrian Prantl authored
This patch adds support for fragment expressions TryToShrinkGlobalToBoolean() which were previously just dropped. Thanks to Reid Kleckner for providing me a reproducer! llvm-svn: 331086
-
- Mar 21, 2018
-
-
David Blaikie authored
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering. Transforms depends on Transforms/Utils, not the other way around. So remove the header and the "createStripGCRelocatesPass" function declaration (& definition) that is unused and motivated this dependency. Move Transforms/Utils/Local.h into Analysis because it's used by Analysis/MemoryBuiltins.cpp. llvm-svn: 328165
-
- Feb 28, 2018
-
-
Jonas Devlieghere authored
When the function has musttail call - its cc is fixed to be equal to the cc of the musttail callee. In such case (and in the case of the musttail callee), GlobalOpt should not change the cc to fastcc as it will break the invariant. This fixes PR36546 Patch by: Fedor Indutny (indutny) Differential revision: https://reviews.llvm.org/D43859 llvm-svn: 326376
-
- Feb 23, 2018
-
-
Eric Christopher authored
checking the alias and not the aliasee. If the alias can be interposed then we shouldn't do anything. llvm-svn: 325837
-
- Feb 22, 2018
-
-
Luke Cheeseman authored
- Fix for bug 36078. - Prevent the functionattrs, function-attrs, globalopt and argpromotion passes from changing naked functions. - These passes can perform some alterations to the functions that should not be applied. An example is removing parameters that are seemingly not used because they are only referenced in the inline assembly. Another example is marking the function as fastcc. llvm-svn: 325788
-
- Feb 02, 2018
-
-
Mikael Holmen authored
Summary: When creating the debug fragments for a SRA'd variable, use the types' allocation sizes. This fixes issues where the pass would emit too small fragments, placed at the wrong offset, for padded types. An example of this is long double on x86. The type is represented using x86_fp80, which is 10 bytes, but the value is aligned to 12/16 bytes. The padding is included in the type's DW_AT_byte_size attribute; therefore, the fragments should also include that. Newer GCC releases (I tested 7.2.0) emit 12/16-byte pieces for long double. Earlier releases, e.g. GCC 5.5.0, behaved as LLVM did, i.e. by emitting a 10-byte piece, followed by an empty 2/6-byte piece for the padding. Failing to cover all `DW_AT_byte_size' bytes of a value with non-empty pieces results in the value being printed as <optimized out> by GDB. Patch by: David Stenberg Reviewers: aprantl, JDevlieghere Reviewed By: aprantl, JDevlieghere Subscribers: llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D42807 llvm-svn: 324066
-
- Feb 01, 2018
-
-
Amara Emerson authored
For very, very large global initializers which can be statically evaluated, the code would create vectors of temporary Constants, modifying them in place, before committing the resulting Constant aggregate to the global's initializer value. This had effectively O(n^2) complexity in the size of the global initializer and would cause memory and non-termination issues compiling some workloads. This change performs the static initializer evaluation and creation in batches, once for each global in the evaluated IR memory. The existing code is maintained as a last resort when the initializers are more complex than simple values in a large aggregate. This should theoretically by NFC, no test as the example case is massive. The existing test cases pass with this, as well as the llvm test suite. To give an example, consider the following C++ code adapted from the clang regression tests: struct S { int n = 10; int m = 2 * n; S(int a) : n(a) {} }; template<typename T> struct U { T *r = &q; T q = 42; U *p = this; }; U<S> e; The global static constructor for 'e' will need to initialize 'r' and 'p' of the outer struct, while also initializing the inner 'q' structs 'n' and 'm' members. This batch algorithm will simply use general CommitValueTo() method to handle the complex nested S struct initialization of 'q', before processing the outermost members in a single batch. Using CommitValueTo() to handle member in the outer struct is inefficient when the struct/array is very large as we end up creating and destroy constant arrays for each initialization. For the above case, we expect the following IR to be generated: %struct.U = type { %struct.S*, %struct.S, %struct.U* } %struct.S = type { i32, i32 } @e = global %struct.U { %struct.S* gep inbounds (%struct.U, %struct.U* @e, i64 0, i32 1), %struct.S { i32 42, i32 84 }, %struct.U* @e } The %struct.S { i32 42, i32 84 } inner initializer is treated as a complex constant expression, while the other two elements of @e are "simple". Differential Revision: https://reviews.llvm.org/D42612 llvm-svn: 323933
-
- Jan 30, 2018
-
-
Zaara Syeda authored
candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778
-
- Jan 25, 2018
-
-
Mikael Holmen authored
Summary: When creating the debug fragments for a SRA'd struct, use the fields' offsets, taken from the struct layout, as the offsets for the resulting fragments. This fixes an issue where GlobalOpt would emit fragments with incorrect offsets for padded fields. This should solve PR36016. Patch by David Stenberg. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42489 llvm-svn: 323411
-
- Jan 17, 2018
-
-
Zaara Syeda authored
Failing build bots. Revert the commit now. llvm-svn: 322748
-
Zaara Syeda authored
candidates with coldcc attribute. This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 322721
-
- Jan 11, 2018
-
-
Rafael Espindola authored
While updating clang tests for having clang set dso_local I noticed that: - There are *a lot* of tests to update. - Many of the updates are redundant. They are redundant because a GV is "obviously dso_local". This patch starts formalizing that a bit by requiring that internal and private GVs be dso_local too. Since they all are, we don't have to print dso_local to the textual representation, making it a bit more compact and easier to read. llvm-svn: 322317
-
- Nov 07, 2017
-
-
Adrian Prantl authored
We can't safely split arithmetic into multiple fragments because we can't express carry-over between fragments. llvm-svn: 317534
-
- Oct 11, 2017
-
-
Eugene Zelenko authored
[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315383
-
- Sep 21, 2017
-
-
Strahinja Petrovic authored
This patch contains fix for reverted commit rL312318 which was causing failure due to use of unchecked dyn_cast to CIInit. Patch by: Nikola Prica. llvm-svn: 313870
-
- Sep 08, 2017
-
-
Richard Trieu authored
r312318 - Debug info for variables whose type is shrinked to bool r312325, r312424, r312489 - Test case for r312318 Revision 312318 introduced a null dereference bug. Details in https://bugs.llvm.org/show_bug.cgi?id=34490 llvm-svn: 312758
-
- Sep 01, 2017
-
-
Strahinja Petrovic authored
This patch provides such debug information for integer variables whose type is shrinked to bool by providing dwarf expression which returns either constant initial value or other value. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D35994 llvm-svn: 312318
-
- Aug 31, 2017
-
-
Adrian Prantl authored
Fixes PR34390. https://bugs.llvm.org/show_bug.cgi?id=34390 llvm-svn: 312196
-
- Aug 30, 2017
-
-
Adrian Prantl authored
NFC llvm-svn: 312165
-
- Aug 09, 2017
-
-
Davide Italiano authored
llvm-svn: 310453
-
- Aug 04, 2017
-
-
Victor Leschuk authored
llvm-svn: 310021
-
Victor Leschuk authored
llvm-svn: 310020
-
Adrian Prantl authored
This is similar to what we are doing in "regular" SROA and creates DW_OP_LLVM_fragment operations to describe the resulting variables. rdar://problem/33654891 llvm-svn: 310014
-
- Jul 13, 2017
-
-
Davide Italiano authored
This commit reapplies r307215 now that we found out and fixed the cause of the cfi test failure (in r307871). llvm-svn: 307920
-
- Jul 12, 2017
-
-
Davide Italiano authored
[GlobalOpt] Remove unreachable blocks before optimizing a function. While the change is presumably correct, it exposes a latent bug in DI which breaks on of the CFI checks. I'll analyze it further and try to understand what's going on. llvm-svn: 307729
-
Konstantin Zhuravlyov authored
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this replaces *singlethread* keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722
-
- Jul 06, 2017
-
-
Davide Italiano authored
LLVM's definition of dominance allows instructions that are cyclic in unreachable blocks, e.g.: %pat = select i1 %condition, @global, i16* %pat because any instruction dominates an instruction in a block that's not reachable from entry. So, remove unreachable blocks from the function, because a) there's no point in analyzing them and b) GlobalOpt should otherwise grow some more complicated logic to break these cycles. Differential Revision: https://reviews.llvm.org/D35028 llvm-svn: 307215
-
- May 01, 2017
-
-
Sanjoy Das authored
This relands r301424. llvm-svn: 301812
-
- Apr 27, 2017
-
-
Eli Friedman authored
Just calling dropAllReferences leaves pointers to the ConstantExpr behind, so we would eventually crash with a null pointer dereference. Differential Revision: https://reviews.llvm.org/D32551 llvm-svn: 301575
-
- Apr 26, 2017
-
-
Sanjoy Das authored
Commits were: "Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts" "Add a new WeakVH value handle; NFC" "Rename WeakVH to WeakTrackingVH; NFC" The changes assumed pointers are 8 byte aligned on all architectures. llvm-svn: 301429
-
Sanjoy Das authored
Summary: I plan to use WeakVH to mean "nulls itself out on deletion, but does not track RAUW" in a subsequent commit. Reviewers: dblaikie, davide Reviewed By: davide Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D32266 llvm-svn: 301424
-
- Apr 20, 2017
-
-
Reid Kleckner authored
llvm-svn: 300787
-
- Apr 11, 2017
-
-
Matt Arsenault authored
LLVM makes several assumptions about address space 0. However, alloca is presently constrained to always return this address space. There's no real way to avoid using alloca, so without this there is no way to opt out of these assumptions. The problematic assumptions include: - That the pointer size used for the stack is the same size as the code size pointer, which is also the maximum sized pointer. - That 0 is an invalid, non-dereferencable pointer value. These are problems for AMDGPU because alloca is used to implement the private address space, which uses a 32-bit index as the pointer value. Other pointers are 64-bit and behave more like LLVM's notion of generic address space. By changing the address space used for allocas, we can change our generic pointer type to be LLVM's generic pointer type which does have similar properties. llvm-svn: 299888
-
- Mar 21, 2017
-
-
Reid Kleckner authored
Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name. Rename AttributeSetImpl to AttributeListImpl to follow suit. It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value. Reviewers: sanjoy, javed.absar, chandlerc, pete Reviewed By: pete Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits Differential Revision: https://reviews.llvm.org/D31102 llvm-svn: 298393
-