- Jun 01, 2020
-
-
Daniel Grumberg authored
Differential Revision: https://reviews.llvm.org/D80808
-
Paula Toth authored
Summary: This is split off from D79192 and exposes APIGenerator (renames to APIIndexer) for use in generating the integrations tests. Reviewers: sivachandra Reviewed By: sivachandra Subscribers: tschuett, ecnelises, libc-commits Tags: #libc-project Differential Revision: https://reviews.llvm.org/D80832
-
Reid Kleckner authored
The inlinees section contains references to the file checksum table. The file checksum table in the PDB must have the same layout as the file checksum table in the object file, so all the existing file id references should stay valid. Previously, we would do this: for all inlined functions: - lookup filename from checksum and string table - make that filename absolute - look up the new file id for that filename up in the new checksum table This lead to pdbMakeAbsolute and remove_dots ending up in the hot path. We should only need to absolutify the source path once, not once every time we process an inline function from that source file. This speeds up linking chrome PGO stage 1 net_unittests.exe from 9.203s to 8.500s (-7.6%). Looking just at time to process symbol records, it goes from ~2000ms to ~1300ms, which is consistent with the overall speedup of about 700ms. This will be less noticeable in debug builds, which have fewer inlined functions records.
-
Florian Hahn authored
This patch implements matrix index expressions (matrix[RowIdx][ColumnIdx]). It does so by introducing a new MatrixSubscriptExpr(Base, RowIdx, ColumnIdx). MatrixSubscriptExprs are built in 2 steps in ActOnMatrixSubscriptExpr. First, if the base of a subscript is of matrix type, we create a incomplete MatrixSubscriptExpr(base, idx, nullptr). Second, if the base is an incomplete MatrixSubscriptExpr, we create a complete MatrixSubscriptExpr(base->getBase(), base->getRowIdx(), idx) Similar to vector elements, it is not possible to take the address of a MatrixSubscriptExpr. For CodeGen, a new MatrixElt type is added to LValue, which is very similar to VectorElt. The only difference is that we may need to cast the type of the base from an array to a vector type when accessing it. Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D76791
-
Martin Liska authored
Remove it from target-specific scope which corresponds to sanitizer_linux.cpp where it lives in the same macro scope. Differential Revision: https://reviews.llvm.org/D80864
-
Fangrui Song authored
GNU ld from binutils 2.35 onwards will likely support --export-dynamic-symbol but with different semantics. https://sourceware.org/pipermail/binutils/2020-May/111302.html Differences: 1. -export-dynamic-symbol is not supported 2. --export-dynamic-symbol takes a glob argument 3. --export-dynamic-symbol can suppress binding the references to the definition within the shared object if (-Bsymbolic or -Bsymbolic-functions) 4. --export-dynamic-symbol does not imply -u I don't think the first three points can affect any user. For the fourth point, Not implying -u can lead to some archive members unfetched. Add -u foo to restore the previous behavior. Exact semantics: * -no-pie or -pie: matched non-local defined symbols will be added to the dynamic symbol table. * -shared: matched non-local STV_DEFAULT symbols will not be bound to definitions within the shared object even if they would otherwise be due to -Bsymbolic, -Bsymbolic-functions, or --dynamic-list. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D80487
-
Sanjay Patel authored
SimplifyDemandedVectorElts() bails out on ScalableVectorType anyway, but we can exit faster with the external check. Move this to a helper function because there are likely other vector folds that we can try here.
-
Matt Arsenault authored
In this awkward case, we have to emit custom pseudo-constrained FP wrappers. InstrEmitter concludes that since a mayRaiseFPException instruction had a chain, it can't add nofpexcept. Test deferred until mayRaiseFPException is really set on everything.
-
Vedant Kumar authored
This is per Adrian's suggestion in https://reviews.llvm.org/D80684.
-
Vedant Kumar authored
Summary: Instead of iterating over all VarLoc IDs in removeEntryValue(), just iterate over the interval reserved for entry value VarLocs. This changes the iteration order, hence the test update -- otherwise this is NFC. This appears to give an ~8.5x wall time speed-up for LiveDebugValues when compiling sqlite3.c 3.30.1 with a Release clang (on my machine): ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- Before: 2.5402 ( 18.8%) 0.0050 ( 0.4%) 2.5452 ( 17.3%) 2.5452 ( 17.3%) Live DEBUG_VALUE analysis After: 0.2364 ( 2.1%) 0.0034 ( 0.3%) 0.2399 ( 2.0%) 0.2398 ( 2.0%) Live DEBUG_VALUE analysis ``` The change in removeEntryValue() is the only one that appears to affect wall time, but for consistency (and to resolve a pending TODO), I made the analogous changes for iterating over SpillLocKind VarLocs. Reviewers: nikic, aprantl, jmorse, djtodoro Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80684
-
Matt Arsenault authored
The AMDGPU non-strict fdiv lowering needs to introduce an FP mode switch in some cases, and has custom nodes to provide chain/glue for the intermediate FP operations. We need to propagate nofpexcept here, but getNode was dropping the flags. Adding nofpexcept in the AMDGPU custom lowering is left to a future patch. Also fix a second case where flags were dropped, but in this case it seems it just didn't handle this number of operands. Test will be included in future AMDGPU patch.
-
Julian Lettner authored
This applies the learnings from [1]. What I intended as a simple cleanup made me realize that the compiler-rt version checks have two separate issues: 1) In some places (e.g., mmap flag setting) what matters is the kernel version, not the OS version. 2) OS version checks are implemented by querying the kernel version. This is not necessarily correct inside the simulators if the simulator runtime isn't aligned with the host macOS. This commit tackles 1) by adopting a separate query function for the Darwin kernel version. 2) (and cleanups) will be dealt with in follow-ups. [1] https://reviews.llvm.org/D78942 rdar://63031937 Reviewed By: delcypher Differential Revision: https://reviews.llvm.org/D79965
-
Hiroshi Yamauchi authored
Summary: The working set size heuristics (ProfileSummaryInfo::hasHugeWorkingSetSize) under the partial sample PGO may not be accurate because the profile is partial and the number of hot profile counters in the ProfileSummary may not reflect the actual working set size of the program being compiled. To improve this, the (approximated) ratio of the the number of profile counters of the program being compiled to the number of profile counters in the partial sample profile is computed (which is called the partial profile ratio) and the working set size of the profile is scaled by this ratio to reflect the working set size of the program being compiled and used for the working set size heuristics. The partial profile ratio is approximated based on the number of the basic blocks in the program and the NumCounts field in the ProfileSummary and computed through the thin LTO indexing. This means that there is the limitation that the scaled working set size is available to the thin LTO post link passes only. Reviewers: davidxl Subscribers: mgorny, eraman, hiraditya, steven_wu, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79831
-
Matt Arsenault authored
-
Matt Arsenault authored
-
hsmahesha authored
Summary: While clustering mem ops, AMDGPU target needs to consider number of clustered bytes to decide on max number of mem ops that can be clustered. This patch adds support to pass number of clustered bytes to target mem ops clustering logic. Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar Reviewed By: foad Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80545
-
Fangrui Song authored
DF_1_PIE originated from Solaris (https://docs.oracle.com/cd/E36784_01/html/E36857/chapter6-42444.html ). GNU ld since https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=5fe2850dd96483f176858fd75c098313d5b20bc2 sets the flag on non-Solaris platforms. It can help distinguish PIE from ET_DYN. eu-classify from elfutils uses this to recognize PIE (https://sourceware.org/git/?p=elfutils.git;a=commit;h=3f489b5c7c78df6d52f8982f79c36e9a220e8951 ) glibc uses this flag to reject dlopen'ing a PIE (https://sourceware.org/bugzilla/show_bug.cgi?id=24323 ) Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D80872
-
Stanislav Mekhanoshin authored
-
Matt Arsenault authored
The alignment value also needs to be scaled by the wave size.
-
Christopher Tetreault authored
Reviewers: efriedma, david-arm, fpetrogalli, ddunbar, rjmccall Reviewed By: fpetrogalli, rjmccall Subscribers: tschuett, rkruppe, psnobl, dmgreen, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D80323
-
Nick Desaulniers authored
Summary: Forked from: https://reviews.llvm.org/D80242 Use the getter for access to DebugInfo consistently. Use break in switch in CodeGenModule::EmitTopLevelDecl consistently. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: cfe-commits, srhines Tags: #clang Differential Revision: https://reviews.llvm.org/D80840
-
Eric Schweitz authored
The lowering bridge will call these lowering hooks to process the Open MP directives that it iterates over in the PFT. This is a mock interface without an implementation in this patch. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D80815
-
Stanislav Mekhanoshin authored
There seems to be some instability with IR nameing between platforms. Attempted to fix it with replacing dot-numbered names.
-
Fangrui Song authored
This flag (and the whole field DT_FLAGS_1) originated from Solaris. I intend to use it in an LLD patch D80872. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D80871
-
Sanjay Patel authored
This is based on an example in D80658
-
Stanislav Mekhanoshin authored
Differential Revision: https://reviews.llvm.org/D79218
-
Siva Chandra Reddy authored
Reviewers: abrachet Differential Revision: https://reviews.llvm.org/D80612
-
Sam Clegg authored
simd-2.C now compiles thanks to: https://github.com/WebAssembly/wasi-libc/pull/183 Differential Revision: https://reviews.llvm.org/D80930
-
Sanjay Patel authored
As discussed in https://bugs.llvm.org/show_bug.cgi?id=45951 and D80584, the name 'tmp' is almost always a bad choice, but we have a legacy of regression tests with that name because it was baked into utils/update_test_checks.py. This change makes -instnamer more consistent (already using "arg" and "bb", the common LLVM shorthand). And it avoids the conflict in telling users of the FileCheck script to run "-instnamer" to create a better regression test and having that cause a warn/fail in update_test_checks.py.
-
AndreyChurbanov authored
same parent tasks. Differential Revision: https://reviews.llvm.org/D80577
-
Aaron Ballman authored
GCC 10.1 introduced support for the [[]] style spelling of attributes in C mode. Similar to how GCC supports __attribute__((foo)) as [[gnu::foo]] in C++ mode, it now supports the same spelling in C mode as well. This patch makes a change in Clang so that when you use the GCC attribute spelling, the attribute is automatically available in all three spellings by default. However, like Clang, GCC has some attributes it only recognizes in C++ mode (specifically, abi_tag and init_priority), which this patch also honors.
-
Ehud Katz authored
-
Sanjay Patel authored
This file was originally added without instnamer at: rL283716 / fe2b9b4f But that was reverted and the test file reappeared with instnamer at: rL285688 / 62f516f5 I'm not seeing any difference locally from checking nameless values, so trying to remove a layering violation and see if that can survive the build bots.
-
James Henderson authored
Reviewed by: clayborg, dblaikie, labath Differential Revision: https://reviews.llvm.org/D80799
-
Raphael Isemann authored
This reverts commit fd0ab3b3. The fix here is incorrect and the actual fault was an incorrect test Makefile. To give some more background: The original test for D80798 compiled three source files into either one executable or one executable + 2 shared libraries, each being one different test setup. If both the monolithic executable and the shared libraries where compiled in the same directory, then Make would overwrite the .o files of one test setup with the other. This caused that while -fPIC was passed correctly to the test setup with the shared libraries, the compiler invocations for the monolithic executable would later overwrite these object files (and as only the test setup with the shared library used -fPIC, it appeared as if the shared library object files didn't receive the -fPIC flag). Thanks to Pavel for figuring this out.
-
James Henderson authored
This will ensure that nothing can ever start parsing data from a future sequence and part-read data will be returned as 0 instead. Reviewed by: aprantl, labath Differential Revision: https://reviews.llvm.org/D80796
-
Raphael Isemann authored
Summary: ClangExpressionSourceCode has different ways to wrap the user expression based on which context the expression is executed in. For example, if we're in a C++ member function we put the expression inside a fake member function of a fake class to make the evaluation possible. Similar things are done for Objective-C instance/static methods. There is also a default wrapping where we put the expression in a normal function just to make it possible to execute it. The way we currently define which kind of wrapping the expression needs is based on the `wrapping_language` we keep passing to the ClangExpressionSourceCode instance. We repurposed the language type enum for that variable to distinguish the cases above with the following mapping: * language = C_plus_plus -> member function wrapping * language = ObjC -> instance/static method wrapping (`is_static` distinguished between those two). * language = C -> normal function wrapping * all other cases like C_plus_plus11, Haskell etc. make our class a no-op that does mostly nothing. That mapping is currently not documented and just confusing as the `language` is unrelated to the expression language (and in the ClangUserExpression we even pretend that it *is* the actual language, but luckily never used it for anything). Some of the code in ClangExpressionSourceCode is also obviously thinking that this is the actual language of the expression as it checks for non-existent cases such as `ObjC_plus_plus` which is not part of the mapping. This patch makes a new enum to describe the four cases above (with instance/static Objective-C methods now being their own case). It also make that enum just a member of ClangExpressionSourceCode instead of having to pass the same value to the class repeatedly. This gets also rid of all the switch-case-checks for 'unknown' language such as C_plus_plus11 as this is no longer necessary. Reviewers: labath, JDevlieghere Reviewed By: labath Subscribers: abidh Differential Revision: https://reviews.llvm.org/D80793
-
Sanjay Patel authored
This is effectively reverting rGbfdc2552664d to avoid test churn while we figure out a better way forward. We at least salvage the warning on name conflict from that patch though. If we change the default string again, we may want to mass update tests at the same time. Alternatively, we could live with the poor naming if we change -instnamer. This also adds a test to LLVM as suggested in the post-commit review. There's a clang test that is also affected. That seems like a layering violation, but I have not looked at fixing that yet. Differential Revision: https://reviews.llvm.org/D80584
-
James Henderson authored
The debug_line_invalid.test test case was previously using the interpreted line table dumping to identify which opcodes have been parsed. This change moves to looking for the expected opcodes explicitly. This is probably a little clearer and also allows for testing some cases that wouldn't be easily identifiable from the interpreted table. Reviewed by: MaskRay Differential Revision: https://reviews.llvm.org/D80795
-
Simon Pilgrim authored
They are implicitly included in TargetFrameLowering.h and only ever used in TargetFrameLowering override methods.
-