- Mar 30, 2024
-
-
paperchalice authored
Fix preprocessor directive.
-
paperchalice authored
This reverts commit 5538853f. #83668 Break some test bots.
-
paperchalice authored
This pull request adds `MachineFunctionProperties` support. If a pass wants to modify machine function properties, it must derive from `MachinePassInfoMixin` and define some static methods like in legacy pass manager. A test pass `RequireAllMachineFunctionPropertiesPass` is also added here, which could be a example.
-
Schrodinger ZHU Yifan authored
-
Maksim Panchenko authored
Under normal circumstances, we terminate basic blocks on a trap instruction. However, Linux kernel may resume execution after hitting a trap (ud2 on x86). Thus, we introduce "--terminal-trap" option that will specify if the trap instruction should terminate the control flow. The option is on by default except for the Linux kernel mode when it's off.
-
- Mar 29, 2024
-
-
Sitnikov Sergey authored
-
Vladimir Vereschaka authored
Review the actual component parameters and update the cache file accordingly. Also fixed the C++ test builds for the compiler-rt component.
-
Aart Bik authored
Note that even though the sparse runtime support lib always uses SoA storage for COO storage (and provides correct codegen by means of views into this storage), in some rare cases we need the true physical SoA storage as a coordinate buffer. This PR provides that functionality by means of a (costly) coordinate buffer call. Since this is currently only used for testing/debugging by means of the sparse_tensor.print method, this solution is acceptable. If we ever want a performing version of this, we should truly support AoS storage of COO in addition to the SoA used right now.
-
Jonathan Peyton authored
The hidden helper team pre-allocates the gtid space [1, num_hidden_helpers] (inclusive). If regular host threads are allocated, then put back in the thread pool, then the hidden helper team is initialized, the hidden helper team tries to allocate the threads from the thread pool with gtids higher than [1, num_hidden_helpers]. Instead, have the hidden helper team fork OS threads so the correct gtid range used for hidden helper threads. Fixes: #87117
-
Rob Suderman authored
Before deleting the block we need to drop uses to the surrounding args. If this is not performed dialect conversion failures can result in a failure to remove args (despite the block having no remaining uses).
-
Christopher Ferris authored
It appears that qemu does not actually cause mmap to fail when calling setrlimit to limit the address space size. In the two tests that use setrlimit, detect if mmap still works and skip the tests in that case. Since all Android targets should support setrlimit, compile out the mmap check code for them.
-
Diego Caballero authored
An `arith.select` may have a scalar condition and true/false vector values.
-
Shourya Goel authored
In continuation to: #87097
-
OverMighty authored
Even if we don't actually use the value of the second argument, we have to evaluate it for side-effects. --------- Co-authored-by:
Richard Smith <richard@metafoo.co.uk>
-
Cyndy Ishida authored
`-reexport*` is the newer spelling for `-sub-library` which is already supported by the clang driver when invoking ld. Support the new spellings when passed by the user. This also helps simplify `clang-installapi` driver logic.
-
mlevesquedion authored
I believe the existing check to determine if an operand should be added is incorrect: `operand.use_empty() || operand.hasOneUse()`. This is because these checks do not take into account the fact that the op is being deleted. It hasn't been deleted yet, so `operand.use_empty()` cannot be true, and `operand.hasOneUse()` may be true if the op being deleted is the only user of the operand and it only uses it once, but it will fail if the operand is used more than once (e.g. something like `add %0, %0`). Instead, check if the op being deleted is the only _user_ of the operand. If so, add the operand to the worklist. Fixes #86765
-
dhruvachak authored
Similar to H2D and D2H, use synchronous mode for large data transfers beyond a certain size for D2D as well. As with H2D and D2H, this size is controlled by an env-var.
-
Shilei Tian authored
This patch tries to fold `G_ICMP` if possible.
-
Shilei Tian authored
This can remove all unnecessary redundant calls in each combiner.
-
Helena Kotas authored
Add lowering of llvm.ceil intrinsics to DXIL ops. Fixes #86984
-
Jan Svoboda authored
An instance of `PreprocessorOptions` is part of `CompilerInvocation` which is supposed to be a value type. The `FailedModules` member is problematic, since it's essentially a shared state used by multiple `CompilerInstance` objects, and not really a preprocessor option. Let's move it into `CompilerInstance` instead.
-
Alex MacLean authored
Switch from `.weak` to `.common` linkage for common global variables where possible. The `.common` linkage is described in [PTX ISA 11.6.4. Linking Directives: .common] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#linking-directives-common) > Declares identifier to be globally visible but “common”. > >Common symbols are similar to globally visible symbols. However multiple object files may declare the same common symbol and they may have different types and sizes and references to a symbol get resolved against a common symbol with the largest size. > >Only one object file can initialize a common symbol and that must have the largest size among all other definitions of that common symbol from different object files. > >.common linking directive can be used only on variables with .global storage. It cannot be used on function symbols or on symbols with opaque type. I've updated the logic and tests to only use `.common` for PTX 5.0 or greater and verified that the new tests now pass with `ptxas`.
-
Kevin P. Neal authored
The AtomicExpand pass was lowering function calls with the strictfp attribute to sequences that included function calls incorrectly lacking the attribute. This patch corrects that. The pass now also emits the correct constrained fp call instead of normal FP instructions when in a function with the strictfp attribute. Test changes verified with D146845.
-
Nikolas Klauser authored
We've been applying ``[[nodiscard]]`` more liberally recently, but we don't have any documented guidance on when it's correct to add it. This patch adds that guidance. Follow-up patches will gradually apply it to the code base.
-
Nikolas Klauser authored
This adds vectorization to the last 0-3 vectors and, if the range is large enough, the remaining elements that don't fill a vector completely. ``` ----------------------------------------------------------------------- Benchmark old full vectors partial vector ----------------------------------------------------------------------- bm_mismatch<char>/1 1.40 ns 1.62 ns 2.09 ns bm_mismatch<char>/2 1.88 ns 2.10 ns 2.33 ns bm_mismatch<char>/3 2.67 ns 2.56 ns 2.72 ns bm_mismatch<char>/4 3.01 ns 3.20 ns 3.70 ns bm_mismatch<char>/5 3.51 ns 3.73 ns 3.64 ns bm_mismatch<char>/6 4.71 ns 4.85 ns 4.37 ns bm_mismatch<char>/7 5.12 ns 5.33 ns 4.37 ns bm_mismatch<char>/8 5.79 ns 6.02 ns 4.75 ns bm_mismatch<char>/15 9.20 ns 10.5 ns 7.23 ns bm_mismatch<char>/16 10.2 ns 10.1 ns 7.46 ns bm_mismatch<char>/17 10.2 ns 10.8 ns 7.57 ns bm_mismatch<char>/31 17.6 ns 17.1 ns 10.8 ns bm_mismatch<char>/32 17.4 ns 1.64 ns 1.64 ns bm_mismatch<char>/33 23.3 ns 2.10 ns 2.33 ns bm_mismatch<char>/63 31.8 ns 16.9 ns 2.33 ns bm_mismatch<char>/64 32.6 ns 2.10 ns 2.10 ns bm_mismatch<char>/65 33.6 ns 2.57 ns 2.80 ns bm_mismatch<char>/127 67.3 ns 18.1 ns 3.27 ns bm_mismatch<char>/128 2.17 ns 2.14 ns 2.57 ns bm_mismatch<char>/129 2.36 ns 2.80 ns 3.27 ns bm_mismatch<char>/255 67.5 ns 19.6 ns 4.68 ns bm_mismatch<char>/256 3.76 ns 3.71 ns 3.97 ns bm_mismatch<char>/257 3.77 ns 4.04 ns 4.43 ns bm_mismatch<char>/511 70.8 ns 22.1 ns 7.47 ns bm_mismatch<char>/512 7.27 ns 7.30 ns 6.95 ns bm_mismatch<char>/513 7.11 ns 7.05 ns 6.96 ns bm_mismatch<char>/1023 75.9 ns 27.4 ns 13.3 ns bm_mismatch<char>/1024 13.9 ns 13.8 ns 12.4 ns bm_mismatch<char>/1025 13.6 ns 13.6 ns 12.8 ns bm_mismatch<char>/2047 87.3 ns 37.5 ns 25.4 ns bm_mismatch<char>/2048 26.8 ns 27.4 ns 24.0 ns bm_mismatch<char>/2049 26.7 ns 27.3 ns 25.5 ns bm_mismatch<char>/4095 112 ns 64.7 ns 48.7 ns bm_mismatch<char>/4096 53.0 ns 54.2 ns 46.8 ns bm_mismatch<char>/4097 52.7 ns 54.2 ns 48.4 ns bm_mismatch<char>/8191 160 ns 118 ns 98.4 ns bm_mismatch<char>/8192 107 ns 108 ns 96.0 ns bm_mismatch<char>/8193 106 ns 108 ns 97.2 ns bm_mismatch<char>/16383 283 ns 234 ns 215 ns bm_mismatch<char>/16384 227 ns 223 ns 217 ns bm_mismatch<char>/16385 221 ns 221 ns 215 ns bm_mismatch<char>/32767 547 ns 499 ns 488 ns bm_mismatch<char>/32768 495 ns 492 ns 492 ns bm_mismatch<char>/32769 491 ns 489 ns 488 ns bm_mismatch<char>/65535 1028 ns 979 ns 971 ns bm_mismatch<char>/65536 976 ns 970 ns 974 ns bm_mismatch<char>/65537 970 ns 965 ns 971 ns bm_mismatch<char>/131071 2031 ns 1948 ns 2005 ns bm_mismatch<char>/131072 1973 ns 1955 ns 1974 ns bm_mismatch<char>/131073 1989 ns 1932 ns 2001 ns bm_mismatch<char>/262143 4469 ns 4244 ns 4223 ns bm_mismatch<char>/262144 4443 ns 4183 ns 4243 ns bm_mismatch<char>/262145 4400 ns 4232 ns 4246 ns bm_mismatch<char>/524287 10169 ns 9733 ns 9592 ns bm_mismatch<char>/524288 10154 ns 9664 ns 9843 ns bm_mismatch<char>/524289 10113 ns 9641 ns 10003 ns bm_mismatch<short>/1 1.86 ns 2.53 ns 2.32 ns bm_mismatch<short>/2 2.57 ns 2.77 ns 2.55 ns bm_mismatch<short>/3 3.26 ns 3.00 ns 2.79 ns bm_mismatch<short>/4 3.95 ns 3.39 ns 3.15 ns bm_mismatch<short>/5 4.83 ns 3.97 ns 3.72 ns bm_mismatch<short>/6 5.43 ns 4.34 ns 4.03 ns bm_mismatch<short>/7 6.11 ns 4.73 ns 4.44 ns bm_mismatch<short>/8 6.84 ns 5.02 ns 4.79 ns bm_mismatch<short>/15 11.5 ns 7.12 ns 6.50 ns bm_mismatch<short>/16 13.9 ns 1.87 ns 2.11 ns bm_mismatch<short>/17 14.0 ns 3.00 ns 2.47 ns bm_mismatch<short>/31 23.1 ns 7.87 ns 2.47 ns bm_mismatch<short>/32 23.8 ns 2.57 ns 2.81 ns bm_mismatch<short>/33 24.5 ns 3.70 ns 2.94 ns bm_mismatch<short>/63 44.8 ns 9.37 ns 3.46 ns bm_mismatch<short>/64 2.32 ns 2.57 ns 2.64 ns bm_mismatch<short>/65 2.52 ns 3.02 ns 3.51 ns bm_mismatch<short>/127 45.6 ns 9.97 ns 5.18 ns bm_mismatch<short>/128 3.85 ns 3.93 ns 3.94 ns bm_mismatch<short>/129 3.82 ns 4.20 ns 4.70 ns bm_mismatch<short>/255 50.4 ns 12.6 ns 8.07 ns bm_mismatch<short>/256 7.23 ns 6.91 ns 6.98 ns bm_mismatch<short>/257 7.24 ns 7.19 ns 7.55 ns bm_mismatch<short>/511 52.3 ns 17.8 ns 14.0 ns bm_mismatch<short>/512 13.6 ns 13.7 ns 13.6 ns bm_mismatch<short>/513 13.9 ns 13.8 ns 18.5 ns bm_mismatch<short>/1023 60.9 ns 30.9 ns 26.3 ns bm_mismatch<short>/1024 26.7 ns 27.7 ns 25.7 ns bm_mismatch<short>/1025 27.7 ns 27.6 ns 25.3 ns bm_mismatch<short>/2047 88.4 ns 58.0 ns 51.6 ns bm_mismatch<short>/2048 52.8 ns 55.3 ns 50.6 ns bm_mismatch<short>/2049 55.2 ns 54.8 ns 48.7 ns bm_mismatch<short>/4095 153 ns 113 ns 102 ns bm_mismatch<short>/4096 105 ns 110 ns 101 ns bm_mismatch<short>/4097 110 ns 110 ns 99.1 ns bm_mismatch<short>/8191 277 ns 219 ns 206 ns bm_mismatch<short>/8192 226 ns 214 ns 250 ns bm_mismatch<short>/8193 226 ns 207 ns 208 ns bm_mismatch<short>/16383 519 ns 492 ns 488 ns bm_mismatch<short>/16384 494 ns 492 ns 492 ns bm_mismatch<short>/16385 492 ns 488 ns 489 ns bm_mismatch<short>/32767 1007 ns 968 ns 964 ns bm_mismatch<short>/32768 977 ns 972 ns 970 ns bm_mismatch<short>/32769 972 ns 962 ns 967 ns bm_mismatch<short>/65535 1978 ns 1918 ns 1956 ns bm_mismatch<short>/65536 1940 ns 1927 ns 1970 ns bm_mismatch<short>/65537 1937 ns 1922 ns 1959 ns bm_mismatch<short>/131071 4524 ns 4193 ns 4304 ns bm_mismatch<short>/131072 4445 ns 4196 ns 4306 ns bm_mismatch<short>/131073 4452 ns 4278 ns 4311 ns bm_mismatch<short>/262143 9801 ns 10188 ns 9634 ns bm_mismatch<short>/262144 9738 ns 10151 ns 9651 ns bm_mismatch<short>/262145 9716 ns 10171 ns 9715 ns bm_mismatch<short>/524287 19944 ns 20718 ns 20044 ns bm_mismatch<short>/524288 21139 ns 20647 ns 20008 ns bm_mismatch<short>/524289 21162 ns 19512 ns 20068 ns bm_mismatch<int>/1 1.40 ns 1.84 ns 1.87 ns bm_mismatch<int>/2 1.87 ns 2.08 ns 2.09 ns bm_mismatch<int>/3 2.36 ns 2.31 ns 2.87 ns bm_mismatch<int>/4 3.06 ns 2.72 ns 2.95 ns bm_mismatch<int>/5 3.66 ns 3.37 ns 3.42 ns bm_mismatch<int>/6 4.55 ns 3.65 ns 3.73 ns bm_mismatch<int>/7 5.03 ns 3.93 ns 3.94 ns bm_mismatch<int>/8 5.67 ns 1.86 ns 1.87 ns bm_mismatch<int>/15 9.89 ns 4.41 ns 2.34 ns bm_mismatch<int>/16 10.1 ns 2.33 ns 2.34 ns bm_mismatch<int>/17 10.2 ns 3.34 ns 2.86 ns bm_mismatch<int>/31 17.2 ns 5.54 ns 3.28 ns bm_mismatch<int>/32 2.16 ns 2.15 ns 2.58 ns bm_mismatch<int>/33 2.36 ns 3.01 ns 3.28 ns bm_mismatch<int>/63 17.7 ns 6.50 ns 4.93 ns bm_mismatch<int>/64 3.81 ns 3.58 ns 3.90 ns bm_mismatch<int>/65 3.74 ns 4.36 ns 4.45 ns bm_mismatch<int>/127 19.5 ns 9.56 ns 7.74 ns bm_mismatch<int>/128 7.30 ns 6.41 ns 6.85 ns bm_mismatch<int>/129 7.09 ns 7.04 ns 7.06 ns bm_mismatch<int>/255 24.7 ns 14.8 ns 13.3 ns bm_mismatch<int>/256 14.0 ns 12.1 ns 12.3 ns bm_mismatch<int>/257 13.8 ns 12.7 ns 12.8 ns bm_mismatch<int>/511 34.3 ns 26.3 ns 24.8 ns bm_mismatch<int>/512 27.6 ns 23.6 ns 23.9 ns bm_mismatch<int>/513 27.3 ns 24.4 ns 25.1 ns bm_mismatch<int>/1023 62.5 ns 50.9 ns 48.3 ns bm_mismatch<int>/1024 54.4 ns 46.1 ns 46.6 ns bm_mismatch<int>/1025 54.2 ns 48.4 ns 47.5 ns bm_mismatch<int>/2047 116 ns 97.8 ns 94.1 ns bm_mismatch<int>/2048 108 ns 92.6 ns 92.4 ns bm_mismatch<int>/2049 108 ns 104 ns 94.0 ns bm_mismatch<int>/4095 233 ns 222 ns 205 ns bm_mismatch<int>/4096 226 ns 223 ns 225 ns bm_mismatch<int>/4097 221 ns 219 ns 210 ns bm_mismatch<int>/8191 499 ns 485 ns 488 ns bm_mismatch<int>/8192 496 ns 490 ns 495 ns bm_mismatch<int>/8193 491 ns 485 ns 488 ns bm_mismatch<int>/16383 982 ns 962 ns 964 ns bm_mismatch<int>/16384 974 ns 971 ns 971 ns bm_mismatch<int>/16385 971 ns 961 ns 968 ns bm_mismatch<int>/32767 2003 ns 1959 ns 1920 ns bm_mismatch<int>/32768 1996 ns 1947 ns 1928 ns bm_mismatch<int>/32769 1990 ns 1945 ns 1926 ns bm_mismatch<int>/65535 4434 ns 4275 ns 4312 ns bm_mismatch<int>/65536 4437 ns 4267 ns 4321 ns bm_mismatch<int>/65537 4442 ns 4261 ns 4321 ns bm_mismatch<int>/131071 9673 ns 9648 ns 9465 ns bm_mismatch<int>/131072 9667 ns 9671 ns 9465 ns bm_mismatch<int>/131073 9661 ns 9653 ns 9464 ns bm_mismatch<int>/262143 20595 ns 19605 ns 19064 ns bm_mismatch<int>/262144 19894 ns 19572 ns 19009 ns bm_mismatch<int>/262145 19851 ns 19656 ns 18999 ns bm_mismatch<int>/524287 39556 ns 39364 ns 38131 ns bm_mismatch<int>/524288 39678 ns 39573 ns 38183 ns bm_mismatch<int>/524289 40168 ns 39301 ns 38121 ns ```
-
Jan Svoboda authored
An instance of `PreprocessorOptions` is part of `CompilerInvocation` which is supposed to be a value type. The `DependencyDirectivesForFile` member is problematic, since it holds an owning reference of the scanning VFS. This makes it not a true value type, and it can keep potentially large chunk of memory (the local cache in the scanning VFS) alive for longer than clients might expect. Let's move it into the `Preprocessor` instead.
-
Joseph Huber authored
Summary: This has changed, so update it to match the new interface.
-
Joseph Huber authored
Summary: The current implementation of RPC tied everything to device IDs and forced us to do init / shutdown to manage some global state. This turned out to be a bad idea in situations where we want to track multiple hetergeneous devices that may report the same device ID in the same process. This patch changes the interface to instead create an opaque handle to the internal device and simply allocates it via `new`. The user will then take this device and store it to interface with the attached device. This interface puts the burden of tracking the device identifier to mapped d evices onto the user, but in return heavily simplifies the implementation.
-
Cyndy Ishida authored
-
Michael Jones authored
In patch #82461 the sprintf tests were made to use UINTMAX_WIDTH which isn't defined on all systems. This patch changes it to sizeof(uintmax_t)*CHAR_BIT which is more portable.
-
Farzon Lotfi authored
fixes #86999
-
Om Prakaash authored
Resolves #81685. This adds support for wN and wfN length modifiers in fprintf.
-
Cyndy Ishida authored
-
Christopher Ferris authored
Only attempt to initialize the ring buffer when tracking is enabled. Updated unit tests, and added a few new unit tests to verify the RingBuffer is not initialized by default. Verified that the two maps associated with the RingBuffer are not created in processes by default.
-
Sirraide authored
This is now allowed in C23; continue to diagnose it in earlier language modes as before, but now as a C23 extension rather than a GNU extension. This fixes #83658.
-
Shilei Tian authored
This patch adds similar handling of div-by-pow2 as in `SelectionDAG`.
-
Craig Topper authored
Rename to VPseudoBinaryNoMaskTU_Zvk. This more consistent with the naming of the class it instantiates and the _Zvk suffix is used elsewhere in RISCVInstrInfoZvk.td.
-
Jordan Rupprecht authored
This adds the bazel equivalent of the `llvm` binary produced by `LLVM_TOOL_LLVM_DRIVER_BUILD` in cmake. For the initial commit, this only includes `llvm-ar`, `llvm-nm`, and `llvm-size`. The rest are trivial to add in a followup commit, following the same pattern as here. By default it will include everything that supports the llvm-driver model, but it can be reduced to only build a subset, e.g. this will build only nm and size: ``` $ bazel build \ --@llvm-project//llvm:driver-tools=llvm-nm,llvm-size \ @llvm-project//llvm:llvm ```
-
Jordan Rupprecht authored
-
Cyndy Ishida authored
InstallAPI does not directly look at object files in the dylib for verification. To help diagnose violations where a declaration is undiscovered in headers, parse the dSYM and look up the source location for symbols. Emitting out the source location with a diagnostic is enough for some IDE's (e.g. Xcode) to have them map back to editable source files.
-