- Mar 31, 2022
-
-
Kirill Bobyrev authored
Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D120306
-
Florian Hahn authored
The only remaining use was to get the exit block of the loop. Instead of relying on the loop, use the successor of VectorHeaderBB (LoopMiddleBlock) directly to set VPTransformState::CFG::ExitB Depends on D121621. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121623
-
Nicholas Guy authored
Differential Revision: https://reviews.llvm.org/D122566
-
Sergei Lebedev authored
While not strictly required after PEP-420, it is better to have one, since not all tooling supports implicit namespace packages. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D122794
-
Sergei Lebedev authored
This commit fixes or disables all errors reported by python3 -m mypy -p mlir --show-error-codes Note that unhashable types cannot be currently expressed in a way compatible with typeshed. See https://github.com/python/typeshed/issues/6243 for details. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D122790
-
Simon Pilgrim authored
Add ADD/SUB(XOR(X,MIN_SIGNED_VALUE),Y) tests
-
Florian Hahn authored
Suggested in D121623. The remaining uses of L can be replaced, reducing the need for the variable.
-
Marco Elver authored
Allow receiving memcpy/memset/memmove instrumentation by using __asan or __hwasan prefixed versions for AddressSanitizer and HWAddressSanitizer respectively when compiling in kernel mode, by passing params -asan-kernel-mem-intrinsic-prefix or -hwasan-kernel-mem-intrinsic-prefix. By default the kernel-specialized versions of both passes drop the prefixes for calls generated by memintrinsics. This assumes that all locations that can lower the intrinsics to libcalls can safely be instrumented. This unfortunately is not the case when implicit calls to memintrinsics are inserted by the compiler in no_sanitize functions [1]. To solve the issue, normal memcpy/memset/memmove need to be uninstrumented, and instrumented code should instead use the prefixed versions. This also aligns with ASan behaviour in user space. [1] https://lore.kernel.org/lkml/Yj2yYFloadFobRPx@lakrids/ Reviewed By: glider Differential Revision: https://reviews.llvm.org/D122724
-
Groverkss authored
This patch removes a forward declaration to PresburgerLocalSpace, a class which does not exist anymore.
-
Jean Perier authored
Runtime was crashing when an INTEGER passed in formatted output with a bad edit descriptor even when the user did provide IOSTAT. Flang is already signaling an error when facing similar error with other types. Do the same with INTEGERs. The input case is already signaling an error in the related input error case. Differential Revision: https://reviews.llvm.org/D122749
-
Jean Perier authored
When including debug lines as code, the `D` should be considered as a white space. Currently an error was raised about bad labels because it the `D` remained a `D` when considering the source line as code. Differential Revision: https://reviews.llvm.org/D122711
-
Simon Pilgrim authored
[X86] combineCarryThroughADD - recognise X86ISD::ADD(AND(X,1),-1) pattern can be folded to X86ISD::BT As mentioned on D122482, if we've generated a masked overflow test see if we can fold it to X86ISD::BT to feed a X86ISD::ADC/SBB Differential Revision: https://reviews.llvm.org/D122572
-
ShihPo Hung authored
This patch adds Uses = [FRM] and mayRaiseFPException = true to following instructions: VFADD, VFSUB, VFRSUB, VFMUL, VFDIV, VFRDIV VFWADD, VFWSUB, VFWMUL VFMADD, VFMACC, VFMSAC, VFMSUB VFNMADD, VFNMACC, VFNMSAC, VVFNMSUB VFWMACC, VFWMSAC, VFWNMACC, VFWNMSAC VFSQRT, VFREC7 VFREDOSUM, VFREDUSUM, VFWREDOSUM, VFWREDUSUM and only adds mayRaiseFPException = true to following instructions: VFRSQRT7, VFMIN, VFMAX, VFREDMIN, VFREDMAX VMFEQ, VMFNE, VMFLT,VMFLE, VMFGT, VMFGE Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121087
-
David Green authored
When MaximizeVectorBandwidth is enabled, we can end up (via calls to collectUniformsAndScalars/setCostBasedWideningDecision through calculateRegisterUsage) making widening decisions before we have decided whether to fold the tail by masking. These decisions will be wrong if we later decided to fold the tail, for example when the trip count is very low. It will use incorrect costs for loads that should get masked, using standard memory operation costs instead. This still at the moment uses the EmulatedMaskMemRefHack costs (a bit unfortunately), but the old costs without this change were 1, leading to too optimistic vectorization. This slightly changes the way that the MaximizeVectorBandwidth option works to make it easier to test, always honouring the option if it is set. Differential Revision: https://reviews.llvm.org/D120215
-
Matthias Springer authored
This fixes a compiler warning.
-
Jay Foad authored
Differential Revision: https://reviews.llvm.org/D122653
-
Matthias Springer authored
Infer a tighter MemRef type instead of always falling back to the most dynamic MemRef type. This is inefficient and caused op verification errors. Differential Revision: https://reviews.llvm.org/D122649
-
Fraser Cormack authored
-
Matthias Springer authored
Differential Revision: https://reviews.llvm.org/D122647
-
Matthias Springer authored
* Complete rewrite of the verifier. * CollapseShapeOp verifier will be updated in a subsequent commit. * Update and expand op documentation. * Add a new builder that infers the result type based on the source type, result shape and reassociation indices. In essence, only the result layout map is inferred. Differential Revision: https://reviews.llvm.org/D122641
-
Argyrios Kyrtzidis authored
* Support compiling with clang-5 * Check for `LLVM_DISABLE_ASSEMBLY_FILES` and have it set by `compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh` which wants to receive and process only bitcode files.
-
Qiu Chaofan authored
--overlay-platform-toolchain inserts a whole new toolchain path with higher priority than system default, which could be achieved by composing smaller options. We need to figure out alternative solution and what is missing among these basic options.
-
Petr Hosek authored
This is necessary so that Tests.cmake is always included in the generated build file and any changes made by subbuilds are detected without needing to rerun CMake. This is equivalent to an earlier version of D121647. Differential Revision: https://reviews.llvm.org/D121647
-
Frances Wingerter authored
Also implements explicit handling for the already-documented --help flag.
-
Fraser Cormack authored
-
Lian Wang authored
Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D122369
-
Tobias Hieta authored
This flag silents the build output of test-release.sh so that it can be used in CI systems a bit better. It will still log the build output to the log files but not echo it to stdout. Reviewed By: tstellar Differential Revision: https://reviews.llvm.org/D122146
-
Fangrui Song authored
Move to clang_ignored_gcc_optimization_f_Group like other ignored options. This decreases code size a bit: ~400 bytes on x86-64.
-
Serge Pavlov authored
A new function 'getConstrainedIntrinsic' is added, which for any gived instruction returns id of the corresponding constrained intrinsic. If there is no constrained counterpart for the instruction or the instruction is already a constrained intrinsic, the function returns zero. This is recommit of 115b3ace, reverted in 8160dd58. Differential Revision: https://reviews.llvm.org/D69562
-
wanglei authored
This patch constructs codegen infra and successfully generate the first 'add' instruction. Add integer calling convention for fixed arguments which are passed with general-purpose registers. New test added here: CodeGen/LoongArch/ir-instruction/add.ll The test file is placed in a subdirectory because we will use subdirctories to distinguish different categories of tests (e.g. intrinsic, inline-asm ...) Reviewed By: MaskRay, SixWeining Differential Revision: https://reviews.llvm.org/D122366
-
Chuanqi Xu authored
in filesystems It is simpler to search for module unit by -fprebuilt-module-path option. However, the separator ':' of partitions is not friendly. According to the discussion in https://reviews.llvm.org/D118586, I think we get consensus to use '-' as the separator instead. The '-' is the choice of GCC too. Previously I thought it would be better to add an option. But I feel it is over-engineering now. Another reason here is that there are too many options for modules (for clang module mainly) now. Given it is not bad to use '-' when searching, I think it is acceptable to not add an option. Reviewed By: iains Differential Revision: https://reviews.llvm.org/D120874
-
Aditya Kumar authored
According to the LLVM debug info update guide: https://llvm.org/docs/HowToUpdateDebugInfo.html, "Hoisting identical instructions which appear in several successor blocks into a predecessor block. In this case there is no single merged instruction. The rule for dropping locations applies". Thanks to Yuanbo Li for reporting this. Reviewed By: dblaikie Reviewers: sebpop, tejohnson, dblaikie Differential Revision: https://reviews.llvm.org/D122730
-
jacquesguan authored
For example, we could do the following eliminations: fold vector.shuffle V1, V2, [0, 1, 2, 3] : <4xi32>, <2xi32> -> V1 fold vector.shuffle V1, V2, [4, 5] : <4xi32>, <2xi32> -> V2 Differential Revision: https://reviews.llvm.org/D122706
-
Wei Xiao authored
-
V Donaldson authored
A format such as "( D C, X6. 2 )" is parsed the same as "(DC,X6.2)".
-
Shafik Yaghmour authored
NSIndexPathSyntheticFrontEnd::Impl::Clear() currently calls Clear() on both unions members regardless of which one is active. I modified it to only call Clear() on the active member. Differential Revision: https://reviews.llvm.org/D122753
-
Jordan R Abrahams-Whitehead authored
This lets the revert_checker.py get called with the -u option, which formats the revert and reverted SHAs into handy URLs which point to the LLVM reviews associated with those SHAs. This is useful for viewers to look quickly at the changes made by SHAs that were potentially reverted. Differential Revision: https://reviews.llvm.org/D122772
-
Stephen Long authored
Factor in the TBAA of adjacent stores instead of just the head store when merging stores into a memset. We were seeing GVN remove a load that had a TBAA that matched the 2nd store because GVN determined it didn't match the TBAA of the memset. The memset had the TBAA of only the first store. i.e. Loading the field pi_ of shared_count after memset to create an array of shared_ptr template<class T> class shared_ptr { T *p; shared_count refcount; }; class shared_count { sp_counted_base *pi_; }; Differential Revision: https://reviews.llvm.org/D122205
-
wangyihan authored
Beautify dump format, add indent for nested struct and struct members, also fix test cases in dump-struct-builtin.c for example: struct: ``` struct A { int a; struct B { int b; struct C { struct D { int d; union E { int x; int y; } e; } d; int c; } c; } b; }; ``` Before: ``` struct A { int a = 0 struct B { int b = 0 struct C { struct D { int d = 0 union E { int x = 0 int y = 0 } } int c = 0 } } } ``` After: ``` struct A { int a = 0 struct B { int b = 0 struct C { struct D { int d = 0 union E { int x = 0 int y = 0 } } int c = 0 } } } ``` Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122704
-
Tue Ly authored
Improve the performance of expm1f: - Rearrange the selection logic for different cases to improve the overall throughput. - Use the same degree-4 polynomial for large inputs as `expf` (https://reviews.llvm.org/D122418), reduced from a degree-7 polynomial. Performance benchmark using perf tool from CORE-MATH project (https://gitlab.inria.fr/core-math/core-math/-/tree/master): Before this patch: ``` $ ./perf.sh expm1f CORE-MATH reciprocal throughput : 15.362 System LIBC reciprocal throughput : 53.288 LIBC reciprocal throughput : 54.572 $ ./perf.sh expm1f --latency CORE-MATH latency : 57.759 System LIBC latency : 147.146 LIBC latency : 118.057 ``` After this patch: ``` $ ./perf.sh expm1f CORE-MATH reciprocal throughput : 15.359 System LIBC reciprocal throughput : 53.188 LIBC reciprocal throughput : 14.600 $ ./perf.sh expm1f --latency CORE-MATH latency : 57.774 System LIBC latency : 147.119 LIBC latency : 60.280 ``` Reviewed By: michaelrj, santoshn, zimmermann6 Differential Revision: https://reviews.llvm.org/D122538
-