Commits · fb92f863f6849c7fa01f5487bd09544ee0856c59 · Lorenzo Albano / LLVM bpEVL

Sep 18, 2020

[X86] Add some demanded bits test cases for PDEP with constant mask · fb92f863

Craig Topper authored Sep 17, 2020

The number of ones in the mask for the PDEP determines how many
bits of the other operand are used. If the mask is constant we
can use this to build a mask for SimplifyDemandedBits. This can
be used to replace the extends in the test with anyextend.

fb92f863

[AArch64] Emit zext move when the source of the zext is AssertZext or AssertSext · 992698cf

Andrew Wei authored Sep 18, 2020

When the source of the zext is AssertZext or AssertSext, it is hard to know any information about the upper 32 bits,
so we should insert a zext move before emitting SUBREG_TO_REG to define the lower 32 bits.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87771

992698cf

Revert "[sanitizer] Add facility to print the full StackDepot" · 6e475e12

Teresa Johnson authored Sep 17, 2020

This reverts commit 2ffaa9a1.

There were 2 reported bot failures that need more investigation:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/69871/steps/stage%201%20check/logs/stdio

This one is in my new test.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer/builds/39187/steps/check-fuzzer/logs/stdio

This one seems completely unrelated.

6e475e12

[libc] Add implementation for hypotf · f55963d5

Tue Ly authored Aug 12, 2020

Truncating the sum of squares, and then use shift-and-add algorithm to compute its square root.
Required MPFR testing infra is updated in https://reviews.llvm.org/D87514

Differential Revision: https://reviews.llvm.org/D87516

f55963d5

[sanitizer] Add facility to print the full StackDepot · 2ffaa9a1

Teresa Johnson authored Sep 16, 2020

Split out of D87120 (memory profiler). Added unit testing of the new
printing facility.

Differential Revision: https://reviews.llvm.org/D87792

2ffaa9a1

[NFC] clang-format one line · 55edf703
Vitaly Buka authored Sep 17, 2020

55edf703
[NFC][Lsan] Fix zero-sized array compilation error · 03358bec
Vitaly Buka authored Sep 17, 2020

03358bec

[scudo/standalone] Don't define test main function for Fuchsia · 27f34540

Roland McGrath authored Sep 17, 2020

Fuchsia's unit test library provides the main function by default.

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D87809

27f34540

[MLIR] Fix build failure due to https://reviews.llvm.org/D87059 . · ea237e2c

Rahul Joshi authored Sep 17, 2020

- Remove spurious ;
- Make comparison object invokable as const.

Differential Revision: https://reviews.llvm.org/D87872

ea237e2c

[mlir][shape] Add `shape.cstr_require %bool` · bae63742

Sean Silva authored Sep 17, 2020

This op is a catch-all for creating witnesses from various random kinds
of constraints. In particular, I when dealing with extents directly,
which are of `index` type, one can directly use std ops for calculating
the predicates, and then use cstr_require for the final conversion to a
witness.

Differential Revision: https://reviews.llvm.org/D87871

bae63742

[lldb] Clarify docstring for SBBlock::IsInlined, NFC · 4926a5ee

Vedant Kumar authored Sep 17, 2020

Previously, there was a little ambiguity about whether IsInlined should
return true for an inlined lexical block, since technically the lexical
block would not represent an inlined function (it'd just be contained
within one).

Edit suggested by Jim Ingham.

4926a5ee

[AArch64][GlobalISel] Make G_STORE <8 x s8> legal. · f5898f8c
Amara Emerson authored Sep 17, 2020

f5898f8c
[AArch64][GlobalISel] clang-format AArch64LegalizerInfo.cpp. NFC. · 196e2f97
Amara Emerson authored Sep 17, 2020

196e2f97

[PowerPC] Add Set Boolean Condition Instruction Definitions and MC Tests · 6f3c0991

Amy Kwan authored Sep 17, 2020

This patch adds the instruction definitions and assembly/disassembly tests for
the set boolean condition instructions. This also includes the negative, and
reverse variants of the instruction.

Differential Revision: https://reviews.llvm.org/D86252

6f3c0991

[PowerPC] Implement Vector Count Mask Bits builtins in LLVM/Clang · 2c3bc918

Amy Kwan authored Sep 16, 2020

This patch implements the vec_cntm function prototypes in altivec.h in order to
utilize the vector count mask bits instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82726

2c3bc918

[MemorySSA] Fix an unused variable warning [NFC] · b4013f9c
Philip Reames authored Sep 17, 2020

b4013f9c

[MLIR][TableGen] Automatic detection and elimination of redundant methods · 80698445

Rahul Joshi authored Sep 17, 2020

- Change OpClass new method addition to find and eliminate any existing methods that
  are made redundant by the newly added method, as well as detect if the newly added
  method will be redundant and return nullptr in that case.
- To facilitate that, add the notion of resolved and unresolved parameters, where resolved
  parameters have each parameter type known, so that redundancy checks on methods
  with same name but different parameter types can be done.
- Eliminate existing code to avoid adding conflicting/redundant build methods and rely
  on this new mechanism to eliminate conflicting build methods.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47095

Differential Revision: https://reviews.llvm.org/D87059

80698445

[RISCV] Support Shadow Call Stack · 1c466477

Zhaoshi Zheng authored Mar 26, 2020

Currenlty assume x18 is used as pointer to shadow call stack. User shall pass
flags:

"-fsanitize=shadow-call-stack -ffixed-x18"

Runtime supported is needed to setup x18.

If SCS is desired, all parts of the program should be built with -ffixed-x18 to
maintain inter-operatability.

There's no particuluar reason that we must use x18 as SCS pointer. Any register
may be used, as long as it does not have designated purpose already, like RA or
passing call arguments.

Differential Revision: https://reviews.llvm.org/D84414

1c466477

[AArch64] Enable implicit null check transformation · b04c181e

Philip Reames authored Sep 17, 2020

This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null check support:

An implicit null check is the use of a signal handler to catch and redirect to a handler a null pointer. Specifically, it's replacing an explicit conditional branch with such a redirect. This is only done for very cold branches under frontend control w/appropriate metadata.
FAULTING_OP is used to wrap the faulting instruction. It is modelled as being a conditional branch to reflect the fact it can transfer control in the CFG.
FAULTING_OP does not need to be an analyzable branch to achieve it's purpose. (Or at least, that's the x86 model. I find this slightly questionable.)
When lowering to MC, we convert the FAULTING_OP back into the actual instruction, record the labels, and lower the original instruction.

As can be seen in the test changes, currently the AArch64 backend does not eliminate the unconditional branch to the fallthrough block. I've tried two approaches, neither of which worked. I plan to return to this in a separate change set once I've wrapped my head around the interactions a bit better. (X86 handles this via AllowModify on analyzeBranch, but adding the obvious code causing BranchFolding to crash. I haven't yet figured out if it's a latent bug in BranchFolding, or something I'm doing wrong.)

Differential Revision: https://reviews.llvm.org/D87851

b04c181e

[test] Fix FullUnroll.ll · f2f0474c

Arthur Eubanks authored Aug 24, 2020

I believe the intention of this test added in
https://reviews.llvm.org/D71687 was to test LoopFullUnrollPass with
clang's -fno-unroll-loops, not its interaction with optnone. Loop
unrolling passes don't run under optnone/-O0.

Also added back unintentionally removed -disable-loop-unrolling from
https://reviews.llvm.org/D85578.

Reviewed By: echristo

Differential Revision: https://reviews.llvm.org/D86485

f2f0474c

[TargetRegisterInfo] Add a couple of target hooks for the greedy register allocator · 99e865b6

Quentin Colombet authored Sep 17, 2020

Before this patch, the last chance recoloring and deferred spilling
techniques were solely controled by command line options.
This patch adds target hooks for these two techniques so that it
is easier for backend writers to override the default behavior.

The default behavior of the hooks preserves the default values of
the related command line options.

NFC

99e865b6

[NFC] Test Commit · cab780a5
Zhaoshi Zheng authored Sep 17, 2020

cab780a5

Sep 17, 2020

Support dwarf fission for wasm object files · 0ff28fa6

Derek Schuff authored Aug 07, 2020

Initial support for dwarf fission sections (-gsplit-dwarf) on wasm.
The most interesting change is support for writing 2 files (.o and .dwo) in the
wasm object writer. My approach moves object-writing logic into its own function
and calls it twice, swapping out the endian::Writer (W) in between calls.
It also splits the import-preparation step into its own function (and skips it when writing a dwo).

Differential Revision: https://reviews.llvm.org/D85685

0ff28fa6

[MemorySSA] Be more conservative when traversing MemoryPhis. · a0017c2b

Florian Hahn authored Sep 17, 2020

I think we need to be even more conservative when traversing memory
phis, to make sure we catch any loop carried dependences.

This approach updates fillInCurrentPair to use unknown sizes for
locations when we walk over a phi, unless the location is guaranteed to
be loop-invariant for any possible loop. Using an unknown size for
locations should ensure we catch all memory accesses to locations after
the given memory location, which includes loop-carried dependences.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87778

a0017c2b

[NewPM] Fix pr45927.ll under NPM · 179a22e8
Arthur Eubanks authored Sep 17, 2020

179a22e8
[llvm-install-name-tool] Update the command-line guide · 53ba045f
Alexander Shaposhnikov authored Sep 10, 2020

53ba045f

[InstCombine] Canonicalize SPF_ABS to abs intrinc · 05d4c4eb

Nikita Popov authored Sep 05, 2020

Enable canonicalization of SPF_ABS and SPF_NABS to the abs intrinsic.

To be conservative, the one-use check on the comparison is retained,
this may be relaxed if all goes well.

It's pretty likely that this will uncover places that missing
handling for the abs() intrinsic. Please report any seen performance
regressions.

Differential Revision: https://reviews.llvm.org/D87188

05d4c4eb

[LoopUnrollAndJam] Allow unroll and jam loops forced by user. · 1cee33e9

Whitney Tsang authored Sep 17, 2020

Summary: Allow unroll and jam loops forced by user.
LoopUnrollAndJamPass is still disabled by default in the NPM pipeline,
and can be controlled by -enable-npm-unroll-and-jam.

Reviewed By: Meinersbur, dmgreen

Differential Revision: https://reviews.llvm.org/D87786

1cee33e9

[GVN] Use that assume(!X) implies X==false (PR47496) · 91ce8e12

Nikita Popov authored Sep 17, 2020

We already use that assume(X) implies X==true, do the same for
assume(!X) implying X==false. This fixes PR47496.

91ce8e12

[GVN] Add additional assume tests (NFC) · 59855b9d

Nikita Popov authored Sep 17, 2020

The other assume tests seem to be dealing with equalities in
particular. Test implication for the condition itself, especially
the negated case from PR47496.

59855b9d

[SCEV] Add test cases for max BTC with loop guard info. · 51973a60

Florian Hahn authored Sep 17, 2020

This adds test cases for PR40961 and PR47247. They illustrate cases in
which the max backedge-taken count can be improved by information from
the loop guards.

51973a60

Disable hoisting MI to hotter basic blocks when using pgo · a4bb71b1

Victor Huang authored Sep 17, 2020

This is a follow up patch for https://reviews.llvm.org/D63676 to
enable the feature when using pgo.

Differential Revision: https://reviews.llvm.org/D85240

a4bb71b1

[Lsan] Use fp registers to search for pointers · 5813fca1

Vitaly Buka authored Sep 17, 2020

X86 can use xmm registers for pointers operations. e.g. for std::swap.
I don't know yet if it's possible on other platforms.

NT_X86_XSTATE includes all registers from NT_FPREGSET so
the latter used only if the former is not available. I am not sure how
reasonable to expect that but LLD has such fallback in
NativeRegisterContextLinux_x86_64::ReadFPR.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D87754

5813fca1

AArch64::ArchKind's underlying type is uint64_t · c145a1ca
Jon Roelofs authored Sep 17, 2020

c145a1ca
[gn build] Port 7e4c6fb8 · 667762c6
LLVM GN Syncbot authored Sep 17, 2020

667762c6

[IRSim] Adding IR Instruction Mapper · 7e4c6fb8

Andrew Litteken authored Sep 17, 2020

This introduces the IRInstructionMapper, and the associated wrapper for
instructions, IRInstructionData, that maps IR level Instructions to
unsigned integers.

Mapping is done mainly by using the "isSameOperationAs" comparison
between two instructions.  If they return true, the opcode, result type,
and operand types of the instruction are used to hash the instruction
with an unsigned integer.  The mapper accepts instruction ranges, and
adds each resulting integer to a list, and each wrapped instruction to
a separate list.

At present, branches, phi nodes are not mapping and exception handling
is illegal.  Debug instructions are not considered.

The different mapping schemes are tested in
unittests/Analysis/IRSimilarityIdentifierTest.cpp

Recommit of: b04c1a9d

Differential Revision: https://reviews.llvm.org/D86968

7e4c6fb8

[SVE][WIP] Implement lowering for fixed length VSELECT to Scalable · a35c7f30
Cameron McInally authored Sep 17, 2020
```
Map fixed length VSELECT to its Scalable equivalent.

Differential Revision: https://reviews.llvm.org/D85364
```
a35c7f30

[PDB] Split TypeServerSource and extend type index map lifetime · 1e5b7e91

Reid Kleckner authored Jun 03, 2020

Extending the lifetime of these type index mappings does increase memory
usage (+2% in my case), but it decouples type merging from symbol
merging. This is a pre-requisite for two changes that I have in mind:
- parallel type merging: speeds up slow type merging
- defered symbol merging: avoid heap allocating (relocating) all symbols

This eliminates CVIndexMap and moves its data into TpiSource. The maps
are also split into a SmallVector and ArrayRef component, so that the
ipiMap can alias the tpiMap for /Z7 object files, and so that both maps
can simply alias the PDB type server maps for /Zi files.

Splitting TypeServerSource establishes that all input types to be merged
can be identified with two 32-bit indices:
- The index of the TpiSource object
- The type index of the record
This is useful, because this information can be stored in a single
64-bit atomic word to enable concurrent hashtable insertion.

One last change is that now all object files with debugChunks get a
TpiSource, even if they have no type info. This avoids some null checks
and special cases.

Differential Revision: https://reviews.llvm.org/D87736

1e5b7e91

[AArch64][GlobalISel] Widen G_EXTRACT_VECTOR_ELT element types if < 8b. · 7d5b1034

Amara Emerson authored Sep 17, 2020

In order to not unnecessarily promote the source vector to greater than our
native vector size of 128b, I've added some cascading rules to widen based on
the number of elements.

7d5b1034

[AArch64][GlobalISel] Make <8 x s16> and <16 x s8> legal for shifts. · bea7749d
Amara Emerson authored Sep 17, 2020

bea7749d