Commits · 3333244d77c44e8bb5af57027646596f7714ff62 · Lorenzo Albano / LLVM bpEVL

Jan 26, 2021

Support: Remove duplicated code in {File,clang::ModulesDependency}Collector, NFC · 080952a9

Duncan P. N. Exon Smith authored Jan 22, 2021

Refactor the duplicated canonicalize-path logic in `FileCollector` and
`ModulesDependencyCollector` into a new utility called
`PathCanonicalizer` that's shared. This popped up when tracking down a
bug common to both in https://reviews.llvm.org/D95202.

As drive-bys, update a few names and comments to better reflect the
effect of the code, delay removal of `..`s to avoid an unnecessary extra
string copy, and leave behind a couple of FIXMEs for future
consideration.

Differential Revision: https://reviews.llvm.org/D95279

080952a9

Jan 25, 2021

[LSR] Drop potentially invalid nowrap flags when switching to post-inc IV (PR46943) · 835104a1

Nikita Popov authored Jan 23, 2021

When LSR converts a branch on the pre-inc IV into a branch on the
post-inc IV, the nowrap flags on the addition may no longer be valid.
Previously, a poison result of the addition might have been ignored,
in which case the program was well defined. After branching on the
post-inc IV, we might be branching on poison, which is undefined behavior.

Fix this by discarding nowrap flags which are not present on the SCEV
expression. Nowrap flags on the SCEV expression are proven by SCEV
to always hold, independently of how the expression will be used.
This is essentially the same fix we applied to IndVars LFTR, which
also performs this kind of pre-inc to post-inc conversion.

I believe a similar problem can also exist for getelementptr inbounds,
but I was not able to come up with a problematic test case. The
inbounds case would have to be addressed in a differently anyway
(as SCEV does not track this property).

Fixes https://bugs.llvm.org/show_bug.cgi?id=46943.

Differential Revision: https://reviews.llvm.org/D95286

835104a1

[RISCV] Add RVV insertelt/extractelt scalable-vector patterns · 15141cd1

Fraser Cormack authored Jan 11, 2021



Original patch by @rogfer01.

This patch adds support for insertelt and extractelt operations on
scalable vectors.

Special care must be taken on RV32 when dealing with i64 vectors as
there are no straightforward ways to insert a 64-bit element without a
register of that size. To that end, both are custom-lowered to different
sequences.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94615

15141cd1

Recommit "[AArch64][GlobalISel] Implement widenScalar for signed overflow" · aa8f3677

Cassie Jones authored Jan 25, 2021

Implement widening for G_SADDO and G_SSUBO.
Add legalize-add/sub tests for narrow overflowing add/sub on AArch64.

Differential Revision: https://reviews.llvm.org/D95034

aa8f3677

Revert "[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV" · 925ae8c7
Richard Smith authored Jan 25, 2021
```
This reverts commit 53176c16, which
introduceed a layering violation. LLVM's IR library can't include
headers from Analysis.
```
925ae8c7

[YAML I/O] Fix bug in emission of empty sequence · f50b8ee7

Jonas Devlieghere authored Jan 25, 2021

Don't emit an output dash for an empty sequence. Take emitting a vector
of strings for example:

  std::vector<std::string> Strings = {"foo", "bar"};
  LLVM_YAML_IS_SEQUENCE_VECTOR(std::string)
  yout << Strings;

This emits the following YAML document.

  ---
  - foo
  - bar
  ...

When the vector is empty, this generates the following result:

  ---
  - []
  ...

Although this is valid YAML, it does not match what we meant to emit.
The result is a one-element sequence consisting of an empty list.
Indeed, if we were to try to read this again we get an error:

  YAML:2:4: error: not a mapping
  - []

The problem is the output dash before the empty list. The correct output
would be:

  ---
  []
  ...

This patch fixes that by not emitting the output dash for an empty
sequence.

Differential revision: https://reviews.llvm.org/D95280

f50b8ee7

Revert "[lit] Use os.cpu_count() to cleanup TODO" · db1a7089

Julian Lettner authored Jan 25, 2021

A bot owner contacted me.  I will re-land after confirming that this
doesn't break anyone (since it's low priority).

This reverts commit 9946b169.

db1a7089

Revert "[IndirectFunctions] Skip propagating attributes to address taken functions" · 2cdb34ef

Konstantin Zhuravlyov authored Jan 25, 2021

This reverts commit dd8ae426.

This commit causes infinite loop when compiling rocThrust and hipCUB.

Differential Revision: https://reviews.llvm.org/D95389

2cdb34ef

[gn build] Port e123cd67 · 12b34ffc
LLVM GN Syncbot authored Jan 25, 2021

12b34ffc

[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV · 53176c16

Akira Hatanaka authored Jan 25, 2021

or claimRV calls in the IR

Background:

This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.

https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue

What this patch does to fix the problem:

- The front-end annotates calls with attribute "clang.arc.rv"="retain"
  or "clang.arc.rv"="claim", which indicates the call is implicitly
  followed by a marker instruction and a retainRV/claimRV call that
  consumes the call result. This is currently done only when the target
  is arm64 and the optimization level is higher than -O0.

- ARC optimizer temporarily emits retainRV/claimRV calls after the
  annotated calls in the IR and removes the inserted calls after
  processing the function.

- ARC contract pass emits retainRV/claimRV calls after the annotated
  calls. It doesn't remove the attribute on the call since the backend
  needs it to emit the marker instruction. The retainRV/claimRV calls
  are emitted late in the pipeline to prevent optimization passes from
  transforming the IR in a way that makes it harder for the ARC
  middle-end passes to figure out the def-use relationship between the
  call and the retainRV/claimRV calls (which is the cause of PR31925).

- The function inliner removes the autoreleaseRV call in the callee that
  returns the result if nothing in the callee prevents it from being
  paired up with the calls annotated with "clang.arc.rv"="retain/claim"
  in the caller. If the call is annotated with "claim", a release call
  is inserted since autoreleaseRV+claimRV is equivalent to a release. If
  it cannot find an autoreleaseRV call, it tries to transfer the
  attributes to a function call in the callee. This is important since
  ARC optimizer can remove the autoreleaseRV call returning the callee
  result, which makes it impossible to pair it up with the retainRV or
  claimRV call in the caller. If that fails, it simply emits a retain
  call in the IR if the call is annotated with "retain" and does nothing
  if it's annotated with "claim".

- This patch teaches dead argument elimination pass not to change the
  return type of a function if any of the calls to the function are
  annotated with attribute "clang.arc.rv". This is necessary since the
  pass can incorrectly determine nothing in the IR uses the function
  return, which can happen since the front-end no longer explicitly
  emits retainRV/claimRV calls in the IR, and change its return type to
  'void'.

Future work:

- Use the attribute on x86-64.

- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
  calls annotated with the attributes.

rdar://71443534

Differential Revision: https://reviews.llvm.org/D92808

53176c16

[lit] Use os.cpu_count() to cleanup TODO · 9946b169

Julian Lettner authored Jan 14, 2021

We can now use Python3.  Let's use `os.cpu_count()` to cleanup this
helper.

Differential Revision: https://reviews.llvm.org/D94734

9946b169

[VPlan] Replace uses with new value in VPInstructionsToVPRecipe (NFC). · 76afbf60

Florian Hahn authored Jan 25, 2021

Now that VPRecipeBase inherits from VPDef, we can always use the new
VPValue for replacement, if the recipe defines one. Given the recipes
that are supported at the moment, all new recipes must have either 0 or
1 defined values.

76afbf60

[GVN] do not repeat PRE on failure to split critical edge · d3681289

Nick Desaulniers authored Jan 25, 2021

Fixes an infinite loop encountered in GVN.

GVN will delay PRE if it encounters critical edges, attempt to split
them later via calls to SplitCriticalEdge(), then restart.

The caller of GVN::splitCriticalEdges() assumed a return value of true
meant that critical edges were split, that the IR had changed, and that
PRE should be re-attempted, upon which we loop infinitely.

This was exposed after D88438, by compiling the Linux kernel for s390,
but the test case is reproducible on x86.

Fixes: https://github.com/ClangBuiltLinux/linux/issues/1261

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D94996

d3681289

[RISCV] Custom type legalize i8/i16 UDIV/UREM/SDIV on RV64 so we can use divuw/remuw/divw. · 239cfbcc

Craig Topper authored Jan 25, 2021

This makes our i8/i16 codegen more similar to the i32 codegen.

I've also added computeKnownBits support for DIVUW/REMUW so
that we can remove zero extending ANDs from the output. Without
this we end up turning DIVUW/REMUW back into DIVU/REMU via some
isel patterns.

Reviewed By: frasercrmck, luismarques

Differential Revision: https://reviews.llvm.org/D95322

239cfbcc

[Win64] Ensure all stack frames are 8 byte aligned · 988a5334

Reid Kleckner authored Jan 25, 2021

The unwind info format requires that all adjustments are 8 byte aligned,
and the bottom three bits are masked out. Most Win64 calling conventions
have 32 bytes of shadow stack space for spilling parameters, and I
believe that constructing these fixed stack objects had the side effect
of ensuring an alignment of 8. However, the Intel regcall convention
does not have this shadow space, so when using that convention, it was
possible to make a 4 byte stack frame, which was impossible to describe
with unwind info.

Fixes pr48867

988a5334

[SampleFDO] Report error when reading a bad/incompatible profile instead of · c9cd9a00

Wei Mi authored Jan 22, 2021

turning off SampleFDO silently.

Currently sample loader pass turns off SampleFDO optimization silently when
it sees error in reading the profile. This behavior will defeat the tests
which could have caught those bad/incompatible profile problems. This patch
change the behavior to report error.

Differential Revision: https://reviews.llvm.org/D95269

c9cd9a00

[PowerPC] Add missing negate for VPERMXOR on little endian subtargets · 1150bfa6

Nemanja Ivanovic authored Jan 25, 2021

This intrinsic is supposed to have the permute control vector complemented on
little endian systems (as the ABI specifies and GCC implements). With the
current code gen, the result vector is byte-reversed.

Differential revision: https://reviews.llvm.org/D95004

1150bfa6

[ARM] Use half directly for args/return types in test. NFC · 9390b85a

David Green authored Jan 25, 2021

Until fairly recently the calling convention for IR half was not handled
correctly in the ARM backend, meaning we needed to pass pointers that
were loaded/stored. Now that that is fixed we can switch to using the
type directly instead.

9390b85a

[RISCV] Use sign extend for i32 arguments and returns in makeLibCall on RV64. · 4eb4f896

Craig Topper authored Jan 25, 2021

As far as I know 32 bits arguments and returns on RV64 are always
sign extended to i64. So I think we should be taking this into
account around libcalls.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95285

4eb4f896

Revert "Fix unused variable in CoroFrame.cpp when building Release with GCC 10" · 17c3538a
Xun Li authored Jan 25, 2021
```
This reverts commit ff5e8964.
```
17c3538a
[AMDGPU][MC] Improved errors handling for SDWA operands · 558b3bbb
Dmitry Preobrazhensky authored Jan 25, 2021
```
Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D95212
```
558b3bbb
Revert "[JITLink] Enable exception handling for ELF." · f8078259
Nico Weber authored Jan 25, 2021
```
This reverts commit 6884fbc2.
Breaks tests on Windows: http://45.33.8.238/win/31981/step_11.txt
```
f8078259

[Verifier] disable llvm.experimental.noalias.scope.decl dominance check. · 3b5d36ec

Jeroen Dobbelaere authored Jan 25, 2021

This was enabled in https://reviews.llvm.org/D95335 but it breaks the stage2 fuchsia build
(See http://lab.llvm.org:8011/#/builders/98/builds/4105/steps/9/logs/stdio)

3b5d36ec

[X86][AVX] Generalize vperm2f128/vperm2i128 patterns to support all legal 256-bit vector types · 13f2aee7
Simon Pilgrim authored Jan 25, 2021
```
Remove bitcasts to/from v4x64 types through vperm2f128/vperm2i128 ops to help improve shuffle combining and demanded vector elts folding.
```
13f2aee7

[Verifier] enable and limit llvm.experimental.noalias.scope.decl dominance checking · 6e530a3d

Jeroen Dobbelaere authored Jan 25, 2021

Checking the llvm.experimental.noalias.scope.decl dominance can be worstcase O(N^2).
Limit the dominance check to N=32.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D95335

6e530a3d

[Doc][NFC] Fix Kaleidoscope links, typos and add blog posts for MCJIT · 3546b372
xgupta authored Jan 25, 2021

3546b372

[VPlan] Handle scalarized values in VPTransformState. · 3201274d

Florian Hahn authored Jan 25, 2021

This patch adds plumbing to handle scalarized values directly in
VPTransformState.

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D92282

3201274d

[NFC] Fix title comment typo and provide description for LLJIT example. · 7163aa99
xgupta authored Jan 25, 2021

7163aa99

[X86][AVX] combineX86ShuffleChainWithExtract - widen to at least original root size. NFCI. · 821a51a9

Simon Pilgrim authored Jan 25, 2021

We're relying on the source inputs for shuffle combining having already been widened to the root size (otherwise the offset logic falls over) - we're going to be supporting different sized shuffle inputs soon, so we need to explicitly make the minimum widened width the original root size.

821a51a9

Revert "[SystemZ][z/OS] Fix No such file or directory expression error" · 978444d5
Abhina Sreeskantharajan authored Jan 25, 2021
```
This reverts commit 06f8a496.
```
978444d5

Revert "[SystemZ][z/OS] Fix No such file or directory expression error... · 84851a27

Abhina Sreeskantharajan authored Jan 25, 2021

Revert "[SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests - continued"

This reverts commit 520b5ecf.

84851a27

[InstCombine] narrow min/max intrinsics with extended inputs · 09a136bc

Sanjay Patel authored Jan 24, 2021

We can sink extends after min/max if they match and would
not change the sign-interpreted compare. The only combo
that doesn't work is zext+smin/smax because the zexts
could change a negative number into positive:
https://alive2.llvm.org/ce/z/D6sz6J

Sext+umax/umin works:

  define i32 @src(i8 %x, i8 %y) {
  %0:
    %sx = sext i8 %x to i32
    %sy = sext i8 %y to i32
    %m = umax i32 %sx, %sy
    ret i32 %m
  }
  =>
  define i32 @tgt(i8 %x, i8 %y) {
  %0:
    %m = umax i8 %x, %y
    %r = sext i8 %m to i32
    ret i32 %r
  }
  Transformation seems to be correct!

09a136bc

[InstCombine] add tests for min/max intrinsics with extended values; NFC · 07b60d00
Sanjay Patel authored Jan 24, 2021

07b60d00

[SLPVectorizer] NFC: Migrate getVectorCallCosts to use InstructionCost. · 171d1248

Sander de Smalen authored Jan 20, 2021

This change also changes getReductionCost to return InstructionCost,
and it simplifies two expressions by removing a redundant 'isValid' check.

171d1248

[X86][AVX] LowerTRUNCATE - avoid bitcasts around extract_subvectors. · 1b780cf3

Simon Pilgrim authored Jan 25, 2021

We allow extract_subvector lowering of all legal types, so pre-bitcast the source type to try and reduce bitcast pollution.

1b780cf3

[X86][AVX] combineX86ShuffleChain - avoid bitcasts around insert_subvector() shuffle patterns. · f461e35c

Simon Pilgrim authored Jan 25, 2021

We allow insert_subvector lowering of all legal types, so don't always cast to the vXi64/vXf64 shuffle types - this is only necessary for X86ISD::SHUF128/X86ISD::VPERM2X128 patterns later.

f461e35c

[TableGen] RuleMatcher::defineComplexSubOperand avoid std::string copy. NFCI. · 9641bd0f

Simon Pilgrim authored Jan 23, 2021

Use const reference to avoid std::string copy - accordingly to the style guide we shouldn't be using auto anyway.

Fixes MSVC analyzer warning.

9641bd0f

[InstructionCost] Prevent InstructionCost being created with CostState. · d196f9e2

Sander de Smalen authored Jan 21, 2021

For a function that returns InstructionCost, it is very tempting to write:

  return InstructionCost::Invalid;

But that actually returns InstructionCost(1 /* int value of Invalid */))
which has a totally different meaning. By marking this constructor as
`delete`, this can no longer happen.

d196f9e2

[SelectionDAG] Support scalable-vector splats in more cases · fde24661

Fraser Cormack authored Jan 08, 2021

This patch adds support for scalable-vector splats in DAGCombiner's
`isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions,
which enable the SelectionDAG div/rem-by-constant optimizations for
scalable vector types.

It also fixes up one case where the UDIV optimization was generating a
SETCC without first consulting the target for its preferred SETCC result
type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94501

fde24661

[llvm-dwp] Automatically set the target triple · da489946

Philip Pfaffe authored Jan 25, 2021

The llvm-dwp tool hard-codes the target triple to x86. Instead, deduce the
target triple from the object files being read.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D93749

da489946