Commits · ad60ff70eb56a7d198e613152f9974d5d4baabd4 · Lorenzo Albano / LLVM bpEVL

May 13, 2020

[PowerPC] Exploit VSX neg, abs and nabs for f32 · 8ffe8891

Qiu Chaofan authored May 12, 2020

xsnegdp, xsabsdp and xsnabsdp can be used to operate on f32 operand.

This patch adds the missing patterns since we prefer VSX instructions
when available.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D75344

8ffe8891

[CostModel] Modify BasicTTI getCastInstrCost · 6bbad728

Sam Parker authored May 13, 2020

Fix the assumption that all bitcasts of the same type sizes are free.
We now only assume that bitcasts between ints and ptrs of the same
size are free. This allows TTImpl to just call the concrete
implementation of getCastInstrCost.

Differential Revision: https://reviews.llvm.org/D78918

6bbad728

[PowerPC] Respect SDNodeFlags in lowering SELECT_CC · e9753822

Qiu Chaofan authored May 13, 2020

Legalizer should respect both command-line options or SDNode-level
fast-math flags.

Also, this patch propagates other flags during custom simplifying.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D79074

e9753822

[PowerPC] Use add instead of addReg in ppc-early-ret pass · 782a4dd1

Kang Zhang authored May 13, 2020

Summary:
The ppc-early-ret pass use the addReg() to add operand to the new
instruction, it can't reserve the flag of old operand. This has caused
machine verfications failed.
This patch use add() to instead of addReg().

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D77997

782a4dd1

[cmake] Update creation of object library dependencies for LINK_LIBS PUBLIC · 085234be

Stephen Neuendorffer authored May 12, 2020

We need to avoid declaring dependencies on strings which are valid
LINK_LIBS and not valid targets.  Previously, we used if(TARGET) to
check this condition.  However, if(TARGET) checks whether a target has
been created (in the cmake subdirectory traversal order) and not
whether it *will* be created.  This results in annoying directory
ordering problems.

This patch changes the check to more explicitly eliminate problematic
libraries (namely -lpthread) using a REGEX.

Differential Revision: https://reviews.llvm.org/D79837

085234be

[LoopReroll] Fix rerolling loop with use outside the loop · 272bc25b

KAWASHIMA Takahiro authored May 07, 2020

Fixes PR41696

The loop-reroll pass generates an invalid IR (or its assertion
fails in debug build) if values of the base instruction and
other root instructions (terms used in the loop-reroll pass)
are used outside the loop block. See IRs written in PR41696
as examples.

The current implementation of the loop-reroll pass can reroll
only loops that don't have values that are used outside the
loop, except reduced values (the last values of reduction chains).
This is described in the comment of the `LoopReroll::reroll`
function.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1600

This is checked in the `LoopReroll::DAGRootTracker::validate`
function.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1393

However, the base instruction and other root instructions skip
this check in the validation loop.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1229

Moving the check in front of the skip is the logically simplest
fix. However, inserting the check in an earlier stage is better
in terms of compilation time of unrerollable loops. This fix
inserts the check for the base instruction into the function
to validate possible base/root instructions. Check for other
root instructions is unnecessary because they don't match any
base instructions if they have uses outside the loop.

Differential Revision: https://reviews.llvm.org/D79549

272bc25b

[Attributor][FIX] Stabilize the state of AAReturnedValues each update · af48351c

Johannes Doerfert authored May 12, 2020

For AAReturnedValues we treated new and existing information differently
in the updateImpl. Only the latter was properly analyzed and
categorized. The former was thought to be analyzed in the subsequent
update. Since the Attributor does not support "self-updates" we need to
make sure the state is "stable" after each updateImpl invocation. That
is, if the surrounding information does not change, the state is valid.
Now we make sure all return values have been handled and properly
categorized each iteration. We might not update again if we have not
requested a non-fix attribute so we cannot "wait" for the next update to
analyze a new return value.

Bug reported by @sdmitriev.

af48351c

[ValueTracking] Fix crash in isGuaranteedNotToBeUndefOrPoison when V is in an unreachable block · d3eb51f0

Juneyoung Lee authored May 13, 2020

Summary:
This fixes PR45885 by fixing isGuaranteedNotToBeUndefOrPoison so it does not look into dominating
branch conditions of V when V is an instruction in an unreachable block.

Reviewers: spatel, nikic, lebedev.ri

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79790

d3eb51f0

Add nomerge function attribute to supress tail merge optimization in simplifyCFG · cb22ab74

Zequan Wu authored May 12, 2020

We want to add a way to avoid merging identical calls so as to keep the
separate debug-information for those calls. There is also an asan
usecase where having this attribute would be beneficial to avoid
alternative work-arounds.

Here is the link to the feature request:
https://bugs.llvm.org/show_bug.cgi?id=42783.

`nomerge` is different from `noline`. `noinline` prevents function from
inlining at callsites, but `nomerge` prevents multiple identical calls
from being merged into one.

This patch adds `nomerge` to disable the optimization in IR level. A
followup patch will be needed to let backend understands `nomerge` and
avoid tail merge at backend.

Reviewed By: asbirlea, rnk

Differential Revision: https://reviews.llvm.org/D78659

cb22ab74

[AMDGPU] Make v4i64/v4f64/v8i64/v8f64 legal · 71ed66d9

Stanislav Mekhanoshin authored May 12, 2020

We can produce such vectors in the Promote Alloca pass,
but we are unable to use movrel to operate it and lower
via scratch. Making it legal makes SI_INDIRECT patterns
work.

There is more work to do in subsequent changes:

1. We initialize m0 twice to access each dword. It shall
be possible to only do it once and increment base register
number instead.
2. We also need v16i64/v16f64 but these first need to be
added to tablegen.

Differential Revision: https://reviews.llvm.org/D79808

71ed66d9

[YAMLVFSWriter] Fix for delimiters · 759465ee
Jan Korous authored May 12, 2020
```
Differential Revision: https://reviews.llvm.org/D79809
```
759465ee

[x86][CGP] enable target hook to sink funnel shift intrinsic's splatted shift amount · f490ca76

Sanjay Patel authored May 12, 2020

SDAG suffers when it can't see that a funnel operand is a splat value
(due to single-basic-block visibility), so invert the normal loop
hoisting rules to move a splat op closer to its use.

This would be part 1 of an enhancement similar to D63233.

This is needed to re-fix PR37426:
https://bugs.llvm.org/show_bug.cgi?id=37426
...because we got better at canonicalizing IR to funnel shift intrinsics.

The existing CGP code for shift opcodes is likely overstepping what it was
intended to do, so that will be fixed in a follow-up.

Differential Revision: https://reviews.llvm.org/D79718

f490ca76

[GIsel] Update a comment and make it more precise. · a9e85626
Davide Italiano authored May 12, 2020
```
This only covers ANYEXT/ZEXT. SEXT is covered in another test
I just checked in.
```
a9e85626
[GlobalISel] Assign the correct location when combining G_SEXT. · 99d60a1d
Davide Italiano authored May 12, 2020
```
<rdar://problem/62991635>
```
99d60a1d
Fix buildbots #2 after aa1eb515 . · 1c44430e
Alexey Lapshin authored May 13, 2020

1c44430e

PowerPC: Treat llvm.fma.f* intrinsic as using CTR with SPE · 0138cc01

Justin Hibbits authored Apr 18, 2020

Summary:
The SPE doesn't have a 'fma' instruction, so the intrinsic becomes a
libcall.  It really should become an expansion to two instructions, but
for some reason the compiler doesn't think that's as optimal as a
branch.  Since this lowering is done after CTR is allocated for loops,
tell the optimizer that CTR may be used in this case.  This prevents a
"Invalid PPC CTR loop!" assertion in the case that a fma() function call
is used in a C/C++ file, and clang converts it into an intrinsic.

Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D78668

0138cc01

Fix buildbots after aa1eb515 . · 293c6d38
Alexey Lapshin authored May 13, 2020

293c6d38

[SampleFDO] Rename llvm-profdata flag -partial-profile to -gen-partial-profile. · 56926ae0

Wei Mi authored May 12, 2020

The internal flag -partial-profile in llvm conflicts with the flag with
the same name in llvm-profdata. The conflict happens in builds with
LLVM_LINK_LLVM_DYLIB enabled. In this case the tools are linked with libLLVM
and we end up with two definitions for the same cl::opt.

The patch renames llvm-profdata flag -partial-profile to -gen-partial-profile.

56926ae0

May 12, 2020

[VirtualFileSystem] Add unit test that showcases another YAMLVFSWriter bug · 58bc507b

Jonas Devlieghere authored May 12, 2020

This scenario generates another broken YAML mapping as illustrated below.

  {
    'type': 'directory',
    'name': "c",
    'contents': [
      ,
      {
        'type': 'directory',
        'name': "d",
        'contents': [
          ,
          {
            'type': 'directory',
            'name': "e",
            'contents': [
              {
                'type': 'file',
                'name': "f",
                'external-contents': "//root/a/c/d/e/f"
              }                    {
                'type': 'file',
                'name': "g",
                'external-contents': "//root/a/c/d/e/g"
              }
            ]
          }
        ]
      }
    ]
  },

58bc507b

[VirtualFileSystem] Add unit test that showcases YAMLVFSWriter bug · 59ba19c5

Jonas Devlieghere authored May 12, 2020

This scenario generates a broken YAML mapping as illustrated below.

 {
   'type': 'directory',
   'name': "c",
   'contents': [
     {
       'type': 'file',
       'name': "d",
       'external-contents': "//root/a/c/d"
     }            {
       'type': 'file',
       'name': "e",
       'external-contents': "//root/a/c/e"
     }            {
       'type': 'file',
       'name': "f",
       'external-contents': "//root/a/c/f"
     }
   ]
 },

59ba19c5

[X86][ISelLowering] refactor Varargs handling in X86ISelLowering.cpp · aa1eb515

Alexey Lapshin authored Feb 12, 2020

Summary:
This patch refactors handling of VarArgs in
X86TargetLowering::LowerFormalArguments.
That refactoring was requested while reviewing
D69372. Code related to varargs handling is removed
from X86TargetLowering::LowerFormalArguments and
is divided into smaller routines.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D74794

aa1eb515

[TargetLoweringObjectFileImpl] Produce .text.hot. instead of .text.hot for... · 66055230

Fangrui Song authored May 07, 2020

[TargetLoweringObjectFileImpl] Produce .text.hot. instead of .text.hot for -fno-unique-section-names

GNU ld's internal linker script uses (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=add44f8d5c5c05e08b11e033127a744d61c26aee)

  .text           :
  {
    *(.text.unlikely .text.*_unlikely .text.unlikely.*)
    *(.text.exit .text.exit.*)
    *(.text.startup .text.startup.*)
    *(.text.hot .text.hot.*)
    *(SORT(.text.sorted.*))
    *(.text .stub .text.* .gnu.linkonce.t.*)
    /* .gnu.warning sections are handled specially by elf.em.  */
    *(.gnu.warning)
  }

Because `*(.text.exit .text.exit.*)` is ordered before `*(.text .text.*)`, in a -ffunction-sections build, the C library function `exit` will be placed before other functions.
gold's `-z keep-text-section-prefix` has the same problem.

In lld, `-z keep-text-section-prefix` recognizes `.text.{exit,hot,startup,unlikely,unknown}.*`, but not `.text.{exit,hot,startup,unlikely,unknown}`, to avoid the strange placement problem.

In -fno-function-sections or -fno-unique-section-names mode, a function whose `function_section_prefix` is set to `.exit"`
will go to the output section `.text` instead of `.text.exit` when linked by lld.
To address the problem, append a dot to become `.text.exit.`

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D79600

66055230

[Attributor] Fixup block addresses after rewriting function signature · 32f5ee83

Sergey Dmitriev authored May 12, 2020

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79801

32f5ee83

Avoid binding pointers to "auto&" (by dereferencing the pointer that's non-null anyway) · aa99da5a
David Blaikie authored May 12, 2020
```
Based on @djtodoro's 2552dc53
```
aa99da5a

[PowerPC] Fold redundant load immediates of zero and delete if possible · cd83333f

Kamau Bridgeman authored May 12, 2020

This patch folds redundant load immediates into a zero for instructions
which recognise this as the value zero and not the register. If the load
immediate is no longer in use it is then deleted.

This is already done in earlier passes but the ppc-mi-peephole allows for
a more general implementation.

Differential Revision: https://reviews.llvm.org/D69168

cd83333f

[FileCollector][NFC] Add comments · 9202df35
Jan Korous authored May 08, 2020
```
Differential Revision: https://reviews.llvm.org/D78961
```
9202df35

[ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc. · e5f602d8

Juneyoung Lee authored Apr 21, 2020

Summary:
This patch makes propagatesPoison be more accurate by returning true on
more bin ops/unary ops/casts/etc.

The changed test in ScalarEvolution/nsw.ll was introduced by
https://github.com/llvm/llvm-project/commit/a19edc4d15b0dae0210b90615775edd76f021008 .
IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has
no-overflow flags even if the loop isn't in the wanted form.
It becomes more accurate with this patch, so think this is okay.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, sanjoy

Reviewed By: spatel, nikic

Subscribers: regehr, nlopes, efriedma, fhahn, javed.absar, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78615

e5f602d8

[X86] Remove the v16i8->v16i16 path for MULHS with AVX2. · 01636c1e

Craig Topper authored May 12, 2020

We have a couple main strategies for legalizing MULH.

-If the vXi16 type is legal, extend to do the full i16 multiply
and then shift and truncate the results.
-Use unpcks to split each 128 bit lane into high and low halves.a

For signed we have an extra case to split a v32i8 to v16i8 and then
use the extending to v16i16 strategy.

This patch proposes to use the unpck strategy instead. Which is
what we already do for unsigned.

This seems to be 1 instruction shorter when the RHS is constant
like the idiv case. It's 1 instruction longer for the smulo case.
But we're trading cross lane shuffles for inlane shuffles and a
shift.

Differential Revision: https://reviews.llvm.org/D79652

01636c1e

[arm] Add big-endian version of pcrel fixups for adr instructions · fc373522

Dimitry Andric authored May 12, 2020

Summary:
In 2e24219d, a number of ARM pcrel fixups were resolved at assembly
time, to solve PR44929. This only covered little-endian ARM however, so
add similar fixups for big-endian ARM. Also extend the test case to
cover big-endian ARM.

Reviewers: hans, psmith, MaskRay

Reviewed By: psmith, MaskRay

Subscribers: kristof.beyls, hiraditya, danielkiss, emaste, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79774

fc373522

[AMDGPU] Add AGPRs to getRegClassForSizeOnBank · 9f0b7361
Austin Kerbow authored May 11, 2020
```
Differential Revision: https://reviews.llvm.org/D79761
```
9f0b7361
[CodeGen] Use Align in MachineConstantPool. · 8c72b027
Craig Topper authored May 12, 2020

8c72b027
[VectorCombine] add test to check for iterative improvements; NFC · 93bd6963
Sanjay Patel authored May 12, 2020

93bd6963

[WebAssembly] Implement pseudo-min/max SIMD instructions · 3d49d1cf

Thomas Lively authored May 12, 2020

Summary:
As proposed in https://github.com/WebAssembly/simd/pull/122. Since
these instructions are not yet merged to the SIMD spec proposal, this
patch makes them entirely opt-in by surfacing them only through LLVM
intrinsics and clang builtins. If these instructions are made
official, these intrinsics and builtins should be replaced with simple
instruction patterns.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D79742

3d49d1cf

[gcov] Default coverage version to '408*' and delete CC1 option -coverage-exit-block-before-body · b56b1e67

Fangrui Song authored May 11, 2020

gcov 4.8 (r189778) moved the exit block from the last to the second.
The .gcda format is compatible with 4.7 but

* decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the
  exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings,
  and print wrong `" returned %s` for branch statistics (-b).
* decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues.

Also, rename "return block" to "exit block" because the latter is the
appropriate term.

b56b1e67

[PassBuilder] Moved ProfileSummaryAnalysis in buildInlinerPipeline. · 5c10c6e0

Whitney Tsang authored May 12, 2020

Summary:
As commented in the code, ProfileSummaryAnalysis is required for inliner
pass to query, so this patch moved
RequireAnalysisPass<ProfileSummaryAnalysis> in the recently created
buildInlinerPipeline.
Reviewer: mtrofin, davidxl, tejohnson, dblaikie, jdoerfert, sstefan1
Reviewed By: mtrofin, davidxl, jdoerfert
Subscribers: hiraditya, steven_wu, dexonsmith, wuzish, llvm-commits,
jsji
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D79696

5c10c6e0

[GlobalISel][IRTranslator] Fix <1 x Ty> handling in ConstantExprs · 989be65b

Jay Foad authored Apr 17, 2020

Summary:
ConstantExprs involving operations on <1 x Ty> could translate into MIR
that failed to verify with:
*** Bad machine code: Reading virtual register without a def ***

The problem was that translate(const Constant &C, Register Reg) had
recursive calls that passed the same Reg in for the translation of a
subexpression, but without updating VMap for the subexpression first as
translate(const Constant &C, Register Reg) expects.

Fix this by using the same translateCopy helper function that we use for
translating Instructions. In some cases this causes extra G_COPY
MIR instructions to be generated.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45576

Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar

Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78378

989be65b

[GlobalISel][IRTranslator] New helper function translateCopy. NFC. · bd80a8bb

Jay Foad authored Apr 17, 2020

Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar

Subscribers: wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78377

bd80a8bb

[docs] Corrected inaccuracies in Common Problems section. · 5c707fd9

Michael Kruse authored May 12, 2020

Changed the language in LLVM_USE_LINKER to more strongly recommend LLD
and to specify that the GNU gold linker is only useful if LLD is
unavailable in binary form and it is the first build of LLVM. Added that
LLD will help when used on ELF-based platforms.

Corrected information in CMAKE_BUILD_TYPE regarding the Release build
type and enabling assertions.

Added option LLVM_ENABLE_ASSERTIONS and mentioned enabling this option
with a Release build as an alternative to using a Debug build.

Specified that the LLVM_OPTIMIZED_TABLEGEN
option is only for Debug builds, that the LLVM_USE_SPLIT_DWARF option
is only available on ELF host platforms, and that setting
CLANG_ENABLE_STATIC_ANALYZER to OFF only slightly improves build time.

These changes address comments made in D75425.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D77346

5c707fd9

Add comment for SelectionDAGBuilder::SL field. · e9536795
James Y Knight authored May 12, 2020

e9536795

[AMDGPU] Order pos exports before param exports · 58f1417e

Carl Ritson authored May 12, 2020

Summary:
Modify export clustering DAG mutation to move position exports
before other exports types.

Reviewers: foad, arsenm, rampitec, nhaehnle

Reviewed By: foad

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79670

58f1417e