Commits · d61b1f8534c6ee0f9dad528cda641d1429920ed4 · Lorenzo Albano / LLVM bpEVL

Jun 12, 2020

[VPlan] Reject loops without computable backedge taken counts · 3a846d4d

Florian Hahn authored Jun 12, 2020

getOrCreateTripCount is used to generate code for the outer loop, but it
requires a computable backedge taken counts. Check that in the VPlan
native path.

Reviewers: Ayal, gilr, rengolin, sguggill

Reviewed By: sguggill

Differential Revision: https://reviews.llvm.org/D81088

3a846d4d

[AMDGPU] Add G16 support to image instructions · 29a6ad94

Sebastian Neubauer authored Mar 25, 2020

Add G16 feature for GFX10 and support A16 and G16 in GlobalISel.

Differential Revision: https://reviews.llvm.org/D76836

29a6ad94

[yaml2obj][MachO] - Fix PubName/PubType handling. · d95f8e7a

Georgii Rymar authored Jun 11, 2020

`PubName` and `PubType` are optional fields since D80722.

They are defined as:
  Optional<PubSection> PubNames;
  Optional<PubSection> PubTypes;

And initialized in the following way:
  IO.mapOptional("debug_pubnames", DWARF.PubNames);
  IO.mapOptional("debug_pubtypes", DWARF.PubTypes);

But problem is that because of the issue in `YAMLTraits.cpp`,
when there are no `debug_pubnames`/`debug_pubtypes` keys in a YAML description,
they are not initialized to `Optional::None` as the code expects, but they
are initialized to default `PubSection()` instances.

Because of this, the `if` condition in the following code is always true:

if (Obj.DWARF.PubNames)
  Err = DWARFYAML::emitPubSection(OS, *Obj.DWARF.PubNames,
                                  Obj.IsLittleEndian);

What means `emitPubSection` is always called and it writes few values.

This patch fixes the issue. I've reduced `sizeofcmds` by size of data
previously written because of this bug.

Differential revision: https://reviews.llvm.org/D81686

d95f8e7a

[PowerPC] refactor convertToImmediateForm - NFC · 9b6e86a1

Chen Zheng authored Jun 12, 2020

This is a NFC patch to make convertToImmediateForm a light wrapper
for converting xform and imm form instructions on PowerPC.

Reviewed By: Steven.zhang

Differential Revision: https://reviews.llvm.org/D80907

9b6e86a1

[InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0" · 012909dc

EgorBo authored Jun 12, 2020

Summary:
"X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two)
However, "X % Y" can also be represented as "X - (X / Y) * Y" so if I rewrite the initial expression:
"X - (X / C) * C == 0" it's not currently optimized to "X & C-1 == 0", see godbolt: https://godbolt.org/z/KzuXUj

This is my first contribution to LLVM so I hope I didn't mess things up

Reviewers: lebedev.ri, spatel

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79369

012909dc

[llvm/Object] Reimplment basic_symbol_iterator in TapiFile · 425c6f07

Jonas Devlieghere authored Jun 12, 2020

Use indices into the Symbols vector instead of casting the objects in
the vector and dereferencing std::vector::end().

This change is NFC modulo the Windows failure reported by
llvm-clang-x86_64-expensive-checks-win.

Differential revision: https://reviews.llvm.org/D81717

425c6f07

[AArch64] Extend AArch64SLSHardeningPass to harden BLR instructions. · c35ed40f

Kristof Beyls authored Jun 11, 2020

To make sure that no barrier gets placed on the architectural execution
path, each
  BLR x<N>
instruction gets transformed to a
  BL __llvm_slsblr_thunk_x<N>
instruction, with __llvm_slsblr_thunk_x<N> a thunk that contains
__llvm_slsblr_thunk_x<N>:
  BR x<N>
  <speculation barrier>

Therefore, the BLR instruction gets split into 2; one BL and one BR.
This transformation results in not inserting a speculation barrier on
the architectural execution path.

The mitigation is off by default and can be enabled by the
harden-sls-blr subtarget feature.

As a linker is allowed to clobber X16 and X17 on function calls, the
above code transformation would not be correct in case a linker does so
when N=16 or N=17. Therefore, when the mitigation is enabled, generation
of BLR x16 or BLR x17 is avoided.

As BLRA* indirect calls are not produced by LLVM currently, this does
not aim to implement support for those.

Differential Revision:  https://reviews.llvm.org/D81402

c35ed40f

[JumpThreading] Handle zero !prof branch_weights · 707836ed

Yevgeny Rouban authored Jun 12, 2020

Avoid division by zero in updatePredecessorProfileMetadata().

Reviewers: yamauchi
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81499

707836ed

[X86] Add a helper lambda to getIntelProcessorTypeAndSubtype to select feature... · 0ce9bf6e

Craig Topper authored Jun 11, 2020

[X86] Add a helper lambda to getIntelProcessorTypeAndSubtype to select feature bits from the correct 32-bit feature variable.

We have three 32 bit variables containing feature bits. But our
enum is a flat 96 bit space. So we need to pick which of the
variables to use based on the bit value. We used to do this
manually by mentioning the correct variable and subtracting an
offset from the enum. But this is error prone.

0ce9bf6e

[StackSafety] Fix byval handling · 99930732

Vitaly Buka authored Jun 11, 2020

We don't need process paramenters which marked as
byval as we are not going to pass interested allocas
without copying.

If we pass value into byval argument, we just handle that
as Load of corresponding type and stop that branch of analysis.

99930732

[BPF] fix incorrect type in BPFISelDAGToDAG readonly load optimization · 4db18781

Yonghong Song authored Jun 10, 2020

In BPF Instruction Selection DAGToDAG transformation phase,
BPF backend had an optimization to turn load from readonly data
section to direct load of the values. This phase is implemented
before libbpf has readonly section support and before alu32
is supported.

This phase however may generate incorrect type when alu32 is
enabled. The following is an example,
  -bash-4.4$ cat ~/tmp2/t.c
  struct t {
    unsigned char a;
    unsigned char b;
    unsigned char c;
  };
  extern void foo(void *);
  int test() {
    struct t v = {
      .b = 2,
    };
    foo(&v);
    return 0;
  }

The compiler will turn local variable "v" into a readonly section.
During instruction selection phase, the compiler generates two
loads from readonly section, one 2 byte load or 1 byte load, e.g., for 2 loads,
  t8: i32,ch = load<(dereferenceable load 2 from `i8* getelementptr inbounds
       (%struct.t, %struct.t* @__const.test.v, i64 0, i32 0)`, align 1),
       anyext from i16> t3, GlobalAddress:i64<%struct.t* @__const.test.v> 0, undef:i64
  t9: ch = store<(store 2 into %ir.v1.sub1), trunc to i16> t3, t8,
    FrameIndex:i64<0>, undef:i64

BPF backend changed t8 to i64 = Constant<2> and eventually the generated machine IR:
  t10: i64 = MOV_ri TargetConstant:i64<2>
  t40: i32 = SLL_ri_32 t10, TargetConstant:i32<8>
  t41: i32 = OR_ri_32 t40, TargetConstant:i64<0>
  t9: ch = STH32<Mem:(store 2 into %ir.v1.sub1)> t41, TargetFrameIndex:i64<0>,
      TargetConstant:i64<0>, t3

Note that t10 in the above is not correct. The type should be i32 and instruction
should be MOV_ri_32. The reason for incorrect insn selection is BPF insn selection
generated an i64 constant instead of an i32 constant as specified in the original
load instruction. Such incorrect insn sequence eventually caused the following
fatal error when a COPY insn tries to copy a 64bit register to a 32bit subregister.
  Impossible reg-to-reg copy
  UNREACHABLE executed at ../lib/Target/BPF/BPFInstrInfo.cpp:42!

This patch fixed the issue by using the load result type instead of always i64
when doing readonly load optimization.

Differential Revision: https://reviews.llvm.org/D81630

4db18781

[llvm][llvm-nm] add TextAPI/MachO support · 28fefcc8

Cyndy Ishida authored Jun 11, 2020

Summary:
This completes the needed glueing to support reading tbd files from nm.
This includes specifying which slice filtering with `--arch` and a new
option specifically for tbd files `--add-inlinedinfo` which will show
the reexported libraries that are appended in the tbd file.

Reviewers: ributzka, steven_wu, JDevlieghere, jhenderson

Reviewed By: JDevlieghere

Subscribers: hiraditya, MaskRay, dexonsmith, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81614

28fefcc8

Verify MemorySSA after all updates. · 519b019a
Alina Sbirlea authored Jun 11, 2020
```
Verify after completing all updates.
Resolves PR46275.
```
519b019a
Tidy up unsigned -> Register fixups. · 3ff8f619
Eric Christopher authored Jun 11, 2020

3ff8f619
Add a diagnostic string to an assert. · cb21b168
Eric Christopher authored Jun 11, 2020

cb21b168
AMDGPU/GlobalISel: Fix select of private <2 x s16> load · 7d913bec
Matt Arsenault authored Jun 11, 2020

7d913bec
[VectorCombine] remove unused parameters; NFC · 039ff29e
Sanjay Patel authored Jun 11, 2020

039ff29e
[StackSafety,NFC] Fix use of CallBase API · a10fc165
Vitaly Buka authored Jun 11, 2020
```
Code does not need iterate arguments and can get ArgNo from
CallBase::getArgOperandNo.
```
a10fc165
AMDGPU/GlobalISel: Fix select of <8 x s64> scalar load · 27f8bd94
Matt Arsenault authored Jun 11, 2020

27f8bd94

AMDGPU/GlobalISel: Set insert point when emitting control flow pseudos · 2247072b

Matt Arsenault authored Jun 11, 2020

This was implicitly assuming the branch instruction was the next after
the pseudo. It's possible for another non-terminator instruction to be
inserted between the intrinsic and the branch, so adjust the insertion
point. Fixes a non-terminator after terminator verifier error (which
without the verifier, manifested itself as an infinite loop in
analyzeBranch much later on).

2247072b

[InlineCost] Preparational patch for creation of Printer pass. · 1022b5eb

Kirill Naumov authored Jun 11, 2020

- Renaming the printer class, flag
- Refactoring
- Changing some tests

This patch is a preparational stage for introducing a new printing pass and new
functionality to the existing Annotation Writer. I plan to extend
this functionality for this tool to be more useful when looking at the inline
process.

1022b5eb

[Support] Don't tie errs() to outs() by default · 03089752

Fangrui Song authored Jun 11, 2020

This reverts part of D81156.

Accessing errs() concurrently was safe before and racy after D81156.
(`errs() << 'a'` is always racy)

Accessing outs() and errs() concurrently was safe before and racy after D81156.

Don't tie errs() to outs() by default to fix the fallout.
llvm-dwarfdump is single-threaded and opting in the tie behavior is safe.

03089752

Fixed assertion in SROA if block has ho successors · a98d618f

Stanislav Mekhanoshin authored Jun 11, 2020

BasicBlock::isLegalToHoistInto() asserts if block does not
have successors. The case is degenarate but assertion still
needs to be avoided.

https://bugs.llvm.org/show_bug.cgi?id=46280

Differential Revision: https://reviews.llvm.org/D81674

a98d618f

[X86] Remove unnecessary #if around call to isCpuIdSupported in getHostCPUName. · c5251681

Craig Topper authored Jun 11, 2020

The exact same #if is already inside isCpuIdSupported and causes
it to return true. The definition of isCpuIdSupported isn't
conditional so we should be able just rely on its body doing
the right thing.

c5251681

[WebAssembly] Make BR_TABLE non-duplicable · c5d01234

Thomas Lively authored Jun 11, 2020

Summary:
After their range checks were removed in 7f50c15b, br_tables
started being duplicated into their predecessors by tail
folding. Unfortunately, when the br_tables were in loops this
transformation introduced bad irreducible control flow which was later
expanded into even more br_tables. This commit abuses the
`isNotDuplicable` property to prevent this irreducible control flow
from being introduced. This change saves a few dozen bytes of code
size and has a negligible affect on performance for most of the large
Emscripten benchmarks, but can improve performance significantly on
microbenchmarks of switches in loops.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81628

c5d01234

Jun 11, 2020

Re-land "Migrate the rest of COFFObjectFile to Error" · 1c03389c

Reid Kleckner authored Jun 11, 2020

This reverts commit 101fbc01.

Remove leftover debugging attribute.

Update LLDB as well, which was missed before.

1c03389c

[X86] Force VIA PadLock crypto instructions to emit a 0xF3 prefix when they... · 8fa3e8fa

Craig Topper authored Jun 11, 2020

[X86] Force VIA PadLock crypto instructions to emit a 0xF3 prefix when they encode to match what GNU as does.

The spec for these says they need 0xf3 but also mentions REP
before the mnemonic. But I don't think its fair to users to make
them write REP first. And gas doesn't make them. objdump seems to
disassemble with or without the prefix and just prints any 0xf3
as REP.

8fa3e8fa

[X86] Replace TB with PS on instructions that are documented in the SDM with 'NP' · 269d8437

Craig Topper authored Jun 11, 2020

'NP' means that the instruction is not recognized with a 66, F2 or F3
prefix. It will either #UD or decode to a different instruction.

All of the cases are here should fall into the #UD variety since
we should be detecting the collision with other instructions when
we build the disassembler tables.

269d8437

[NFC] clean up the AsmPrinter::emitLinkage for AIX part · c6be3ea5

diggerlin authored Jun 11, 2020

SUMMARY:

Since we deal with aix emitLinkage in the PPCAIXAsmPrinter::emitLinkage() in the patch https://reviews.llvm.org/D75866. It do not go to AsmPrinter::emitLinkage() any more, we clean up some aix related code in the AsmPrinter::emitLinkage()

Reviewers: Jason liu

Differential Revision: https://reviews.llvm.org/D81613

c6be3ea5

AMDGPU/GlobalISel: Fix lower for f64->f16 G_FPTRUNC · bd3d951b

Petar Avramovic authored Jun 11, 2020

Put AND before ADD in LegalizerHelper::lowerFPTRUNC_F64_TO_F16
in order to match algorithm from AMDGPUTargetLowering::LowerFP_TO_FP16.

Differential Revision: https://reviews.llvm.org/D81666

bd3d951b

[llvm][NFC] Factor some common data in InlineAdvice · e82eff7a

Mircea Trofin authored Jun 09, 2020

Summary:
Other derivations will all want to emit optimization remarks and, as
part of that, use debug info.

Additionally, drive-by const-ing.

Reviewers: davidxl, dblaikie

Subscribers: aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81507

e82eff7a

[X86] Fold vXi1 OR(KSHIFTL(X,NumElts/2),Y) -> KUNPCK · 7706c7af

Simon Pilgrim authored Jun 11, 2020

Convert shift+or bool vector patterns into CONCAT_VECTORS if we know this will be lowered to KUNPCK (which requires 16+ vector elements).

Fixes PR32547

7706c7af

Fix return status of DataFlowSanitizer pass · bff09876

serge-sans-paille authored Jun 04, 2020

Take into account added functions, global values and attribute change.

Differential Revision: https://reviews.llvm.org/D81239

bff09876

[IR] Clean up dead instructions after simplifying a conditional branch · 69bdfb07

Jay Foad authored May 18, 2020

Change BasicBlock::removePredecessor to optionally return a vector of
instructions which might be dead. Use this in ConstantFoldTerminator to
delete them if they are dead.

Reapply with a bug fix: don't drop the "!KeepOneInputPHIs" argument when
removePredecessor calls PHINode::removeIncomingValue.

Differential Revision: https://reviews.llvm.org/D80206

69bdfb07

[IR] Remove assert from ShuffleVectorInst · 3d5f7c85

Sam Parker authored Jun 11, 2020

Which triggers on valid, but not useful, IR such as a undef mask.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46276

Differential Revision: https://reviews.llvm.org/D81634

3d5f7c85

Revert "[IR] Clean up dead instructions after simplifying a conditional branch" · f45c65aa
Jay Foad authored Jun 11, 2020
```
This reverts commit 4494e453.

It caused problems for sanitizer buildbots.
```
f45c65aa

[IR] Clean up dead instructions after simplifying a conditional branch · 4494e453

Jay Foad authored May 18, 2020

Change BasicBlock::removePredecessor to optionally return a vector of
instructions which might be dead. Use this in ConstantFoldTerminator to
delete them if they are dead.

Differential Revision: https://reviews.llvm.org/D80206

4494e453

[MemCpyOptimizer] Simplify API of processStore and processMem* functions · f79e6a88

Jay Foad authored Jun 10, 2020

Previously these functions either returned a "changed" flag or a "repeat
instruction" flag, and could also modify an iterator to control which
instruction would be processed next.

Simplify this by always returning a "changed" flag, and handling all of
the "repeat instruction" functionality by modifying the iterator.

No functional change intended except in this case:
// If the source and destination of the memcpy are the same, then zap it.
... where the previous code failed to process the instruction after the
zapped memcpy.

Differential Revision: https://reviews.llvm.org/D81540

f79e6a88

[llvm/DWARFDebugLine] Remove spurious full stop from warning messages · 9ed452f3
Pavel Labath authored Jun 11, 2020
```
Other warnings messages don't have a trailing full stop.
```
9ed452f3
[llvm/DWARFDebugLine] Fix a typo in one warning message · fccaa89e
Pavel Labath authored Jun 11, 2020

fccaa89e