Commits · caa4de72a23b26107f6ddb45d27a252959c114f5 · Roger Ferrer / llvm-epi

Sep 07, 2018

[InstCombine][x86] add tests for possible blendv transform (PR38814); NFC · caa4de72
Sanjay Patel authored Sep 07, 2018
```
llvm-svn: 341715
```
caa4de72

[AST] Generalize argument specific aliasing · cb8b3278

Philip Reames authored Sep 07, 2018

AliasSetTracker has special case handling for memset, memcpy and memmove which pre-existed argmemonly on functions and readonly and writeonly on arguments. This patch generalizes it using the AA infrastructure to any call correctly annotated.

The motivation here is to cut down on confusion, not performance per se. For most instructions, there is a direct mapping to alias set. However, this is not guaranteed by the interface and was not in fact true for these three intrinsics *and only these three intrinsics*. I kept getting myself confused about this invariant, so I figured it would be good to clearly distinguish between a instructions and alias sets. Calls happened to be an easy target.

The nice side effect is that custom implementations of memset/memcpy/memmove - including wrappers discovered by IPO - can now be optimized the same as builts by LICM.

Note: The actual removal of the memset/memtransfer specific handling will happen in a follow on NFC patch. It was originally part of this one, but separate for ease of review and rebase.

Differential Revision: https://reviews.llvm.org/D50730

llvm-svn: 341713

cb8b3278

[codeview] Add .cv_string directive for testing purposes · 06d02d03

Reid Kleckner authored Sep 07, 2018

The main use case for this directive is to allow assembly writers to
write their own FPO data strings without going through the .cv_fpo*
directive family.

I'm experimenting with different RPN programs to fix PR38857, and I
figured I should go ahead and make this directive permanent.

llvm-svn: 341712

06d02d03

[X86] Add codegen tests for narrow PADDUS/PSUBUS patterns for PR38691. · fa535c02
Craig Topper authored Sep 07, 2018
```
llvm-svn: 341711
```
fa535c02

[MemorySSA] Update MemoryPhi wiring for block splitting to consider if identical edges were merged. · f98c2c5e

Alina Sbirlea authored Sep 07, 2018

Summary:
Block splitting is done with either identical edges being merged, or not.
Only critical edges can be split without merging identical edges based on an option.
Teach the memoryssa updater to take this into account: for the same edge between two blocks only move one entry from the Phi in Old to the new Phi in New.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D51563

llvm-svn: 341709

f98c2c5e

[InstCombine] narrow vector select with padded condition and extracted result (PR38691) · c1416b60

Sanjay Patel authored Sep 07, 2018

shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) -->
sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask)

The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=38691
...is the last regression test. In that case, we're just left with the narrow select.

Note that if we do create new shuffles, they use the existing extraction identity mask, 
so there's no danger that this transform creates arbitrary shuffles.

Differential Revision: https://reviews.llvm.org/D51496

llvm-svn: 341708

c1416b60

[WebAssembly] Change SIMD lane indices to vec_i8imm_op · 653278f8

Thomas Lively authored Sep 07, 2018

Summary: To explicitly opt out of LEB encoding for these immediates.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51766

llvm-svn: 341707

653278f8

[AArch64] Support reserving x1-7 registers. · 287a3be3

Nick Desaulniers authored Sep 07, 2018

Summary:
Reserving registers x1-7 is used to support CONFIG_ARM64_LSE_ATOMICS in Linux kernel. This change adds support for reserving registers x1 through x7.

Reviewers: javed.absar, phosek, srhines, nickdesaulniers, efriedma

Reviewed By: nickdesaulniers, efriedma

Subscribers: niravd, jfb, manojgupta, nickdesaulniers, jyknight, efriedma, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D48580

llvm-svn: 341706

287a3be3

[X86] Don't create ZERO_EXTEND_INREG/SIGN_EXTEND_INREG for v1iX vectors. · 5cbce81c

Craig Topper authored Sep 07, 2018

The generic type legalizer will scalarize vXi1 instructions getting rid of the vector entirely. Creating wider vector instructions is just going to prevent that.

llvm-svn: 341705

5cbce81c

[X86] Don't create X86ISD::AVG nodes from v1iX vectors. · 39f48fdc

Craig Topper authored Sep 07, 2018

The type legalizer will try to scalarize this and fail.

It looks like there's some other v1iX oddities out there too since we still generated some vector instructions.

llvm-svn: 341704

39f48fdc

[PGO] Fix some style issue of ControlHeightReduction · b3b61de0

Fangrui Song authored Sep 07, 2018

Reviewers: yamauchi

Reviewed By: yamauchi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51811

llvm-svn: 341702

b3b61de0

[CMake] Fix LLVM_ENABLE_LTO option on Windows · 00b407c9
Alexandre Ganea authored Sep 07, 2018
```
Differential Revision: https://reviews.llvm.org/D51804

llvm-svn: 341701
```
00b407c9

[X86] Modify the the rdtscp intrinsic to return values instead of taking a pointer argument · 4863313b

Craig Topper authored Sep 07, 2018

Similar to what was recently done for addcarry/subborrow and has been done for rdrand/rdseed for a while. It's better to use two results and an explicit store in IR when the store isn't part of the semantics of the instruction. This allows store->load forwarding to happen in the middle end. Or the store to be removed if its never loaded.

Differential Revision: https://reviews.llvm.org/D51803

llvm-svn: 341698

4863313b

[codeview] Improve readobj FPO dumper and pdbutil register names · ee0e8bab
Reid Kleckner authored Sep 07, 2018
```
The improved dumping helps me investigate PR38857.

llvm-svn: 341695
```
ee0e8bab
[PGO][CHR] Build/warning fix · 06650941
Hiroshi Yamauchi authored Sep 07, 2018
```
llvm-svn: 341692
```
06650941

[RISCV] Fix crash in decoding instruction with unknown floating point rounding mode · b2ed11a0

Ana Pazos authored Sep 07, 2018

Summary:
Instead of crashing in printFRMArg, decode and warn about invalid instruction.

This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer
for the RISC-V assembly language.

Reviewers: asb

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb

Differential Revision: https://reviews.llvm.org/D51705

llvm-svn: 341691

b2ed11a0

[Error] Reintroduce type validation in createFileError() · 96032447

Alexandre Ganea authored Sep 07, 2018

This prevents from using ErrorSuccess as an argument to createFileError().

Differential Revision: https://reviews.llvm.org/D51490

llvm-svn: 341689

96032447

[llvm-dwp] Clean up tests X86/*.test · 91c95a35
Fangrui Song authored Sep 07, 2018
```
llvm-svn: 341688
```
91c95a35

[RISCV] Fix AddressSanitizer heap-buffer-overflow in disassembling · b97d1894

Ana Pazos authored Sep 07, 2018

Summary:
RISCVDisassembler should check number of bytes available before reading them.
Crash noticed when enabling -DLLVM_USE_SANITIZER=Address.

This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer for the RISC-V assembly language.

Reviewers: asb

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb

Differential Revision: https://reviews.llvm.org/D51708

llvm-svn: 341686

b97d1894

NFC: remove magic bool in LoopIdiomRecognize · 7e2dd2d2
JF Bastien authored Sep 07, 2018
```
Use an enum class instead.

llvm-svn: 341684
```
7e2dd2d2

[PGO][CHR] Small cleanup. · 5fb509b7

Hiroshi Yamauchi authored Sep 07, 2018

Summary:
Do away with demangling. It wasn't really necessary.
Declared some local functions to be static.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51740

llvm-svn: 341681

5fb509b7

[Bindings][Go] Fixed go.test failure due to C-API argument type mismatch. · d8bdf16c

Kristina Brooks authored Sep 07, 2018

go.test was failing previously with error,
Command Output (stderr):
dibuilder.go:301: cannot use C.uint(t.Encoding) (type C.uint) as type
C.LLVMDWARFTypeEncoding in argument to func literal
This patch fixes the argument type.

Patch by Chirag (Chirag Patel)

Differential Revision: https://reviews.llvm.org/D51721

llvm-svn: 341680

d8bdf16c

utils/abtest: Refactor and add bisection method · e2dc6929

Matthias Braun authored Sep 07, 2018

- Refactor/rewrite most of the code. Also make sure it passes
  pycodestyle/pyflakes now
- Add a new mode that performs bisection on the search space. This
  should be faster in the common case where there is only a small number
  of files or functions actually leading to failure.
  The previous sequential behavior can still be accessed via `--seq`.

llvm-svn: 341679

e2dc6929

[X86] Change the addcarry and subborrow intrinsics to return 2 results and... · 72964ae9

Craig Topper authored Sep 07, 2018

[X86] Change the addcarry and subborrow intrinsics to return 2 results and remove the pointer argument.

We should represent the store directly in IR instead. This gives the middle end a chance to remove it if it can see a load from the same address.

Differential Revision: https://reviews.llvm.org/D51769

llvm-svn: 341677

72964ae9

[X86] Use regular expressions to make test immune to register allocation changes. · 51e11788
Craig Topper authored Sep 07, 2018
```
llvm-svn: 341676
```
51e11788

[X86] Teach X86DAGToDAGISel::foldLoadStoreIntoMemOperand to handle loads in... · 313d09af

Craig Topper authored Sep 07, 2018

[X86] Teach X86DAGToDAGISel::foldLoadStoreIntoMemOperand to handle loads in operand 1 of commutable operations.

Previously we only handled loads in operand 0, but nothing guarantees the load will be operand 0 for commutable operations.

Differential Revision: https://reviews.llvm.org/D51768

llvm-svn: 341675

313d09af

[InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible · 040c2b0a

Craig Topper authored Sep 07, 2018

If the ~X wasn't able to simplify above the max/min, we might be able to simplify it by moving it below the max/min.

I had to modify the ~(min/max ~X, Y) transform to prevent getting stuck in a loop when we saw the new ~(max/min X, ~Y) before the ~Y had been folded away to remove the new not.

Differential Revision: https://reviews.llvm.org/D51398

llvm-svn: 341674

040c2b0a

[LV] Fix code gen for conditionally executed loads and stores · 110df11a

Anna Thomas authored Sep 07, 2018

Fix a latent bug in loop vectorizer which generates incorrect code for
memory accesses that are executed conditionally. As pointed in review,
this bug definitely affects uniform loads and may affect conditional
stores that should have turned into scatters as well).

The code gen for conditionally executed uniform loads on architectures
that support masked gather instructions is broken.

Without this patch, we were unconditionally executing the *conditional*
load in the vectorized version.

This patch does the following:
1. Uniform conditional loads on architectures with gather support will
   have correct code generated. In particular, the cost model
   (setCostBasedWideningDecision) is fixed.
2. For the recipes which are handled after the widening decision is set,
   we use the isScalarWithPredication(I, VF) form which is added in the
   patch.

3. Fix the vectorization cost model for scalarization
   (getMemInstScalarizationCost): implement and use isPredicatedInst to
   identify *all* predicated instructions, not just scalar+predicated. So,
   now the cost for scalarization will be increased for maskedloads/stores
   and gather/scatter operations. In short, we should be choosing the
   gather/scatter in place of scalarization on archs where it is
   profitable.
4. We needed to weaken the assert in useEmulatedMaskMemRefHack.

Reviewers: Ayal, hsaito, mkuper

Differential Revision: https://reviews.llvm.org/D51313

llvm-svn: 341673

110df11a

Hot cold splitting pass · 801394a3

Aditya Kumar authored Sep 07, 2018

Find cold blocks based on profile information (or optionally with static analysis).
Forward propagate profile information to all cold-blocks.
Outline a cold region.
Set calling conv and prof hint for the callsite of the outlined function.

Worked in collaboration with: Sebastian Pop <s.pop@samsung.com>
Differential Revision: https://reviews.llvm.org/D50658

llvm-svn: 341669

801394a3

[InstCombine] Do not fold scalar ops over select with vector condition. · e32ff4b2

Florian Hahn authored Sep 07, 2018

If OtherOpT or OtherOpF have scalar types and the condition is a vector,
we would create an invalid select.

Reviewers: spatel, john.brawn, mssimpso, craig.topper

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D51781

llvm-svn: 341666

e32ff4b2

[DebugInfo] Handle stack slot offsets for spilled sub-registers in LDV · 45acc961

David Stenberg authored Sep 07, 2018

Summary:
Extend LDV so that stack slot offsets for spilled sub-registers
are added to the emitted debug locations. This is accomplished
by querying InstrInfo::getStackSlotRange().

With this change, LDV will add a DW_OP_plus_uconst operation to
the expression if a sub-register is spilled. Later on, PEI will
add an offset operation for the stack slot, meaning that we will
get expressions of the forms:

 * {DW_OP_constu #fp-offset, DW_OP_minus,
    DW_OP_plus_uconst #subreg-offset}

 * {DW_OP_plus_const #fp-offset,
    DW_OP_minus, DW_OP_plus_uconst #subreg-offset}

The two offset operations should ideally be merged.

Reviewers: rnk, aprantl, stoklund

Reviewed By: aprantl

Subscribers: dblaikie, bjope, nemanjai, JDevlieghere, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D51612

llvm-svn: 341659

45acc961

Add support for getRegisterByName. · 9ad0f027

Sid Manning authored Sep 07, 2018

Support required to build the Hexagon Linux kernel.

Differential Revision: https://reviews.llvm.org/D51363

llvm-svn: 341658

9ad0f027

[X86][SSE] Add additional fadd/fsub(x, bitcast_fneg(y)) tests with different integer bitwidths · 04d07484
Simon Pilgrim authored Sep 07, 2018
```
llvm-svn: 341657
```
04d07484

[DAGCombiner] foldBitcastedFPLogic - Add basic vector support · 96d6b9c2

Simon Pilgrim authored Sep 07, 2018

Add support for bitcasts from float type to an integer type of the same element bitwidth.

There maybe cases where we need to support different widths (e.g. as SSE __m128i is treated as v2i64) - but I haven't seen cases of this in the wild yet.

llvm-svn: 341652

96d6b9c2

[NewGVN] Mark function as changed if we erase instructions. · b30f7aee

Florian Hahn authored Sep 07, 2018

Currently eliminateInstructions only returns true if any instruction got
replaced. In the test case for this patch, we eliminate the trivially
dead calls, for which eliminateInstructions not do a replacement and the
function is not marked as changed, which is why the inliner crashes
while traversing the call graph.

Alternatively we could also change eliminateInstructions to return true
in case we mark instructions for deletion, but that's slightly more code
and doing it at the place where the replacement happens seems safer.

Fixes PR37517.

Reviewers: davide, mcrosier, efriedma, bjope

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D51169

llvm-svn: 341651

b30f7aee

[X86][SSE] Add fadd/fsub(x, bitcast_fneg(y)) tests · a2aef22a
Simon Pilgrim authored Sep 07, 2018
```
Show missing vector support

llvm-svn: 341650
```
a2aef22a

[dsymutil] Prevent non-determinism due to threading. · 475ce5a2

Jonas Devlieghere authored Sep 07, 2018

Before this patch, analyzeContext called getCanonicalDIEOffset(), for
which the result depends on the timings of the setCanonicalDIEOffset()
calls in the cloneLambda. This can lead to slightly different output
between runs due to threading.

To prevent this from happening, we now record the output debug info size
after importing the modules (before any concurrent processing takes
place). This value, named the ModulesEndOffset is used to compare the
canonical DIE offset against. If the value is greater than this offset,
the canonical DIE offset has been updated during cloning, and should
therefore not be considered for pruning.

Differential revision: https://reviews.llvm.org/D51443

llvm-svn: 341649

475ce5a2

[MSan] don't access MsanCtorFunction when using KMSAN · 6301574c
Alexander Potapenko authored Sep 07, 2018
```
MSan has found a use of uninitialized memory in MSan, fix it.

llvm-svn: 341646
```
6301574c

ARM: fix Thumb2 CodeGen for ldrex with folded frame-index. · bb7d7b3d

Tim Northover authored Sep 07, 2018

Because t2LDREX (& t2STREX) were marked as AddrModeNone, but did allow a
FrameIndex operand, rewriteT2FrameIndex asserted. This gives them a
proper addressing-mode and tells the rewriter about it so that encodable
offsets are exploited and others are rejected.

Should fix PR38828.

llvm-svn: 341642

bb7d7b3d

[MSan] Add KMSAN instrumentation to MSan pass · 8fe99a0e

Alexander Potapenko authored Sep 07, 2018

Introduce the -msan-kernel flag, which enables the kernel instrumentation.

The main differences between KMSAN and MSan instrumentations are:

- KMSAN implies msan-track-origins=2, msan-keep-going=true;
- there're no explicit accesses to shadow and origin memory.
  Shadow and origin values for a particular X-byte memory location are
  read and written via pointers returned by
  __msan_metadata_ptr_for_load_X(u8 *addr) and
  __msan_store_shadow_origin_X(u8 *addr, uptr shadow, uptr origin);
- TLS variables are stored in a single struct in per-task storage. A call
  to a function returning that struct is inserted into every instrumented
  function before the entry block;
- __msan_warning() takes a 32-bit origin parameter;
- local variables are poisoned with __msan_poison_alloca() upon function
  entry and unpoisoned with __msan_unpoison_alloca() before leaving the
  function;
- the pass doesn't declare any global variables or add global constructors
  to the translation unit.

llvm-svn: 341637

8fe99a0e