Commits · 70783e5f355739ff7863d1bd18f5c95145b03885 · Roger Ferrer / llvm-epi

Jun 08, 2018

Test commit: remove a blank line · 70783e5f
Ryan Prichard authored Jun 08, 2018
```
Test commit access

llvm-svn: 334324
```
70783e5f

[ARM] Allow CMPZ transforms even if the input has multiple uses. · 864df223

Eli Friedman authored Jun 08, 2018

It looks like this got left in by accident in r289794; I can't think of
any reason this check would be necessary.  (Maybe it was meant to be a
check that the AND has one use? But we check that a few lines earlier.)

Differential Revision: https://reviews.llvm.org/D47921

llvm-svn: 334322

864df223

[SmallSet] Add some simple unit tests. · 79510be7

Florian Hahn authored Jun 08, 2018

Reviewers: craig.topper, dblaikie

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D47940

llvm-svn: 334321

79510be7

[SCEV] Look through zero-extends in howFarToZero · b10ea392

Krzysztof Parzyszek authored Jun 08, 2018

An expression like
  (zext i2 {(trunc i32 (1 + %B) to i2),+,1}<%while.body> to i32)
will become zero exactly when the nested value becomes zero in its type.
Strip injective operations from the input value in howFarToZero to make
the value simpler.

Differential Revision: https://reviews.llvm.org/D47951

llvm-svn: 334318

b10ea392

[InstCombine] Skip dbg.value(s) when looking at stack{save,restore}. · 189c2cf1
Davide Italiano authored Jun 08, 2018
```
Fixes PR37713.

llvm-svn: 334317
```
189c2cf1
[InstCombine] add llvm.assume + debuginfo test (PR37726); NFC · afcf39e1
Sanjay Patel authored Jun 08, 2018
```
llvm-svn: 334314
```
afcf39e1

[asan] Instrument comdat globals on COFF targets · 0bab2220

Reid Kleckner authored Jun 08, 2018

Summary:
If we can use comdats, then we can make it so that the global metadata
is thrown away if the prevailing definition of the global was
uninstrumented. I have only tested this on COFF targets, but in theory,
there is no reason that we cannot also do this for ELF.

This will allow us to re-enable string merging with ASan on Windows,
reducing the binary size cost of ASan on Windows.

Reviewers: eugenis, vitalybuka

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D47841

llvm-svn: 334313

0bab2220

[DAGCombiner] clean up comments; NFC · 498564e6
Sanjay Patel authored Jun 08, 2018
```
llvm-svn: 334312
```
498564e6

[X86][SSE] Support v8i16/v16i16 rotations · 5c32989c

Simon Pilgrim authored Jun 08, 2018

Extension to D46954 (PR37426), this patch adds support for v8i16/v16i16 rotations in a similar manner - the conversion of the shift/rotate amount to a multiplication factor and the use of PMULLW to shift left and PMULHUW (ISD::MULHU) to shift the wrapped bits back around to be ORd together.

Differential Revision: https://reviews.llvm.org/D47822

llvm-svn: 334309

5c32989c

[x86] add tests for node-level FMF; NFC · 70314bd6
Sanjay Patel authored Jun 08, 2018
```
These cases should be optimized using the change from D47911.

llvm-svn: 334308
```
70314bd6
[x86] regenerate test checks; NFC · 9995a00a
Sanjay Patel authored Jun 08, 2018
```
llvm-svn: 334307
```
9995a00a

Utilize new SDNode flag functionality to expand current support for fsub · bf90d1f2

Michael Berg authored Jun 08, 2018

Summary: This patch originated from D46562 and is a proper subset, with some issues addressed for fsub.

Reviewers: spatel, hfinkel, wristow, arsenm

Reviewed By: spatel

Subscribers: wdng

Differential Revision: https://reviews.llvm.org/D47910

llvm-svn: 334306

bf90d1f2

[VPlan] Move recipe construction to VPRecipeBuilder. · 45e5d5b4

Florian Hahn authored Jun 08, 2018

This patch moves the recipe-creation functions out of
LoopVectorizationPlanner, which should do the high-level
orchestration of the transformations.

Reviewers: dcaballe, rengolin, hsaito, Ayal

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D47595

llvm-svn: 334305

45e5d5b4

[X86][BtVer2] Add support for all SUB/XOR 32/64 scalar instructions that... · 89deac66

Simon Pilgrim authored Jun 08, 2018

[X86][BtVer2] Add support for all SUB/XOR 32/64 scalar instructions that should match the dependency-breaking 'zero-idiom'

As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), these instructions are dependency breaking and fast-path zero the destination register (and appropriate EFLAGS bits).

llvm-svn: 334303

89deac66

[X86] Fix schedule-x86_64.s tests to use different registers in reg-reg cases · 59e915c6

Simon Pilgrim authored Jun 08, 2018

Same fix as rL334110: I noticed while working on zero-idiom + dependency-breaking support (PR36671) that most of our binary instruction schedule tests were reusing the same src registers, which would cause the tests to fail once we enable scalar zero-idiom support on btver2.

llvm-svn: 334302

59e915c6

[AMDGPU] Inline asm - added i16, half and i128 types support · c9a098b3

Daniil Fukalov authored Jun 08, 2018

AMDGPU inline assembler support i16, half and i128 typed variables in constraints, but they were reported as error.
Needed to fix https://github.com/RadeonOpenCompute/ROCm/issues/341,
e.g. to be able to load with global_load_dwordx4 to a 128bit integer variable

Differential Revision: https://reviews.llvm.org/D44920

llvm-svn: 334301

c9a098b3

reapply r334209 with fixes for harfbuzz in Chromium · 37433dc2

Daniil Fukalov authored Jun 08, 2018

r334209 description:
[LSR] Check yet more intrinsic pointer operands

the patch fixes another assertion in isLegalUse()

Differential Revision: https://reviews.llvm.org/D47794

llvm-svn: 334300

37433dc2

[NFC][InstSimplify] SimplifyAddInst(): coding style: variable names. · f87321a2
Roman Lebedev authored Jun 08, 2018
```
llvm-svn: 334299
```
f87321a2

[InstSimplify] add nuw %x, -1 -> -1 fold. · b060ce45

Roman Lebedev authored Jun 08, 2018

Summary:
`%ret = add nuw i8 %x, C`
From [[ https://llvm.org/docs/LangRef.html#add-instruction | langref ]]:
    nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”,
    respectively. If the nuw and/or nsw keywords are present,
    the result value of the add is a poison value if unsigned
    and/or signed overflow, respectively, occurs.

So if `C` is `-1`, `%x` can only be `0`, and the result is always `-1`.

I'm not sure we want to use `KnownBits`/`LVI` here, because there is
exactly one possible value (all bits set, `-1`), so some other pass
should take care of replacing the known-all-ones with constant `-1`.

The `test/Transforms/InstCombine/set-lowbits-mask-canonicalize.ll` change *is* confusing.
What happening is, before this: (omitting `nuw` for simplicity)
1. First, InstCombine D47428/rL334127 folds `shl i32 1, %NBits`) to `shl nuw i32 -1, %NBits`
2. Then, InstSimplify D47883/rL334222 folds `shl nuw i32 -1, %NBits` to `-1`,
3. `-1` is inverted to `0`.
But now:
1. *This* InstSimplify fold `%ret = add nuw i32 %setbit, -1` -> `-1` happens first,
   before InstCombine D47428/rL334127 fold could happen.
Thus we now end up with the opposite constant,
and it is all good: https://rise4fun.com/Alive/OA9

https://rise4fun.com/Alive/sldC
Was mentioned in D47428 review.
Follow-up for D47883.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47908

llvm-svn: 334298

b060ce45

[X86][BtVer2] Remove SBB tests that were accidentally added in rL334296 · efb4806b
Simon Pilgrim authored Jun 08, 2018
```
These aren't true zero-idiom instructions (just dependency breaking).

llvm-svn: 334297
```
efb4806b

[X86][BtVer2] Add tests for scalar SUB/XOR instructions that should match the... · 53766a98

Simon Pilgrim authored Jun 08, 2018

[X86][BtVer2] Add tests for scalar SUB/XOR instructions that should match the dependency-breaking 'zero-idiom'

As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions).

llvm-svn: 334296

53766a98

commandLineFitsWithinSystemLimits Overestimates System Limits · f0849132

Alexander Kornienko authored Jun 08, 2018

Summary:
The function `llvm::sys::commandLineFitsWithinSystemLimits` appears to be overestimating the system limits. This issue was discovered while attempting to enable response files in the Swift compiler. When the compiler submits its frontend jobs, those jobs are subjected to the system limits on command line length. `commandLineFitsWithinSystemLimits` is used to determine if the job's arguments need to be wrapped in a response file. There are some cases where the argument size for the job passes `commandLineFitsWithinSystemLimits`, but actually exceeds the real system limit, and the job fails.

`clang` also uses this function to decide whether or not to wrap it's job arguments in response files. See: https://github.com/llvm-mirror/clang/blob/master/lib/Driver/Driver.cpp#L1341. Clang will also fail for response files who's size falls within a certain range. I wrote a script that should find a failure point for `clang++`. All that is needed to run it is Python 2.7, and a simple "hello world" program for `test.cc`. It should run on Linux and on macOS. The script is available here: https://gist.github.com/dabelknap/71bd083cd06b91c5b3cef6a7f4d3d427. When it hits a failure point, you should see a `clang: error: unable to execute command: posix_spawn failed: Argument list too long`.

The proposed solution is to mirror the behavior of `xargs` in `commandLinefitsWithinSystemLimits`. `xargs` defaults to 128k for the command line length size (See: https://fossies.org/dox/findutils-4.6.0/buildcmd_8c_source.html#l00551). It adjusts this depending on the value of `ARG_MAX`.

Reviewers: alexfh

Reviewed By: alexfh

Subscribers: llvm-commits

Tags: #clang

Patch by Austin Belknap!

Differential Revision: https://reviews.llvm.org/D47795

llvm-svn: 334295

f0849132

Clean up some code in Program. · 66ef5d3c

Zachary Turner authored Jun 08, 2018

NFC here, this just raises some platform specific ifdef hackery
out of a class and creates proper platform-independent typedefs
for the relevant things.  This allows these typedefs to be
reused in other places without having to reinvent this preprocessor
logic.

llvm-svn: 334294

66ef5d3c

Add a file open flag that disables O_CLOEXEC. · 6edfecb8

Zachary Turner authored Jun 08, 2018

O_CLOEXEC is the right default, but occasionally you don't
want this.  This is especially true for tools like debuggers
where you might need to spawn the child process with specific
files already open, but it's occasionally useful in other
scenarios as well, like when you want to do some IPC between
parent and child.

llvm-svn: 334293

6edfecb8

[X86][BtVer2] Limit zero idiom tests to a single iteration. · aafcf9e4

Simon Pilgrim authored Jun 08, 2018

Reduces output size and we're only wanting to check that the instructions are fast-path'd (just Dispatch+Retire) anyhow

llvm-svn: 334292

aafcf9e4

Fix Wdocumentation warning for unknown param. NFCI. · c246d8dd
Simon Pilgrim authored Jun 08, 2018
```
llvm-svn: 334291
```
c246d8dd

[X86][SSE] Add SSE2/AVX2 vector rotate tests · eab9d204

Simon Pilgrim authored Jun 08, 2018

Now that we're custom lowering vector rotates for SSE in general we should be testing the combines with them as well.

llvm-svn: 334290

eab9d204

[X86][SSE] Simplify combineVectorTruncationWithPACKUS to reduce code duplication · a6afa310

Simon Pilgrim authored Jun 08, 2018

Simplify combineVectorTruncationWithPACKUS to mask the upper bits followed by calling truncateVectorWithPACK instead of duplicating with similar code.

This results in the codegen using (V)PACKUSDW on SSE41+ targets for vXi64/vXi32 inputs where before it always used PACKUSWB (along with a lot more bitcasting).

I've raised PR37749 as until we avoid unnecessary concats back to 256-bit for bitwise ops, we can't avoid splitting the input value into 128-bit subvectors for masking.

llvm-svn: 334289

a6afa310

[x86] restore test comment; NFC · ab4ca060

Sanjay Patel authored Jun 08, 2018

The description got deleted along with the FIXME note in
rL334268.

llvm-svn: 334288

ab4ca060

[BPI] Apply invoke heuristic before loop branch heuristic · 4d063e7b

Artur Pilipenko authored Jun 08, 2018

Currently the loop branch heuristic is applied before the invoke heuristic which makes us overestimate the probability of the unwind destination of invokes inside loops. This in turn makes us grossly underestimate the frequencies of loops with invokes.

Reviewed By: skatkov, vsk

Differential Revision: https://reviews.llvm.org/D47371

llvm-svn: 334285

4d063e7b

[VPlan] Move recipe based VPlan generation to separate function. · b3c6f07d

Florian Hahn authored Jun 08, 2018

This first step separates VPInstruction-based and VPRecipe-based
VPlan creation, which should make it easier to migrate to VPInstruction
based code-gen step by step.

Reviewers: Ayal, rengolin, dcaballe, hsaito, mkuper, mzolotukhin

Reviewed By: dcaballe

Subscribers: bollu, tschuett, rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D47477

llvm-svn: 334284

b3c6f07d

[ADT] Add `StringRef::rsplit(StringRef Separator)`. · 945c481a

Henry Wong authored Jun 08, 2018

Summary: Add `StringRef::rsplit(StringRef Separator)` to achieve the function of getting the tail substring according to the separator. A typical usage is to get `data` in `std::basic_string::data`.

Reviewers: mehdi_amini, zturner, beanz, xbolva00, vsk

Reviewed By: zturner, xbolva00, vsk

Subscribers: vsk, xbolva00, llvm-commits, MTC

Differential Revision: https://reviews.llvm.org/D47406

llvm-svn: 334283

945c481a

[mips] Correct the predicates for a number of codegen only instructions · 1d6254f7

Simon Dardis authored Jun 08, 2018

Reviewers: smaksimovic, atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D47638

llvm-svn: 334280

1d6254f7

[RISCV] Implement MC layer support for the fence.tso instruction · ed53ca73

Alex Bradbury authored Jun 08, 2018

The instruction makes use of a previously ignored field in the fence
instruction. It is introduced in the version 2.3 draft of the RISC-V
specification after much work by the Memory Model Task Group.

As clarified here <https://github.com/riscv/riscv-isa-manual/issues/186>,
the fence.tso assembler mnemonic does not have operands.

llvm-svn: 334278

ed53ca73

[X86][SSE] Consistently prefer lowering to PACKUS over PACKSS · ad45efc4

Simon Pilgrim authored Jun 08, 2018

We have some combines/lowerings that attempt to use PACKSS-then-PACKUS and others that use PACKUS-then-PACKSS.

PACKUS is much easier to combine with if we know the upper bits are zero as ComputeKnownBits can easily see through BITCASTs etc. especially now that rL333995 and rL334007 have landed. It also effectively works at byte level which further simplifies shuffle combines.

The only (minor) annoyances are that ComputeKnownBits can sometimes take longer as it doesn't fail as quickly as ComputeNumSignBits (but I'm not seeing any actual regressions in tests) and PACKUSDW only became available after SSE41 so we have more codegen diffs between targets.

llvm-svn: 334276

ad45efc4

[TableGen] Make DAGInstruction own Pattern to avoid leaking it. · 84e6ef00

Florian Hahn authored Jun 08, 2018

Reviewers: dsanders, craig.topper, stoklund, nhaehnle

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D47525

llvm-svn: 334275

84e6ef00

[LV] Fix PR36983. For a given recurrence, fix all phis in exit block · 9ba0aa2d

Roman Shirokiy authored Jun 08, 2018

There could be more than one PHIs in exit block using same loop recurrence.
Don't assume there is only one and fix each user.

Differential Revision: https://reviews.llvm.org/D47788

llvm-svn: 334271

9ba0aa2d

AMDGPU: Error on LDS global address in functions · 6fc37598

Matt Arsenault authored Jun 08, 2018

These won't work as expected now, so error on them to avoid
wasting time debugging this in the future.

llvm-svn: 334269

6fc37598

[DAGCombine] Fix for PR37667 · 16f963ba

Sam Parker authored Jun 08, 2018

While trying to propagate AND masks back to loads, we currently allow
one non-load node to be included as a leaf in chain. This fix now
limits that node to produce only a single data value.

Differential Revision: https://reviews.llvm.org/D47878

llvm-svn: 334268

16f963ba

[NFC] fix formatting · 863fb7a4
Hiroshi Inoue authored Jun 08, 2018
```
llvm-svn: 334263
```
863fb7a4