Commits · 1d68112c4b9ec7502ed9776555fd499cb2483347 · Lorenzo Albano / LLVM bpEVL

Jan 25, 2018

[InstCombine] narrow masked zexted binops (PR35792) · 1d68112c

Sanjay Patel authored Jan 25, 2018

This is guarded by shouldChangeType(), so the tests show that
we don't do the fold if the narrower type is not legal. Note
that there is a proposal (D42424) that would change the results
for the specific cases shown in these tests. That difference is
also discussed in PR35792:
https://bugs.llvm.org/show_bug.cgi?id=35792

Alive proofs for the cases handled here as well as the bitwise 
logic binops that we should already do better on:
https://rise4fun.com/Alive/c97
https://rise4fun.com/Alive/Lc5E
https://rise4fun.com/Alive/kdf

llvm-svn: 323437

1d68112c

[InstCombine] add tests for PR35792; NFC · 0f95dd23
Sanjay Patel authored Jan 25, 2018
```
llvm-svn: 323436
```
0f95dd23
Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle." · a0b2c78e
Alexey Bataev authored Jan 25, 2018
```
This reverts commit r323430 to fix buildbots.

llvm-svn: 323432
```
a0b2c78e

[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. · ad51fe36

Alexey Bataev authored Jan 25, 2018

Summary:
If the same value is going to be vectorized several times in the same
tree entry, this entry is considered to be a gather entry and cost of
this gather is counter as cost of InsertElementInstrs for each gathered
value. But we can consider these elements as ShuffleInstr with
SK_PermuteSingle shuffle kind.

Reviewers: spatel, RKSimon, mkuper, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38697

llvm-svn: 323430

ad51fe36

[X86][SSE] Add tests for vector truncation with signed saturation · fb01d066
Simon Pilgrim authored Jan 25, 2018
```
AVX512 isn't using X86ISD::VTRUNCS and SSE/AVX isn't using PACKSS/PACKUS

llvm-svn: 323428
```
fb01d066

Update build_llvm_package.bat · f85c4960

Hans Wennborg authored Jan 25, 2018

I moved to a new machine and had to adjust a few things:

- Use %USERNAME% instead of %USER% (not sure why %USER% didn't work anymore)
- Update paths for using Python 3.6 instead of 3.5
- Skip building OpenMP which seems broken on Windows
- Work around new vsdevcmd.bat changing paths:
  https://developercommunity.visualstudio.com/content/problem/26780/vsdevcmdbat-changes-the-current-working-directory.html
- Build stage-0 compiler with MinSizeRel to work around VS 2017 bug:
  https://developercommunity.visualstudio.com/content/problem/139043/miscompile-in-trivial-c-program-with-155-preview-2.html

llvm-svn: 323427

f85c4960

[X86][SSE] Add tests for vector truncation with unsigned saturation · e59bf81e
Simon Pilgrim authored Jan 25, 2018
```
AVX512 tends to do a good job, but there are some missed opportunities with SSE/AVX

llvm-svn: 323422
```
e59bf81e

X86 Tests: Add AVX+XOP config to SDIV combine tests · 0fb9638e

Zvi Rackover authored Jan 25, 2018

As pointed out in D42479, XOP also needs to be covered as it supports
vector shifts with variable shift amount.

llvm-svn: 323418

0fb9638e

Another try to commit 323321 (aggressive instruction combine). · f1f57a31
Amjad Aboud authored Jan 25, 2018
```
llvm-svn: 323416
```
f1f57a31

[LTO] - Get rid of friend 'computeDeadSymbols'. NFC. · 0027ddfd

George Rimar authored Jan 25, 2018

computeDeadSymbols accessed isLive() which was not public
before. It does not make much sence to keep isLive() private
because flags are available via flags() public member anyways.

llvm-svn: 323415

0027ddfd

[Dwarf] Add dsymutil Atom extensions. NFC · 2c14b155

Jonas Devlieghere authored Jan 25, 2018

This patch extends the atom types used by the Apple accelerator tables
with two dsymutil extensions:

 - DW_ATOM_type_type_flags
 - DW_ATOM_qual_name_hash

llvm-svn: 323414

2c14b155

[GlobalOpt] Emit fragments using field offsets from struct layout · 886edf8f

Mikael Holmen authored Jan 25, 2018

Summary:
When creating the debug fragments for a SRA'd struct, use the fields'
offsets, taken from the struct layout, as the offsets for the resulting
fragments. This fixes an issue where GlobalOpt would emit fragments with
incorrect offsets for padded fields.

This should solve PR36016.

Patch by David Stenberg.

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42489

llvm-svn: 323411

886edf8f

[FuzzMutate] Inst deleter doesn't work with PhiNodes · 7a43f26d
Igor Laevsky authored Jan 25, 2018
```
Differential Revision: https://reviews.llvm.org/D42412

llvm-svn: 323409
```
7a43f26d
[IRMover] Add comment and fix test case · 41e45955
Eugene Leviant authored Jan 25, 2018
```
llvm-svn: 323407
```
41e45955

[X86] Expand IMUL/MUL instregexs in Intel scheduler models. Add load latency... · b369cdba

Craig Topper authored Jan 25, 2018

[X86] Expand IMUL/MUL instregexs in Intel scheduler models. Add load latency to some of them in SkylakeClient model.

The regular expressions and the imul names caused some instructions to be matched by multiple regexs creating unpredictable results.

This changes them all to use explicit instrs instead.

While doing this I also found that some instructions in Skylake were missing load latency so I fixed that too.

llvm-svn: 323406

b369cdba

[X86] Expand IMUL/MUL instregexs in Znver1 scheduler to show what's actually implemented. · 795b17f4

Craig Topper authored Jan 25, 2018

The IMUL instruction names mixed with the prefix matching of the instregex lead to some strange matches. The worst being that several memory instructions are using the register form latency.

I don't know what the right answer is, so I've left TODOs and will try to work with the AMD folks to get this cleaned up.

llvm-svn: 323405

795b17f4

[cmake] Set cmake policy CMP0068 to suppress warnings on OSX · 32ff6599

Don Hinton authored Jan 25, 2018

Set cmake policy CMP0068=NEW, if available, and set
"CMAKE_BUILD_WITH_INSTALL_NAME_DIR=On" globally to
maintain current behavior.

This is needed to suppress warnings on OSX starting with cmake version
3.9.6.

Differential Revision: https://reviews.llvm.org/D42463

llvm-svn: 323404

32ff6599

[X86] Name the MMX phaddd instruction with 3 Ds instead of just 2. NFC · 066e7376
Craig Topper authored Jan 25, 2018
```
llvm-svn: 323403
```
066e7376

[X86] Remove 64/128/256 from MMX/SSE/AVX instruction names for overall consistency. NFC · dbddac09

Craig Topper authored Jan 25, 2018

MMX instrutions all start with MMX_ so the 64 isn't needed for disambigutation.
SSE/AVX1 instructions are assumed 128-bit so we don't need to say 128.
AVX2 instructions should use a Y to indicate 256-bits.

llvm-svn: 323402

dbddac09

[X86] Remove unnecessary '_alt' and '_Int' from scheduler model regular expressions. · 81c87092

Craig Topper authored Jan 25, 2018

These were treated as optional suffixes, but the regular expressions are already prefix matches so this is unnecessary. It breaks the binary search optimization in tablegen due to the top level question mark.

llvm-svn: 323401

81c87092

Add support for pattern matching MachineInsts. · 2036f446

Aditya Nandakumar authored Jan 25, 2018

https://reviews.llvm.org/D42439

Add Instcombine like matchers for MachineInstructions. There are only
globalISel matchers for now.

llvm-svn: 323400

2036f446

[ORC] Refactor the various lookupFlags methods to return the flags map via the · c8a74a04

Lang Hames authored Jan 25, 2018

first argument.

This makes lookupFlags more consistent with lookup (which takes the query as the
first argument) and composes better in practice, since lookups are usually
linearly chained: Each lookupFlags can populate the result map based on the
symbols not found in the previous lookup. (If the maps were returned rather than
passed by reference there would have to be a merge step at the end).

llvm-svn: 323398

c8a74a04

[GISel]: Fix modules build by including <cassert> · 7cff1908
Aditya Nandakumar authored Jan 25, 2018
```
llvm-svn: 323394
```
7cff1908

[ORC] Try to silence compiler error at · 357b88dc

Lang Hames authored Jan 25, 2018

http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/17264

NFC.

llvm-svn: 323393

357b88dc

[GISel]: Implement GlobalISel combiner API. · 81c81b64

Aditya Nandakumar authored Jan 25, 2018

https://reviews.llvm.org/D41373

The various components are

GICombinerHelper contains transformations that are common to all
targets. Targets can pick and choose which transformations (at
function/opcode granularity) each pass uses via configuring a
GICombinerInfo.

GICombiner contains some common code and it does the traversal,
driving of combines, worklist management and iterating until
convergence.

GICombinerInfo is an interface with a virtual method called combine.
The combiner info will allow targets to pick and choose (or
implement their own specific combines). CombineInfos can make
use of available combines in GICombineHelper to configure the
transformations for a particular pass. Currently this approach allows
cherry picking transformations from helpers (at function/opcode
granularity) and also allows early returning on specific
transformations. Targets also get to prioritize whether target specific
combines run before/after the opt-in generic combines. Ideally we would
like this part to be configured by both C++ and Tablegen. The
CombinerInfo also has a field which indicates how to deal with
IllegalOps (ie - should we allow to create them/or legalize them?).

A CombinerPass would configure a CombinerInfo, create the GICombiner
with the Info, and call
GICombiner::combineMachineInstrs(MachineFunction&).
This organization is very similar to the GISelLegalizer.

llvm-svn: 323392

81c81b64

[GlobalISel][TableGen] Fix the statistics for emitted patters · 4f3fa798

Volkan Keles authored Jan 25, 2018

Collected statistics for the number of patterns emitted can be
incorrect because rules can be grouped if OptimizeMatchTable
is enabled. Increase the counter in RuleMatcher::emit(...)
to avoid that.

llvm-svn: 323391

4f3fa798

[ORC] Add helpers for building orc::SymbolResolvers from legacy findSymbol-style · d78ba0d4

Lang Hames authored Jan 24, 2018

functions/methods that return JITSymbols.

lookupFlagsWithLegacyFn takes a SymbolNameSet and a legacy lookup function and
returns a LookupFlagsResult. It uses the legacy lookup function to search for
each symbol. If found, getFlags is called on the symbol and the flags added to
the SymbolFlags map. If not found, the symbol is added to the SymbolsNotFound
set.

lookupWithLegacyFn takes an AsynchronousSymbolQuery, a SymbolNameSet and a
legacy lookup function. Each symbol in the SymbolNameSet is searched for via the
legacy lookup function. If it is found, its getAddress function is called
(triggering materialization if it has not happened already) and the resulting
mapping stored in the query. If it is not found the symbol is added to the
unresolved symbols set which is returned at the end of the function. If an
error occurs during legacy lookup or materialization it is passed to the
query via setFailed and the function returns immediately.

llvm-svn: 323388

d78ba0d4

Jan 24, 2018

[GlobalISel] Add a requires: asserts to a test. · 5ee03988
Amara Emerson authored Jan 24, 2018
```
llvm-svn: 323384
```
5ee03988

[TableGen] Add a way of getting the number of generic opcodes without... · 4890a71f

Benjamin Kramer authored Jan 24, 2018

[TableGen] Add a way of getting the number of generic opcodes without including modular CodeGen headers.

This is a bit of a hack, but removes a cycle that broke modular builds
of LLVM. Of course the cycle is still there in form of a dependency
on the .def file.

llvm-svn: 323383

4890a71f

[InstCombine] fix datalayout in test file · 60c13c77

Sanjay Patel authored Jan 24, 2018

The only part of the datalayout that should matter for these tests
is the part that specifies the legal int widths ('n*'). But there
was a bug - that part of the string was not correctly separated with
the expected '-' character, so we were testing as if there were no
legal int widths at all. Removed the leading cruft so we have some 
legal ints to test with.

I noticed this while testing a potential change to the way we 
transform shifts and sexts in D42424.

llvm-svn: 323377

60c13c77

[ORC] Add a LambdaSymbolResolver convenience class and docs for SymbolResolver. · 7f20eacf

Lang Hames authored Jan 24, 2018

This patch adds a LambdaSymbolResolver convenience utility that can create an
orc::SymbolResolver from a pair of function objects that supply the behavior for
the lookupFlags and lookup methods.

This class plays the same role for orc::SymbolResolver as the legacy
LambdaResolver class plays for LegacyJITSymbolResolver, and will replace the
latter class once all ORC APIs are migrated to orc::SymbolResolver.

This patch also adds some documentation for the orc::SymbolResolver class as
this was left out of the original commit.

llvm-svn: 323375

7f20eacf

[Hexagon] Replace EmitFunctionEntryCode with a DAG preprocessing code · 14f3ef1f

Krzysztof Parzyszek authored Jan 24, 2018

The code in EmitFunctionEntryCode needs to know the maximum stack
alignment, but it runs very early in the selection process (before
lowering). The final stack alignment may change during lowering, so
the code needs to be moved to where the alignment is known.

llvm-svn: 323374

14f3ef1f

[globalisel] Fix long lines from r323342 · 538921dc

Daniel Sanders authored Jan 24, 2018

They would be fixed in a later patch but they shouldn't have been introduced.

llvm-svn: 323372

538921dc

[AArch64][GlobalISel] Fall back during AArch64 isel if we have a volatile load. · 4f84f886

Amara Emerson authored Jan 24, 2018

The tablegen imported patterns for sext(load(a)) don't check for single uses
of the load or delete the original after matching. As a result two loads are
left in the generated code. This particular issue will be fixed by adding
support for a G_SEXTLOAD opcode in future.

There are however other potential issues around this that wouldn't be fixed by
a G_SEXTLOAD, so until we have a proper solution we don't try to handle volatile
loads at all in the AArch64 selector.

Fixes/works around PR36018.

llvm-svn: 323371

4f84f886

[GlobalISel] Don't fall back to FastISel. · f386e2b0

Amara Emerson authored Jan 24, 2018

Apparently checking the pass structure isn't enough to ensure that we don't fall
back to FastISel, as it's set up as part of the SelectionDAGISel.

llvm-svn: 323369

f386e2b0

[X86][SSE] Aggressively use PMADDWD for v4i32 multiplies with 17 or more leading zeros · 9f551ad6

Simon Pilgrim authored Jan 24, 2018

As discussed in D41484, PMADDWD for 'zero extended' vXi32 is nearly always a better option than PMULLD:
On SNB it will result in code that isn't any faster, but not any slower so we may as well keep it.
On KNL it only has half the throughput, so I've disabled it on there - ideally there'd be a better way than this.

Differential Revision: https://reviews.llvm.org/D42258

llvm-svn: 323367

9f551ad6

Simplify. NFC. · 349fe0aa
Rafael Espindola authored Jan 24, 2018
```
Thanks to Teresa Johnson for the suggestion.

llvm-svn: 323365
```
349fe0aa
[X86][SSE] Add slow-pmulld attribute (silvermont-style) test · 21f17d40
Simon Pilgrim authored Jan 24, 2018
```
Requested by @zvi on D42258

llvm-svn: 323364
```
21f17d40
Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle." · 0affccc8
Alexey Bataev authored Jan 24, 2018
```
This reverts commit r323348 because of the broken buildbots.

llvm-svn: 323359
```
0affccc8
Revert "[ThinLTO] Add call edges' relative block frequency to per-module summary." · bf38deef
Easwaran Raman authored Jan 24, 2018
```
Causes buildbot regressions.

llvm-svn: 323358
```
bf38deef