Commits · 069af04a4a3855bd0b7d7f1bdad8a0d8bb10db85 · Roger Ferrer / llvm-epi

Feb 19, 2019

Testing commit access · 069af04a
Andrew Scheidecker authored Feb 19, 2019
```
llvm-svn: 354378
```
069af04a

[X86] Don't consider functions ABI compatible for ArgumentPromotion pass if... · 6d0190f2

Craig Topper authored Feb 19, 2019

[X86] Don't consider functions ABI compatible for ArgumentPromotion pass if they view 512-bit vectors differently.

The use of the -mprefer-vector-width=256 command line option mixed with functions
using vector intrinsics can create situations where one function thinks 512 vectors
are legal, but another fucntion does not.

If a 512 bit vector is passed between them via a pointer, its possible ArgumentPromotion
might try to pass by value instead. This will result in type legalization for the two
functions handling the 512 bit vector differently leading to runtime failures.

Had the 512 bit vector been passed by value from clang codegen, both functions would
have been tagged with a min-legal-vector-width=512 function attribute. That would
make them be legalized the same way.

I observed this issue in 32-bit mode where a union containing a 512 bit vector was
being passed by a function that used intrinsics to one that did not. The caller
ended up passing in zmm0 and the callee tried to read it from ymm0 and ymm1.

The fix implemented here is just to consider it a mismatch if two functions
would handle 512 bit differently without looking at the types that are being
considered. This is the easist and safest fix, but it can be improved in the future.

Differential Revision: https://reviews.llvm.org/D58390

llvm-svn: 354376

6d0190f2

Revert "Revert "[llvm-objdump] Allow short options without arguments to be grouped"" · 0f436771

Matthew Voss authored Feb 19, 2019

  - Tests that use multiple short switches now test them grouped and ungrouped.

  - Ensure the output of ungrouped and grouped variants is identical

Differential Revision: https://reviews.llvm.org/D57904

llvm-svn: 354375

0f436771

Fix builds for older macOS deployment targets after r354365 · daf777b2

Daniel Sanders authored Feb 19, 2019

Surprisingly, check_symbol_exists is not sufficient. The macOS linker checks the
called functions against a compatibility list for the given deployment target
and check_symbol_exists doesn't trigger this check as it never calls the
function.

This fixes the GreenDragon bots where the deployment target is 10.9

llvm-svn: 354374

daf777b2

Annotate timeline in Instruments with passes and other timed regions. · e1414d17

Daniel Sanders authored Feb 19, 2019

Summary:
Instruments is a useful tool for finding performance issues in LLVM but it can
be difficult to identify regions of interest on the timeline that we can use
to filter the profiler or allocations instrument. Xcode 10 and the latest
macOS/iOS/etc. added support for the os_signpost() API which allows us to
annotate the timeline with information that's meaningful to LLVM.

This patch causes timer start and end events to emit signposts. When used with
-time-passes, this causes the passes to be annotated on the Instruments timeline.
In addition to visually showing the duration of passes on the timeline, it also
allows us to filter the profile and allocations instrument down to an individual
pass allowing us to find the issues within that pass without being drowned out
by the noise from other parts of the compiler.

Using this in conjunction with the Time Profiler (in high frequency mode) and
the Allocations instrument is how I found the SparseBitVector that should have
been a BitVector and the DenseMap that could be replaced by a sorted vector a
couple months ago. I added NamedRegionTimers to TableGen and used the resulting
annotations to identify the slow portions of the Register Info Emitter. Some of
these were placed according to educated guesses while others were placed
according to hot functions from a previous profile. From there I filtered the
profile to a slow portion and the aforementioned issues stood out in the
profile.

To use this feature enable LLVM_SUPPORT_XCODE_SIGNPOSTS in CMake and run the
compiler under Instruments with -time-passes like so:
  instruments -t 'Time Profiler' bin/llc -time-passes -o - input.ll'
Then open the resulting trace in Instruments.

There was a talk at WWDC 2018 that explained the feature which can be found at
https://developer.apple.com/videos/play/wwdc2018/405/ if you'd like to know
more about it.

Reviewers: bogner

Reviewed By: bogner

Subscribers: jdoerfert, mgorny, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D52954

llvm-svn: 354365

e1414d17

[libObject][NFC] Use sys::path::convert_to_slash. · d138b126

Jordan Rupprecht authored Feb 19, 2019

Summary: As suggested in rL353995

Reviewers: compnerd

Reviewed By: compnerd

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58298

llvm-svn: 354364

d138b126

[X86][SSE] Generalize X86ISD::BLENDI support to more value types · 0b3b9424

Simon Pilgrim authored Feb 19, 2019

D42042 introduced the ability for the ExecutionDomainFixPass to more easily change between BLENDPD/BLENDPS/PBLENDW as the domains required.

With this ability, we can avoid most bitcasts/scaling in the DAG that was occurring with X86ISD::BLENDI lowering/combining, blend with the vXi32/vXi64 vectors directly and use isel patterns to lower to the float vector equivalent vectors.

This helps the shuffle combining and SimplifyDemandedVectorElts be more aggressive as we lose track of fewer UNDEF elements than when we go up/down through bitcasts.

I've introduced a basic blend(bitcast(x),bitcast(y)) -> bitcast(blend(x,y)) fold, there are more generalizations I can do there (e.g. widening/scaling and handling the tricky v16i16 repeated mask case).

The vector-reduce-smin/smax regressions will be fixed in a future improvement to SimplifyDemandedBits to peek through bitcasts and support X86ISD::BLENDV.

Reapplied after reversion at rL353699 - AVX2 isel fix was applied at rL354358, additional test at rL354360/rL354361

Differential Revision: https://reviews.llvm.org/D57888

llvm-svn: 354363

0b3b9424

[NFC] Remove unused headers in Optional.h · 3d13ed97
Serge Guelton authored Feb 19, 2019
```
llvm-svn: 354362
```
3d13ed97
Fix stupid assembly comment typo · d58cc6f9
Simon Pilgrim authored Feb 19, 2019
```
llvm-svn: 354361
```
d58cc6f9
[X86][SSE] Add pblendw commuted load test case · e31838f8
Simon Pilgrim authored Feb 19, 2019
```
Reduced test case for the regression caused in D57888/rL353610

llvm-svn: 354360
```
e31838f8

[SDAG] Use shift amount type in MULO promotion; NFC · 04e45e93

Nikita Popov authored Feb 19, 2019

Directly use the correct shift amount type if it is possible, and
future-proof the code against vectors. The added test makes sure that
bitwidths that do not fit into the shift amount type do not assert.

Split out from D57997.

llvm-svn: 354359

04e45e93

[X86][AVX2] Hide VPBLENDD instructions behind AVX2 predicate · dce9c2a8

Simon Pilgrim authored Feb 19, 2019

This was the cause of the regression in D57888 - the commuted load pattern wasn't hidden by the predicate so once we enabled v4i32 blends on SSE41+ targets then isel was incorrectly matched against AVX2+ instructions.

llvm-svn: 354358

dce9c2a8

[X86] Bugfix for nullptr check by klocwork · 51a2e889

Craig Topper authored Feb 19, 2019

klocwork critical issues in CG files:

Patch by Xiang Zhang (xiangzhangllvm)

Differential Revision: https://reviews.llvm.org/D58363

llvm-svn: 354357

51a2e889

X86AsmParser AVX-512: Return error instead of hitting assert · d8acfe69

Craig Topper authored Feb 19, 2019

When parsing a sequence of tokens beginning with {, it will hit an assert and crash if the token afterwards is not an identifier. Instead of this, return a more verbose error as seen elsewhere in the function.

Patch by Brandon Jones (BrandonTJones)

Differential Revision: https://reviews.llvm.org/D57375

llvm-svn: 354356

d8acfe69

[X86] Filter out tuning feature flags and a few ISA feature flags when... · 236e1ce1

Craig Topper authored Feb 19, 2019

[X86] Filter out tuning feature flags and a few ISA feature flags when checking for function inline compatibility.

Tuning flags don't have any effect on the available instructions so aren't a good reason to prevent inlining.

There are also some ISA flags that don't have any intrinsics our ABI requirements that we can exclude. I've put only the most basic ones like cmpxchg16b and lahfsahf. These are interesting because they aren't present in all 64-bit CPUs, but we have codegen workarounds when they aren't present.

Loosening these checks can help with scenarios where a caller has a more specific CPU than a callee. The default tuning flags on our generic 'x86-64' CPU can currently make it inline compatible with other CPUs. I've also added an example test for 'nocona' and 'prescott' where 'nocona' is just a 64-bit capable version of 'prescott' but in 32-bit mode they should be completely compatible.

I've based the implementation here of the similar code in AMDGPU.

Differential Revision: https://reviews.llvm.org/D58371

llvm-svn: 354355

236e1ce1

GlobalISel: Implement moreElementsVector for select · b4c95b33
Matt Arsenault authored Feb 19, 2019
```
llvm-svn: 354354
```
b4c95b33
index.rst: Remove bb-chapuni from list of IRC bots · b48b3ae5
Hans Wennborg authored Feb 19, 2019
```
llvm-svn: 354353
```
b48b3ae5
index.rst: Remove Dragonegg link · 05bff5a0
Hans Wennborg authored Feb 19, 2019
```
llvm-svn: 354352
```
05bff5a0
GlobalISel: Implement moreElementsVector for G_EXTRACT source · 4d88427a
Matt Arsenault authored Feb 19, 2019
```
llvm-svn: 354348
```
4d88427a

[X86][AVX] Update VBROADCAST folds to always use v2i64 X86vzload · 9d575db8

Simon Pilgrim authored Feb 19, 2019

The VBROADCAST combines and SimplifyDemandedVectorElts improvements mean that we now more consistently use shorter (128-bit) X86vzload input operands.

Follow up to D58053

llvm-svn: 354346

9d575db8

GlobalISel: Implement moreElementsVector for bit ops · 26b7e859
Matt Arsenault authored Feb 19, 2019
```
llvm-svn: 354345
```
26b7e859

[yaml2obj][obj2yaml] Remove section type range markers from allowed mappings and support hex values · d82914c8

James Henderson authored Feb 19, 2019

yaml2obj/obj2yaml previously supported SHT_LOOS, SHT_HIOS, and
SHT_LOPROC for section types. These are simply values that delineate a
range and don't really make sense as valid values. For example if a
section has type value 0x70000000, obj2yaml shouldn't print this value
as SHT_LOPROC. Additionally, this was missing the three other range
markers (SHT_HIPROC, SHT_LOUSER and SHT_HIUSER).

This change removes these three range markers. It also adds support for
specifying the type as an integer, to allow section types that LLVM
doesn't know about.

Reviewed by: grimar

Differential Revision: https://reviews.llvm.org/D58383

llvm-svn: 354344

d82914c8

Cast from SDValue directly instead of superfluous getNode(). NFCI. · d6add749
Simon Pilgrim authored Feb 19, 2019
```
llvm-svn: 354343
```
d6add749
GlobalISel: Verify g_insert · 26760145
Matt Arsenault authored Feb 19, 2019
```
llvm-svn: 354342
```
26760145

[X86][AVX] EltsFromConsecutiveLoads - Add BROADCAST lowering support · 952abcef

Simon Pilgrim authored Feb 19, 2019

This patch adds scalar/subvector BROADCAST handling to EltsFromConsecutiveLoads.

It mainly shows codegen changes to 32-bit code which failed to handle i64 loads, although 64-bit code is also using this new path to more efficiently combine to a broadcast load.

Differential Revision: https://reviews.llvm.org/D58053

llvm-svn: 354340

952abcef

[yaml2obj][obj2yaml] - Support SHT_GNU_versym (.gnu.version) section. · 646af08e

George Rimar authored Feb 19, 2019

This patch adds support for parsing dumping the .gnu.version section.
Description of the section is: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symversion.html#SYMVERTBL

Differential revision: https://reviews.llvm.org/D58280

llvm-svn: 354338

646af08e

Recommit r354328, r354329 "[obj2yaml][yaml2obj] - Add support of... · 0621b795

George Rimar authored Feb 19, 2019

Recommit r354328, r354329 "[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section."

Fix:
Replace
assert(!IO.getContext() && "The IO context is initialized already");
with
assert(IO.getContext() && "The IO context is not initialized");
(this was introduced in r354329, where I tried to quickfix the darwin BB
and seems copypasted the assert from the wrong place).

Original commit message:

The section is described here:
https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html

Patch just teaches obj2yaml/yaml2obj to dump and parse such sections.

We did the finalization of string tables very late,
and I had to move the logic to make it a bit earlier.
That was needed in this patch since .gnu.version_r adds strings to .dynstr.
This might also be useful for implementing other special sections.

Everything else changed in this patch seems to be straightforward.

Differential revision: https://reviews.llvm.org/D58119

llvm-svn: 354335

0621b795

[RISCV][NFC] Move some std::string to StringRef · 6aae2161
Alex Bradbury authored Feb 19, 2019
```
llvm-svn: 354333
```
6aae2161

Revert r354328, r354329 "[obj2yaml][yaml2obj] - Add support of parsing/dumping... · aa735de6

George Rimar authored Feb 19, 2019

Revert r354328, r354329 "[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section."

Something went wrong. Bots are unhappy:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/44113/steps/test/logs/stdio

llvm-svn: 354332

aa735de6

Fix BB after r354328. · 12b283df

George Rimar authored Feb 19, 2019

Bot:
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/30188/steps/build_Lld/logs/stdio

Error:
/Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:1013:15: error: unused variable 'Object' [-Werror,-Wunused-variable]
  const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext());
              ^
/Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:1023:15: error: unused variable 'Object' [-Werror,-Wunused-variable]
  const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext());

Fix:
change 
  const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext());
  assert(Object && "The IO context is not initialized");
to
  assert(!IO.getContext() && "The IO context is initialized already");

llvm-svn: 354329

12b283df

[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section. · c09f2cd0

George Rimar authored Feb 19, 2019

The section is described here:
https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html

Patch just teaches obj2yaml/yaml2obj to dump and parse such sections.

We did the finalization of string tables very late,
and I had to move the logic to make it a bit earlier.
That was needed in this patch since .gnu.version_r adds strings to .dynstr.
This might also be useful for implementing other special sections.

Everything else changed in this patch seems to be straightforward.

Differential revision: https://reviews.llvm.org/D58119

llvm-svn: 354328

c09f2cd0

[RISCV] Re-organise calling convention tests · 1f0c1215

Alex Bradbury authored Feb 19, 2019

Re-organise calling convention tests to prepare for ilp32f and ilp32d hard
float ABI tests. It's also clear that we need to introduce similar tests for
lp64.

llvm-svn: 354323

1f0c1215

Fix BB after r354319 "[yaml2obj] - Do not skip zeroes blocks if there are... · d560744e

George Rimar authored Feb 19, 2019

Fix BB after r354319 "[yaml2obj] - Do not skip zeroes blocks if there are relocations against them."

Fix: move the test to x86 folder. 
Seems it is needed, because llvm-objdump invocation used in test has -D (disasm) flag.

BB: http://lab.llvm.org:8011/builders/clang-hexagon-elf/builds/23016

/local/buildbot/slaves/hexagon-build-02/clang-hexagon-elf/stage1/bin/llvm-objdump: 
error: '/local/buildbot/slaves/hexagon-build-02/clang-hexagon-elf/stage1/test/tools/llvm-objdump/Output/disasm-zeroes-relocations.test.tmp': 
can't find target: : error: unable to get target for 'x86_64--', see --version and --triple.
.

llvm-svn: 354322

d560744e

[yaml2obj] - Do not skip zeroes blocks if there are relocations against them. · c5f29200

George Rimar authored Feb 19, 2019

This is for -D -reloc combination.

With this patch, we do not skip the zero bytes that have a relocation against
them when -reloc is used. If -reloc is not used, then the behavior will be the same.

Differential revision: https://reviews.llvm.org/D58174

llvm-svn: 354319

c5f29200

[yaml2obj] - Do not ignore explicit addresses for .dynsym and .dynstr · 6947acd9

George Rimar authored Feb 19, 2019

This fixes https://bugs.llvm.org/show_bug.cgi?id=40339

Previously if the addresses were set in YAML they were ignored for
.dynsym and .dynstr sections. The patch fixes that.

Differential revision: https://reviews.llvm.org/D58168

llvm-svn: 354318

6947acd9

Fix obsolete comment. NFC · 55d41a78

Diana Picus authored Feb 19, 2019

Both files mentioned in the comment now include TargetOpcodes.def. Just
mention that directly.

llvm-svn: 354316

55d41a78

[NFC] API for signaling that the current loop is being deleted · ebd95ea8

Max Kazantsev authored Feb 19, 2019

We are planning to be able to delete the current loop in LoopSimplifyCFG
in the future. Add API to notify the loop pass manager that it happened.

llvm-svn: 354314

ebd95ea8

[NFC] Store loop header in a local to keep it available after the loop is deleted · 30095d97
Max Kazantsev authored Feb 19, 2019
```
llvm-svn: 354313
```
30095d97
[ARM GlobalISel] Support G_PHI for Thumb2 · 19dbc624
Diana Picus authored Feb 19, 2019
```
Same as arm mode.

llvm-svn: 354310
```
19dbc624

[Dominators] Fix and optimize edge insertion of depth-based search · b7fbfa68

Fangrui Song authored Feb 19, 2019

Summary:
After (x,y) is inserted, depth-based search finds all affected v that satisfies:

depth(nca(x,y))+1 < depth(v) && there exists a path P from y to v where every w on P satisfies depth(v) <= depth(w)

This reduces to a widest path problem (maximizing the depth of the
minimum vertex in the path) which can be solved by a modified version of
Dijkstra with a bucket queue (named depth-based search in the paper).

The algorithm visits vertices in decreasing order of bucket number.
However, the current code misused priority_queue to extract them in
increasing order. I cannot think of a failing scenario but it surely may
process vertices more than once due to the local usage of Processed.

This patch fixes this bug and simplifies/optimizes the code a bit. Also
add more comments.

Reviewers: kuhar

Reviewed By: kuhar

Subscribers: kristina, jdoerfert, llvm-commits, NutshellySima, brzycki

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58349

llvm-svn: 354306

b7fbfa68