Commits · cfdfba996b081092814d9b0856fcb8b2e12f73e7 · Roger Ferrer / llvm-epi

Mar 18, 2019

[AMDGPU] Asm/disasm clamp modifier on vop3 int arithmetic · cfdfba99

Tim Renouf authored Mar 18, 2019

Allow the clamp modifier on vop3 int arithmetic instructions in assembly
and disassembly.

This involved adding a clamp operand to the affected instructions in MIR
and MC, and thus having to fix up several places in codegen and MIR
tests.

Differential Revision: https://reviews.llvm.org/D59267

Change-Id: Ic7775105f02a985b668fa658a0cd7837846a534e
llvm-svn: 356399

cfdfba99

[AMDGPU] Asm/disasm v_cndmask_b32_e64 with abs/neg source modifiers · 2e94f6e5

Tim Renouf authored Mar 18, 2019

This commit allows v_cndmask_b32_e64 with abs, neg source
modifiers on src0, src1 to be assembled and disassembled.

This does appear to be allowed, even though they are floating point
modifiers and the operand type is b32.

To do this, I added src0_modifiers and src1_modifiers to the
MachineInstr, which involved fixing up several places in codegen and mir
tests.

Differential Revision: https://reviews.llvm.org/D59191

Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea
llvm-svn: 356398

2e94f6e5

[Sema] Add some compile time _FORTIFY_SOURCE diagnostics · b6e16ea0

Erik Pilkington authored Mar 18, 2019

These diagnose overflowing calls to subset of fortifiable functions. Some
functions, like sprintf or strcpy aren't supported right not, but we should
probably support these in the future. We previously supported this kind of
functionality with -Wbuiltin-memcpy-chk-size, but that diagnostic doesn't work
with _FORTIFY implementations that use wrapper functions. Also unlike that
diagnostic, we emit these warnings regardless of whether _FORTIFY_SOURCE is
actually enabled, which is nice for programs that don't enable the runtime
checks.

Why not just use diagnose_if, like Bionic does? We can get better diagnostics in
the compiler (i.e. mention the sizes), and we have the potential to diagnose
sprintf and strcpy which is impossible with diagnose_if (at least, in languages
that don't support C++14 constexpr). This approach also saves standard libraries
from having to add diagnose_if.

rdar://48006655

Differential revision: https://reviews.llvm.org/D58797

llvm-svn: 356397

b6e16ea0

Revert r356304: remove subreg parameter from MachineIRBuilder::buildCopy() · 8627178d

Amara Emerson authored Mar 18, 2019

After review comments, it was preferred to not teach MachineIRBuilder about
non-generic instructions beyond using buildInstr().

For AArch64 I've changed the buildCopy() calls to buildInstr() + a
separate addReg() call.

This also relaxes the MachineIRBuilder's COPY checking more because it may
not always have a SrcOp given to it.

llvm-svn: 356396

8627178d

[DebugInfo][PDB] Don't write empty debug streams · 4aeea4cc

Alexandre Ganea authored Mar 18, 2019

Before, empty debug streams were written as 8 bytes (4 bytes signature + 4 bytes for the GlobalRefs count).

With this patch, unused empty streams aren't emitted anymore. Modules now encode 65535 as an 'unused stream' value, by convention.
Also fix the * Linker * contrib section which wasn't correctly emitted previously.

Differential Revision: https://reviews.llvm.org/D59502

llvm-svn: 356395

4aeea4cc

[MsgPack][AMDGPU] Fix unflushed raw_string_ostream bugs on windows expensive checks bot · 8723a565

Tim Renouf authored Mar 18, 2019

This fixes a couple of unflushed raw_string_ostream bugs in recent
commits that only show up on a bot building on windows with expensive
checks.

Differential Revision: https://reviews.llvm.org/D59396

Change-Id: I9c6208325503b3ee0786b4b688e13fc24a15babf
llvm-svn: 356394

8723a565

[X86] Rename imm8_su/imm16_su/imm32_su to... · f07062a7

Craig Topper authored Mar 18, 2019

[X86] Rename imm8_su/imm16_su/imm32_su to relocImm8_su/relocImm16_su/relocImm32_su/ to accurately reflect what they are.

llvm-svn: 356393

f07062a7

[SCEV] Guard movement of insertion point for loop-invariants · ad7d0ded

Warren Ristow authored Mar 18, 2019

This reinstates r347934, along with a tweak to address a problem with
PHI node ordering that that commit created (or exposed). (That commit
was reverted at r348426, due to the PHI node issue.)

Original commit message:

r320789 suppressed moving the insertion point of SCEV expressions with
dev/rem operations to the loop header in non-loop-invariant situations.
This, and similar, hoisting is also unsafe in the loop-invariant case,
since there may be a guard against a zero denominator. This is an
adjustment to the fix of r320789 to suppress the movement even in the
loop-invariant case.

This fixes PR30806.

Differential Revision: https://reviews.llvm.org/D57428

llvm-svn: 356392

ad7d0ded

[AArch64] Small fix for getIntImmCost · 270249de

Adhemerval Zanella authored Mar 18, 2019

It uses the generic AArch64_IMM::expandMOVImm to get the correct
number of instruction used in immediate materialization.

Reviewers: efriedma

Differential Revision: https://reviews.llvm.org/D58461

llvm-svn: 356391

270249de

[AArch64] Optimize floating point materialization · a3cefa5d

Adhemerval Zanella authored Mar 18, 2019

This patch follows some ideas from r352866 to optimize the floating
point materialization even further. It changes isFPImmLegal to
considere up to 2 mov instruction or up to 5 in case subtarget has
fused literals.

The rationale is the cost is the same for mov+fmov vs. adrp+ldr; but
the mov+fmov sequence is always better because of the reduced d-cache
pressure. The timings are still the same if you consider movw+movk+fmov
vs. adrp+ldr will be fused (although one instruction longer).

Reviewers: efriedma

Differential Revision: https://reviews.llvm.org/D58460

llvm-svn: 356390

a3cefa5d

[TargetLowering] Add code size information on isFPImmLegal. NFC · 664c1ef5

Adhemerval Zanella authored Mar 18, 2019

This allows better code size for aarch64 floating point materialization
in a future patch.

Reviewers: evandro

Differential Revision: https://reviews.llvm.org/D58690

llvm-svn: 356389

664c1ef5

[OPENMP] Set scheduling for doacross loops as schedule, 1. · f6a53d63
Alexey Bataev authored Mar 18, 2019
```
The default scheduling for doacross loops is changed from static to
static, 1.

llvm-svn: 356388
```
f6a53d63

[AArch64] Refactor floating point materialization. NFC · 8a595b1d

Adhemerval Zanella authored Mar 18, 2019

It splits the login of actual instruction emission away from the logic
that figures out the appropriate sequence on AArch64ExpandPseudo::expandMOVImm.
The new function AArch64_IMM::expandMOVImm, which return the list of the 
instructions to materialize the immediate constant, is implemented on a 
separated unit because it will be used in a subsequent patch to optimize
floating point materialization.

Reviewers: efriedma

Differential Revision: https://reviews.llvm.org/D58915

llvm-svn: 356387

8a595b1d

[libc++][NFC] Promote CMake comment to an actual option description · 0c962cb5
Louis Dionne authored Mar 18, 2019
```
llvm-svn: 356386
```
0c962cb5
[AMDGPU] Add the missing clang change of the experimental buffer fat pointer · 3c2aadbe
Michael Liao authored Mar 18, 2019
```
llvm-svn: 356385
```
3c2aadbe

[X86] Remove the _alt forms of (V)CMP instructions. Use a combination of... · c2b35ebc

Craig Topper authored Mar 18, 2019

[X86] Remove the _alt forms of (V)CMP instructions. Use a combination of custom printing and custom parsing to achieve the same result and more

Similar to previous change done for VPCOM and VPCMP

Differential Revision: https://reviews.llvm.org/D59468

llvm-svn: 356384

c2b35ebc

[InstCombine] add/adjust test for NaN checks; NFC · 08b5e68e
Sanjay Patel authored Mar 18, 2019
```
llvm-svn: 356383
```
08b5e68e

[DAG] Cleanup unused node in SimplifySelectCC. · 55c921f4

Nirav Dave authored Mar 18, 2019

Delete temporarily constructed node uses for analysis after it's use,
holding onto original input nodes. Ideally this would be rewritten
without making nodes, but this appears relatively complex.

Reviewers: spatel, RKSimon, craig.topper

Subscribers: jdoerfert, hiraditya, deadalnix, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57921

llvm-svn: 356382

55c921f4

[MVT] Fix typos in comment. NFC. · c131e0e2
Michael Liao authored Mar 18, 2019
```
llvm-svn: 356381
```
c131e0e2

lld-link: Run conflict-mangled.test on all systems · 2b1dca79

Nico Weber authored Mar 18, 2019

It seems to pass fine on my Mac, and it running it only on Windows made
me miss it in r355959 and required r355959.

When the test was added in r288992 we still used Win-only
UnDecorateSymbolName() for demangling. Now we use LLVM's
microsoftDemangle() which is cross-platform.

Differential Revision: https://reviews.llvm.org/D59497

llvm-svn: 356380

2b1dca79

Skip TestVSCode_setFunctionBreakpoints on linux · 0e5012ea
Pavel Labath authored Mar 18, 2019
```
Test hangs under heavy load.

llvm-svn: 356379
```
0e5012ea
Fix some "variable 'foo' set but not used" warnings · 370e5dba
Pavel Labath authored Mar 18, 2019
```
gcc-8 diagnoses these.

llvm-svn: 356378
```
370e5dba

Fix libstdc++ data formatters for python3 · 22457e66

Pavel Labath authored Mar 18, 2019

Use floor-division for consistentcy across python versions. This fixes a
couple of libstdc++ data formatter tests.

llvm-svn: 356377

22457e66

[libc++] Add a test for PR40977 · 2bde5303

Louis Dionne authored Mar 18, 2019

Even though the header makes the exact same check since https://llvm.org/D59063,
the headers could conceivably change in the future and introduce a bug.

llvm-svn: 356376

2bde5303

[ELF] Emit weak-undef symbols in .dynsym of a PIE binary only if linked against shared libs. · 1915e2be

Siva Chandra authored Mar 18, 2019

Reviewers: espindola

Subscribers: emaste, arichardson, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59275

llvm-svn: 356374

1915e2be

[AMDGPU] Add an experimental buffer fat pointer address space. · 523dab07

Neil Henning authored Mar 18, 2019

Add an experimental buffer fat pointer address space that is currently
unhandled in the backend. This commit reserves address space 7 as a
non-integral pointer repsenting the 160-bit fat pointer (128-bit buffer
descriptor + 32-bit offset) that is heavily used in graphics workloads
using the AMDGPU backend.

Differential Revision: https://reviews.llvm.org/D58957

llvm-svn: 356373

523dab07

[InstCombine] allow general vector constants for funnel shift to shift transforms · 60633935

Sanjay Patel authored Mar 18, 2019

Follow-up to:
rL356338
rL356369

We can calculate an arbitrary vector constant minus the bitwidth, so there's
no need to limit this transform to scalars and splats.

llvm-svn: 356372

60633935

[llvm-objcopy] - Calculate the string table section sizes correctly. · faf308b1

George Rimar authored Mar 18, 2019

This fixes the https://bugs.llvm.org/show_bug.cgi?id=40980.

Previously if string optimization occurred as a result of
StringTableBuilder's finalize() method, the size wasn't updated.

This hopefully also makes the interaction between sections during finalization
processes a bit more clear.

Differential revision: https://reviews.llvm.org/D59488

llvm-svn: 356371

faf308b1

Fix TestCommandScriptImmediateOutput for python3 · 58e9ef13
Pavel Labath authored Mar 18, 2019
```
s/iteritems/items

llvm-svn: 356370
```
58e9ef13

[InstCombine] extend rotate-left-by-constant canonicalization to funnel shift · 84de8a30

Sanjay Patel authored Mar 18, 2019

Follow-up to:
rL356338

Rotates are a special case of funnel shift where the 2 input operands
are the same value, but that does not need to be a restriction for the
canonicalization when the shift amount is a constant.

llvm-svn: 356369

84de8a30

[SystemZ] Remove icmp undef from reduced tests · f9ab4f5f

Simon Pilgrim authored Mar 18, 2019

Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC)

Approved by @uweigand (Ulrich Weigand)

llvm-svn: 356368

f9ab4f5f

[InstCombine] add funnel shift tests with arbitrary constants; NFC · d7f15393
Sanjay Patel authored Mar 18, 2019
```
llvm-svn: 356367
```
d7f15393

[pp-trace] Delete -ignore and add a new option -callbacks · 560a45a3

Fangrui Song authored Mar 18, 2019

Summary:
-ignore specifies a list of PP callbacks to ignore. It cannot express a
whitelist, which may be more useful than a blacklist.
Add a new option -callbacks to replace it.

-ignore= (default) => -callbacks='*' (default)
-ignore=FileChanged,FileSkipped => -callbacks='*,-FileChanged,-FileSkipped'

-callbacks='Macro*' : print only MacroDefined,MacroExpands,MacroUndefined,...

Reviewers: juliehockett, aaron.ballman, alexfh, ioeric

Reviewed By: aaron.ballman

Subscribers: nemanjai, kbarton, jsji, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D59296

llvm-svn: 356366

560a45a3

[llvm-exegesis] Separate tool options into three categories. · 23629385

Roman Lebedev authored Mar 18, 2019

Results in much nicer -help output:
```
$ ./bin/llvm-exegesis -help
USAGE: llvm-exegesis [options]

OPTIONS:

Color Options:

  -color                                         - Use colors in output (default=autodetect)

General options:

  -enable-cse-in-irtranslator                    - Should enable CSE in irtranslator
  -enable-cse-in-legalizer                       - Should enable CSE in Legalizer

Generic Options:

  -help                                          - Display available options (-help-hidden for more)
  -help-list                                     - Display list of available options (-help-list-hidden for more)
  -version                                       - Display the version of this program

llvm-exegesis analysis options:

  -analysis-clustering-epsilon=<number>          - dbscan epsilon for benchmark point clustering
  -analysis-clusters-output-file=<string>        -
  -analysis-display-unstable-clusters            - if there is more than one benchmark for an opcode, said benchmarks may end up not being clustered into the same cluster if the measured performance characteristics are different. by default all such opcodes are filtered out. this flag will instead show only such unstable opcodes
  -analysis-inconsistencies-output-file=<string> -
  -analysis-inconsistency-epsilon=<number>       - epsilon for detection of when the cluster is different from the LLVM schedule profile values
  -analysis-numpoints=<uint>                     - minimum number of points in an analysis cluster

llvm-exegesis benchmark options:

  -ignore-invalid-sched-class                    - ignore instructions that do not define a sched class
  -mode=<value>                                  - the mode to run
    =latency                                     -   Instruction Latency
    =inverse_throughput                          -   Instruction Inverse Throughput
    =uops                                        -   Uop Decomposition
    =analysis                                    -   Analysis
  -num-repetitions=<uint>                        - number of time to repeat the asm snippet
  -opcode-index=<int>                            - opcode to measure, by index
  -opcode-name=<string>                          - comma-separated list of opcodes to measure, by name
  -snippets-file=<string>                        - code snippets to measure

llvm-exegesis options:

  -benchmarks-file=<string>                      - File to read (analysis mode) or write (latency/uops/inverse_throughput modes) benchmark results. “-” uses stdin/stdout.
  -mcpu=<string>                                 - cpu name to use for pfm counters, leave empty to autodetect
```

llvm-svn: 356364

23629385

[DebugInfo] Ignore bitcasts when lowering stack arg dbg.values · 8a2e4af7

David Stenberg authored Mar 18, 2019

Summary:
Look past bitcasts when looking for parameter debug values that are
described by frame-index loads in `EmitFuncArgumentDbgValue()`.

In the attached test case we would be left with an undef `DBG_VALUE`
for the parameter without this patch.

A similar fix was done for parameters passed in registers in D13005.

This fixes PR40777.

Reviewers: aprantl, vsk, jmorse

Reviewed By: aprantl

Subscribers: bjope, javed.absar, jdoerfert, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D58831

llvm-svn: 356363

8a2e4af7

Fix "type qualifiers ignored on cast result type" warnings · f92ddfed
Pavel Labath authored Mar 18, 2019
```
These warnings start to get emitted with gcc-8.

llvm-svn: 356362
```
f92ddfed

Reinitialize UnwindTable when the SymbolFile changes · dec96392

Pavel Labath authored Mar 18, 2019

Summary:
This is a preparatory step to enable adding of unwind plans by symbol
file plugins.

Although at the surface it seems that currently symbol files have
nothing to do with unwinding, this isn't entirely correct even now. The
mere act of adding a symbol file can have the effect of making more
sections (typically .debug_frame) available to the unwinding machinery,
so that it can have more unwind strategies to choose from.

Up until now, we've had a bug, which went largely unnoticed, where
unwind info in the manually added symbols files (target symbols add) was
being ignored during unwinding. Reinitializing the UnwindTable fixes
that bug too.

Reviewers: clayborg, jasonmolenda, alexshap

Subscribers: jdoerfert, lldb-commits

Differential Revision: https://reviews.llvm.org/D58347

llvm-svn: 356361

dec96392

[AArch64] Fix bug 35094 atomicrmw on Armv8.1-A+lse · 8cfd91dc

Christof Douma authored Mar 18, 2019

Fixes https://bugs.llvm.org/show_bug.cgi?id=35094

The Dead register definition pass should leave alone the atomicrmw
instructions on AArch64 (LTE extension). The reason is the following
statement in the Arm ARM:

"The ST<OP> instructions, and LD<OP> instructions where the destination
register is WZR or XZR, are not regarded as doing a read for the purpose
of a DMB LD barrier."

A good example was given in the gcc thread by Will Deacon (linked in the
bugzilla ticket 35094):

    P0 (atomic_int* y,atomic_int* x) {
      atomic_store_explicit(x,1,memory_order_relaxed);
      atomic_thread_fence(memory_order_release);
      atomic_store_explicit(y,1,memory_order_relaxed);
    }

    P1 (atomic_int* y,atomic_int* x) {
      atomic_fetch_add_explicit(y,1,memory_order_relaxed);  // STADD
      atomic_thread_fence(memory_order_acquire);
      int r0 = atomic_load_explicit(x,memory_order_relaxed);
    }

    P2 (atomic_int* y) {
      int r1 = atomic_load_explicit(y,memory_order_relaxed);
    }

    My understanding is that it is forbidden for r0 == 0 and r1 == 2 after
    this test has executed. However, if the relaxed add in P1 compiles to
    STADD and the subsequent acquire fence is compiled as DMB LD, then we
    don't have any ordering guarantees in P1 and the forbidden result could
    be observed.

Change-Id: I419f9f9df947716932038e1100c18d10a96408d0
llvm-svn: 356360

8cfd91dc

[X86] Hopefully fix a tautological compare warning in printVecCompareInstr. · ba898da1
Craig Topper authored Mar 18, 2019
```
llvm-svn: 356359
```
ba898da1
[RISCV] Add ImmArg to intrinsics · 60444ad1
Alex Bradbury authored Mar 18, 2019
```
llvm-svn: 356358
```
60444ad1