Commits · d9e39d53b6ebe528d0c5728a9dd95e227856bfee · Lorenzo Albano / LLVM bpEVL

Jun 12, 2015

[ARM] Disabling vfp4 should disable fp16 · d9e39d53

John Brawn authored Jun 12, 2015

ARMTargetParser::getFPUFeatures should disable fp16 whenever it
disables vfp4, as otherwise something like -mcpu=cortex-a7 -mfpu=none
leaves us with fp16 enabled (though the only effect that will have is
a wrong build attribute).

Differential Revision: http://reviews.llvm.org/D10397

llvm-svn: 239599

d9e39d53

LowerBitSets: Give names to aliases of unnamed bitset element objects. · 005354b1

Peter Collingbourne authored Jun 12, 2015

It is valid for globals to be unnamed, but aliases must have a name. To avoid
creating invalid IR, we need to assign names to any aliases we create that
point to unnamed objects that have been moved into combined globals.

llvm-svn: 239590

005354b1

[GVN] Use a simpler form of IRBuilder constructor. · 9947e48c

Alexey Samsonov authored Jun 12, 2015

Summary:
A side effect of this change is that it IRBuilder now automatically
created debug info locations for new instructions, which is the
same as debug location of insertion point. This is fine for the
functions in questions (GetStoreValueForLoad and
GetMemInstValueForLoad), as they are used in two situations:
  * GVN::processLoad, which tries to eliminate a load. In this case
    new instructions would have the same debug location as the load they
    eventually replace;
  * MaterializeAdjustedValue, which adds new instructions to the end
    of the basic blocks, which could later be used to replace the load
    definition. In this case we don't yet know the way the load would
    be eventually replaced (either by assembling the precomputed values
    via PHI, or by using them directly), so just using the basic block
    strategy seems to be reasonable. There is also a special case
    in the code that *would* adjust the location of the last
    instruction replacing the load definition to the location of the
    load.

Test Plan: regression test suite

Reviewers: echristo, dberlin, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10405

llvm-svn: 239585

9947e48c

[WinEH] Put finally pointers in the handler scope table field · 81d1cc00

Reid Kleckner authored Jun 11, 2015

We were putting them in the filter field, which is correct for 64-bit
but wrong for 32-bit.

Also switch the order of scope table entry emission so outermost entries
are emitted first, and fix an obvious state assignment bug.

llvm-svn: 239574

81d1cc00

[WinEH] Create an llvm.x86.seh.exceptioninfo intrinsic · a9d62535

Reid Kleckner authored Jun 11, 2015

This intrinsic is like framerecover plus a load. It recovers the EH
registration stack allocation from the parent frame and loads the
exception information field out of it, giving back a pointer to an
EXCEPTION_POINTERS struct. It's designed for clang to use in SEH filter
expressions instead of accessing the EXCEPTION_POINTERS parameter that
is available on x64.

This required a minor change to MC to allow defining a label variable to
another absolute framerecover label variable.

llvm-svn: 239567

a9d62535

Jun 11, 2015

Object: Prepend __imp_ when mangling a dllimport symbol in IRObjectFile. · 82e657b5

Peter Collingbourne authored Jun 11, 2015

We cannot prepend __imp_ in the IR mangler because a function reference may
be emitted unmangled in a constant initializer. The linker is expected to
resolve such references to thunks. This is covered by the new test case.

Strictly speaking we ought to emit two undefined symbols, one with __imp_ and
one without, as we cannot know which symbol the final object file will refer
to. However, this would require rather intrusive changes to IRObjectFile,
and lld works fine without it for now.

This reimplements r239437, which was reverted in r239502.

Differential Revision: http://reviews.llvm.org/D10400

llvm-svn: 239560

82e657b5

Set proper debug location for branch added in BasicBlock::splitBasicBlock(). · 770f65ca

Alexey Samsonov authored Jun 11, 2015

This improves debug locations in passes that do a lot of basic block
transformations. Important case is LoopUnroll pass, the test for correct
debug locations accompanies this change.

Test Plan: regression test suite

Reviewers: dblaikie, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10367

llvm-svn: 239551

770f65ca

This reverts commit r239529 and r239514. · 65d37e64

Rafael Espindola authored Jun 11, 2015

Revert "[AArch64] Match interleaved memory accesses into ldN/stN instructions."
Revert "Fixing MSVC 2013 build error."

The  test/CodeGen/AArch64/aarch64-interleaved-accesses.ll test was failing on OS X.

llvm-svn: 239544

65d37e64

Revert "Fix merges of non-zero vector stores" · 2691c59e

Reid Kleckner authored Jun 11, 2015

This reverts commit r239539.

It was causing SDAG assertions while building freetype.

llvm-svn: 239543

2691c59e

SLSR: Pass address space to isLegalAddressingMode · 91f90e69

Matt Arsenault authored Jun 11, 2015

This only updates one of the uses. The other is used in cases
that may never touch memory, so I'm not sure why this is even
calling it. Perhaps there should be a new, similar hook for such
cases or pass -1 for unknown address space.

llvm-svn: 239540

91f90e69

Fix merges of non-zero vector stores · e23a063d

Matt Arsenault authored Jun 11, 2015

Now actually stores the non-zero constant instead of 0.
I somehow forgot to include this part of r238108.

The test change was just an independent instruction order swap,
so just add another check line to satisfy CHECK-NEXT.

llvm-svn: 239539

e23a063d

R600/SI: Add -mcpu=bonaire to a test that uses flat address space · 53e015f3

Tom Stellard authored Jun 11, 2015

Flat instructions don't exist on SI, but there is a bug in the backend that
allows them to be selected.

llvm-svn: 239533

53e015f3

Recommit "[mips] [IAS] Add support for BNE and BEQ with an immediate operand." (r239396). · e1e460db
Toma Tabacu authored Jun 11, 2015
```
Apparently, Arcanist didn't include some of my local changes in my previous
commit attempt.

llvm-svn: 239523
```
e1e460db
[mips][microMIPS] Implement ERET and ERETNC instructions · cdfcbe41
Zoran Jovanovic authored Jun 11, 2015
```
http://reviews.llvm.org/D10091

llvm-svn: 239522
```
cdfcbe41
[mips] Change existing uimm10 operand to restrict the accepted immediates · 6b0dcd7b
Zoran Jovanovic authored Jun 11, 2015
```
http://reviews.llvm.org/D10312

llvm-svn: 239520
```
6b0dcd7b
[mips][microMIPSr6] Change disassembler tests to one line format · fcecf260
Zoran Jovanovic authored Jun 11, 2015
```
llvm-svn: 239519
```
fcecf260

[AArch64] Match interleaved memory accesses into ldN/stN instructions. · 4566d18e

Hao Liu authored Jun 11, 2015

Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true"

E.g. Transform an interleaved load (Factor = 2):
       %wide.vec = load <8 x i32>, <8 x i32>* %ptr
       %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>  ; Extract even elements
       %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>  ; Extract odd elements
     Into:
       %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr)
       %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
       %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1

E.g. Transform an interleaved store (Factor = 2):
       %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7>  ; Interleaved vec
       store <8 x i32> %i.vec, <8 x i32>* %ptr
     Into:
       %v0 = shuffle %i.vec, undef, <0, 1, 2, 3>
       %v1 = shuffle %i.vec, undef, <4, 5, 6, 7>
       call void aarch64.neon.st2(%v0, %v1, %ptr)

llvm-svn: 239514

4566d18e

[X86][SSE] Vectorized i8 and i16 shift operators · 5965680d

Simon Pilgrim authored Jun 11, 2015

This patch ensures that SHL/SRL/SRA shifts for i8 and i16 vectors avoid scalarization. It builds on the existing i8 SHL vectorized implementation of moving the shift bits up to the sign bit position and separating the 4, 2 & 1 bit shifts with several improvements:

1 - SSE41 targets can use (v)pblendvb directly with the sign bit instead of performing a comparison to feed into a VSELECT node.
2 - pre-SSE41 targets were masking + comparing with an 0x80 constant - we avoid this by using the fact that a set sign bit means a negative integer which can be compared against zero to then feed into VSELECT, avoiding the need for a constant mask (zero generation is much cheaper).
3 - SRA i8 needs to be unpacked to the upper byte of a i16 so that the i16 psraw instruction can be correctly used for sign extension - we have to do more work than for SHL/SRL but perf tests indicate that this is still beneficial.

The i16 implementation is similar but simpler than for i8 - we have to do 8, 4, 2 & 1 bit shifts but less shift masking is involved. SSE41 use of (v)pblendvb requires that the i16 shift amount is splatted to both bytes however.

Tested on SSE2, SSE41 and AVX machines.

Differential Revision: http://reviews.llvm.org/D9474

llvm-svn: 239509

5965680d

LLVM support for vector quad bit permute and gather instructions through builtins · ea1db8a6

Nemanja Ivanovic authored Jun 11, 2015

This patch corresponds to review:
http://reviews.llvm.org/D10096

This is the back end portion of the patch related to D10095.
The patch adds the instructions and back end intrinsics for:
vbpermq
vgbbd

llvm-svn: 239505

ea1db8a6

Revert "Move dllimport name mangling to IR mangler." · c35e7f52

Reid Kleckner authored Jun 11, 2015

This reverts commit r239437.

This broke clang-cl self-hosts. We'd end up calling the __imp_ symbol
directly instead of using it to do an indirect function call.

llvm-svn: 239502

c35e7f52

Jun 10, 2015

ArgumentPromotion: Drop sret attribute on functions that are only called directly. · 115fe376

Peter Collingbourne authored Jun 10, 2015

If the first argument to a function is a 'this' argument and the second
has the sret attribute, the ArgumentPromotion pass may promote the 'this'
argument to more than one argument, violating the IR constraint that 'sret'
may only be applied to the first or second argument.

Although this IR constraint is arguably unnecessary, it highlighted the fact
that ArgPromotion does not need to preserve this attribute. Dropping the
attribute reduces register pressure in the backend by avoiding the register
copy required by sret. Because sret implies noalias, we also replace the
former with the latter.

Differential Revision: http://reviews.llvm.org/D10353

llvm-svn: 239488

115fe376

[x86] Add a reassociation optimization to increase ILP via the MachineCombiner pass · 08829bac

Sanjay Patel authored Jun 10, 2015

This is a reimplementation of D9780 at the machine instruction level rather than the DAG.

Use the MachineCombiner pass to reassociate scalar single-precision AVX additions (just a
starting point; see the TODO comments) to increase ILP when it's safe to do so.

The code is closely based on the existing MachineCombiner optimization that is implemented
for AArch64.

This patch should not cause the kind of spilling tragedy that led to the reversion of r236031.

Differential Revision: http://reviews.llvm.org/D10321

llvm-svn: 239486

08829bac

[WinEH] _except_handlerN uses 0 instead of 1 to indicate catch-all · c87a6fab
Reid Kleckner authored Jun 10, 2015
```
Our usage of 1 was a holdover from __C_specific_handler.

llvm-svn: 239482
```
c87a6fab

[GVN] Set proper debug locations for some instructions created by GVN. · 89645dfa

Alexey Samsonov authored Jun 10, 2015

Determining proper debug locations for instructions created in
PHITransAddr is tricky. We use a simple approach here and simply copy
debug locations from instructions computing load address to
"corresponding" instructions re-creating the address computation
in predecessor basic blocks.

This may not always be correct, given all the rearrangement and
simplification going on, and debug locations may jump around a lot,
as the basic blocks we copy locations between may be very far from
each other.

Still, this would work good in most simple cases (e.g. when chain
of address computing instruction is short, or our mapping turns out
to be 1-to-1), and we desire to have *some* reasonable debug locations
associated with newly inserted instructions.

See http://reviews.llvm.org/D10351 review thread for more details.

Test Plan: regression test suite

Reviewers: spatel, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10351

llvm-svn: 239479

89645dfa

[Hexagon] Adding decoders for signed operands and ensuring all signed operand... · 1e9d1d76
Colin LeMahieu authored Jun 10, 2015
```
[Hexagon] Adding decoders for signed operands and ensuring all signed operand types disassemble correctly.

llvm-svn: 239477
```
1e9d1d76
[Statepoints] Add test case to check that statepoint is marked with Throwable attribute. · 965bf6a3
Igor Laevsky authored Jun 10, 2015
```
Differential Revision: http://reviews.llvm.org/D10215

llvm-svn: 239473
```
965bf6a3

[StatepointLowering] Reuse stack slots across basic blocks · 346ff628

Igor Laevsky authored Jun 10, 2015

During statepoint lowering we can sometimes avoid spilling of the value if we know that it was already spilled for previous statepoint.
We were doing this by checking if incoming statepoint value was lowered into load from stack slot. This was working only in boundaries of one basic block.

But instead of looking at the lowered node we can look directly at the llvm-ir value and if it was gc.relocate (or some simple modification of it) look up stack slot for it's derived pointer and reuse stack slot from it. This allows us to look across basic block boundaries.

Differential Revision: http://reviews.llvm.org/D10251

llvm-svn: 239472

346ff628

AVX-512: Fixed a bug in comparison of i1 vectors. · 00c9ad5e

Elena Demikhovsky authored Jun 10, 2015

cmp eq should give kxnor instruction
cmp neq should give kxor 

https://llvm.org/bugs/show_bug.cgi?id=23631

llvm-svn: 239460

00c9ad5e

[WinEH] Call llvm.stackrestore in __except blocks · 673de15a

Reid Kleckner authored Jun 10, 2015

We have to do this manually, the runtime only sets up ebp. Fixes a crash
when returning after catching an exception.

llvm-svn: 239451

673de15a

[WinEH] Emit .safeseh directives for all 32-bit exception handlers · 2bc93ca8

Reid Kleckner authored Jun 10, 2015

Use a "safeseh" string attribute to do this. You would think we chould
just accumulate the set of personalities like we do on dwarf, but this
fails to account for the LSDA-loading thunks we use for
__CxxFrameHandler3. Each of those needs to make it into .sxdata as well.
The string attribute seemed like the most straightforward approach.

llvm-svn: 239448

2bc93ca8

Add explicit -mtriple=arm-unknown to... · e6c1b567

NAKAMURA Takumi authored Jun 09, 2015

Add explicit -mtriple=arm-unknown to llvm/test/CodeGen/ARM/disable-tail-calls.ll, to satisfy *-win32.

llvm-svn: 239442

e6c1b567

[BasicBlockUtils] Set debug locations for instructions created in SplitBlockPredecessors. · b7f02d37

Alexey Samsonov authored Jun 09, 2015

Test Plan: regression test suite

Reviewers: eugenis, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10343

llvm-svn: 239438

b7f02d37

Move dllimport name mangling to IR mangler. · 9fe51fdf

Peter Collingbourne authored Jun 09, 2015

This ensures that LTO clients see the correct external symbol name.

Differential Revision: http://reviews.llvm.org/D10318

llvm-svn: 239437

9fe51fdf

Jun 09, 2015

[NVPTX] fix a crash bug in NVPTXFavorNonGenericAddrSpaces · 75589ffc

Jingyue Wu authored Jun 09, 2015

Summary:
We used to assume V->RAUW only modifies the operand list of V's user.
However, if V and V's user are Constants, RAUW may replace and invalidate V's
user entirely.

This patch fixes the above issue by letting the caller replace the
operand instead of calling RAUW on Constants.

Test Plan: @nested_const_expr and @rauw in access-non-generic.ll

Reviewers: broune, jholewinski

Reviewed By: broune, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10345

llvm-svn: 239435

75589ffc

LibDriver, llvm-lib: introduce. · bc05163f

Peter Collingbourne authored Jun 09, 2015

llvm-lib is intended to be a lib.exe compatible utility that also
understands bitcode. The implementation lives in a library so that
lld can use it to implement /lib.

Differential Revision: http://reviews.llvm.org/D10297

llvm-svn: 239434

bc05163f

[WinEH] Add 32-bit SEH state table emission prototype · f12c030f

Reid Kleckner authored Jun 09, 2015

This gets all the handler info through to the asm printer and we can
look at the .xdata tables now. I've convinced one small catch-all test
case to work, but other than that, it would be a stretch to say this is
functional.

The state numbering algorithm avoids doing any scope reconstruction as
we do for C++ to simplify the implementation.

llvm-svn: 239433

f12c030f

[AArch64] Remove an overly conservative check when generating store pairs. · cf90acc1

Chad Rosier authored Jun 09, 2015

Store instructions do not modify register values and therefore it's safe
to form a store pair even if the source register has been read in between
the two store instructions.

Previously, the read of w1 (see below) prevented the formation of a stp.

        str      w0, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        str     w1, [x2, #4]
        ret

We now generate the following code.

        stp      w0, w1, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        ret

All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass.
Performance results for SPEC2K were within noise.

llvm-svn: 239432

cf90acc1

Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions · d9699bc7

Akira Hatanaka authored Jun 09, 2015

that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099

llvm-svn: 239427

d9699bc7

MergeFunctions: Don't replace a weak function use by another equivalent weak function · 7e226271
Arnold Schwaighofer authored Jun 09, 2015
```
We don't know whether the weak functions definition is the definitive definition.

rdar://21303727

llvm-svn: 239422
```
7e226271

Revert "[DWARF] Fix a few corner cases in expression emission" · 0ebe35b2

David Blaikie authored Jun 09, 2015

This reverts commit r239380 due to apparently GDB regressions:
http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/22562

llvm-svn: 239420

0ebe35b2