Commits · 7ecb0714be826145fa69fdf42956203f0d8bd39a · Roger Ferrer / llvm-epi

May 05, 2015

Added Andrey Churbanov as the owner of the OpenMP runtime library code · 7ecb0714
Andrey Churbanov authored May 05, 2015
```
llvm-svn: 236540
```
7ecb0714

[Inliner] Discard empty COMDAT groups · ac256cfe

David Majnemer authored May 05, 2015

COMDAT groups which have become rendered unused because of inline are
discardable if we can prove that we've made the group empty.

This fixes PR22285.

llvm-svn: 236539

ac256cfe

Refactor UpdatePredRedefs and StepForward to avoid duplication. NFC · 7605e37a

Pete Cooper authored May 05, 2015

Note, this is a reapplication of r236515 with a fix to not assert on non-register operands, but instead only handle them until the subsequent commit. Original commit message follows.

The code was basically the same here already. Just added an out parameter for a vector of seen defs so that UpdatePredRedefs can call StepForward first, then do its own post processing on the seen defs.

Will be used in the next commit to also handle regmasks.

llvm-svn: 236538

7605e37a

Build ASan runtime library with -z global on Android. · 91960900
Evgeniy Stepanov authored May 05, 2015
```
llvm-svn: 236537
```
91960900

Thumb2SizeReduction: Check the correct set of registers for LDMIA. · 85a0e23b

Peter Collingbourne authored May 05, 2015

The register set for LDMIA begins at offset 3, not 4. We were previously
missing the short encoding of this instruction in the case where the base
register was the first register in the register set.

Also clean up some dead code:

- The isARMLowRegister check is redundant with what VerifyLowRegs does;
  replace with an assert.
- Remove handling of LDMDB instruction, which has no short encoding (and
  does not appear in ReduceTable).

Differential Revision: http://reviews.llvm.org/D9485

llvm-svn: 236535

85a0e23b

Integrate libiomp CMake into LLVM CMake build system. · 648467ed

Andrey Churbanov authored May 05, 2015

This patch integrates the libiomp CMake build system into the LLVM CMake build
system so that users can checkout libiomp into the projects directory of llvm
and build llvm,clang, and libiomp all together. These changes specifically
introduce a new install target which will put libraries and headers into the
correct locations when either a standalone build or part of llvm.
The copy_recipe() method has been removed in favor of the POST_BUILD method
to move headers into the exports subdirectory. And lastly, the MicroTests.cmake
file was refactored which led to simpler target dependencies and a new target,
make libiomp-micro-tests, which performs the 5 small tests (test-relo,
test-touch, etc.) when called.

llvm-svn: 236534

648467ed

[analyzer] This eliminates regression caused by r236423. · 22f6189f
Anton Yartsev authored May 05, 2015
```
Wrap an argument with quotes only if it has spaces.

llvm-svn: 236533
```
22f6189f

[SystemZ] Add support for z13 low-level vector builtins · 5722c0f1

Ulrich Weigand authored May 05, 2015

This adds low-level builtins to allow access to all of the z13 vector
instructions. Note that instructions whose semantics can be described
by standard C (including clang extensions) do not get any builtins.

For each instructions whose semantics *cannot* (fully) be described, we
define a builtin named __builtin_s390_<insn> that directly maps to this
instruction. These are intended to be compatible with GCC.

For instructions that also set the condition code, the builtin will take
an extra argument of type "int *" at the end. The integer pointed to by
this argument will be set to the post-instruction CC value.

For many instructions, the low-level builtin is mapped to the corresponding
LLVM IR intrinsic. However, a number of instructions can be represented
in standard LLVM IR without requiring use of a target intrinsic.

Some instructions require immediate integer operands within a certain
range. Those are verified at the Sema level.

Based on a patch by Richard Sandiford.

llvm-svn: 236532

5722c0f1

[SystemZ] Add support for z13 and its vector facility · 66ff51b4

Ulrich Weigand authored May 05, 2015

This patch adds support for the z13 architecture type.  For compatibility
with GCC, a pair of options -mvx / -mno-vx can be used to selectively
enable/disable use of the vector facility.

When the vector facility is present, we default to the new vector ABI.
This is characterized by two major differences:
- Vector types are passed/returned in vector registers
  (except for unnamed arguments of a variable-argument list function).
- Vector types are at most 8-byte aligned.

The reason for the choice of 8-byte vector alignment is that the hardware
is able to efficiently load vectors at 8-byte alignment, and the ABI only
guarantees 8-byte alignment of the stack pointer, so requiring any higher
alignment for vectors would require dynamic stack re-alignment code.

However, for compatibility with old code that may use vector types, when
*not* using the vector facility, the old alignment rules (vector types
are naturally aligned) remain in use.

These alignment rules are not only implemented at the C language level,
but also at the LLVM IR level.  This is done by selecting a different
DataLayout string depending on whether the vector ABI is in effect or not.

Based on a patch by Richard Sandiford.

llvm-svn: 236531

66ff51b4

[DAGCombiner] Account for getVectorIdxTy() when narrowing vector load · 9958c489

Ulrich Weigand authored May 05, 2015

This patch makes ReplaceExtractVectorEltOfLoadWithNarrowedLoad convert
the element number from getVectorIdxTy() to PtrTy before doing pointer
arithmetic on it.  This is needed on z, where element numbers are i32
but pointers are i64.

Original patch by Richard Sandiford.

llvm-svn: 236530

9958c489

[DAGCombiner] Fix ReplaceExtractVectorEltOfLoadWithNarrowedLoad for BE · af2c618e

Ulrich Weigand authored May 05, 2015

For little-endian, the function would convert (extract_vector_elt (load X), Y)
to X + Y*sizeof(elt).  For big-endian it would instead use
X + sizeof(vec) - Y*sizeof(elt).  The big-endian case wasn't right since
vector index order always follows memory/array order, even for big-endian.
(Note that the current handling has to be wrong for Y==0 since it would
access beyond the end of the vector.)

Original patch by Richard Sandiford.

llvm-svn: 236529

af2c618e

[LegalizeVectorTypes] Allow single loads and stores for more short vectors · 2693c0a4

Ulrich Weigand authored May 05, 2015

When lowering a load or store for TypeWidenVector, the type legalizer
would use a single load or store if the associated integer type was legal.
E.g. it would load a v4i8 as an i32 if i32 was legal.

This patch extends that behavior to promoted integers as well as legal ones.
If the integer type for the full vector width is TypePromoteInteger,
the element type is going to be TypePromoteInteger too, and it's still
better to use a single promoting load or truncating store rather than N
individual promoting loads or truncating stores.  E.g. if you have a v2i8
on a target where i16 is promoted to i32, it's better to load the v2i8 as
an i16 rather than load both i8s individually.

Original patch by Richard Sandiford.

llvm-svn: 236528

2693c0a4

[SystemZ] Add vector intrinsics · c1708b26

Ulrich Weigand authored May 05, 2015

This adds intrinsics to allow access to all of the z13 vector instructions.
Note that instructions whose semantics can be described by standard LLVM IR
do not get any intrinsics.

For each instructions whose semantics *cannot* (fully) be described, we
define an LLVM IR target-specific intrinsic that directly maps to this
instruction.

For instructions that also set the condition code, the LLVM IR intrinsic
returns the post-instruction CC value as a second result.  Instruction
selection will attempt to detect code that compares that CC value against
constants and use the condition code directly instead.

Based on a patch by Richard Sandiford.

llvm-svn: 236527

c1708b26

[SystemZ] Mark v1i128 and v1f128 as unsupported · 5211f9ff

Ulrich Weigand authored May 05, 2015

The ABI specifies that <1 x i128> and <1 x fp128> are supposed to be
passed in vector registers.  We do not yet support those types, and
some infrastructure is missing before we can do so.

In order to prevent accidentally generating code violating the ABI,
this patch adds checks to detect those types and error out if user
code attempts to use them.

llvm-svn: 236526

5211f9ff

[SystemZ] Handle sub-128 vectors · cd2a1b53

Ulrich Weigand authored May 05, 2015

The ABI allows sub-128 vectors to be passed and returned in registers,
with the vector occupying the upper part of a register.  We therefore
want to legalize those types by widening the vector rather than promoting
the elements.

The patch includes some simple tests for sub-128 vectors and also tests
that we can recognize various pack sequences, some of which use sub-128
vectors as temporary results.  One of these forms is based on the pack
sequences generated by llvmpipe when no intrinsics are used.

Signed unpacks are recognized as BUILD_VECTORs whose elements are
individually sign-extended.  Unsigned unpacks can have the equivalent
form with zero extension, but they also occur as shuffles in which some
elements are zero.

Based on a patch by Richard Sandiford.

llvm-svn: 236525

cd2a1b53

[SystemZ] Add CodeGen support for scalar f64 ops in vector registers · 49506d78

Ulrich Weigand authored May 05, 2015

The z13 vector facility includes some instructions that operate only on the
high f64 in a v2f64, effectively extending the FP register set from 16
to 32 registers.  It's still better to use the old instructions if the
operands happen to fit though, since the older instructions have a shorter
encoding.

Based on a patch by Richard Sandiford.

llvm-svn: 236524

49506d78

[SystemZ] Add CodeGen support for v4f32 · 80b3af7a

Ulrich Weigand authored May 05, 2015

The architecture doesn't really have any native v4f32 operations except
v4f32->v2f64 and v2f64->v4f32 conversions, with only half of the v4f32
elements being used.  Even so, using vector registers for <4 x float>
and scalarising individual operations is much better than generating
completely scalar code, since there's much less register pressure.
It's also more efficient to do v4f32 comparisons by extending to 2
v2f64s, comparing those, then packing the result.

This particularly helps with llvmpipe.

Based on a patch by Richard Sandiford.

llvm-svn: 236523

80b3af7a

[SystemZ] Add CodeGen support for v2f64 · cd808237

Ulrich Weigand authored May 05, 2015

This adds ABI and CodeGen support for the v2f64 type, which is natively
supported by z13 instructions.

Based on a patch by Richard Sandiford.

llvm-svn: 236522

cd808237

[SystemZ] Add CodeGen support for integer vector types · ce4c1095

Ulrich Weigand authored May 05, 2015

This the first of a series of patches to add CodeGen support exploiting
the instructions of the z13 vector facility.  This patch adds support
for the native integer vector types (v16i8, v8i16, v4i32, v2i64).

When the vector facility is present, we default to the new vector ABI.
This is characterized by two major differences:
- Vector types are passed/returned in vector registers
  (except for unnamed arguments of a variable-argument list function).
- Vector types are at most 8-byte aligned.

The reason for the choice of 8-byte vector alignment is that the hardware
is able to efficiently load vectors at 8-byte alignment, and the ABI only
guarantees 8-byte alignment of the stack pointer, so requiring any higher
alignment for vectors would require dynamic stack re-alignment code.

However, for compatibility with old code that may use vector types, when
*not* using the vector facility, the old alignment rules (vector types
are naturally aligned) remain in use.

These alignment rules are not only implemented at the C language level
(implemented in clang), but also at the LLVM IR level.  This is done
by selecting a different DataLayout string depending on whether the
vector ABI is in effect or not.

Based on a patch by Richard Sandiford.

llvm-svn: 236521

ce4c1095

[SystemZ] Add z13 vector facility and MC support · a8b04e1c

Ulrich Weigand authored May 05, 2015

This patch adds support for the z13 processor type and its vector facility,
and adds MC support for all new instructions provided by that facilily.

Apart from defining the new instructions, the main changes are:

- Adding VR128, VR64 and VR32 register classes.
- Making FP64 a subclass of VR64 and FP32 a subclass of VR32.
- Adding a D(V,B) addressing mode for scatter/gather operations
- Adding 1-, 2-, and 3-bit immediate operands for some 4-bit fields.
  Until now all immediate operands have been the same width as the
  underlying field (hence the assert->return change in decode[SU]ImmOperand).

In addition, sys::getHostCPUName is extended to detect running natively
on a z13 machine.

Based on a patch by Richard Sandiford.

llvm-svn: 236520

a8b04e1c

Allow TransformTypos to ignore corrections to a specified VarDecl. · b8499f09

Kaelyn Takata authored May 05, 2015

This is needed to prevent a TypoExpr from being corrected to a variable
when the TypoExpr is a subexpression of that variable's initializer.

Also exclude more keywords from the correction candidate pool when the
subsequent token is .* or ->* since keywords like "new" or "return"
aren't valid on the left side of those operators.

Fixes PR23140.

llvm-svn: 236519

b8499f09

Revert "Refactor UpdatePredRedefs and StepForward to avoid duplication. NFC" · 336d90b6

Pete Cooper authored May 05, 2015

This reverts commit 963cdbccf6e5578822836fd9b2ebece0ba9a60b7 (ie r236514)

This is to get the bots green while i investigate.

llvm-svn: 236518

336d90b6

Revert "Fix IfConverter to handle regmask machine operands." · 05b84d41

Pete Cooper authored May 05, 2015

This reverts commit b27413cbfd78d959c18e713bfa271fb69e6b3303 (ie r236515).

This is to get the bots green while i investigate the failures.

llvm-svn: 236517

05b84d41

Fix process launch from Windows host to Android target. · ce36c4ce

Chaoren Lin authored May 05, 2015

Summary:
- Denormalized path on Windows host causes bad `A` packet.
- Executables copied from Windows host doesn't have executable bits.

Reviewers: tberghammer, zturner, ovyalov

Reviewed By: ovyalov

Subscribers: tberghammer, lldb-commits

Differential Revision: http://reviews.llvm.org/D9492

llvm-svn: 236516

ce36c4ce

Fix IfConverter to handle regmask machine operands. · 6ebc2077

Pete Cooper authored May 05, 2015

A regmask (typically seen on a call) clobbers the set of registers it lists. The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks.

These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier. Otherwise, uses after the if converted call could think they are reading an undefined register.

Reviewed by Matthias Braun and Quentin Colombet.

llvm-svn: 236515

6ebc2077

Refactor UpdatePredRedefs and StepForward to avoid duplication. NFC · bbd1c727

Pete Cooper authored May 05, 2015

Will be used in the next commit to also handle regmasks.

llvm-svn: 236514

bbd1c727

Fix typo in assert message. NFC. · 32a0bee2
Diego Novillo authored May 05, 2015
```
llvm-svn: 236513
```
32a0bee2
Fix the clang -Werror build, use of uninitialized variable. · b10516e4
David Blaikie authored May 05, 2015
```
llvm-svn: 236512
```
b10516e4

Update BasicAliasAnalysis to understand that nothing aliases with undef values. · 3459d6ea

Daniel Berlin authored May 05, 2015

It got this in some cases (if one of them was an identified object), but not in all cases.

This caused stores to undef to block load-forwarding in some cases, etc.

Added test to Transforms/GVN to verify optimization occurs as expected.

llvm-svn: 236511

3459d6ea

[opaque pointer type] Track explicit GEP pointee type through in-memory IR · 73cf872a
David Blaikie authored May 05, 2015
```
llvm-svn: 236510
```
73cf872a
Fix Android build. · 26438d26
Chaoren Lin authored May 05, 2015
```
llvm-svn: 236509
```
26438d26

Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86" · 0738a9c0

Reid Kleckner authored May 05, 2015

This reverts commit r236360.

This change exposed a bug in WinEHPrepare by opting win32 code into EH
preparation. We already knew that WinEHPrepare has bugs, and is the
status quo for x64, so I don't think that's a reason to hold off on this
change. I disabled exceptions in the sanitizer tests in r236505 and an
earlier revision.

llvm-svn: 236508

0738a9c0

[ShrinkWrap] Add (a simplified version) of shrink-wrapping. · 61b305ed

Quentin Colombet authored May 05, 2015

This patch introduces a new pass that computes the safe point to insert the
prologue and epilogue of the function.
The interest is to find safe points that are cheaper than the entry and exits
blocks.

As an example and to avoid regressions to be introduce, this patch also
implements the required bits to enable the shrink-wrapping pass for AArch64.


** Context **

Currently we insert the prologue and epilogue of the method/function in the
entry and exits blocks. Although this is correct, we can do a better job when
those are not immediately required and insert them at less frequently executed
places.
The job of the shrink-wrapping pass is to identify such places.


** Motivating example **

Let us consider the following function that perform a call only in one branch of
a if:
define i32 @f(i32 %a, i32 %b)  {
 %tmp = alloca i32, align 4
 %tmp2 = icmp slt i32 %a, %b
 br i1 %tmp2, label %true, label %false

true:
 store i32 %a, i32* %tmp, align 4
 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
 br label %false

false:
 %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
 ret i32 %tmp.0
}

On AArch64 this code generates (removing the cfi directives to ease
readabilities):
_f:                                     ; @f
; BB#0:
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
LBB0_2:                                 ; %false
  mov  sp, x29
  ldp x29, x30, [sp], #16
  ret

With shrink-wrapping we could generate:
_f:                                     ; @f
; BB#0:
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
  add sp, x29, #16            ; =16
  ldp x29, x30, [sp], #16
LBB0_2:                                 ; %false
  ret

Therefore, we would pay the overhead of setting up/destroying the frame only if
we actually do the call.


** Proposed Solution **

This patch introduces a new machine pass that perform the shrink-wrapping
analysis (See the comments at the beginning of ShrinkWrap.cpp for more details).
It then stores the safe save and restore point into the MachineFrameInfo
attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the
related code at the right place. This pass runs right before the PEI.

Unlike the original paper of Chow from PLDI’88, this implementation of
shrink-wrapping does not use expensive data-flow analysis and does not need hack
to properly avoid frequently executed point. Instead, it relies on dominance and
loop properties.

The pass is off by default and each target can opt-in by setting the
EnableShrinkWrap boolean to true in their derived class of TargetPassConfig.
This setting can also be overwritten on the command line by using
-enable-shrink-wrap.

Before you try out the pass for your target, make sure you properly fix your
emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not
necessarily the entry block.


** Design Decisions **

1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but
for debugging and clarity I thought it was best to have its own file.
2. Right now, we only support one save point and one restore point. At some
point we can expand this to several save point and restore point, the impacted
component would then be:
- The pass itself: New algorithm needed.
- MachineFrameInfo: Hold a list or set of Save/Restore point instead of one
  pointer.
- PEI: Should loop over the save point and restore point.
Anyhow, at least for this first iteration, I do not believe this is interesting
to support the complex cases. We should revisit that when we motivating
examples.

Differential Revision: http://reviews.llvm.org/D9210

<rdar://problem/3201744>

llvm-svn: 236507

61b305ed

[Orc] Reapply r236465 with fixes for the MSVC bots. · cd68eba3
Lang Hames authored May 05, 2015
```
llvm-svn: 236506
```
cd68eba3
Disable exceptions with Clang on Windows in lib/sanitizer-common/tests · 60bdf6e4
Reid Kleckner authored May 05, 2015
```
While I'm here, fix a copy-paste bug so we get debug info for these
tests.

llvm-svn: 236505
```
60bdf6e4

[bugpoint] Increase default memory limit to 400MB to fix bugpoint tests. · 85202063

Daniel Sanders authored May 05, 2015

I tracked down the bug to an unchecked malloc in SmallVectorBase::grow_pod().
This malloc is returning NULL on my machine when running under bugpoint but not
when -enable-valgrind is given.

llvm-svn: 236504

85202063

This patch adds ABI support for v1i128 data type. · d4eb73c0

Kit Barton authored May 05, 2015

It adds v1i128 to the appropriate register classes and checks parameter passing
and return values.

This is related to http://reviews.llvm.org/D9081, which will add instructions
that exploit the v1i128 datatype.

Phabricator review: http://reviews.llvm.org/D9475

llvm-svn: 236503

d4eb73c0

[NativeProcessLinux] Get rid of the thread state coordinator thread · 45f5cb31

Pavel Labath authored May 05, 2015

Summary:
This change removes the thread state coordinator thread by making all the operations it was
performing synchronous. In order to prevent deadlock, NativeProcessLinux must now always call
m_monitor->DoOperation with the m_threads_mutex released. This is needed because HandleWait
callbacks lock the mutex (which means the monitor thread will block waiting on whoever holds the
lock). If the other thread now requests a monitor operation, it will wait for the monitor thread
do process it, creating a deadlock.

To preserve this invariant I have introduced two new Monitor commands: "begin operation block"
and "end operation block". They begin command blocks the monitor from processing waitpid
events until the corresponding end command, thereby assuring the monitor does not attempt to
acquire the mutex.

Test Plan: Run the test suite locally, verify no tests fail.

Reviewers: vharron, chaoren

Subscribers: lldb-commits

Differential Revision: http://reviews.llvm.org/D9227

llvm-svn: 236501

45f5cb31

Revert "Enable TestConvenienceVariables on Linux" · 941b743f

Pavel Labath authored May 05, 2015

This reverts commit 193ac6993b64a502db6dc7f2d69dafc47c318407.

The buildbot says the test still fails. :)

llvm-svn: 236500

941b743f

Enable TestConvenienceVariables on Linux · 0d31df47

Pavel Labath authored May 05, 2015

The test has passed in the last 300 runs for me, enabling to see what the buildbot says.

llvm-svn: 236498

0d31df47