Commits · 4dbdc9021d7f655a8a5bd6c97806a9069778252a · Roger Ferrer / llvm-epi-0.8

Oct 31, 2013

Debug Info: remove duplication of DIEs when a DIE can be shared across CUs. · 4dbdc902

Manman Ren authored Oct 31, 2013

We add a map in DwarfDebug to map MDNodes that are shareable across CUs to the
corresponding DIEs: MDTypeNodeToDieMap. These DIEs can be shared across CUs,
that is why we keep the maps in DwarfDebug instead of CompileUnit.

We make the assumption that if a DIE is not added to an owner yet, we assume
it belongs to the current CU. Since DIEs for the type system are added to
their owners immediately after creation, and other DIEs belong to the current
CU, the assumption should be true.

A testing case is added to show that we only create a single DIE for a type
MDNode and we use ref_addr to refer to the type DIE.

We also add a testing case to show ref_addr relocations for non-darwin
platforms.

llvm-svn: 193779

4dbdc902

DWARFAbbreviationDeclaration: remove dead code, refactor parsing code and make... · d5cc93c3

Alexey Samsonov authored Oct 31, 2013

DWARFAbbreviationDeclaration: remove dead code, refactor parsing code and make it more robust. No functionality change.

llvm-svn: 193770

d5cc93c3

Lower stackmap intrinsics directly to their target opcode in the DAG builder. · 74f4c749
Andrew Trick authored Oct 31, 2013
```
llvm-svn: 193769
```
74f4c749
Enable variable arguments support for intrinsics. · a2efd99b
Andrew Trick authored Oct 31, 2013
```
llvm-svn: 193766
```
a2efd99b
whitespace · d4d1d9c0
Andrew Trick authored Oct 31, 2013
```
llvm-svn: 193765
```
d4d1d9c0
Remove another unused flag. · 4b102d0e
Rafael Espindola authored Oct 31, 2013
```
llvm-svn: 193756
```
4b102d0e
Remove unused flag. · 74e1d0a0
Rafael Espindola authored Oct 31, 2013
```
llvm-svn: 193752
```
74e1d0a0
Rules adjustments in order to build on DragonFly BSD. · aca9739a
Rafael Espindola authored Oct 31, 2013
```
Patch by Robin Hahling.

llvm-svn: 193750
```
aca9739a
Remove the --shrink-wrap option. · dbec9d9b
Rafael Espindola authored Oct 31, 2013
```
It had no tests, was unused and was "experimental at best".

llvm-svn: 193749
```
dbec9d9b
Add AVX512 unmasked integer broadcast intrinsics and support. · 394d557f
Cameron McInally authored Oct 31, 2013
```
llvm-svn: 193748
```
394d557f
AVX-512: Implemented CMOV for 512-bit vectors · 49665690
Elena Demikhovsky authored Oct 31, 2013
```
llvm-svn: 193747
```
49665690

[SystemZ] Automatically detect zEC12 and z196 hosts · f834ea19

Richard Sandiford authored Oct 31, 2013

As on other hosts, the CPU identification instruction is priveleged,
so we need to look through /proc/cpuinfo.  I copied the PowerPC way of
handling "generic".

Several tests were implicitly assuming z10 and so failed on z196.

llvm-svn: 193742

f834ea19

[AArch64] Make the use of FP instructions optional, but enabled by default. · f80f95fc

Amara Emerson authored Oct 31, 2013

This adds a new subtarget feature called FPARMv8 (implied by NEON), and
predicates the support of the FP instructions and registers on this feature.

llvm-svn: 193739

f80f95fc

Fix a use after free on invalid input. · 26b43cac
Rafael Espindola authored Oct 31, 2013
```
llvm-svn: 193737
```
26b43cac
Fix most memory leaks in tablegen. · 8fb73c87
Rafael Espindola authored Oct 31, 2013
```
Found by the valgrind bot.

llvm-svn: 193736
```
8fb73c87
Merge CallGraph and BasicCallGraph. · 6554e5a9
Rafael Espindola authored Oct 31, 2013
```
llvm-svn: 193734
```
6554e5a9

Legalize: Improve legalization of long vector extends. · 72366786

Jim Grosbach authored Oct 31, 2013

When an extend more than doubles the size of the elements (e.g., a zext
from v16i8 to v16i32), the normal legalization method of splitting the
vectors will run into problems as by the time the destination vector is
legal, the source vector is illegal. The end result is the operation
often becoming scalarized, with the typical horrible performance. For
example, on x86_64, the simple input of:
define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind {
  %tmp = zext <16 x i8> %a to <16 x i32>
  store <16 x i32> %tmp, <16 x i32>*%p
  ret void
}

Generates:
  .section  __TEXT,__text,regular,pure_instructions
  .section  __TEXT,__const
  .align  5
LCPI0_0:
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _bar
  .align  4, 0x90
_bar:
  vpunpckhbw  %xmm0, %xmm0, %xmm1
  vpunpckhwd  %xmm0, %xmm1, %xmm2
  vpmovzxwd %xmm1, %xmm1
  vinsertf128 $1, %xmm2, %ymm1, %ymm1
  vmovaps LCPI0_0(%rip), %ymm2
  vandps  %ymm2, %ymm1, %ymm1
  vpmovzxbw %xmm0, %xmm3
  vpunpckhwd  %xmm0, %xmm3, %xmm3
  vpmovzxbd %xmm0, %xmm0
  vinsertf128 $1, %xmm3, %ymm0, %ymm0
  vandps  %ymm2, %ymm0, %ymm0
  vmovaps %ymm0, (%rdi)
  vmovaps %ymm1, 32(%rdi)
  vzeroupper
  ret

So instead we can check if there are legal types that enable us to split
more cleverly when the input vector is already legal such that we don't
turn it into an illegal type. If the extend is such that it's more than
doubling the size of the input we check if
  - the number of vector elements is even,
  - the source type is legal,
  - the type of a split source is illegal,
  - the type of an extended (by doubling element size) source is legal, and
  - the type of that extended source when split is legal.
If the conditions are met, instead of just splitting both the
destination and the source types, we create an extend that only goes up
one "step" (doubling the element width), and the continue legalizing the
rest of the operation normally. The result is that this operates as a
new, more effecient, termination condition for the loop of "split the
operation until the destination type is legal."

With this change, the above example now compiles to:
_bar:
  vpxor %xmm1, %xmm1, %xmm1
  vpunpcklbw  %xmm1, %xmm0, %xmm2
  vpunpckhwd  %xmm1, %xmm2, %xmm3
  vpunpcklwd  %xmm1, %xmm2, %xmm2
  vinsertf128 $1, %xmm3, %ymm2, %ymm2
  vpunpckhbw  %xmm1, %xmm0, %xmm0
  vpunpckhwd  %xmm1, %xmm0, %xmm3
  vpunpcklwd  %xmm1, %xmm0, %xmm0
  vinsertf128 $1, %xmm3, %ymm0, %ymm0
  vmovaps %ymm0, 32(%rdi)
  vmovaps %ymm2, (%rdi)
  vzeroupper
  ret

This generalizes a custom lowering that was added a while back to the
ARM backend. That lowering is no longer necessary, and is removed. The
testcases for it, however, provide excellent ARM tests for this change
and so remain.

rdar://14735100

llvm-svn: 193727

72366786

Fix a few typos · 909d0c06
Matt Arsenault authored Oct 30, 2013
```
llvm-svn: 193723
```
909d0c06
Fix CodeGen for unaligned loads with address spaces · 2ba54c3d
Matt Arsenault authored Oct 30, 2013
```
llvm-svn: 193721
```
2ba54c3d

Oct 30, 2013

Teach scalarrepl about address spaces · 38b8ecf3
Matt Arsenault authored Oct 30, 2013
```
llvm-svn: 193720
```
38b8ecf3

Add calls to doInitialization() and doFinalization() in verifyFunction() · 55fdcff4

Rafael Espindola authored Oct 30, 2013

The function verifyFunction() in lib/IR/Verifier.cpp misses some
calls. It creates a temporary FunctionPassManager that will run a
single Verifier pass. Unfortunately, FunctionPassManager is no
PassManager and does not call doInitialization() and doFinalization()
by itself. Verifier does important tasks in doInitialization() such as
collecting type information used to check DebugInfo metadata and
doFinalization() does some additional checks. Therefore these checks
were missed and debug info couldn't be verified at all, it just
crashed if the function had some.

verifyFunction() is currently not used in llvm unless -debug option is
enabled, and in unittests/IR/VerifierTest.cpp

VerifierTest had to be changed to create the function in a module from
which the type debug info can be collected.

Patch by Michael Kruse.

llvm-svn: 193719

55fdcff4

Produce .weak_def_can_be_hidden for some linkonce_odr values · 6f1b2852

Rafael Espindola authored Oct 30, 2013

With this patch llvm produces a weak_def_can_be_hidden for linkonce_odr
if they are also unnamed_addr or don't have their address taken.

There is not a lot of documentation about .weak_def_can_be_hidden, but
from the old discussion about linkonce_odr_auto_hide and the name of
the directive this looks correct: these symbols can be hidden.

Testing this with the ld64 in Xcode 5 linking clang reduces the number of
exported symbols from 21053 to 19049.

llvm-svn: 193718

6f1b2852

DebugInfo: Push header handling down into CompileUnit · 6b288cfa

David Blaikie authored Oct 30, 2013

This is a preliminary step to handling type units by abstracting over
all (type or compile) units.

llvm-svn: 193714

6b288cfa

Fix GVN creating bitcast between address spaces · 614ea99d
Matt Arsenault authored Oct 30, 2013
```
llvm-svn: 193710
```
614ea99d
This commit adds some (but not all) of the x86-64 relocations that are not · 04d88fba
Tom Roeder authored Oct 30, 2013
```
currently supported in the ELF object writer, along with a simple test case.

llvm-svn: 193709
```
04d88fba

Add {start,end}with_lower methods to StringRef. · 00e24e48

Rui Ueyama authored Oct 30, 2013

startswith_lower is ocassionally useful and I think worth adding.
endwith_lower is added for completeness.

Differential Revision: http://llvm-reviews.chandlerc.com/D2041

llvm-svn: 193706

00e24e48

[ARM] NEON instructions were erroneously decoded from certain invalid encodings · c1be9c16
Artyom Skrobov authored Oct 30, 2013
```
llvm-svn: 193705
```
c1be9c16
R600: Custom lower f32 = uint_to_fp i64 · c947d8ca
Tom Stellard authored Oct 30, 2013
```
llvm-svn: 193701
```
c947d8ca
DwarfDebug: Change Abbreviations member from pointer to reference · 2d4e1122
David Blaikie authored Oct 30, 2013
```
llvm-svn: 193699
```
2d4e1122

Add #include of raw_ostream.h to MipsSEISelLowering.cpp · 3e9b1c10

Hans Wennborg authored Oct 30, 2013

Fixing this Windows build error:

..\lib\Target\Mips\MipsSEISelLowering.cpp(997) : error C2027: use of undefined type 'llvm::raw_ostream'

llvm-svn: 193696

3e9b1c10

[mips][msa] Correct definition of bins[lr] and CHECK-DAG-ize related tests · d5f554f0
Daniel Sanders authored Oct 30, 2013
```
llvm-svn: 193695
```
d5f554f0
make ConstantRange::signExtend() optimal · 1112eca0
Nuno Lopes authored Oct 30, 2013
```
the case [x, INT_MIN) was not handled optimally

llvm-svn: 193694
```
1112eca0

[mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal... · ab94b537

Daniel Sanders authored Oct 30, 2013

[mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal IR (i.e. not intrinsics)

Also corrected the definition of the intrinsics for these instructions (the
result register is also the first operand), and added intrinsics for bsel and
bseli to clang (they already existed in the backend).

These four operations are mostly equivalent to bsel, and bseli (the difference
is which operand is tied to the result). As a result some of the tests changed
as described below.

bitwise.ll:
- bsel.v test adapted so that the mask is unknown at compile-time. This stops
  it emitting bmnzi.b instead of the intended bsel.v.
- The bseli.b test now tests the right thing. Namely the case when one of the
  values is an uimm8, rather than when the condition is a uimm8 (which is
  covered by bmnzi.b)

compare.ll:
- bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this
  is the same operation (see MSA.txt).

i8.ll
- CHECK-DAG-ized test.
- bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands
  because this is the same operation (see MSA.txt).
- bseli.b still emits bseli.b though because the immediate makes it
  distinguishable from bmnzi.b.

vec.ll:
- CHECK-DAG-ized test.
- bmz.v tests now (correctly) emits bmnz.v with swapped operands (see
  MSA.txt).
- bsel.v tests now (correctly) emits bmnz.v with swapped operands (see
  MSA.txt).

llvm-svn: 193693

ab94b537

[AArch64] Add support for NEON scalar floating-point compare instructions. · be020d03
Chad Rosier authored Oct 30, 2013
```
llvm-svn: 193691
```
be020d03

[mips][msa] Added support for matching bins[lr]i.[bhwd] from normal IR (i.e. not intrinsics) · d74b130c

Daniel Sanders authored Oct 30, 2013

This required correcting the definition of the bins[lr]i intrinsics because
the result is also the first operand.

It also required removing the (arbitrary) check for 32-bit immediates in
MipsSEDAGToDAGISel::selectVSplat().

Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d
because the constant is legalized into a ConstantPool. Similar things can
happen with binsri.d with more than 10 bits set in the mask. The resulting
code when this happens is correct but not optimal.

llvm-svn: 193687

d74b130c

[mips][msa] Combine binsri-like DAG of AND and OR into equivalent VSELECT · 53fe6c4d

Daniel Sanders authored Oct 30, 2013

(or (and $a, $mask), (and $b, $inverse_mask)) => (vselect $mask, $a, $b).
where $mask is a constant splat. This allows bitwise operations to make use
of bsel.

It's also a stepping stone towards matching bins[lr], and bins[lr]i from
normal IR.

Two sets of similar tests have been added in this commit. The bsel_* functions
test the case where binsri cannot be used. The binsr_*_i functions will
start to use the binsri instruction in the next commit.

llvm-svn: 193682

53fe6c4d

[mips] MipsSETargetLowering now reports DAGCombiner changes when using -debug-only=mips-isel · 62aeab83
Daniel Sanders authored Oct 30, 2013
```
No test since -debug output is intended for developers and not end-users.

llvm-svn: 193681
```
62aeab83

[mips][msa] Added support for matching splat.[bhw] from normal IR (i.e. not intrinsics) · e7ef0c81

Daniel Sanders authored Oct 30, 2013

splat.d is implemented but this subtest is currently disabled. This is because
it is difficult to match the appropriate IR on MIPS32. There is a patch under
review that should help with this so I hope to enable the subtest soon.

llvm-svn: 193680

e7ef0c81

Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too." · 3bd686d4
Juergen Ributzka authored Oct 30, 2013
```
Now Hexagon and SystemZ are not happy with it :-(

llvm-svn: 193677
```
3bd686d4

SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. · 6ad05d6b

Juergen Ributzka authored Oct 30, 2013

The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. This mask has
usually the same size as the VSELECT return type (except for Intel KNL). Now the
type legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

Reviewed by Nadav

llvm-svn: 193676

6ad05d6b