Commits · ab94b537d75278f93918b202417c23ad561a101b · Roger Ferrer / llvm-epi-0.8

Oct 30, 2013

[mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal... · ab94b537

Daniel Sanders authored Oct 30, 2013

[mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal IR (i.e. not intrinsics)

Also corrected the definition of the intrinsics for these instructions (the
result register is also the first operand), and added intrinsics for bsel and
bseli to clang (they already existed in the backend).

These four operations are mostly equivalent to bsel, and bseli (the difference
is which operand is tied to the result). As a result some of the tests changed
as described below.

bitwise.ll:
- bsel.v test adapted so that the mask is unknown at compile-time. This stops
  it emitting bmnzi.b instead of the intended bsel.v.
- The bseli.b test now tests the right thing. Namely the case when one of the
  values is an uimm8, rather than when the condition is a uimm8 (which is
  covered by bmnzi.b)

compare.ll:
- bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this
  is the same operation (see MSA.txt).

i8.ll
- CHECK-DAG-ized test.
- bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands
  because this is the same operation (see MSA.txt).
- bseli.b still emits bseli.b though because the immediate makes it
  distinguishable from bmnzi.b.

vec.ll:
- CHECK-DAG-ized test.
- bmz.v tests now (correctly) emits bmnz.v with swapped operands (see
  MSA.txt).
- bsel.v tests now (correctly) emits bmnz.v with swapped operands (see
  MSA.txt).

llvm-svn: 193693

ab94b537

[AArch64] Add support for NEON scalar floating-point compare instructions. · be020d03
Chad Rosier authored Oct 30, 2013
```
llvm-svn: 193691
```
be020d03

Refactor the AVX512 intrinsics. Cluster the intrinsics into the appropriate... · d184466d

Cameron McInally authored Oct 30, 2013

Refactor the AVX512 intrinsics. Cluster the intrinsics into the appropriate vector extension class within the .td file.

llvm-svn: 193690

d184466d

Rehash but don't grow when full of tombstones. · 811c96fa

Howard Hinnant authored Oct 30, 2013

This problem was found and fixed by José Fonseca in March 2011 for
SmallPtrSet, committed r128566.  But as far as I can tell, all other
llvm hash tables retain the same problem:  the bucket count can grow
without bound while size() remains near constant by repeated
insert/erase cycles that tend to fill the container with tombstones. 
Here is a demo that has been reduced to a trivial case:

int
main()
{
   llvm::DenseSet<unsigned> d;
   for (unsigned i = 0; i < 0xFFFFFFF; ++i)
   {
       d.insert(i);
       d.erase(i);
   }
}

While the container size() never grows above 1, the bucket count grows
like this:

nb = 64
nb = 128
nb = 256
nb = 512
nb = 1024
nb = 2048
nb = 4096
nb = 8192
nb = 16384
nb = 32768
nb = 65536
nb = 131072
nb = 262144
nb = 524288
nb = 1048576
nb = 2097152
nb = 4194304
nb = 8388608
nb = 16777216
nb = 33554432
nb = 67108864
nb = 134217728
nb = 268435456

The above program currently consumes a few GB ram.  This patch brings
the memory consumption down by several orders of magnitude, and keeps
the bucket count at 64 for the above test.

llvm-svn: 193689

811c96fa

[mips][msa] Added support for matching bins[lr]i.[bhwd] from normal IR (i.e. not intrinsics) · d74b130c

Daniel Sanders authored Oct 30, 2013

This required correcting the definition of the bins[lr]i intrinsics because
the result is also the first operand.

It also required removing the (arbitrary) check for 32-bit immediates in
MipsSEDAGToDAGISel::selectVSplat().

Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d
because the constant is legalized into a ConstantPool. Similar things can
happen with binsri.d with more than 10 bits set in the mask. The resulting
code when this happens is correct but not optimal.

llvm-svn: 193687

d74b130c

[mips][msa] Combine binsri-like DAG of AND and OR into equivalent VSELECT · 53fe6c4d

Daniel Sanders authored Oct 30, 2013

(or (and $a, $mask), (and $b, $inverse_mask)) => (vselect $mask, $a, $b).
where $mask is a constant splat. This allows bitwise operations to make use
of bsel.

It's also a stepping stone towards matching bins[lr], and bins[lr]i from
normal IR.

Two sets of similar tests have been added in this commit. The bsel_* functions
test the case where binsri cannot be used. The binsr_*_i functions will
start to use the binsri instruction in the next commit.

llvm-svn: 193682

53fe6c4d

[mips] MipsSETargetLowering now reports DAGCombiner changes when using -debug-only=mips-isel · 62aeab83
Daniel Sanders authored Oct 30, 2013
```
No test since -debug output is intended for developers and not end-users.

llvm-svn: 193681
```
62aeab83

[mips][msa] Added support for matching splat.[bhw] from normal IR (i.e. not intrinsics) · e7ef0c81

Daniel Sanders authored Oct 30, 2013

splat.d is implemented but this subtest is currently disabled. This is because
it is difficult to match the appropriate IR on MIPS32. There is a patch under
review that should help with this so I hope to enable the subtest soon.

llvm-svn: 193680

e7ef0c81

Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too." · 3bd686d4
Juergen Ributzka authored Oct 30, 2013
```
Now Hexagon and SystemZ are not happy with it :-(

llvm-svn: 193677
```
3bd686d4

SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. · 6ad05d6b

Juergen Ributzka authored Oct 30, 2013

The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. This mask has
usually the same size as the VSELECT return type (except for Intel KNL). Now the
type legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

Reviewed by Nadav

llvm-svn: 193676

6ad05d6b

Reformat Makefile. No other changes. · d3b4344a
Bill Wendling authored Oct 30, 2013
```
llvm-svn: 193675
```
d3b4344a
[mips] Compute stack alignment on the fly. · 3048b024
Akira Hatanaka authored Oct 30, 2013
```
llvm-svn: 193673
```
3048b024

Reformat code with clang-format. · 7245f1d8

Josh Magee authored Oct 30, 2013

Differential Revision: http://llvm-reviews.chandlerc.com/D2057

llvm-svn: 193672

7245f1d8

StackProtector.h: Fix trailing comments for doxygen. [-Wdocumentation] · c6823c76
NAKAMURA Takumi authored Oct 30, 2013
```
  s!//<!///<!

llvm-svn: 193669
```
c6823c76
Trailing whitespace in a comment line. · 8970f538
NAKAMURA Takumi authored Oct 30, 2013
```
llvm-svn: 193668
```
8970f538

Debug Info: code clean up. · 251a1bd2

Manman Ren authored Oct 29, 2013

Use EmitLabelOffsetDifference for handling on darwin platform when
non-darwin platforms use EmitLabelPlusOffset.

Also fix a bug in EmitLabelOffsetDifference where the size is hard-coded
to 4 even though Size is passed in as an argument.

llvm-svn: 193660

251a1bd2

Oct 29, 2013

Debug Info: support for DW_FORM_ref_addr. · ce20d460

Manman Ren authored Oct 29, 2013

To support ref_addr, we calculate the section offset of a DIE (i.e. offset
of a DIE from beginning of the debug info section). The Offset field in DIE
is currently CU-relative. To calculate the section offset, we add a
DebugInfoOffset field in CompileUnit to store the offset of a CU from beginning
of the debug info section. We set the value in DwarfUnits::computeSizeAndOffset
for each CompileUnit.

A helper function DIE::getCompileUnit is added to return the CU DIE that
the input DIE belongs to. We also add a map CUDieMap in DwarfDebug to help
finding the CU for a given CU DIE.

For a cross-referenced DIE, we first find the CU DIE it belongs to with
getCompileUnit, then we use CUDieMap to get the corresponding CU for the CU DIE.
Adding the section offset of the CU with the CU-relative offset of a DIE gives
us the seciton offset of the DIE.

We correctly emit ref_addr with relocation using EmitLabelPlusOffset when
doesDwarfUseRelocationsAcrossSections is true.

This commit handles the emission of DW_FORM_ref_addr when we have an attribute
with FORM_ref_addr. A follow-on patch will start using ref_addr when adding a
DIEEntry. This commit will be tested and verified in the follow-on patch.

Reviewed off-list by Eric, Thanks.

llvm-svn: 193658

ce20d460

Debug Info: instead of calling addToContextOwner which constructs the context · f4c339e0

Manman Ren authored Oct 29, 2013

after the DIE creation, we construct the context first.

Ensure that we create the context before we create a type so that we can add
the newly created type to the parent. Remove last use of addToContextOwner
now that it's not needed.

We use createAndAddDIE to wrap around "new DIE(". Now all shareable DIEs
should be added to their parents right after the creation.

Reviewed off-list by Eric, Thanks.

llvm-svn: 193657

f4c339e0

Struct byval cleanup: add helper functions to reduce code duplication. · b504f494

Manman Ren authored Oct 29, 2013

Helper functions are added:
emitPostLd: emit a post-increment load operation with given size.
emitPostSt: emit a post-increment store operation with given size.

No functionality change.

llvm-svn: 193656

b504f494

[stackprotector] Update the StackProtector pass to perform datalayout analysis. · 3f1c0e35

Josh Magee authored Oct 29, 2013

This modifies the pass to classify every SSP-triggering AllocaInst according to
an SSPLayoutKind (LargeArray, SmallArray, AddrOf).  This analysis is collected
by the pass and made available for use, but no other pass uses it yet.

The next patch will make use of this analysis in PEI and StackSlot
passes.  The end goal is to support ssp-strong stack layout rules.

WIP.

Differential Revision: http://llvm-reviews.chandlerc.com/D1789

llvm-svn: 193653

3f1c0e35

Update comment · 87596662
Matt Arsenault authored Oct 29, 2013
```
llvm-svn: 193651
```
87596662

Workaround MSVC 32-bit miscompile of getCondCodeAction. · a1ca46d0

Matt Arsenault authored Oct 29, 2013

Use 32-bit types for the array instead of 64. This should
generally be better anyway.

In optimized + assert builds, I saw a failure when a
cond code / type combination that is never set was loading
a non-zero value and hitting the != Promote assert.

It turns out when loading the 64-bit value to do the shift,
the assembly loads the 2 32-bit halves from non-consecutive
addresses. The address the second half of the loaded uint64_t
doesn't include the offset of the array in the struct. Instead
of being offset + 4, it's just + 4.

I'm not entirely sure why this wasn't observed before.
setCondCodeAction isn't heavily used by the in-tree targets,
and not with the higher valued vector SimpleValueTypes. Only
PPC is using one of the > 32 valued types, and that is probably
never used by anyone on a 32-bit MSVC compiled host.

I ran into this when upgrading LLVM versions, so I guess the
value loaded from the nonsense address happened to work out
before.

No test since I'm not really sure if / how it can be reproduced
with the current in tree targets, and it's not supposed to change
anything.

llvm-svn: 193650

a1ca46d0

Removing a switch statement that contains only a default label. This resolves... · 9ab670fb

Aaron Ballman authored Oct 29, 2013

Removing a switch statement that contains only a default label.  This resolves an MSVC warning.  No functional change intended.

llvm-svn: 193649

9ab670fb

[mips] Align the stack to 16-bytes for mfp64. · 6b2d8419
Akira Hatanaka authored Oct 29, 2013
```
llvm-svn: 193641
```
6b2d8419
Remove declared but not implemented function. · 88034af2
Rafael Espindola authored Oct 29, 2013
```
llvm-svn: 193637
```
88034af2
Fix common typos in the docs. · 3b32b2ff
Benjamin Kramer authored Oct 29, 2013
```
llvm-svn: 193632
```
3b32b2ff
Move getSymbol to TargetLoweringObjectFile. · e133ed88
Rafael Espindola authored Oct 29, 2013
```
This allows constructing a Mangler with just a TargetMachine.

llvm-svn: 193630
```
e133ed88

Debug Info: clean up testing case. · 75cc7658

Manman Ren authored Oct 29, 2013

Add a tag before the name attribute for readability. Use CHECK-NEXT
instead of CHECK-NOT followed by a CHECK. Add new lines to separate checking
of different DIEs.

llvm-svn: 193629

75cc7658

Add a helper getSymbol to AsmPrinter. · 79858aa3
Rafael Espindola authored Oct 29, 2013
```
llvm-svn: 193627
```
79858aa3
add test cases for frameaddr and returnaddr for aarch64 · acf48d75
Weiming Zhao authored Oct 29, 2013
```
llvm-svn: 193626
```
acf48d75
[AArch64] Implement FrameAddr and ReturnAddr · ffade617
Weiming Zhao authored Oct 29, 2013
```
Fixes PR17690

llvm-svn: 193625
```
ffade617
[ARM] Make sure HasCRC is initialized to false in Subtarget. · f9a67fce
Amara Emerson authored Oct 29, 2013
```
llvm-svn: 193624
```
f9a67fce
Support for microMIPS jump instructions · 507e084a
Zoran Jovanovic authored Oct 29, 2013
```
llvm-svn: 193623
```
507e084a

R600/SI: Add compute support for CI v2 · 6e1ee476

Tom Stellard authored Oct 29, 2013



v2:
  - Fix LDS size calculation

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 193621

6e1ee476

R600: Expand vector FSQRT ops · e118b8be
Tom Stellard authored Oct 29, 2013
```
llvm-svn: 193620
```
e118b8be
DWARF parser: propery handle DW_FORM_ref_sig8 and fix Windows build. · cbd806ae
Alexey Samsonov authored Oct 29, 2013
```
Based on D2050 by Timur Iskhodzhanov.

llvm-svn: 193619
```
cbd806ae
The asm printer has a mangler. Use it. · 7d78b2ae
Rafael Espindola authored Oct 29, 2013
```
llvm-svn: 193618
```
7d78b2ae
The AsmPrinter has a Mangler. Use it. · 69c1d631
Rafael Espindola authored Oct 29, 2013
```
llvm-svn: 193617
```
69c1d631
The asm printer has a mangler. Don't keep a second pointer to it. · 38c2e65e
Rafael Espindola authored Oct 29, 2013
```
llvm-svn: 193616
```
38c2e65e

Support names like llvm-ar-3.4 and llvm-ranlib-3.4. · e804b1a4

Rafael Espindola authored Oct 29, 2013

They are used in some packages. For example:
http://packages.ubuntu.com/saucy/i386/llvm-3.4/filelist

This fixes pr17721.

llvm-svn: 193612

e804b1a4