Commits · 9d117ab7ef43ca94ff4b8fda523e2b8db3aede81 · Roger Ferrer / llvm-epi-0.8

Sep 20, 2013

Add braces to suppress Clang's dangling-else warning. · 9d117ab7
David Blaikie authored Sep 20, 2013
```
These violations were introduced in r191049

llvm-svn: 191059
```
9d117ab7
Added support for generate DWARF .debug_aranges sections automatically. · 21101b32
Richard Mitton authored Sep 19, 2013
```
llvm-svn: 191052
```
21101b32

Rename ConvergingScheduler to GenericScheduler. · 665d3ec3

Andrew Trick authored Sep 19, 2013

This was an experimental scheduler a year ago. It's now used by
several subtargets, both in-order and out-of-order, and it
is about to be enabled by default for x86 and armv7. It will be the
new GenericScheduler for subtargets that don't provide their own
SchedulingStrategy.

llvm-svn: 191051

665d3ec3

DebugInfo: llvm-dwarfdump support for gnu_pubnames section · 404d3047
David Blaikie authored Sep 19, 2013
```
llvm-svn: 191050
```
404d3047

PR16726: extend rol/ror matching · d09bb461

Kai Nacke authored Sep 19, 2013

C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.

This commit extends the DAGCombiner in the way that the pattern

(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))

is folded into

([az]ext (rotl x, y))

The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.

This fixes PR16726.

llvm-svn: 191049

d09bb461

Revert PR16726: extend rol/ror matching · 2d967b27
Kai Nacke authored Sep 19, 2013
```
There is a buildbot failure. Need to investigate this.

llvm-svn: 191048
```
2d967b27

PR16726: extend rol/ror matching · 4eaf6444

Kai Nacke authored Sep 19, 2013

C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.

This commit extends the DAGCombiner in the way that the pattern

(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))

is folded into

([az]ext (rotl x, y))

The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.

This fixes PR16726.

llvm-svn: 191045

4eaf6444

DebugInfo: Improve IR annotation comments for GNU pubthings. · d0a869d0
David Blaikie authored Sep 19, 2013
```
llvm-svn: 191043
```
d0a869d0

Sep 19, 2013

Unshift the GDB index/GNU pubnames constants modified in r191025 · 8dec4076

David Blaikie authored Sep 19, 2013

Based on code review feedback from Eric Christopher, unshifting these
constants as they can appear in the gdb_index itself, shifted a further
24 bits. This means that keeping them preshifted is a bit inflexible, so
let's not do that.

Given the motivation, wrap up some nicer enums, more type safety, and
some utility functions.

llvm-svn: 191035

8dec4076

DebugInfo: Simplify gnu_pubnames index computation. · b20db58a

David Blaikie authored Sep 19, 2013

Names open to bikeshedding. Could switch back to the constants being
unshifted, but this way seems a bit easier to work with.

llvm-svn: 191025

b20db58a

Remove unnecessary conditional operators performing bool->bool conversion. · 70a33202
David Blaikie authored Sep 19, 2013
```
llvm-svn: 191020
```
70a33202
Fix a typo and simplify a boolean expression. · 0f5ad28a
David Blaikie authored Sep 19, 2013
```
llvm-svn: 191018
```
0f5ad28a

DAGCombiner: Don't fold vector muls with constants that look like a splat of a... · d443e4a0

Benjamin Kramer authored Sep 19, 2013

DAGCombiner: Don't fold vector muls with constants that look like a splat of a power of 2 but differ in bit width.

PR17283.

llvm-svn: 191000

d443e4a0

Debug info: Get rid of the VLA indirection hack in FastISel. · 262bcf45

Adrian Prantl authored Sep 18, 2013

Use the DIVariable::isIndirect() flag set by the frontend instead of
guessing whether to set the machine location's indirection bit.
Paired commit with CFE.

llvm-svn: 190961

262bcf45

Sep 17, 2013

Costmodel: Add support for horizontal vector reductions · cae8735a

Arnold Schwaighofer authored Sep 17, 2013

Upcoming SLP vectorization improvements will want to be able to estimate costs
of horizontal reductions. Add infrastructure to support this.

We model reductions as a series of (shufflevector,add) tuples ultimately
followed by an extractelement. For example, for an add-reduction of <4 x float>
we could generate the following sequence:

 (v0, v1, v2, v3)
   \   \  /  /
     \  \  /
       +  +

 (v0+v2, v1+v3, undef, undef)
    \      /
 ((v0+v2) + (v1+v3), undef, undef)

 %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef,
                           <4 x i32> <i32 2, i32 3, i32 undef, i32 undef>
 %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf
 %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef,
                          <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
 %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7
 %r = extractelement <4 x float> %bin.rdx8, i32 0

This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)"
that will allow clients to ask for the cost of such a reduction (as backends
might generate more efficient code than the cost of the individual instructions
summed up). This interface is excercised by the CostModel analysis pass which
looks for reduction patterns like the one above - starting at extractelements -
and if it sees a matching sequence will call the cost model interface.

We will also support a second form of pairwise reduction that is well supported
on common architectures (haddps, vpadd, faddp).

 (v0, v1, v2, v3)
  \   /    \  /
 (v0+v1, v2+v3, undef, undef)
    \     /
 ((v0+v1)+(v2+v3), undef, undef, undef)

  %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef,
        <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef>
  %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef,
        <4 x i32> <i32 1, i32 3, i32 undef, i32 undef>
  %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1
  %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
        <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
  %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
        <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
  %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1
  %r = extractelement <4 x float> %bin.rdx.1, i32 0

llvm-svn: 190876

cae8735a

Added documentation to getMemsetStores. · 8ec39992
Serge Pavlov authored Sep 17, 2013
```
llvm-svn: 190866
```
8ec39992

[SelectionDAG] Teach the vector scalarizer about TRUNCATE. · d30a9585

Quentin Colombet authored Sep 17, 2013

When a truncate node defines a legal vector type but uses an illegal
vector type, the legalization process was splitting the vector until
<1 x vector> type, but then it was failing to scalarize the node because
it did not know how to handle TRUNCATE.

<rdar://problem/14989896>

llvm-svn: 190830

d30a9585

Debug info: Fix PR16736 and rdar://problem/14990587 . · db3e26d1

Adrian Prantl authored Sep 16, 2013

A DBG_VALUE is register-indirect iff the first operand is a register
_and_ the second operand is an immediate.

llvm-svn: 190821

db3e26d1

Use reference instead of copy. · ec2ffa92
Jakub Staszak authored Sep 16, 2013
```
llvm-svn: 190813
```
ec2ffa92

Sep 16, 2013
- Implement function prefix data as an IR feature. · 3fa50f9b
  Peter Collingbourne authored Sep 16, 2013
```
Previous discussion:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/063909.html

Differential Revision: http://llvm-reviews.chandlerc.com/D1191

llvm-svn: 190773
```
  3fa50f9b
- Replace some unnecessary vector copies with references. · 7d605268
  Benjamin Kramer authored Sep 15, 2013
```
llvm-svn: 190770
```
  7d605268
Sep 15, 2013

Prevent assert in CombinerGlobalAA with null values · 31658834

Hal Finkel authored Sep 15, 2013

DAGCombiner::isAlias can be called with SrcValue1 or SrcValue2 null, and we
can't use AA in this case (if we try, then the casting code in AA will assert).

llvm-svn: 190763

31658834

Sep 13, 2013

[Peephole] Rewrite copies to avoid cross register banks copies. · cf71c632

Quentin Colombet authored Sep 13, 2013

By definition copies across register banks are not coalescable. Still, it may be
possible to get rid of such a copy when the value is available in another
register of the same register file.
Consider the following example, where capital and lower letters denote different
register file:
b = copy A <-- cross-bank copy
...
C = copy b <-- cross-bank copy

This could have been optimized this way:
b = copy A  <-- cross-bank copy
...
C = copy A <-- same-bank copy

Note: b and C's definitions may be in different basic blocks.

This patch adds a peephole optimization that looks through a chain of copies
leading to a cross-bank copy and reuses a source that is on the same register
file if available.

This solution could also be used to get rid of some copies (e.g., A could have
been used instead of C). However, we do not do so because:
- It may over constrain the coloring of the source register for coalescing.
- The register allocator may not be able to find a nice split point for the
  longer live-range, leading to more spill.

<rdar://problem/14742333>

llvm-svn: 190713

cf71c632

Add initial support for handling gnu style pubnames accepted by some · dd1a0120

Eric Christopher authored Sep 13, 2013

versions of gold. This support is designed to allow gold to produce
gdb_index sections similar to the accelerator tables and consumable
by gdb.

llvm-svn: 190649

dd1a0120

Reformat and hoist section grabbing to top level. · 8b3737fb
Eric Christopher authored Sep 13, 2013
```
llvm-svn: 190648
```
8b3737fb

Sep 12, 2013

Add an instruction deprecation feature to TableGen. · 0e76fa7d

Joey Gouly authored Sep 12, 2013

The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.

The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
  ComplexDeprecationPredicate<"MCR">

would mean you would have to define the following function:
  bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
                             std::string &Info)

Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.

The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.

llvm-svn: 190598

0e76fa7d

Fix crash in AggressiveAntiDepBreaker with empty CriticalPathSet · 6f1ff8e1

Hal Finkel authored Sep 12, 2013

If no register classes are added to CriticalPathRCs, then the CriticalPathSet
bitmask will be empty. In that case, ExcludeRegs must remain NULL or else this
line will cause a segfault:

  } else if ((ExcludeRegs != NULL) && ExcludeRegs->test(AntiDepReg)) {

I have no in-tree test case.

llvm-svn: 190584

6f1ff8e1

Remove pointless assertion after r190376 · bc08ddba
Matt Arsenault authored Sep 12, 2013
```
llvm-svn: 190565
```
bc08ddba

Sep 11, 2013
- Debug info: add more comments. · 5b2f4b05
  Manman Ren authored Sep 11, 2013
```
llvm-svn: 190544
```
  5b2f4b05
- Add getUnrollingPreferences to TTI · 8f2e7005
  Hal Finkel authored Sep 11, 2013
```
Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.

llvm-svn: 190542
```
  8f2e7005
- Revert "Give internal classes hidden visibility." · 079b96e6
  Benjamin Kramer authored Sep 11, 2013
```
It works with clang, but GCC has different rules so we can't make all of those
hidden. This reverts commit r190534.

llvm-svn: 190536
```
  079b96e6
- Give internal classes hidden visibility. · 6a44af36
  Benjamin Kramer authored Sep 11, 2013
```
Worth 100k on a linux/x86_64 Release+Asserts clang.

llvm-svn: 190534
```
  6a44af36
- Simplify the checking of function attributes by using the simple methods. · 62a2d14a
  Bill Wendling authored Sep 11, 2013
```
llvm-svn: 190499
```
  62a2d14a
- Rename variables for consistency. · 8f06d556
  Eli Friedman authored Sep 11, 2013
```
No functional change.

llvm-svn: 190466
```
  8f06d556
- Fix unused variables. · 78bffa57
  Eli Friedman authored Sep 10, 2013
```
llvm-svn: 190448
```
  78bffa57
Sep 10, 2013

Hoist section call out of loop. · 13b99d2a
Eric Christopher authored Sep 10, 2013
```
llvm-svn: 190440
```
13b99d2a

Debug Info: create scope children DIEs when the scope DIE is not null. · 2312ed35

Manman Ren authored Sep 10, 2013

We try to create the scope children DIEs after we create the scope DIE. But
to avoid emitting empty lexical block DIE, we first check whether a scope
DIE is going to be null, then create the scope children if it is not null.
From the number of children, we decide whether to actually create the scope DIE.

This patch also removes an early exit which checks for a special condition.
It also removes deletion of un-used children DIEs that are generated
because we used to generate children DIEs before the scope DIE.

Deletion of un-used children DIEs may cause problem because we sometimes keep
created DIEs in a member variable of a CU.

llvm-svn: 190421

2312ed35

Debug Info: define a DIRef template. · 34b3dcc3

Manman Ren authored Sep 10, 2013

Specialize the constructors for DIRef<DIScope> and DIRef<DIType> to make sure
the Value is indeed a scope ref and a type ref.

Use DIScopeRef for DIScope::getContext and DIType::getContext and use DITypeRef
for getContainingType and getClassType.

DIScope::generateRef now returns a DIScopeRef instead of a "Value *" for
readability and type safety.

llvm-svn: 190418

34b3dcc3

Don't use getSetCCResultType for creating a vselect · d232222f

Matt Arsenault authored Sep 10, 2013

The vselect mask isn't a setcc.

This breaks in the case when the result of getSetCCResultType
is larger than the vector operands

e.g. %tmp = select i1 %cmp <2 x i8> %a, <2 x i8> %b
when getSetCCResultType returns <2 x i32>, the assertion
that the (MaskTy.getSizeInBits() == Op1.getValueType().getSizeInBits())
is hit.

No test since I don't think I can hit this with any of the current
targets. The R600/SI implementation would break, since it returns a
vector of i1 for this, but it doesn't reach ExpandSELECT for other
reasons.

llvm-svn: 190376

d232222f

Enable -misched-cyclicpath by default. · 6c88b350
Andrew Trick authored Sep 09, 2013
```
llvm-svn: 190367
```
6c88b350