Commits · 8caa904bde537aa8e9d8e3665c9b37b6779a2d02 · Roger Ferrer / llvm-epi-0.8

Apr 10, 2013

R600/SI: Add pattern for AMDGPUurecip · 8caa904b

Michel Danzer authored Apr 10, 2013



21 more little piglits with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 179186

8caa904b

This is for an experimental option -mips-os16. The idea is to compile all · fe94cc3e

Reed Kotler authored Apr 10, 2013

Mips32 code as Mips16 unless it can't be compiled as Mips 16. For now this
would happen as long as floating point instructions are not needed.
Probably it would also make sense to compile as mips32 if atomic operations
are needed too. There may be other cases too.

A module pass prescans the IR and adds the mips16 or nomips16 attribute
to functions depending on the functions needs.

Mips 16 mode can result in a 40% code compression by utililizing 16 bit
encoding of many instructions.

The hope is for this to replace the traditional gcc way of dealing with
Mips16 code using floating point which involves essentially using soft float
but with a library implemented using mips32 floating point. This gcc 
method also requires creating stubs so that Mips32 code can interact with
these Mips 16 functions that have floating point needs. My conjecture is
that in reality this traditional gcc method would never win over this
new method.

I will be implementing the traditional gcc method also. Some of it is already
done but I needed to do the stubs to finish the work and those required
this mips16/32 mixed mode capability.

I have more ideas for to make this new method much better and I think the old
method will just live in llvm for anyone that needs the backward compatibility
but I don't for what reason that would be needed.

llvm-svn: 179185

fe94cc3e

Use a scheme closer to that of GNU as when deciding the type of a · adac407e

Peter Collingbourne authored Apr 10, 2013

symbol with multiple .type declarations.

Differential Revision: http://llvm-reviews.chandlerc.com/D607

llvm-svn: 179184

adac407e

Template MachOObjectFile over endianness too. · 641c9bcf
Rafael Espindola authored Apr 10, 2013
```
llvm-svn: 179179
```
641c9bcf
R600: Add VTX_READ_* and RAT_WRITE_CACHELESS_* when computing cf addr · 04d9aa48
Vincent Lejeune authored Apr 10, 2013
```
llvm-svn: 179174
```
04d9aa48

ARM: Make "SMC" instructions conditional on new TrustZone architecture feature. · c6047655

Tim Northover authored Apr 10, 2013

These instructions aren't universally available, but depend on a specific
extension to the normal ARM architecture (rather than, say, v6/v7/...) so a new
feature is appropriate.

This also enables the feature by default on A-class cores which usually have
these extensions, to avoid breaking existing code and act as a sensible
default.

llvm-svn: 179171

c6047655

Change CloneFunctionInto to always clone Argument attributes induvidually, · 81259294

Joey Gouly authored Apr 10, 2013

rather than checking if the source and destination have the same number of
arguments and copying the attributes over directly.

llvm-svn: 179169

81259294

R600/SI: dynamical figure out the reg class of MIMG · 8b1ed28e

Christian Konig authored Apr 10, 2013



Depending on the number of bits set in the writemask.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179166

8b1ed28e

R600/SI: adjust writemask to only the used components · 8e06e2a8

Christian Konig authored Apr 10, 2013



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179165

8e06e2a8

R600/SI: remove image sample writemask · 4ace6632

Christian Konig authored Apr 10, 2013



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179164

4ace6632

Cleanup PPCInstrInfo::DefinesPredicate · af822018

Hal Finkel authored Apr 10, 2013

Implement suggestions made by Bill Schmidt in post-commit review. Thanks!

llvm-svn: 179162

af822018

RegionInfo: Add helpers to replace entry/exit recursively · 141cc3e8
Tobias Grosser authored Apr 10, 2013
```
Contributed by: Star Tan <tanmx_star@yeah.net>

llvm-svn: 179157
```
141cc3e8

PPC: Prep for if conversion of bctr[l] · 500b0045

Hal Finkel authored Apr 10, 2013

This adds in-principle support for if-converting the bctr[l] instructions.
These instructions are used for indirect branching. It seems, however, that the
current if converter will never actually predicate these. To do so, it would
need the ability to hoist a few setup insts. out of the conditionally-executed
block. For example, code like this:
  void foo(int a, int (*bar)()) { if (a != 0) bar(); }
becomes:
        ...
        beq 0, .LBB0_2
        std 2, 40(1)
        mr 12, 4
        ld 3, 0(4)
        ld 11, 16(4)
        ld 2, 8(4)
        mtctr 3
        bctrl
        ld 2, 40(1)
.LBB0_2:
        ...
and it would be safe to do all of this unconditionally with a predicated
beqctrl instruction.

llvm-svn: 179156

500b0045

Template the MachO types over endianness. · eaae687d
Rafael Espindola authored Apr 10, 2013
```
For now they are still only used as little endian.

llvm-svn: 179147
```
eaae687d
__sincosf_stret returns sinf / cosf in bits 0:31 and 32:63 of xmm0, not in · ac0469c5
Evan Cheng authored Apr 10, 2013
```
xmm0 / xmm1.

rdar://13599493

llvm-svn: 179141
```
ac0469c5

Generalize the PassConfig API and remove addFinalizeRegAlloc(). · e220323c

Andrew Trick authored Apr 10, 2013

The target hooks are getting out of hand. What does it mean to run
before or after regalloc anyway? Allowing either Pass* or AnalysisID
pass identification should make it much easier for targets to use the
substitutePass and insertPass APIs, and create less need for badly
named target hooks.

llvm-svn: 179140

e220323c

Mips specific inline asm operand modifier 'D' · b04e357d

Jack Carter authored Apr 09, 2013

Modifier 'D' is to use the second word of a double integer.

We had previously implemented the pure register varient of 
the modifier and this patch implements the memory reference.



#include "stdio.h"

int b[8] = {0,1,2,3,4,5,6,7};
void main()
{
    int i;
    
    // The first word. Notice, no 'D'
    {asm (
    "lw    %0,%1;"
    : "=r" (i)
    : "m" (*(b+4))
    );}
    
    printf("%d\n",i);

    // The second word
    {asm (
    "lw    %0,%D1;"
    : "=r" (i)
    : "m" (*(b+4))
    );}
    
    printf("%d\n",i);
}

llvm-svn: 179135

b04e357d

Allow PPC B and BLR to be if-converted into some predicated forms · 5711eca1

Hal Finkel authored Apr 09, 2013

This enables us to form predicated branches (which are the same conditional
branches we had before) and also a larger set of predicated returns (including
instructions like bdnzlr which is a conditional return and loop-counter
decrement all in one).

At the moment, if conversion does not capture all possible opportunities. A
simple example is provided in early-ret2.ll, where if conversion forms one
predicated return, and then the PPCEarlyReturn pass picks up the other one. So,
at least for now, we'll keep both mechanisms.

llvm-svn: 179134

5711eca1

Fix some comment typos. · 798a7709
Bob Wilson authored Apr 09, 2013
```
llvm-svn: 179132
```
798a7709

Apr 09, 2013

Cleanup. No functional change intended. · 18785857
Chad Rosier authored Apr 09, 2013
```
llvm-svn: 179129
```
18785857
Cleanup. No functional change intended. · 10d1d1cc
Chad Rosier authored Apr 09, 2013
```
llvm-svn: 179125
```
10d1d1cc
Remove unused method and default values. · 1b276c5c
Rafael Espindola authored Apr 09, 2013
```
llvm-svn: 179124
```
1b276c5c
Revert r179115 as it looks to have killed the ASan tests. · e8d8288d
Chad Rosier authored Apr 09, 2013
```
llvm-svn: 179120
```
e8d8288d
Rationalize the formatting of these case labels. Having two sorted · 9f6b59ae
Chandler Carruth authored Apr 09, 2013
```
columns is essentially impossible to edit.

llvm-svn: 179119
```
9f6b59ae

This patch enables llvm to switch between compiling for mips32/mips64 · 1595f36d

Reed Kotler authored Apr 09, 2013

and mips16 on a per function basis.

Because this patch is somewhat involved I have provide an overview of the
key pieces of it.

The patch is written so as to not change the behavior of the non mixed
mode. We have tested this a lot but it is something new to switch subtargets
so we don't want any chance of regression in the mainline compiler until
we have more confidence in this.

Mips32/64 are very different from Mip16 as is the case of ARM vs Thumb1.
For that reason there are derived versions of the register info, frame info, 
instruction info and instruction selection classes.

Now we register three separate passes for instruction selection.
One which is used to switch subtargets (MipsModuleISelDAGToDAG.cpp) and then
one for each of the current subtargets (Mips16ISelDAGToDAG.cpp and
MipsSEISelDAGToDAG.cpp).

When the ModuleISel pass runs, it determines if there is a need to switch
subtargets and if so, the owning pointers in MipsTargetMachine are
appropriately changed.

When 16Isel or SEIsel is run, they will return immediately without doing
any work if the current subtarget mode does not apply to them.

In addition, MipsAsmPrinter needs to be reset on a function basis.

The pass BasicTargetTransformInfo is substituted with a null pass since the
pass is immutable and really needs to be a function pass for it to be
used with changing subtargets. This will be fixed in a follow on patch.

llvm-svn: 179118

1595f36d

Add support for bottom-up SLP vectorization infrastructure. · 2d9dec32

Nadav Rotem authored Apr 09, 2013

This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations.
The infrastructure has three potential users:

  1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]).

  2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute.

  3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization.

This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code:

void SAXPY(int *x, int *y, int a, int i) {
  x[i]   = a * x[i]   + y[i];
  x[i+1] = a * x[i+1] + y[i+1];
  x[i+2] = a * x[i+2] + y[i+2];
  x[i+3] = a * x[i+3] + y[i+3];
}

llvm-svn: 179117

2d9dec32

[ms-inline asm] Use parsePrimaryExpr in lieu of parseExpression if we need to · a08f30f0

Chad Rosier authored Apr 09, 2013

parse an identifier.  Otherwise, parseExpression may parse multiple tokens,
which makes it impossible to properly compute an immediate displacement.
An example of such a case is the source operand (i.e., [Symbol + ImmDisp]) in
the below example:

 __asm mov eax, [Symbol + ImmDisp]

The existing test cases exercise this patch.
rdar://13611297

llvm-svn: 179115

a08f30f0

The .dwo section shouldn't contain the unrelocated values (and · 52ce7189

Eric Christopher authored Apr 09, 2013

therefore not at all) of the pc or statement list. We also don't
need to emit the compilation dir so save so space and time
and don't bother.

Fix up the testcase accordingly and verify that we don't emit
the attributes or the items that they use.

llvm-svn: 179114

52ce7189

Cleanup PPCEarlyReturn · 21aad9a8

Hal Finkel authored Apr 09, 2013

Some general cleanup and only scan the end of a BB for branches (once we're
done with the terminators and debug values, then there should not be any other
branches). These address post-commit review suggestions by Bill Schmidt.

No functionality change intended.

llvm-svn: 179112

21aad9a8

Revert r176408 and r176407 to address PR15540. · abcc64fd
Nadav Rotem authored Apr 09, 2013
```
llvm-svn: 179111
```
abcc64fd

[ms-inline asm] Maintain a StringRef to reference a symbol in a parsed operand, · e81309b3

Chad Rosier authored Apr 09, 2013

rather than deriving the StringRef from the Start and End SMLocs.

Using the Start and End SMLocs works fine for operands such as [Symbol], but
not for operands such as [Symbol + ImmDisp].  All existing test cases that
reference a variable exercise this patch.
rdar://13602265

llvm-svn: 179109

e81309b3

DAGCombiner: Fold a shuffle on CONCAT_VECTORS into a new CONCAT_VECTORS if possible. · bbae991d

Benjamin Kramer authored Apr 09, 2013

This pattern occurs in SROA output due to the way vector arguments are lowered
on ARM.

The testcase from PR15525 now compiles into this, which is better than the code
we got with the old scalarrepl:
_Store:
	ldr.w	r9, [sp]
	vmov	d17, r3, r9
	vmov	d16, r1, r2
	vst1.8	{d16, d17}, [r0]
	bx	lr

Differential Revision: http://llvm-reviews.chandlerc.com/D647

llvm-svn: 179106

bbae991d

Use virtual base registers on PPC · b5899d57

Hal Finkel authored Apr 09, 2013

On PowerPC, non-vector loads and stores have r+i forms; however, in functions
with large stack frames these were not being used to access slots far from the
stack pointer because such slots were out of range for the signed 16-bit
immediate offset field. This increases register pressure because we need a
separate register for each offset (when the r+r form is used). By enabling
virtual base registers, we can deal with large stack frames without unduly
increasing register pressure.

llvm-svn: 179105

b5899d57

Convert MachOObjectFile to a template. · c2413f59

Rafael Espindola authored Apr 09, 2013

For now it is templated only on being 64 or 32 bits. I will add little/big
endian next.

llvm-svn: 179097

c2413f59

DWARF parser: Fix DWARF-2/3 incompatibility: size of DW_FORM_ref_addr is the... · d60859b2

Alexey Samsonov authored Apr 09, 2013

DWARF parser: Fix DWARF-2/3 incompatibility: size of DW_FORM_ref_addr is the same as DW_FORM_addr in DWARF2, and is 4/8 bytes on 32/64-bit DWARF starting from DWARF3. Adding a test for this is a huge pain - generating and uploading pre-built binary with DWARF3 debug info is way too ugly, and writing fine-grained unittests for DebugInfo is impossible, as it doesn't expose any headers in include/llvm. That said, I'm going to choose the second approach and submit the patch exposing DebugInfo headers for review soon enough.

llvm-svn: 179095

d60859b2

Extract a function. · c910feb4
Jakob Stoklund Olesen authored Apr 09, 2013
```
llvm-svn: 179086
```
c910feb4
Revert 179071 because it is not the right way to support non standard new/new[] operators. · 7b7585d1
Nadav Rotem authored Apr 09, 2013
```
llvm-svn: 179084
```
7b7585d1

Compute correct frame sizes for SPARC v9 64-bit frames. · 2cfe46fd

Jakob Stoklund Olesen authored Apr 09, 2013

The save area is twice as big and there is no struct return slot. The
stack pointer is always 16-byte aligned (after adding the bias).

Also eliminate the stack adjustment instructions around calls when the
function has a reserved stack frame.

llvm-svn: 179083

2cfe46fd

More uses for SymbolTableEntryBase. · eb8b211e
Rafael Espindola authored Apr 09, 2013
```
llvm-svn: 179076
```
eb8b211e

Add a SymbolTableEntryBase. · 5d6cec9b

Rafael Espindola authored Apr 09, 2013

Use it when we don't need to know if we have a 32 or 64 bit SymbolTableEntry.

llvm-svn: 179074

5d6cec9b