Commits · 3a7beafb328d7362aecea7df38e0247e3c2a5aa9 · Roger Ferrer / llvm-epi-0.8

Apr 15, 2013

R600/SI: Emit configuration value in the .AMDGPU.config ELF section · 3a7beafb
Tom Stellard authored Apr 15, 2013
```
llvm-svn: 179545
```
3a7beafb
R600: Emit ELF formatted code rather than raw ISA. · 9991659f
Tom Stellard authored Apr 15, 2013
```
llvm-svn: 179544
```
9991659f
Avoid outputting temporary test file into source tree. · 943e9293
Tim Northover authored Apr 15, 2013
```
llvm-svn: 179532
```
943e9293

Fix PPC64 CR spill location for callee-saved registers · 6736988a

Hal Finkel authored Apr 15, 2013

This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition
registers, the spill location is specified relative to the stack pointer (SP +
8). However, this is not relative to the SP after the new stack frame is
established, but instead relative to the caller's stack pointer (it is stored
into the linkage area of the parent's stack frame).

So, like with the link register, we don't directly spill the CRs with other
callee-saved registers, but just mark them to be spilled during prologue
generation.

In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32).

llvm-svn: 179500

6736988a

Apr 14, 2013
- Use i32 for all SPARC shift amounts, even in 64-bit mode. · eed1072f
  Jakob Stoklund Olesen authored Apr 14, 2013
```
Test case by llvm-stress.

llvm-svn: 179477
```
  eed1072f
- Add support for the abs64 SPARC v9 code model. · c3c28f85
  Jakob Stoklund Olesen authored Apr 14, 2013
```
For when 16 TB just isn't enough.

llvm-svn: 179474
```
  c3c28f85
- Add support for the SPARC v9 abs44 code model. · c8fc76b0
  Jakob Stoklund Olesen authored Apr 14, 2013
```
This is the default model for non-PIC 64-bit code. It supports
text+data+bss linked anywhere in the low 16 TB of the address space.

llvm-svn: 179473
```
  c8fc76b0
- Also put target flags on SPARC constant pool references. · e0fc832b
  Jakob Stoklund Olesen authored Apr 14, 2013
```
Constant pool entries are accessed exactly the same way as global
variables.

llvm-svn: 179471
```
  e0fc832b
- Fix patterns for 64-bit pointers. · dc1ed578
  Jakob Stoklund Olesen authored Apr 14, 2013
```
This fixes the pic32 code model for SPARC v9.

llvm-svn: 179469
```
  dc1ed578
Apr 13, 2013

Define SPARC code models. · 15b3e900

Jakob Stoklund Olesen authored Apr 13, 2013

Currently, only abs32 and pic32 are implemented. Add a test case for
abs32 with 64-bit code. 64-bit PIC code is currently broken.

llvm-svn: 179463

15b3e900

Spill and restore PPC CR registers using the FP when we have one · d85a04b3

Hal Finkel authored Apr 13, 2013

For functions that need to spill CRs, and have dynamic stack allocations, the
value of the SP during the restore is not what it was during the save, and so
we need to use the FP in these cases (as for all of the other spills and
restores, but the CR restore has a special code path because its reserved slot,
like the link register, is specified directly relative to the adjusted SP).

llvm-svn: 179457

d85a04b3

Further generalize this scheduler test. · 3d957c0e
Andrew Trick authored Apr 13, 2013
```
The order of copies depends on queue order, which is not very stable.

llvm-svn: 179456
```
3d957c0e
Fix a dislexic regex. · e6f9fc0c
Andrew Trick authored Apr 13, 2013
```
llvm-svn: 179455
```
e6f9fc0c
Add a missing REQUIRES: asserts · 88a1285b
Andrew Trick authored Apr 13, 2013
```
llvm-svn: 179453
```
88a1285b

MI-Sched: schedule physreg copies. · e833e1cd

Andrew Trick authored Apr 13, 2013

The register allocator expects minimal physreg live ranges. Schedule
physreg copies accordingly. This is slightly tricky when they occur in
the middle of the scheduling region. For now, this is handled by
rescheduling the copy when its associated instruction is
scheduled. Eventually we may instead bundle them, but only if we can
preserve the bundles as parallel copies during regalloc.

llvm-svn: 179449

e833e1cd

[mips] Reapply r179420 and r179421. · 2f08822f
Akira Hatanaka authored Apr 13, 2013
```
llvm-svn: 179434
```
2f08822f
Revert r179420 and r179421. · 8ed2892c
Akira Hatanaka authored Apr 12, 2013
```
llvm-svn: 179422
```
8ed2892c
[mips] Instruction selection patterns for carry-setting and using add · 931ad87f
Akira Hatanaka authored Apr 12, 2013
```
instructions.

llvm-svn: 179421
```
931ad87f
[mips] v4i8 and v2i16 add, sub and mul instruction selection patterns. · 8f41dd92
Akira Hatanaka authored Apr 12, 2013
```
llvm-svn: 179420
```
8f41dd92

Apr 12, 2013

Replace coff-/elf-dump with llvm-readobj · ba848e3b
Nico Rieck authored Apr 12, 2013
```
llvm-svn: 179361
```
ba848e3b
Fix the test on linux by setting the triple and the align format · 25a23bc0
Nadav Rotem authored Apr 12, 2013
```
llvm-svn: 179354
```
25a23bc0

Add a flag to align all basic blocks in the function. · c3b0f50a

Nadav Rotem authored Apr 12, 2013

When debugging performance regressions we often ask ourselves if the regression
that we see is due to poor isel/sched/ra or due to some micro-architetural
problem. When comparing two code sequences one good way to rule out front-end
bottlenecks (and other the issues) is to force code alignment. This pass adds
a flag that forces the alignment of all of the basic blocks in the program.

llvm-svn: 179353

c3b0f50a

Apr 11, 2013

Use FileCheck instead of grep. · 6bda0db2
Preston Gurd authored Apr 11, 2013
```
llvm-svn: 179322
```
6bda0db2
Mips specific inline asm memory operand modifier test case · a16fa808
Jack Carter authored Apr 11, 2013
```
These changes are based on commit responses for r179135.

llvm-svn: 179315
```
a16fa808
Add a CHECK-NOT for a more faithful translation of the original grep | count 2. · 0840082c
Eli Bendersky authored Apr 11, 2013
```
Thanks to Reid Kleckner for catching this.

llvm-svn: 179289
```
0840082c
Add missing colons to check lines. · b50682e1
Benjamin Kramer authored Apr 11, 2013
```
llvm-svn: 179277
```
b50682e1
FileCheckize a bunch of tests. · 3960c1cd
Benjamin Kramer authored Apr 11, 2013
```
llvm-svn: 179276
```
3960c1cd

Optimize vector select from all 0s or all 1s · 55658d42

Michael Liao authored Apr 11, 2013

As packed comparisons in AVX/SSE produce all 0s or all 1s in each SIMD lane,
vector select could be simplified to AND/OR or removed if one or both values
being selected is all 0s or all 1s.

llvm-svn: 179267

55658d42

Enhance bool simplifcation in X86 to handle more cases · f7bf8705

Michael Liao authored Apr 11, 2013

This patch is revised based on patch from Victor Umansky
<victor.umansky@intel.com>. More cases are handled in X86's bool
simplification, i.e.
- SETCC_CARRY
- value is truncated to i1 with AND

As a by-product, PR5443 is also fixed.

llvm-svn: 179265

f7bf8705

Rewrite some of the test/CodeGen/X86 tests to use FileCheck instead of grep · 1dceb3c9
Eli Bendersky authored Apr 10, 2013
```
llvm-svn: 179241
```
1dceb3c9

Manually remove successors in if conversion when CopyAndPredicateBlock is used · 95081bff

Hal Finkel authored Apr 10, 2013

In the simple and triangle if-conversion cases, when CopyAndPredicateBlock is
used because the to-be-predicated block has other predecessors, we need to
explicitly remove the old copied block from the successors list. Normally if
conversion relies on TII->AnalyzeBranch combined with BB->CorrectExtraCFGEdges
to cleanup the successors list, but if the predicated block contained an
un-analyzable branch (such as a now-predicated return), then this will fail.

These extra successors were causing a problem on PPC because it was causing
later passes (such as PPCEarlyReturm) to leave dead return-only basic blocks in
the code.

llvm-svn: 179227

95081bff

Mips specific inline asm memory operand modifier test case · b6bcdfd2
Jack Carter authored Apr 10, 2013
```
These changes are based on commit responses for r179135.

llvm-svn: 179225
```
b6bcdfd2

Apr 10, 2013

R600/SI: Add pattern for AMDGPUurecip · 8caa904b

Michel Danzer authored Apr 10, 2013



21 more little piglits with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 179186

8caa904b

This is for an experimental option -mips-os16. The idea is to compile all · fe94cc3e

Reed Kotler authored Apr 10, 2013

Mips32 code as Mips16 unless it can't be compiled as Mips 16. For now this
would happen as long as floating point instructions are not needed.
Probably it would also make sense to compile as mips32 if atomic operations
are needed too. There may be other cases too.

A module pass prescans the IR and adds the mips16 or nomips16 attribute
to functions depending on the functions needs.

Mips 16 mode can result in a 40% code compression by utililizing 16 bit
encoding of many instructions.

The hope is for this to replace the traditional gcc way of dealing with
Mips16 code using floating point which involves essentially using soft float
but with a library implemented using mips32 floating point. This gcc 
method also requires creating stubs so that Mips32 code can interact with
these Mips 16 functions that have floating point needs. My conjecture is
that in reality this traditional gcc method would never win over this
new method.

I will be implementing the traditional gcc method also. Some of it is already
done but I needed to do the stubs to finish the work and those required
this mips16/32 mixed mode capability.

I have more ideas for to make this new method much better and I think the old
method will just live in llvm for anyone that needs the backward compatibility
but I don't for what reason that would be needed.

llvm-svn: 179185

fe94cc3e

R600: Add VTX_READ_* and RAT_WRITE_CACHELESS_* when computing cf addr · 04d9aa48
Vincent Lejeune authored Apr 10, 2013
```
llvm-svn: 179174
```
04d9aa48

R600/SI: dynamical figure out the reg class of MIMG · 8b1ed28e

Christian Konig authored Apr 10, 2013



Depending on the number of bits set in the writemask.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179166

8b1ed28e

R600/SI: adjust writemask to only the used components · 8e06e2a8

Christian Konig authored Apr 10, 2013



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179165

8e06e2a8

R600/SI: remove image sample writemask · 4ace6632

Christian Konig authored Apr 10, 2013



Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179164

4ace6632

__sincosf_stret returns sinf / cosf in bits 0:31 and 32:63 of xmm0, not in · ac0469c5
Evan Cheng authored Apr 10, 2013
```
xmm0 / xmm1.

rdar://13599493

llvm-svn: 179141
```
ac0469c5

Mips specific inline asm operand modifier 'D' · b04e357d

Jack Carter authored Apr 09, 2013

Modifier 'D' is to use the second word of a double integer.

We had previously implemented the pure register varient of 
the modifier and this patch implements the memory reference.



#include "stdio.h"

int b[8] = {0,1,2,3,4,5,6,7};
void main()
{
    int i;
    
    // The first word. Notice, no 'D'
    {asm (
    "lw    %0,%1;"
    : "=r" (i)
    : "m" (*(b+4))
    );}
    
    printf("%d\n",i);

    // The second word
    {asm (
    "lw    %0,%D1;"
    : "=r" (i)
    : "m" (*(b+4))
    );}
    
    printf("%d\n",i);
}

llvm-svn: 179135

b04e357d