Commits · 8cfaffaaded057aa2ff7be9443797f2c9f47224a · Roger Ferrer / llvm-epi-0.8

Apr 04, 2013

Add SPARC v9 support for select on 64-bit compares. · 8cfaffaa

Jakob Stoklund Olesen authored Apr 04, 2013

This requires v9 cmov instructions using the %xcc flags instead of the
%icc flags.

Still missing:
- Select floats on %xcc flags.
- Select i64 on %fcc flags.

llvm-svn: 178737

8cfaffaa

Apr 03, 2013

X86 cost model: Vector shifts are expensive in most cases · e9b50164

Arnold Schwaighofer authored Apr 03, 2013

The default logic does not correctly identify costs of casts because they are
marked as custom on x86.

For some cases, where the shift amount is a scalar we would be able to generate
better code. Unfortunately, when this is the case the value (the splat) will get
hoisted out of the loop, thereby making it invisible to ISel.

radar://13130673
radar://13537826

llvm-svn: 178703

e9b50164

R600: Fix last ALU of a clause being emitted in a separate clause · c3d3f9b6
Vincent Lejeune authored Apr 03, 2013
```
llvm-svn: 178675
```
c3d3f9b6

Cleanup PPC reciprocal-estimate functionality · b0c810ff

Hal Finkel authored Apr 03, 2013

Incorporating review feedback from Bill Schmidt on r178617. No functionality
change intended.

llvm-svn: 178672

b0c810ff

R600: Factorize maximum alu per clause in a single location · 80031d9f
Vincent Lejeune authored Apr 03, 2013
```
llvm-svn: 178667
```
80031d9f
R600: Simplify data structure and add DEBUG to R600ControlFlowFinalizer · b6d6c0d4
Vincent Lejeune authored Apr 03, 2013
```
llvm-svn: 178665
```
b6d6c0d4

R600: Consider KILLGT as an ALU instruction · 9931298b

Vincent Lejeune authored Apr 03, 2013

Mesa does not override llvm behavior wrt KILLGT anymore so llvm
has to handle KILLGT on its own.

llvm-svn: 178664

9931298b

PPC: Enable FRES and FRSQRTE on the default PPC64 description · 7ac4592e

Hal Finkel authored Apr 03, 2013

I discussed this with Bill Schmidt on IRC, and it was decided that this is a
safe and reasonable default.

llvm-svn: 178659

7ac4592e

PPC: Add a FIXME regarding the non-working fma+fneg Altivec pattern · 0c6d2193
Hal Finkel authored Apr 03, 2013
```
llvm-svn: 178658
```
0c6d2193
Remove some obsolete PowerPC/README entries · 2ed21a8c
Hal Finkel authored Apr 03, 2013
```
llvm-svn: 178657
```
2ed21a8c

· 084ff8e8

Ulrich Weigand authored Apr 03, 2013

More direct types in PowerPC AltiVec intrinsics.

This patch follows up on work done by Bill Schmidt in r178277,
and replaces most of the remaining uses of VRRC in ISEL DAG patterns.

The resulting .inc files are identical except for comments, so
no change in code generation is expected.

llvm-svn: 178656

084ff8e8

Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC. · 92e26646

Bill Schmidt authored Apr 03, 2013

For this we need to use a libcall.  Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner.  A test case from the bug report is included.

llvm-svn: 178639

92e26646

AArch64: implement ETMv4 trace system registers. · 5816ca11
Tim Northover authored Apr 03, 2013
```
llvm-svn: 178637
```
5816ca11
Fix SRet for thiscall in i686-pc-win32 · f4e0665e
Timur Iskhodzhanov authored Apr 03, 2013
```
llvm-svn: 178634
```
f4e0665e

AArch64: switch patterns to be type-based rather than RegClass-based · 5b097a73

Tim Northover authored Apr 03, 2013

It's a bit of churn in the blame log, but I think there are real benefits to
the newer system so I'm making the change in one go.

llvm-svn: 178633

5b097a73

Add 64-bit compare + branch for SPARC v9. · d9bbdfd3

Jakob Stoklund Olesen authored Apr 03, 2013

The same compare instruction is used for 32-bit and 64-bit compares. It
sets two different sets of flags: icc and xcc.

This patch adds a conditional branch instruction using the xcc flags for
64-bit compares.

llvm-svn: 178621

d9bbdfd3

Remove some unsupported-feature comments from PPC.td · b00fc876
Hal Finkel authored Apr 03, 2013
```
These refer to the reciprocal estimate support recently committed.

llvm-svn: 178618
```
b00fc876

Use PPC reciprocal estimates with Newton iteration in fast-math mode · 2e103310

Hal Finkel authored Apr 03, 2013

When unsafe FP math operations are enabled, we can use the fre[s] and
frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together
with some Newton iteration, in order to quickly generate floating-point
division and sqrt results. All of these instructions are separately optional,
and so each has its own feature flag (except for the Altivec instructions,
which are covered under the existing Altivec flag). Doing this is not only
faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these
computations to be pipelined with other computations in order to hide their
overall latency.

I've also added a couple of missing fnmsub patterns which turned out to be
missing (but are necessary for good code generation of the Newton iterations).
Altivec needs a similar fix, but that will probably be more complicated because
fneg is expanded for Altivec's v4f32.

llvm-svn: 178617

2e103310

Formatting. · e2fbc67e
Eric Christopher authored Apr 02, 2013
```
llvm-svn: 178589
```
e2fbc67e

[mips] Small update to the implementation of eh.return for Mips. · 023c678a

Akira Hatanaka authored Apr 02, 2013

This patch initializes t9 to the handler address, but only if the relocation
model is pic. This handles the case where handler to which eh.return jumps 
points to the start of the function.

Patch by Sasa Stankovic.

llvm-svn: 178588

023c678a

[mips] Expand pseudo multiply/divide instructions in MipsCodeEmitter.cpp. · 2ffc5734

Akira Hatanaka authored Apr 02, 2013

This patch fixes the following two tests which have been failing on
llvm-mips-linux builder since r178403:

LLVM :: Analysis/Profiling/load-branch-weights-ifs.ll
LLVM :: Analysis/Profiling/load-branch-weights-loops.ll

llvm-svn: 178584

2ffc5734

Apr 02, 2013

[ms-inline asm] Add support for parsing variables with namespace alias · 8a24466f

Chad Rosier authored Apr 02, 2013

qualifiers.

This patch only adds support for parsing these identifiers in the
X86AsmParser.  The front-end interface isn't capable of looking up
these identifiers at this point in time.  The end result is the
compiler now errors during object file emission, rather than at
parse time.  Test case coming shortly.
Part of rdar://13499009 and PR13340

llvm-svn: 178566

8a24466f

Fix PR15630: Replace faulty stdcx. with stwcx. · 3581cd4b

Bill Schmidt authored Apr 02, 2013

When doing a partword atomic operation, a lwarx was being paired with
a stdcx. instead of a stwcx. when compiling for a 64-bit target.  The
target has nothing to do with it in this case; we always need a stwcx.

Thanks to Kai Nacke for reporting the problem.

llvm-svn: 178559

3581cd4b

[fast-isel] Use the correct API to disable FastLowerArguments for Win64. · 7925d280
Chad Rosier authored Apr 02, 2013
```
llvm-svn: 178549
```
7925d280
[NVPTX] Fix a few style issues in NVVMReflect · a922c7e9
Justin Holewinski authored Apr 02, 2013
```
llvm-svn: 178536
```
a922c7e9
Add 64-bit load and store instructions. · 8eabc3ff
Jakob Stoklund Olesen authored Apr 02, 2013
```
There is only a few new instructions, the rest is handled with patterns.

llvm-svn: 178528
```
8eabc3ff

Basic 64-bit ALU operations. · 917e07f0

Jakob Stoklund Olesen authored Apr 02, 2013

SPARC v9 extends all ALU instructions to 64 bits, so we simply need to
add patterns to use them for both i32 and i64 values.

llvm-svn: 178527

917e07f0

Materialize 64-bit immediates. · bddb20ee

Jakob Stoklund Olesen authored Apr 02, 2013

The last resort pattern produces 6 instructions, and there are still
opportunities for materializing some immediates in fewer instructions.

llvm-svn: 178526

bddb20ee

Add 64-bit shift instructions. · c1d1a481

Jakob Stoklund Olesen authored Apr 02, 2013

SPARC v9 defines new 64-bit shift instructions. The 32-bit shift right
instructions are still usable as zero and sign extensions.

This adds new F3_Sr and F3_Si instruction formats that probably should
be used for the 32-bit shifts as well. They don't really encode an
simm13 field.

llvm-svn: 178525

c1d1a481

Add predicates for distinguishing 32-bit and 64-bit modes. · 739d722e

Jakob Stoklund Olesen authored Apr 02, 2013

The 'sparc' architecture produces 32-bit code while 'sparcv9' produces
64-bit code.

It is also possible to run 32-bit code using SPARC v9 instructions with:

  llc -march=sparc -mattr=+v9

llvm-svn: 178524

739d722e

Add support for 64-bit calling convention. · 0b21f35a

Jakob Stoklund Olesen authored Apr 02, 2013

This is far from complete, but it is enough to make it possible to write
test cases using i64 arguments.

Missing features:
- Floating point arguments.
- Receiving arguments on the stack.
- Calls.

llvm-svn: 178523

0b21f35a

Add an I64Regs register class for 64-bit registers. · 5ad3b353

Jakob Stoklund Olesen authored Apr 02, 2013

We are going to use the same registers for 32-bit and 64-bit values, but
in two different register classes. The I64Regs register class has a
larger spill size and alignment.

The addition of an i64 register class confuses TableGen's type
inference, so it is necessary to clarify the type of some immediates and
the G0 register.

In 64-bit mode, pointers are i64 and should use the I64Regs register
class. Implement getPointerRegClass() to dynamically provide the pointer
register class depending on the subtarget. Use ptr_rc and iPTR for
memory operands.

Finally, add the i64 type to the IntRegs register class. This register
class is not used to hold i64 values, I64Regs is for that. The type is
required to appease TableGen's type checking in output patterns like this:

  def : Pat<(add i64:$a, i64:$b), (ADDrr $a, $b)>;

SPARC v9 uses the same ADDrr instruction for i32 and i64 additions, and
TableGen doesn't know to check the type of register sub-classes.

llvm-svn: 178522

5ad3b353

Fix typo in PPCISelLowering · 93d75ea0

Hal Finkel authored Apr 02, 2013

Thanks to Bill Schmidt for finding this in review of r178480.

llvm-svn: 178521

93d75ea0

The divide unit is not pipeline, but it is still buffered. · e1d88cfb

Andrew Trick authored Apr 02, 2013

Buffered means a later divide may be executed out-of-order while a
prior divide is sitting (buffered) in a reservation station.

You can tell it's not pipelined, because operations that use it
reserve it for more than one cycle:

def : WriteRes<WriteIDiv, [HWPort0, HWDivider]> {
  let Latency = 25;
  let ResourceCycles = [1, 10];
}

We don't currently distinguish between an unpipeline operation and one
that is split into multiple micro-ops requiring the same unit. Except
that the later may have NumMicroOps > 1 if they also consume
issue/dispatch resources.

llvm-svn: 178519

e1d88cfb

Target/R600: Fix CMake build to add missing files. · fd98f7f2
NAKAMURA Takumi authored Apr 01, 2013
```
llvm-svn: 178508
```
fd98f7f2

Apr 01, 2013
- R600: Add support for native control flow · bfaa63a6
  Vincent Lejeune authored Apr 01, 2013
```
llvm-svn: 178505
```
  bfaa63a6
- R600/SI: Share code recording ShaderTypeAttribute between generations · ace6f735
  Vincent Lejeune authored Apr 01, 2013
```
llvm-svn: 178504
```
  ace6f735
- R600: Emit CF_ALU and use true kcache register. · f43bc57b
  Vincent Lejeune authored Apr 01, 2013
```
llvm-svn: 178503
```
  f43bc57b
- Fix a bad assert in PPCTargetLowering · 3f88d089
  Hal Finkel authored Apr 01, 2013
```
llvm-svn: 178489
```
  3f88d089
- Add more PPC floating-point conversion instructions · f6d45f23
  Hal Finkel authored Apr 01, 2013
```
The P7 and A2 have additional floating-point conversion instructions which
allow a direct two-instruction sequence (plus load/store) to convert from all
combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores,
only some combinations were directly available).

llvm-svn: 178480
```
  f6d45f23