Commits · dfe09b1b5bce82effc956c6b9b27f10bba3da1bb · Roger Ferrer / llvm-epi-0.8

Feb 07, 2014

[Sparc] Use SparcMCExpr::VariantKind itself as MachineOperand's target flags. · dfe09b1b
Venkatraman Govindaraju authored Feb 07, 2014
```
llvm-svn: 200960
```
dfe09b1b

X86: Resolve a long standing FIXME and properly isel pextr[bw]. · e9008de6

Jim Grosbach authored Feb 07, 2014

Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use
them to match the relevant pextr store instructions.

The test widen_load-2.ll requires a slight change because with the
stores gone, the remaining instructions are scheduled in a different
order.

Add test cases for SSE4 and AVX variants.

Resolves rdar://13414672.

Patch by Adam Nemet <anemet@apple.com>.

llvm-svn: 200957

e9008de6

Feb 06, 2014

Revert r200095 and r200152. It turns out when compiling with -arch armv7... · 91f205bf

Evan Cheng authored Feb 06, 2014

Revert r200095 and r200152. It turns out when compiling with -arch armv7 -mcpu=cortex-m3, the triple would still set iOS as the OS so the hack is still needed. rdar://15984891

llvm-svn: 200937

91f205bf

R600/SI: Add a MUBUF store pattern for Reg+Imm offsets · e2367945
Tom Stellard authored Feb 06, 2014
```
llvm-svn: 200935
```
e2367945
R600/SI: Add a MUBUF store pattern for Imm offsets · 2937cbc0
Tom Stellard authored Feb 06, 2014
```
llvm-svn: 200934
```
2937cbc0
R600/SI: Add a MUBUF load pattern for Reg+Imm offsets · 11624bc5
Tom Stellard authored Feb 06, 2014
```
llvm-svn: 200933
```
11624bc5

R600/SI: Use immediates offsets for SMRD instructions whenever possible · 044e418f

Tom Stellard authored Feb 06, 2014

There was a problem with the old pattern, so we were copying some
larger immediates into registers when we could have been encoding
them in the instruction.

llvm-svn: 200932

044e418f

Remove const_cast for STI when parsing inline asm · ea2bcb9e

David Peixotto authored Feb 06, 2014

In a previous commit (r199818) we added a const_cast to an existing
subtarget info instead of creating a new one so that we could reuse
it when creating the TargetAsmParser for parsing inline assembly.
This cast was necessary because we needed to reuse the existing STI
to avoid generating incorrect code when the inline asm contained
mode-switching directives (e.g. .code 16).

The root cause of the failure was that there was an implicit sharing
of the STI between the parser and the MCCodeEmitter. To fix a
different but related issue, we now explicitly pass the STI to the
MCCodeEmitter (see commits r200345-r200351).

The const_cast is no longer necessary and we can now create a fresh
STI for the inline asm parser to use.

Differential Revision: http://llvm-reviews.chandlerc.com/D2709

llvm-svn: 200929

ea2bcb9e

X86: add costs for 64-bit vector ext/trunc & rebalance · f0e21616

Tim Northover authored Feb 06, 2014

The most important part of this is probably adding any cost at all for
operations like zext <8 x i8> to <8 x i32>. Before they were being
recorded as extremely costly (24, I believe) which made LLVM fall back
on a 4-wide vectorisation of a loop.

It also rebalances the values for sext, zext and trunc. Lacking any
other sane metric that might work across CPU microarchitectures I went
for instructions. This seems to be in reasonable accord with the rest
of the table (sitofp, ...) though no doubt at least one value is
sub-optimal for some bizarre reason.

Finally, separate AVX and AVX2 values are provided where appropriate.
The CodeGen is quite different in many cases.

rdar://problem/15981990

llvm-svn: 200928

f0e21616

X86: deduplicate V[SZ]EXT_MOVL and V[SZ]EXT nodes · 546b57b0

Tim Northover authored Feb 06, 2014

I believe VZEXT_MOVL means "zero all vector elements except the first" (and
should have identical input & output types) whereas VZEXT means "zero extend
each element of a vector (discarding higher elements if necessary)".

For example:
    (v4i32 (vzext (v16i8 ...)))

should zero extend the low 4 bytes of the incoming vector to 32-bits,
discarding higher bytes.

However, somewhere in the past, these two concepts had become confused, even
leading to a nonsensical VSEXT_MOVL.

This re-merges the nodes where appropriate (all VSEXT_MOVL -> VSEXT, VZEXT_MOVL
-> VZEXT when it's an actual extension).

rdar://problem/15981990

llvm-svn: 200918

546b57b0

Update the X86 assembler for .intel_syntax to accept · d6b10713
Kevin Enderby authored Feb 06, 2014
```
the << and >> bitwise operators.

rdar://15975725

llvm-svn: 200896
```
d6b10713

don't set HasReliableSymbolDifference for ELF. · 6a383f9a

Rafael Espindola authored Feb 06, 2014

It is only used in MachObjectWriter.cpp. Another leftover from early days
of ELF in MC.

llvm-svn: 200895

6a383f9a

doesSectionRequireSymbols is meaningless on ELF, remove. · 12f04984

Rafael Espindola authored Feb 06, 2014

This is a nop. doesSectionRequireSymbols is only used from
isSymbolLinkerVisible. isSymbolLinkerVisible only use from ELF was in

if (!Asm.isSymbolLinkerVisible(Symbol) && !Symbol.isUndefined())
  return false;

if (Symbol.isTemporary())
  return false;

If the symbol is a temporary this code returns false and it is irrelevant if
we take the first if or not. If the symbol is not a temporary,
Asm.isSymbolLinkerVisible returns true without ever calling
doesSectionRequireSymbols.

This was an horrible leftover from when support for ELF was first added.

llvm-svn: 200894

12f04984

Just returning false is the default. · 4998280f
Rafael Espindola authored Feb 06, 2014
```
llvm-svn: 200890
```
4998280f
Add address space argument to allowsUnalignedMemoryAccess. · 25793a3f
Matt Arsenault authored Feb 05, 2014
```
On R600, some address spaces have more strict alignment
requirements than others.

llvm-svn: 200887
```
25793a3f

Feb 05, 2014

Remove support for not using .loc directives. · b4eec1da
Rafael Espindola authored Feb 05, 2014
```
Clang itself was not using this. The only way to access it was via llc.

llvm-svn: 200862
```
b4eec1da

[mips] Add NaCl target and forbid indexed loads and stores for it · 9725016a

Petar Jovanovic authored Feb 05, 2014

This patch adds NaCl target for Mips. It also forbids indexed loads and
stores if the target is NaCl.

Patch by Sasa Stankovic.

Differential Revision: http://llvm-reviews.chandlerc.com/D2690

llvm-svn: 200855

9725016a

AVX-512: optimized icmp -> sext -> icmp pattern · 0b79be8a
Elena Demikhovsky authored Feb 05, 2014
```
llvm-svn: 200849
```
0b79be8a

ARM: Resolve thumb_bl fixup in same MCFragment. · d5c48aa3

Logan Chien authored Feb 05, 2014

In Thumb1 mode, bl instruction might be selected for branches between
basic blocks in the function if the offset is greater than 2KB.
However, this might cause SEGV because the destination symbol
is not marked as thumb function and the execution mode will be reset
to ARM mode.

Since we are sure that these symbols are in the same data fragment, we
can simply resolve these local symbols, and don't emit any relocation
information for this bl instruction.

llvm-svn: 200842

d5c48aa3

AVX-512: fixed a bug in EVEX encoding (the bug appeared after r200624) · a38114c4
Elena Demikhovsky authored Feb 05, 2014
```
llvm-svn: 200837
```
a38114c4

R600/SI: Add pattern for zero-extending i1 to i32 · 5d26fdfc

Michel Danzer authored Feb 05, 2014

Fixes opencl-example if_* tests with radeonsi.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469



Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200830

5d26fdfc

ARM: Enable use of relocation type tlsldo in debug info for tls data. · 382c1405
Kai Nacke authored Feb 05, 2014
```
This fixes PR18554.

Reviewers: Renato Golin, Keith Walker
llvm-svn: 200826
```
382c1405

Move matching for x86 BMI BLSI/BLSMSK/BLSR instructions to isel patterns... · 7ee16384

Craig Topper authored Feb 05, 2014

Move matching for x86 BMI BLSI/BLSMSK/BLSR instructions to isel patterns instead of DAG combine. This weakens the ability to fold loads with them because we aren't able to match patterns that load the same thing twice. But maybe we should fix that if we care. The peephole optimizer will be able to fold some loads in its absense.

llvm-svn: 200824

7ee16384

AVX-512: Added intrinsic for cvtph2ps. · a30e4376

Elena Demikhovsky authored Feb 05, 2014

Added VPTESTNM instruction.
Added a pattern to vselect (lit tests will follow).

llvm-svn: 200823

a30e4376

Feb 04, 2014

SimplifyLibCalls: Push TLI through the exp2->ldexp transform. · 34f460ed
Benjamin Kramer authored Feb 04, 2014
```
For the odd case of platforms with exp2 available but not ldexp.

llvm-svn: 200795
```
34f460ed

[X86] Only 213 FMA3 variants should be marked commutable. · 3303a339

Lang Hames authored Feb 04, 2014

Commuting the 231 and 132 variants would swap addends and
multiplicands/multipliers, which isn't valid.

I'm still trying to reduce a decent test case for this.

llvm-svn: 200792

3303a339

cleanup: scc_iterator consumers should use isAtEnd · 8e661efc

Duncan P. N. Exon Smith authored Feb 04, 2014

No functional change.  Updated loops from:

    for (I = scc_begin(), E = scc_end(); I != E; ++I)

to:

    for (I = scc_begin(); !I.isAtEnd(); ++I)

for teh win.

llvm-svn: 200789

8e661efc

[mips] Implement %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions · a5da588b

Petar Jovanovic authored Feb 04, 2014

Patch implements %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions for MIPS
by creating target expression class MipsMCExpr.

Patch by Sasa Stankovic.

Differential Revision: http://llvm-reviews.chandlerc.com/D2592

llvm-svn: 200783

a5da588b

Every target uses .align. Simplify. · 7cbbd28c
Rafael Espindola authored Feb 04, 2014
```
llvm-svn: 200782
```
7cbbd28c
Use the default values. · 7b514969
Rafael Espindola authored Feb 04, 2014
```
llvm-svn: 200781
```
7b514969

Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly · b9b7362c

David Peixotto authored Feb 04, 2014

This patch fixes the ldr-pseudo implementation to work when used in
inline assembly.  The fix is to move arm assembler constant pools
from the ARMAsmParser class to the ARMTargetStreamer class.

Previously we kept the assembler generated constant pools in the
ARMAsmParser object. This does not work for inline assembly because
a new parser object is created for each blob of inline assembly.
This patch moves the constant pools to the ARMTargetStreamer class
so that the constant pool will remain alive for the entire code
generation process.

An ARMTargetStreamer class is now required for the arm backend.
There was no existing implementation for MachO, only Asm and ELF.
Instead of creating an empty MachO subclass, we decided to make the
ARMTargetStreamer a non-abstract class and provide default
(llvm_unreachable) implementations for the non constant-pool related
methods.

Differential Revision: http://llvm-reviews.chandlerc.com/D2638

llvm-svn: 200777

b9b7362c

R600/SI: Expand i1 BR_CC · aeb45643

Tom Stellard authored Feb 04, 2014

This fixes a crashes in the OpenCV test suite and also the scrypt
kernel in bfgminer.

I was unable to come up with a reduced test case for this.

https://bugs.freedesktop.org/show_bug.cgi?id=72785

llvm-svn: 200776

aeb45643

R600/SI: Don't assume copies will be coalesced in SIFixSGPRCopies · b8725d84

Tom Stellard authored Feb 04, 2014

There is no lit test for this, because it would be too big and
complicated, but it does fix a crash in the Arithm/Absdiff.* OpenCV test.

llvm-svn: 200775

b8725d84

R600/SI: Custom lower i64 ISD::SELECT · 0ec134f3
Tom Stellard authored Feb 04, 2014
```
llvm-svn: 200774
```
0ec134f3

R600: Enable vector fpow. · bfebd1fc

Tom Stellard authored Feb 04, 2014



The OpenCL specs say: "The vector versions of the math functions operate
component-wise. The description is per-component."

Patch by: Jan Vesely

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 200773

bfebd1fc

OS X: the correct function is __sincospif_stret, not __sincospi_stretf · 103e648d
Tim Northover authored Feb 04, 2014
```
rdar://problem/13729466

llvm-svn: 200771
```
103e648d

ARM & AArch64: merge NEON absolute compare intrinsics · fdbdb4b6

Tim Northover authored Feb 04, 2014

There was an extremely confusing proliferation of LLVM intrinsics to implement
the vacge & vacgt instructions. This combines them all into two polymorphic
intrinsics, shared across both backends.

llvm-svn: 200768

fdbdb4b6

ARM: fix fast-isel assertion failure · e42fb076

Tim Northover authored Feb 04, 2014

Missing braces on if meant we inserted both ARM and Thumb load for a litpool
entry. This didn't end well.

rdar://problem/15959157

llvm-svn: 200752

e42fb076

R600/SI: Fix fneg for 0.0 · 624b02aa

Michel Danzer authored Feb 04, 2014



V_ADD_F32 with source modifier does not produce -0.0 for this. Just
manipulate the sign bit directly instead.

Also add a pattern for (fneg (fabs ...)).

Fixes a bunch of bit encoding piglit tests with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200743

624b02aa

Revert: ARM: Enable use of relocation type tlsldo in debug info for tls data. · ab7ee461
Kai Nacke authored Feb 04, 2014
```
There seems to be a new problem with the debug info in the test case.
I'll have to investigate this.

llvm-svn: 200737
```
ab7ee461