Commits · dfc277f35012995d5b8052685c1965b6ca001d6e · Roger Ferrer / llvm-epi-0.8

Oct 16, 2013

Enabling 3DNow! prefetch instruction for a few AMD processors: bobcat, jaguar, · dfc277f3

Yunzhong Gao authored Oct 16, 2013

bulldozer and piledriver. Support for the instruction itself seems to have
already been added in r178040.

Differential Revision: http://llvm-reviews.chandlerc.com/D1933

llvm-svn: 192828

dfc277f3

R600: Fix a crash in the AMDILCFGStructurizer · b34186ae

Tom Stellard authored Oct 16, 2013

We were calling llvm_unreachable() when failing to optimize the
branch into if case.  However, it is still possible for us
to structurize the CFG by duplicating blocks even if this optimization
fails.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 192813

b34186ae

R600: Remove some dead code from the AMDILCFGStructurizer · 69f86d19
Tom Stellard authored Oct 16, 2013
```
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 192812
```
69f86d19
Fix comment. · f2b25455
Chad Rosier authored Oct 16, 2013
```
llvm-svn: 192805
```
f2b25455

Assert on duplicate registration. Don't depend on function pointer equality. · 40a3d018

Rafael Espindola authored Oct 16, 2013

Before this patch we would assert when building llvm as multiple shared
libraries (cmake's BUILD_SHARED_LIBS). The problem was the line

if (T.AsmStreamerCtorFn == Target::createDefaultAsmStreamer)

which returns false because of -fvisibility-inlines-hidden. It is easy
to fix just this one case, but I decided to try to also make the
registration more strict. It looks like the old logic for ignoring
followup registration was just a temporary hack that outlived its
usefulness.

This patch converts the ifs to asserts, fixes the few cases that were
registering twice and makes sure all the asserts compare with null.

Thanks for Joerg for reporting the problem and reviewing the patch.

llvm-svn: 192803

40a3d018

[AArch64] Add support for NEON scalar signed saturating accumulated of unsigned · 178b1cef
Chad Rosier authored Oct 16, 2013
```
value and unsigned saturating accumulate of signed value instructions.

llvm-svn: 192800
```
178b1cef

[SystemZ] Handle extensions in RxSBG optimizations · 3e382972

Richard Sandiford authored Oct 16, 2013

The input to an RxSBG operation can be narrower as long as the upper bits
are don't care.  This fixes a FIXME added in r192783.

llvm-svn: 192790

3e382972

[SystemZ] Improve handling of SETCC · f722a8e3

Richard Sandiford authored Oct 16, 2013

We previously used the default expansion to SELECT_CC, which in turn would
expand to "LHI; BRC; LHI".  In most cases it's better to use an IPM-based
sequence instead.

llvm-svn: 192784

f722a8e3

Add a MCAsmInfoELF class and factor some code into it. · 43c4e24f
Rafael Espindola authored Oct 16, 2013
```
We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before.

llvm-svn: 192760
```
43c4e24f

Move .ident handling to MCStreamer. · 5645bade

Rafael Espindola authored Oct 16, 2013

No functionality change, but exposes the API so that codegen can use it too.

Patch by Katya Romanova.

llvm-svn: 192757

5645bade

Fix typo · 22658065
Matt Arsenault authored Oct 15, 2013
```
llvm-svn: 192752
```
22658065
Fix missing C++ mode thing in header · df90c02e
Matt Arsenault authored Oct 15, 2013
```
llvm-svn: 192751
```
df90c02e

Enable MI Sched for x86. · e97d8d6d

Andrew Trick authored Oct 15, 2013

This changes the SelectionDAG scheduling preference to source
order. Soon, the SelectionDAG scheduler can be bypassed saving
a nice chunk of compile time.

Performance differences that result from this change are often a
consequence of register coalescing. The register coalescer is far from
perfect. Bugs can be filed for deficiencies.

On x86 SandyBridge/Haswell, the source order schedule is often
preserved, particularly for small blocks.

Register pressure is generally improved over the SD scheduler's ILP
mode. However, we are still able to handle large blocks that require
latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also
attempts to discover the critical path in single-block loops and
adjust heuristics accordingly.

The MI scheduler relies on the new machine model. This is currently
unimplemented for AVX, so we may not be generating the best code yet.

Unit tests are updated so they don't depend on SD scheduling heuristics.

llvm-svn: 192750

e97d8d6d

R600/SI: Remove some leftover MI dump call · 5d6c2c31
Vincent Lejeune authored Oct 15, 2013
```
llvm-svn: 192743
```
5d6c2c31

Oct 15, 2013

[AArch64] Add support for NEON scalar signed saturating absolute value and · 9d517086
Chad Rosier authored Oct 15, 2013
```
scalar signed saturating negate instructions.

llvm-svn: 192733
```
9d517086
Struct byval: fix a copy-paste error for thumb2. · fd956dba
Manman Ren authored Oct 15, 2013
```
PR17309

llvm-svn: 192730
```
fd956dba

Fix PR17546 · ad71659d

Michael Liao authored Oct 15, 2013

- Type of index used in extract_vector_elt or insert_vector_elt supposes
  to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd
  better to truncate (or zero-extend in case it's changed later) it to
  mask element type to guarantee they are matching instead of asserting
  that.

llvm-svn: 192722

ad71659d

Fix PR16807 · 8ba06821

Michael Liao authored Oct 15, 2013

- Lower signed division by constant powers-of-2 to target-independent
  DAG operators instead of target-dependent ones to support them better
  on targets where vector types are legal but shift operators on that
  types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16>
  though <16 x i16> is a legal type.

llvm-svn: 192721

8ba06821

[mips][msa] Added support for build_vector for v4f32 and v2f64. · 1dfddc73
Daniel Sanders authored Oct 15, 2013
```
llvm-svn: 192699
```
1dfddc73
Revert "Add AllTargetsBindings sublibrary" as it breaks cmake build on... · 0c3b6539
Anders Waldenborg authored Oct 15, 2013
```
Revert "Add AllTargetsBindings sublibrary" as it breaks cmake build on (atleast) windows and darwin.

llvm-svn: 192697
```
0c3b6539

Add AllTargetsBindings sublibrary instead of having static inlines in the llvm-c headers. · 1d9cb434

Anders Waldenborg authored Oct 15, 2013

This new library will be linked in when using the "all-targets"
component and contains the LLVMInitializeAll* functions.

This means that those functions will exist as real symbols in
the shared library, and can therefore can be called from
bindings that are using ffi the shared library.

llvm-svn: 192690

1d9cb434

[SystemZ] Use A(G)SI when spilling the target of a constant addition · 6af6ff1e
Richard Sandiford authored Oct 15, 2013
```
llvm-svn: 192681
```
6af6ff1e
Fix MSP430 calling convention to match MSPGCC · e9a1d4c2
Job Noorman authored Oct 15, 2013
```
llvm-svn: 192678
```
e9a1d4c2

Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from... · ef9e993e

Craig Topper authored Oct 15, 2013

Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext.

llvm-svn: 192672

ef9e993e

[mips] Define a pseudo instruction which writes to both the lower and higher · 06aff571
Akira Hatanaka authored Oct 15, 2013
```
parts of the accumulators and gets expanded post-RA.

llvm-svn: 192667
```
06aff571
[mips] Use predicates to guard instructions using accumulator registers instead · ec67c902
Akira Hatanaka authored Oct 15, 2013
```
of relying on AddedComplexity.

llvm-svn: 192665
```
ec67c902
[mips] Rename isel nodes. · d98c99fd
Akira Hatanaka authored Oct 15, 2013
```
llvm-svn: 192663
```
d98c99fd
[mips] Transfer kill flag to the newly created operand. · 86c3c794
Akira Hatanaka authored Oct 15, 2013
```
llvm-svn: 192662
```
86c3c794
[mips] Set HI/LO registers' HWEncoding field. · 8368b3b3
Akira Hatanaka authored Oct 15, 2013
```
llvm-svn: 192661
```
8368b3b3
[mips] Delete unnecessary code. · 8f31b2fd
Akira Hatanaka authored Oct 15, 2013
```
llvm-svn: 192660
```
8f31b2fd

[X86][FastISel] During X86 fastisel, the address of indirect call was resolved · 778dba1d

Quentin Colombet authored Oct 14, 2013

through bitcast, ptrtoint, and inttoptr instructions. This is valid
only if the related instructions are in that same basic block, otherwise
we may reference variables that were not live accross basic blocks
resulting in undefined virtual registers.

The bug was exposed when both SDISel and FastISel were used within the same
function, i.e., one basic block is issued with FastISel and another with SDISel,
as demonstrated with the testcase.

<rdar://problem/15192473>

llvm-svn: 192636

778dba1d

Fix the ExecutionDepsFix pass to handle AVX instructions. · b6d56be6

Andrew Trick authored Oct 14, 2013

This pass is needed to break false dependencies. Without it, unlucky
register assignment can result in wild (5x) swings in
performance. This pass was trying to handle AVX but not getting it
right. AVX doesn't have partial register defs, it has unused register
reads in which the high bits of a source operand are copied into the
unused bits of the dest.

Fixing this requires conservative liveness analysis. This is awkard
because the pass already has its own pseudo-liveness. However, proper
liveness is expensive, and we would like to use a generic utility to
compute it. The fix only invokes liveness on-demand. It is rare to
detect a case that needs undef-read dependence breaking, but when it
happens, it can be needed many times within a very large block.

I think the existing heuristic which uses a register window of 16 is
too conservative for loop-carried false dependencies. If the loop is a
reduction. The out-of-order engine may be able to execute several loop
iterations in parallel. However, I'll leave this tuning exercise for
next time.

llvm-svn: 192635

b6d56be6

whitespace · 8460a3bf
Andrew Trick authored Oct 14, 2013
```
llvm-svn: 192633
```
8460a3bf

Oct 14, 2013

Revert part of a fix from 2010, changes since then: · 74002574

Eric Christopher authored Oct 14, 2013

a) x86-64 TLS has been documented
b) the code path should use movq for the correct relocation
   to be generated.

I've also added a fixme for the test case that we should improve
the code generated, it should look something like is documented
in the tls abi document.

llvm-svn: 192631

74002574

Reformat this routine slightly. · 755711e5
Eric Christopher authored Oct 14, 2013
```
llvm-svn: 192630
```
755711e5
Remove some extraneous whitespace. · 584d71c6
Eric Christopher authored Oct 14, 2013
```
llvm-svn: 192629
```
584d71c6
[AArch64] Add support for NEON scalar integer compare instructions. · d1f40d76
Chad Rosier authored Oct 14, 2013
```
llvm-svn: 192596
```
d1f40d76
Add Cortex-A57 support · 53169762
Bernard Ogden authored Oct 14, 2013
```
llvm-svn: 192591
```
53169762

Add subtarget feature support for Cortex-A53 · 4400cde8

Bernard Ogden authored Oct 14, 2013

Some previous implicit defaults have changed, for example FP and NEON
are now on by default.

llvm-svn: 192590

4400cde8

[mips][msa] Direct Object Emission support for BIT instructions. · 2102188c

Matheus Almeida authored Oct 14, 2013

List of instructions:
bclri.{b,h,w,d}
binsli.{b,h,w,d}
binsri.{b,h,w,d}
bnegi.{b,h,w,d}
bseti.{b,h,w,d}
sat_s.{b,h,w,d}
sat_u.{b,h,w,d}
slli.{b,h,w,d}
srai.{b,h,w,d}
srari.{b,h,w,d}
srli.{b,h,w,d}
srlri.{b,h,w,d}

llvm-svn: 192589

2102188c