Commits · 8ec1474f7f5f1673f6ea0bc47bdbded62fefde0e · Roger Ferrer / llvm-epi

Jul 25, 2014

After unrolling a loop with llvm.loop.unroll.count metadata (unroll factor · 8ec1474f

Mark Heffernan authored Jul 24, 2014

hint) the loop unroller replaces the llvm.loop.unroll.count metadata with
llvm.loop.unroll.disable metadata to prevent any subsequent unrolling
passes from unrolling more than the hint indicates.  This patch fixes
an issue where loop unrolling could be disabled for other loops as well which
share the same llvm.loop metadata.

llvm-svn: 213900

8ec1474f

Don't use 128bit functions on PPC32. · b5459e6e
Joerg Sonnenberger authored Jul 24, 2014
```
llvm-svn: 213899
```
b5459e6e

[SDAG] Introduce a combined set to the DAG combiner which tracks nodes · 9f4530b9

Chandler Carruth authored Jul 24, 2014

which have successfully round-tripped through the combine phase, and use
this to ensure all operands to DAG nodes are visited by the combiner,
even if they are only added during the combine phase.

This is critical to have the combiner reach nodes that are *introduced*
during combining. Previously these would sometimes be visited and
sometimes not be visited based on whether they happened to end up on the
worklist or not. Now we always run them through the combiner.

This fixes quite a few bad codegen test cases lurking in the suite while
also being more principled. Among these, the TLS codegeneration is
particularly exciting for programs that have this in the critical path
like TSan-instrumented binaries (although I think they engineer to use
a different TLS that is faster anyways).

I've tried to check for compile-time regressions here by running llc
over a merged (but not LTO-ed) clang bitcode file and observed at most
a 3% slowdown in llc. Given that this is essentially a worst case (none
of opt or clang are running at this phase) I think this is tolerable.
The actual LTO case should be even less costly, and the cost in normal
compilation should be negligible.

With this combining logic, it is possible to re-legalize as we combine
which is necessary to implement PSHUFB formation on x86 as
a post-legalize DAG combine (my ultimate goal).

Differential Revision: http://reviews.llvm.org/D4638

llvm-svn: 213898

9f4530b9

[x86] Make vector legalization of extloads work more like the "normal" · 80b86946

Chandler Carruth authored Jul 24, 2014

vector operation legalization with support for custom target lowering
and fallback to expand when it fails, and use this to implement sext and
anyext load lowering for x86 in a more principled way.

Previously, the x86 backend relied on a target DAG combine to "combine
away" sextload and extload nodes prior to legalization, or would expand
them during legalization with terrible code. This is particularly
problematic because the DAG combine relies on running over non-canonical
DAG nodes at just the right time to match several common and important
patterns. It used a combine rather than lowering because we didn't have
good lowering support, and to expose some tricks being employed to more
combine phases.

With this change it becomes a proper lowering operation, the backend
marks that it can lower these nodes, and I've added support for handling
the canonical forms that don't have direct legal representations such as
sextload of a v4i8 -> v4i64 on AVX1. With this change, our test cases
for this behavior continue to pass even after the DAG combiner beigns
running more systematically over every node.

There is some noise caused by this in the test suite where we actually
use vector extends instead of subregister extraction. This doesn't
really seem like the right thing to do, but is unlikely to be a critical
regression. We do regress in one case where by lowering to the
target-specific patterns early we were able to combine away extraneous
legal math nodes. However, this regression is completely addressed by
switching to a widening based legalization which is what I'm working
toward anyways, so I've just switched the test to that mode.

Differential Revision: http://reviews.llvm.org/D4654

llvm-svn: 213897

80b86946

Target: invert condition for Windows · 8dc8fb18

Saleem Abdulrasool authored Jul 24, 2014

The Microsoft ABI and MSVCRT are considered the canonical C runtime and ABI.
The long double routines are not part of this environment.  However, cygwin and
MinGW both provide supplementary implementations.  Change the condition to
reflect this reality.

llvm-svn: 213896

8dc8fb18

Jul 24, 2014

Feedback from Hans on r213815. No functionaility change. · 4d189fb9
Manman Ren authored Jul 24, 2014
```
llvm-svn: 213895
```
4d189fb9

Windows: Don't wildcard expand /? or -? · e34a71aa

Hans Wennborg authored Jul 24, 2014

Even if there's a file called c:\a, we want /? to be preserved as
an option, not expanded to a filename.

llvm-svn: 213894

e34a71aa

Use ELF in the clang-interpreter on windows. · 3175daa9
Rafael Espindola authored Jul 24, 2014
```
We don't support loading COFF files yet.

llvm-svn: 213893
```
3175daa9

[X86] Optimize stackmap shadows on X86. · f49bc3f1

Lang Hames authored Jul 24, 2014

This patch minimizes the number of nops that must be emitted on X86 to satisfy
stackmap shadow constraints.

To minimize the number of nops inserted, the X86AsmPrinter now records the
size of the most recent stackmap's shadow in the StackMapShadowTracker class,
and tracks the number of instruction bytes emitted since the that stackmap
instruction was encountered. Padding is emitted (if it is required at all)
immediately before the next stackmap/patchpoint instruction, or at the end of
the basic block.

This optimization should reduce code-size and improve performance for people
using the llvm stackmap intrinsic on X86.

<rdar://problem/14959522>

llvm-svn: 213892

f49bc3f1

Replace an assertion with a fatal error · 9a412d13

Reid Kleckner authored Jul 24, 2014

Frontends are responsible for putting inalloca on parameters that would
be passed in memory and not registers.

llvm-svn: 213891

9a412d13

Use the same .eh_frame encoding for 32bit PPC as on i386. · 6637d4e2
Joerg Sonnenberger authored Jul 24, 2014
```
llvm-svn: 213890
```
6637d4e2

[libcxx] expose experimental::erased_type for all standard versions. · aa873af5

Eric Fiselier authored Jul 24, 2014

Summary: The polymorphic allocator implementation would greatly benefit by defining virtual functions in the dynlib instead of inline. In order to do that some types are going to have to be available outside of c++1y. This is the first step.

Reviewers: mclow.lists, EricWF

Reviewed By: EricWF

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D4554

llvm-svn: 213889

aa873af5

D4451: Fix copy/move issues casude by __tuple_leafs's converting constructor · 567bb79b
Eric Fiselier authored Jul 24, 2014
```
llvm-svn: 213888
```
567bb79b
test commit · d1854f9c
Eric Fiselier authored Jul 24, 2014
```
llvm-svn: 213887
```
d1854f9c
Preserve libclang ABI compatibility with the 3.5 release · ba764483
Reid Kleckner authored Jul 24, 2014
```
llvm-svn: 213886
```
ba764483
Add support for #pragma nounroll. · c888e41c
Mark Heffernan authored Jul 24, 2014
```
llvm-svn: 213885
```
c888e41c
Try to fix the bots again by moving test to X86 directory. · 29a20055
Manman Ren authored Jul 24, 2014
```
llvm-svn: 213884
```
29a20055

X86: correct library call setup for Windows itanium · c61ed047

Saleem Abdulrasool authored Jul 24, 2014

This target is identical to the Windows MSVC (and follows Microsoft ABI for C).
Correct the library call setup for this target.  The same set of library calls
are missing on this environment.

llvm-svn: 213883

c61ed047

R600: Add FMA instructions for Evergreen · 83592a2d
Matt Arsenault authored Jul 24, 2014
```
llvm-svn: 213882
```
83592a2d
Attempt at fixing the windows shared build. · 2daa6ede
Rafael Espindola authored Jul 24, 2014
```
llvm-svn: 213881
```
2daa6ede
Try to fix the bots. If this does not work, I am going to move it to X86 directory. · a8bc9a4c
Manman Ren authored Jul 24, 2014
```
llvm-svn: 213880
```
a8bc9a4c
Use MCJIT. · e89d2891
Rafael Espindola authored Jul 24, 2014
```
llvm-svn: 213879
```
e89d2891

X86: silence sign comparison warning · 34610e33

Saleem Abdulrasool authored Jul 24, 2014

GCC 4.8 detected a signed compare [-Wsign-compare].  Add a cast for the
destination index.  Add an assert to catch a potential overflow however unlikely
it may be.

llvm-svn: 213878

34610e33

R600: Add new functions for splitting vector loads and stores. · 83e60581
Matt Arsenault authored Jul 24, 2014
```
These will be used in future patches and shouldn't change anything yet.

llvm-svn: 213877
```
83e60581
Let the integrated assembler understand .exitm, PR20426. · 155dccd1
Nico Weber authored Jul 24, 2014
```
llvm-svn: 213876
```
155dccd1

We were turning off all these tests on OSX and FreeBSD because of a known (and... · a9b49895

Jim Ingham authored Jul 24, 2014

We were turning off all these tests on OSX and FreeBSD because of a known (and fairly unimportant) bug.
Keep a test for that bug, but let the useful parts of the test run anyway.

llvm-svn: 213875

a9b49895

Remove unused field MacroInstantiation::TheMacro. No behavior change. · 2a8f922b
Nico Weber authored Jul 24, 2014
```
llvm-svn: 213874
```
2a8f922b
Let the integrated assembler understand .warning, PR20428. · 404012b7
Nico Weber authored Jul 24, 2014
```
llvm-svn: 213873
```
404012b7
Include relative path for header outside the current directory. · c7dbc13e
Joerg Sonnenberger authored Jul 24, 2014
```
llvm-svn: 213872
```
c7dbc13e
Remove dead code. · 8c4c0213
Rafael Espindola authored Jul 24, 2014
```
Every user has been switched to using EngineBuilder.

llvm-svn: 213871
```
8c4c0213

[Refactor] Remove containsLoop to find innermost loops · af9b1e2d

Johannes Doerfert authored Jul 24, 2014

  Use the fact that if we visit a for node first in pre and next in post order
  we know we did not visit any children, thus we found an innermost loop.

  + Test case for an innermost loop with a conditional inside

llvm-svn: 213870

af9b1e2d

Remove the last use of llvm::ExecutionEngine::create. · fb3d5d17
Rafael Espindola authored Jul 24, 2014
```
llvm-svn: 213869
```
fb3d5d17
[Mips] Replace assembler code by YAML to make the 'exe-dynsym.test' test · 4b6d090b
Simon Atanasyan authored Jul 24, 2014
```
target independent.

llvm-svn: 213868
```
4b6d090b

AArch64: refactor ReconstructShuffle function · 7324e845

Tim Northover authored Jul 24, 2014

Quite a bit of cruft had accumulated as we realised the various different cases
it had to handle and squeezed them in where possible. This refactoring mostly
flattens the logic and special-cases. The result is slightly longer, but I
think clearer.

Should be no functionality change.

llvm-svn: 213867

7324e845

Fix r213824 on windows · 857fd660
Duncan P. N. Exon Smith authored Jul 24, 2014
```
llvm-svn: 213866
```
857fd660

Improving the "integer constant too large" diagnostics based on post-commit... · 31f42318

Aaron Ballman authored Jul 24, 2014

Improving the "integer constant too large" diagnostics based on post-commit feedback from Richard Smith. Amends r213657.

llvm-svn: 213865

31f42318

Add scoped-noalias metadata · 9414665a

Hal Finkel authored Jul 24, 2014

This commit adds scoped noalias metadata. The primary motivations for this
feature are:
  1. To preserve noalias function attribute information when inlining
  2. To provide the ability to model block-scope C99 restrict pointers

Neither of these two abilities are added here, only the necessary
infrastructure. In fact, there should be no change to existing functionality,
only the addition of new features. The logic that converts noalias function
parameters into this metadata during inlining will come in a follow-up commit.

What is added here is the ability to generally specify noalias memory-access
sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA
nodes:

!scope0 = metadata !{ metadata !"scope of foo()" }
!scope1 = metadata !{ metadata !"scope 1", metadata !scope0 }
!scope2 = metadata !{ metadata !"scope 2", metadata !scope0 }
!scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 }
!scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 }

Loads and stores can be tagged with an alias-analysis scope, and also, with a
noalias tag for a specific scope:

... = load %ptr1, !alias.scope !{ !scope1 }
... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 }

When evaluating an aliasing query, if one of the instructions is associated
with an alias.scope id that is identical to the noalias scope associated with
the other instruction, or is a descendant (in the scope hierarchy) of the
noalias scope associated with the other instruction, then the two memory
accesses are assumed not to alias.

Note that is the first element of the scope metadata is a string, then it can
be combined accross functions and translation units. The string can be replaced
by a self-reference to create globally unqiue scope identifiers.

[Note: This overview is slightly stylized, since the metadata nodes really need
to just be numbers (!0 instead of !scope0), and the scope lists are also global
unnamed metadata.]

Existing noalias metadata in a callee is "cloned" for use by the inlined code.
This is necessary because the aliasing scopes are unique to each call site
(because of possible control dependencies on the aliasing properties). For
example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets
inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } --
now just because we know that a1 does not alias with b1 at the first call site,
and a2 does not alias with b2 at the second call site, we cannot let inlining
these functons have the metadata imply that a1 does not alias with b2.

llvm-svn: 213864

9414665a

Fixing an MSVC conversion warning about implicitly converting the shift... · 99e0ea0a

Aaron Ballman authored Jul 24, 2014

Fixing an MSVC conversion warning about implicitly converting the shift results to 64-bits. No functional change intended.

llvm-svn: 213863

99e0ea0a

Setting the documentation heading for #pragma unroll, which should not be with... · 0cffc8a2
Aaron Ballman authored Jul 24, 2014
```
Setting the documentation heading for #pragma unroll, which should not be with the heading for #pragma clang loop.

llvm-svn: 213862
```
0cffc8a2

Fix endian test for big-endian hosts · 3feafa78

Ed Maste authored Jul 24, 2014

The uint16_t cast truncated the magic value to 0x00000304, making the
first byte 0 (eByteOrderInvalid) on big endian hosts.

Reported by Justin Hibbits.

llvm-svn: 213861

3feafa78