Commits · 18583d71e83525cf58787ab5b5e96e999a8f9e1c · Roger Ferrer / llvm-epi-0.8

Feb 25, 2014

Keep the link register for uwtable. · 18583d71

Logan Chien authored Feb 25, 2014

The function with uwtable attribute might be visited by the
stack unwinder, thus the link register should be considered
as clobbered after the execution of the branch and link
instruction (i.e. the definition of the machine instruction
can't be ignored) even when the callee function are marked
with noreturn.

llvm-svn: 202165

18583d71

[XCore] Prefer to word align functions. · 8b7466e8

Richard Osborne authored Feb 25, 2014

The behaviour of the XCore's instruction buffer means that the performance
of the same code sequence can differ depending on whether it starts at a 4
byte aligned address or not. Since we don't model the instruction buffer
in the backend we have no way of knowing for sure if it is beneficial to
word align a specific function. However, in the absence of precise
modelling, it is better on balance to word align functions because:

* It makes a fetch-nop while executing the prologue slightly less likely.
* If we don't word align functions then a small perturbation in one
  function can have a dramatic knock on effect. If the size of the function
  changes it might change the alignment and therefore the performance of
  all the functions that happen to follow it in the binary. This butterfly
  effect makes it harder to reason about and measure the performance of
  code.

llvm-svn: 202163

8b7466e8

Factor out calls to AA.getDataLayout(). · 6d6e87be
Rafael Espindola authored Feb 25, 2014
```
llvm-svn: 202157
```
6d6e87be
Make a few more DataLayout variables const. · 43b5a51e
Rafael Espindola authored Feb 25, 2014
```
llvm-svn: 202155
```
43b5a51e

[SROA] Use the original load name with the SROA-prefixed IRB rather than · 25adb7b0

Chandler Carruth authored Feb 25, 2014

just "load". This helps avoid pointless de-duping with order-sensitive
numbers as we already have unique names from the original load. It also
makes the resulting IR quite a bit easier to read.

llvm-svn: 202140

25adb7b0

[SROA] Thread the ability to add a pointer-specific name prefix through · cb93cd2d

Chandler Carruth authored Feb 25, 2014

the pointer adjustment code. This is the primary code path that creates
totally new instructions in SROA and being able to lump them based on
the pointer value's name for which they were created causes
*significantly* fewer name collisions and general noise in the debug
output. This is particularly significant because it is making it much
harder to track down instability in the output of SROA, as name
de-duplication is a totally harmless form of instability that gets in
the way of seeing real problems.

The new fancy naming scheme tries to dig out the root "pre-SROA" name
for pointer values and associate that all the way through the pointer
formation instructions. Digging out the root is important to prevent the
multiple iterative rounds of SROA from just layering too much cruft on
top of cruft here. We already track the layers of SROAs iteration in the
alloca name prefix. We don't need to duplicate it here.

Should have no functionality change, and shouldn't have any really
measurable impact on NDEBUG builds, as most of the complex logic is
debug-only.

llvm-svn: 202139

cb93cd2d

[SROA] Rather than copying the logic for building a name prefix into the · 51175533
Chandler Carruth authored Feb 25, 2014
```
PHI-pointer builder, just copy the builder and clobber the obvious
fields.

llvm-svn: 202136
```
51175533

[SROA] Simplify some of the logic to dig out the old pointer value by · 8183a50f

Chandler Carruth authored Feb 25, 2014

using OldPtr more heavily. Lots of this code was written before the
rewriter had an OldPtr member setup ahead of time. There are already
asserts in place that should ensure this doesn't change any
functionality.

llvm-svn: 202135

8183a50f

[SROA] Adjust to new clang-format style. · 7625c54e
Chandler Carruth authored Feb 25, 2014
```
llvm-svn: 202134
```
7625c54e
Reuse constants for COFF string table entry offsets · 01143f9a
Nico Rieck authored Feb 25, 2014
```
llvm-svn: 202130
```
01143f9a

[SROA] Fix a *glaring* bug in r202091: you have to actually *write* · a8c4cc68

Chandler Carruth authored Feb 25, 2014

the break statement, not just think it to yourself....

No idea how this worked at all, much less survived most bots, my
bootstrap, and some bot bootstraps!

The Polly one didn't survive, and this was filed as PR18959. I don't
have a reduced test case and honestly I'm not seeing the need. What we
probably need here are better asserts / debug-build behavior in
SmallPtrSet so that this madness doesn't make it so far.

llvm-svn: 202129

a8c4cc68

Disable old JIT unittests for AARch64 · dd8c8018
Renato Golin authored Feb 25, 2014
```
llvm-svn: 202127
```
dd8c8018
Ignore old JIT tests in AARch64 - CMake style · 69736692
Renato Golin authored Feb 25, 2014
```
llvm-svn: 202126
```
69736692
Add aarch64 to config.guess · 882e947d
Renato Golin authored Feb 25, 2014
```
llvm-svn: 202125
```
882e947d
Silence GCC warning · 26af6f7f
Alexey Samsonov authored Feb 25, 2014
```
llvm-svn: 202119
```
26af6f7f
Fix typos · 70b36995
Alp Toker authored Feb 25, 2014
```
llvm-svn: 202107
```
70b36995

[SROA] Add a debugging tool which shuffles the slices sequence prior to · 83cee772

Chandler Carruth authored Feb 25, 2014

sorting it. This helps uncover latent reliance on the original ordering
which aren't guaranteed to be preserved by std::sort (but often are),
and which are based on the use-def chain orderings which also aren't
(technically) guaranteed.

Only available in C++11 debug builds, and behind a flag to prevent noise
at the moment, but this is generally useful so figured I'd put it in the
tree rather than keeping it out-of-tree.

llvm-svn: 202106

83cee772

[SROA] Use a more direct way of determining whether we are processing · bb2a9324

Chandler Carruth authored Feb 25, 2014

the destination operand or source operand of a memmove.

It so happens that it was impossible for SROA to try to rewrite
self-memmove where the operands are *identical*, because either such
a think is volatile (and we don't rewrite) or it is non-volatile, and we
don't even register it as a use of the alloca.

However, making the 'IsDest' test *rely* on this subtle fact is... Very
confusing for the reader. We should use the direct and readily available
test of the Use* which gives us concrete information about which operand
is being rewritten.

No functionality changed, I hope! ;]

llvm-svn: 202103

bb2a9324

Add some convenience accessors for the underlying Use of an operand. · fbe7dc33

Chandler Carruth authored Feb 25, 2014

These complement many of the existing accessors and make it
significantly easier to write code which needs to poke at the underlying
Use without hard coding the operand number at which it resides for
a particular instruction. No functionality changed of course.

llvm-svn: 202102

fbe7dc33

Indent this continued line. · 1ce017e8
Nick Lewycky authored Feb 25, 2014
```
llvm-svn: 202096
```
1ce017e8

[SROA] Fix another instability in SROA with respect to the slice · 3bf18ed5

Chandler Carruth authored Feb 25, 2014

ordering.

The fundamental problem that we're hitting here is that the use-def
chain ordering is *itself* not a stable thing to be relying on in the
rewriting for SROA. Further, we use a non-stable sort over the slices to
arrange them based on the section of the alloca they're operating on.
With a debugging STL implementation (or different implementations in
stage2 and stage3) this can cause stage2 != stage3.

The specific aspect of this problem fixed in this commit deals with the
rewriting and load-speculation around PHIs and Selects. This, like many
other aspects of the use-rewriting in SROA, is really part of the
"strong SSA-formation" that is doen by SROA where it works very hard to
canonicalize loads and stores in *just* the right way to satisfy the
needs of mem2reg[1]. When we have a select (or a PHI) with 2 uses of the
same alloca, we test that loads downstream of the select are
speculatable around it twice. If only one of the operands to the select
needs to be rewritten, then if we get lucky we rewrite that one first
and the select is immediately speculatable. This can cause the order of
operand visitation, and thus the order of slices to be rewritten, to
change an alloca from promotable to non-promotable and vice versa.

The fix is to defer all of the speculation until *after* the rewrite
phase is done. Once we've rewritten everything, we can accurately test
for whether speculation will work (once, instead of twice!) and the
order ceases to matter.

This also happens to simplify the other subtlety of speculation -- we
need to *not* speculate anything unless the result of speculating will
make the alloca fully promotable by mem2reg. I had a previous attempt at
simplifying this, but it was still pretty horrible.

There is actually already a *really* nice test case for this in
basictest.ll, but on multiple STL implementations and inputs, we just
got "lucky". Fortunately, the test case is very small and we can
essentially build it in exactly the opposite way to get reasonable
coverage in both directions even from normal STL implementations.

llvm-svn: 202092

3bf18ed5

llvm-dwarfdump: Support for debug_line.dwo section for file names for type units under fission. · 1d4736e0
David Blaikie authored Feb 24, 2014
```
llvm-svn: 202091
```
1d4736e0
Make some DataLayout pointers const. · aeff8a9c
Rafael Espindola authored Feb 24, 2014
```
No functionality change. Just reduces the noise of an upcoming patch.

llvm-svn: 202087
```
aeff8a9c

Feb 24, 2014

Permit CMAKE_INSTALL_RPATH to be set on command line · 301bafed

Bernard Ogden authored Feb 24, 2014

Commit 201921 overrides setting of CMAKE_INSTALL_RPATH via the
command line. Last time this happened we applied another patch
to only set CMAKE_INSTALL_RPATH if already defined (r197825).
This patch does the same thing again, but only for the UNIX
case - we leave APPLE alone as presumably the original committer
is happy with the non-overriding behaviour.

llvm-svn: 202085

301bafed

trivial test commit · fbd12d35
Albrecht Kadlec authored Feb 24, 2014
```
llvm-svn: 202084
```
fbd12d35

llvm-objdump: Do not attempt to disassemble symbols outside of section · 2b614e11

Simon Atanasyan authored Feb 24, 2014

boundaries.

It is possible to create an ELF executable where symbol from say .text
section 'points' to the address outside the section boundaries. It does
not have a sense to disassemble something outside the section.

Without this fix llvm-objdump prints finite or infinite (depends on
the executable file architecture) number of 'invalid instruction
encoding' warnings.

llvm-svn: 202083

2b614e11

Disable an MCJIT test on older Darwins until we have a better interface. · 6a162d85

Andrew Trick authored Feb 24, 2014

See
<rdar://16149106> [MCJIT] provide a platform-independent way to communicate callee-save frame info.
<rdar://16149279> [MCJIT] get the host OS version from a runtime check, not a configure-time check.

llvm-svn: 202082

6a162d85

Fix unused variable · a81aee82
Matt Arsenault authored Feb 24, 2014
```
llvm-svn: 202080
```
a81aee82

R600/SI - Add new CI arithmetic instructions. · 41e2f2ba

Matt Arsenault authored Feb 24, 2014

Does not yet include larger part required
to match v_mad_i64_i32 / v_mad_u64_u32.

llvm-svn: 202077

41e2f2ba

R600: Make check clearer. · d0ce2bd8

Matt Arsenault authored Feb 24, 2014

The check is clearer as southern islands or later,
rather than checking for later than northern islands.

llvm-svn: 202076

d0ce2bd8

Fix DOT4 missing from getTargetOpcodeName · 21a3faaf
Matt Arsenault authored Feb 24, 2014
```
llvm-svn: 202075
```
21a3faaf
Add missing const · b598f7b8
Matt Arsenault authored Feb 24, 2014
```
llvm-svn: 202074
```
b598f7b8
Trivial code simplification · 58a76396
Matt Arsenault authored Feb 24, 2014
```
llvm-svn: 202073
```
58a76396
SLPVectorizer: Try vectorizing 'splat' stores · 9611d23d
Arnold Schwaighofer authored Feb 24, 2014
```
Vectorize sequential stores of a broadcasted value.
5% on eon.

radar://16124699

llvm-svn: 202067
```
9611d23d

[X86][SchedModel] Add missing scheduling model for SSE related instructions. · ca498518

Quentin Colombet authored Feb 24, 2014

The patch defines new or refines existing generic scheduling classes to match
the behavior of the SSE instructions.
It also maps those scheduling classes on the related SSE instructions.

<rdar://problem/15607571>

llvm-svn: 202065

ca498518

Add a dwarf number to the Y register. · e89f3109
Roman Divacky authored Feb 24, 2014
```
llvm-svn: 202057
```
e89f3109

Replace the F_Binary flag with a F_Text one. · 90c7f1cc

Rafael Espindola authored Feb 24, 2014

After this I will set the default back to F_None. The advantage is that
before this patch forgetting to set F_Binary would corrupt a file on windows.
Forgetting to set F_Text produces one that cannot be read in notepad, which
is a better failure mode :-)

llvm-svn: 202052

90c7f1cc

LTO: Add the loop vectorizer to the LTO pipeline. · 6ccda923

Arnold Schwaighofer authored Feb 24, 2014

During the LTO phase LICM will move loop invariant global variables out of loops
(informed by GlobalModRef). This makes more loops countable presenting
opportunity for the loop vectorizer.

Adding the loop vectorizer improves some TSVC benchmarks and twolf/ref dataset
(5%) on x86-64.

radar://15970632

llvm-svn: 202051

6ccda923

Fix windows unittest I missed in the raw_fd_ostream constructor change. · 83708e65
Rafael Espindola authored Feb 24, 2014
```
llvm-svn: 202050
```
83708e65
For lcov tests, don't Xfail mips littl endian (mipsel-... and mip64el-...) · 59ebf32d
Reed Kotler authored Feb 24, 2014
```
targets. Just big endian (mips-... and mips64-...)

llvm-svn: 202049
```
59ebf32d