Commits · f5b769e4f21725df838b993b74d6421f693edacf · Roger Ferrer / llvm-epi-0.8

Dec 05, 2013

Document that dllexported symbols are preserved by optimization passes. · f5b769e4
Yunzhong Gao authored Dec 05, 2013
```
llvm-svn: 196523
```
f5b769e4

Fix non-deterministic behavior. · cdbde3aa

Rafael Espindola authored Dec 05, 2013

We use CSEBlocks to initialize a worklist:

SmallVector<BasicBlock *, 8> CSEWorkList(CSEBlocks.begin(), CSEBlocks.end());

so it must have a deterministic order.

llvm-svn: 196520

cdbde3aa

Rename DwarfUnits to DwarfFile to help avoid some naming confusion. · f8194853
Eric Christopher authored Dec 05, 2013
```
llvm-svn: 196519
```
f8194853
AttributeList: tweak the conditional order to avoid two strcmps · a31d1dd1
Alp Toker authored Dec 05, 2013
```
llvm-svn: 196518
```
a31d1dd1

MI-Sched: Model "reserved" processor resources. · 5a22df49

Andrew Trick authored Dec 05, 2013

This allows a target to use MI-Sched as an in-order scheduler that
will model strict resource conflicts without defining a processor
itinerary. Instead, the target can now use the new per-operand machine
model and define in-order resources with BufferSize=0. For example,
this would allow restricting the type of operations that can be formed
into a dispatch group. (Normally NumMicroOps is sufficient to enforce
dispatch groups).

If the intent is to model latency in in-order pipeline, as opposed to
resource conflicts, then a resource with BufferSize=1 should be
defined instead.

This feature is only casually tested as there are no in-tree targets
using it yet. However, Hal will be experimenting with POWER7.

llvm-svn: 196517

5a22df49

MI-Sched: handle latency of in-order operations with the new machine model. · 880e573d

Andrew Trick authored Dec 05, 2013

The per-operand machine model allows the target to define "unbuffered"
processor resources. This change is a quick, cheap way to model stalls
caused by the latency of operations that use such resources. This only
applies when the processor's micro-op buffer size is non-zero
(Out-of-Order). We can't precisely model in-order stalls during
out-of-order execution, but this is an easy and effective
heuristic. It benefits cortex-a9 scheduling when using the new
machine model, which is not yet on by default.

MI-Sched for armv7 was evaluated on Swift (and only not enabled because
of a performance bug related to predication). However, we never
evaluated Cortex-A9 performance on MI-Sched in its current form. This
change adds MI-Sched functionality to reach performance goals on
A9. The only remaining change is to allow MI-Sched to run as a PostRA
pass.

I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7:
-mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false

For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results:
(min run time over 2 runs, filtering tiny changes)

Speedups:
| Benchmarks/BenchmarkGame/recursive         |  52.39% |
| Benchmarks/VersaBench/beamformer           |  20.80% |
| Benchmarks/Misc/pi                         |  19.97% |
| Benchmarks/Misc/mandel-2                   |  19.95% |
| SPEC/CFP2000/188.ammp                      |  18.72% |
| Benchmarks/McCat/08-main/main              |  18.58% |
| Benchmarks/Misc-C++/Large/sphereflake      |  18.46% |
| Benchmarks/Olden/power                     |  17.11% |
| Benchmarks/Misc-C++/mandel-text            |  16.47% |
| Benchmarks/Misc/oourafft                   |  15.94% |
| Benchmarks/Misc/flops-7                    |  14.99% |
| Benchmarks/FreeBench/distray               |  14.26% |
| SPEC/CFP2006/470.lbm                       |  14.00% |
| mediabench/mpeg2/mpeg2dec/mpeg2decode      |  12.28% |
| Benchmarks/SmallPT/smallpt                 |  10.36% |
| Benchmarks/Misc-C++/Large/ray              |   8.97% |
| Benchmarks/Misc/fp-convert                 |   8.75% |
| Benchmarks/Olden/perimeter                 |   7.10% |
| Benchmarks/Bullet/bullet                   |   7.03% |
| Benchmarks/Misc/mandel                     |   6.75% |
| Benchmarks/Olden/voronoi                   |   6.26% |
| Benchmarks/Misc/flops-8                    |   5.77% |
| Benchmarks/Misc/matmul_f64_4x4             |   5.19% |
| Benchmarks/MiBench/security-rijndael       |   5.15% |
| Benchmarks/Misc/flops-6                    |   5.10% |
| Benchmarks/Olden/tsp                       |   4.46% |
| Benchmarks/MiBench/consumer-lame           |   4.28% |
| Benchmarks/Misc/flops-5                    |   4.27% |
| Benchmarks/mafft/pairlocalalign            |   4.19% |
| Benchmarks/Misc/himenobmtxpa               |   4.07% |
| Benchmarks/Misc/lowercase                  |   4.06% |
| SPEC/CFP2006/433.milc                      |   3.99% |
| Benchmarks/tramp3d-v4                      |   3.79% |
| Benchmarks/FreeBench/pifft                 |   3.66% |
| Benchmarks/Ptrdist/ks                      |   3.21% |
| Benchmarks/Adobe-C++/loop_unroll           |   3.12% |
| SPEC/CINT2000/175.vpr                      |   3.12% |
| Benchmarks/nbench                          |   2.98% |
| SPEC/CFP2000/183.equake                    |   2.91% |
| Benchmarks/Misc/perlin                     |   2.85% |
| Benchmarks/Misc/flops-1                    |   2.82% |
| Benchmarks/Misc-C++-EH/spirit              |   2.80% |
| Benchmarks/Misc/flops-2                    |   2.77% |
| Benchmarks/NPB-serial/is                   |   2.42% |
| Benchmarks/ASC_Sequoia/CrystalMk           |   2.33% |
| Benchmarks/BenchmarkGame/n-body            |   2.28% |
| Benchmarks/SciMark2-C/scimark2             |   2.27% |
| Benchmarks/Olden/bh                        |   2.03% |
| skidmarks10/skidmarks                      |   1.81% |
| Benchmarks/Misc/flops                      |   1.72% |

Slowdowns:
| Benchmarks/llubenchmark/llu                | -14.14% |
| Benchmarks/Polybench/stencils/seidel-2d    |  -5.67% |
| Benchmarks/Adobe-C++/functionobjects       |  -5.25% |
| Benchmarks/Misc-C++/oopack_v1p8            |  -5.00% |
| Benchmarks/Shootout/hash                   |  -2.35% |
| Benchmarks/Prolangs-C++/ocean              |  -2.01% |
| Benchmarks/Polybench/medley/floyd-warshall |  -1.98% |
| Polybench/linear-algebra/kernels/3mm       |  -1.95% |
| Benchmarks/McCat/09-vor/vor                |  -1.68% |

llvm-svn: 196516

880e573d

Machine model comments. Explain a ProcessorUnit's BufferSize. · 093bdd17
Andrew Trick authored Dec 05, 2013
```
llvm-svn: 196515
```
093bdd17
Fix the A9 machine model. VTRN writes two registers. · ff199a4b
Andrew Trick authored Dec 05, 2013
```
llvm-svn: 196514
```
ff199a4b
comment typo and reformat · bb1247b9
Andrew Trick authored Dec 05, 2013
```
llvm-svn: 196513
```
bb1247b9

clang-format vsix cmake build: use ${LLVM_TOOLS_BINARY_DIR}/${CMAKE_CFG_INTDIR} · 6a3816a6

Hans Wennborg authored Dec 05, 2013

as the location for grabbing clang-format.exe, and also output the .vsix here.

This allows us to find clang-format.exe when building from a MSVC Solution.

llvm-svn: 196512

6a3816a6

Check the initial line number without going through PresumedLoc · 52937abc

Alp Toker authored Dec 05, 2013

No practical difference in this case and would return 1 either way, but this is
more self-explanatory.

llvm-svn: 196511

52937abc

Fix a tranche of comment, test and doc typos · f6a24ce4
Alp Toker authored Dec 05, 2013
```
llvm-svn: 196510
```
f6a24ce4
Add a default constructor to get deterministic behavior. · 4cc2b873
Rafael Espindola authored Dec 05, 2013
```
Should fix the msan and valgrind bots.

llvm-svn: 196509
```
4cc2b873

SLPVectorizer: An in-tree vectorized entry cannot also be a scalar external use · 7ee53cac

Arnold Schwaighofer authored Dec 05, 2013

We were creating external uses for scalar values in MustGather entries that also
had a ScalarToTreeEntry (they also are present in a vectorized tuple). This
meant we would keep a value 'alive' as a scalar and vectorized causing havoc.
This is not necessary because when we create a MustGather vector we explicitly
create external uses entries for the insertelement instructions of the
MustGather vector elements.

Fixes PR18129.

radar://15582184

llvm-svn: 196508

7ee53cac

[tsan] fix PR18146: sometimes a variable written into vptr could have an... · 2460c3fc

Kostya Serebryany authored Dec 05, 2013

[tsan] fix PR18146: sometimes a variable written into vptr could have an integer type (after other optimizations)

llvm-svn: 196507

2460c3fc

PR16532: work around old GCC bug in interception_type_test.cc · 5ca3de6e
Alexey Samsonov authored Dec 05, 2013
```
llvm-svn: 196506
```
5ca3de6e
Use !! to convert to a boolean value. · d2014f19
Rui Ueyama authored Dec 05, 2013
```
llvm-svn: 196505
```
d2014f19

[PECOFF] Handle .lib files as if they are grouped by --{start,end}-group. · 16c025e2

Rui Ueyama authored Dec 05, 2013

Currently we do not de-duplicate library files specified by /defaultlib option.
As a result, the same files are added multiple times to the input graph. In
particular, some popular files, such as kernel32.lib or oldnames.lib, are added
more than 10 times during linking of LLD. That makes the linker slower, as it
needs to parse the same file again and again.

This patch solves the issue by de-duplicating. The same file will be added only
once to the input graph. This patch improved the LLD linking time from 10.5
seconds to 7.7 seconds on my 4-core Core i7 Macbook Pro.

llvm-svn: 196504

16c025e2

[NVPTX] Fix off-by-one error when creating the VT list for an SDNode · 4459717b
Justin Holewinski authored Dec 05, 2013
```
llvm-svn: 196503
```
4459717b
Run TSan/MSan lit tests only on 64-bit platforms · 2d42b1d6
Alexey Samsonov authored Dec 05, 2013
```
llvm-svn: 196501
```
2d42b1d6
Add forgotten header guards · 996e099d
Alexey Samsonov authored Dec 05, 2013
```
llvm-svn: 196500
```
996e099d

Revert: "Patch from Todd Fiala that install the lldb.py module in the prefix... · 13b0fba4

Sylvestre Ledru authored Dec 05, 2013

Revert: "Patch from Todd Fiala that install the lldb.py module in the prefix directory and also makes install fail if the prefix directory can't be accessed"

Does not respect the prefix

llvm-svn: 196499

13b0fba4

[mips] Small code generation improvement for conditional operator (select) · a6beac1a

Matheus Almeida authored Dec 05, 2013

in case the operands are constants and its difference is |1|.
It should be possible in those cases to rematerialize the result using
MIPS's slt and similar instructions.

The small update to some of the tests in cmov.ll, sel1c.ll and sel2c.ll was needed
otherwise the optimization implemented in this patch would have been triggered
(difference between the operands was 1) and that would have changed the semantic
of the tests.

llvm-svn: 196498

a6beac1a

[sanitizer] Introduce VReport and VPrintf macros and use them in sanitizer code. · 9be70fbd
Sergey Matveev authored Dec 05, 2013
```
Instead of "if (common_flags()->verbosity) Report(...)" we now have macros.

llvm-svn: 196497
```
9be70fbd
[mips] Add some comments related to the optimization performed in performSELECTCombine. · a611c0f4
Matheus Almeida authored Dec 05, 2013
```
The structure of the code was slightly modified so that the next patch is easier to read/review.

No functional changes.

llvm-svn: 196496
```
a611c0f4
[tsan] fix the old tsan Makefile to build the asm files with includes · 3b2f702d
Kostya Serebryany authored Dec 05, 2013
```
llvm-svn: 196495
```
3b2f702d

[mips][msa] Fix issue with immediate fields of LD/ST instructions · 6b59c449

Matheus Almeida authored Dec 05, 2013

not being correctly encoded/decoded.
In more detail, immediate fields of LD/ST instructions should be
divided/multiplied by the size of the data format before encoding and
after decoding, respectively.

llvm-svn: 196494

6b59c449

ARM: fix yet another stack-folding bug · e4def5e2

Tim Northover authored Dec 05, 2013

We were trying to fold the stack adjustment into the wrong instruction in the
situation where the entire basic-block was epilogue code. Really, it can only
ever be valid to do the folding precisely where the "add sp, ..." would be
placed so there's no need for a separate iterator to track that.

Should fix PR18136.

llvm-svn: 196493

e4def5e2

Revert r196490 and fix include paths in makefile-based build · 58e44a34
Alexey Samsonov authored Dec 05, 2013
```
llvm-svn: 196492
```
58e44a34
[asan] revert files which I accidentally commited in r196490 · d4af5c24
Kostya Serebryany authored Dec 05, 2013
```
llvm-svn: 196491
```
d4af5c24

[tsan] fix the include path that is broken in configure/make build but works... · 9ffa232f

Kostya Serebryany authored Dec 05, 2013

[tsan] fix the include path that is broken in configure/make build but works in cmake build (PR18144). This is a quick fix. Will need to fix the configure/make build properly

llvm-svn: 196490

9ffa232f

[sanitizer] fix the ppc32 build (patch by Jakub Jelinek) · f2c93b29
Kostya Serebryany authored Dec 05, 2013
```
llvm-svn: 196489
```
f2c93b29
PR17983: Fix crasher bug in C++1y mode when performing a non-global array · f03bd308
Richard Smith authored Dec 05, 2013
```
delete on a class which has no array cookie and has no class-specific operator
new.

llvm-svn: 196488
```
f03bd308
[libclang] Record ranges skipped by the preprocessor and expose them with libclang. · 9ef5775a
Argyrios Kyrtzidis authored Dec 05, 2013
```
Patch by Erik Verbruggen!

llvm-svn: 196487
```
9ef5775a
[c-index-test] Enhance perform_test_reparse_source() to allow remapping a file · 011e6a5f
Argyrios Kyrtzidis authored Dec 05, 2013
```
at a particular reparsing iteration.

Passing '-remap-file-1=from:to' will remap the files in the second iteration.

llvm-svn: 196486
```
011e6a5f
[c-index-test] For the '-remap-file=' option use ':' instead of ';' for separator. · a60d8ae0
Argyrios Kyrtzidis authored Dec 05, 2013
```
lldb does not like semicolon as part of an option.

llvm-svn: 196485
```
a60d8ae0

clang-format-diff.py: pass through errors to stderr, not stdout · fcf30326

Alp Toker authored Dec 05, 2013

Also use write() for unified diff output to avoid further processing by the
print function (e.g. trailing newline).

llvm-svn: 196484

fcf30326

Fix the build failure of lldb wrt clang recent change. Patch by Todd Fiala · 34efb6ae
Sylvestre Ledru authored Dec 05, 2013
```
llvm-svn: 196483
```
34efb6ae
Update C++ status from 'SVN' to 'Clang 3.4' in preparation for release. Leave · 28b19398
Richard Smith authored Dec 05, 2013
```
boxes yellow until we release, though.

llvm-svn: 196482
```
28b19398

Implement DR482: namespace members can be redeclared with a qualified name · a230224b

Richard Smith authored Dec 05, 2013

within their namespace, and such a redeclaration isn't required to be a
definition any more.

Update DR status page to say Clang 3.4 instead of SVN and add new Clang 3.5
category (but keep Clang 3.4 yellow for now).

llvm-svn: 196481

a230224b