Commits · 4b358188c608c26df1f8e6c7406fea4cb43d9f34 · Roger Ferrer / llvm-epi-0.8

Aug 29, 2013

Revert: r189565 - Add getUnrollingPreferences to TTI · 8e83820a

Hal Finkel authored Aug 29, 2013

Revert unintentional commit (of an unreviewed change).

Original commit message:

Add getUnrollingPreferences to TTI

Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.

llvm-svn: 189566

8e83820a

Add getUnrollingPreferences to TTI · 63e6c0e9

Hal Finkel authored Aug 29, 2013

Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.

llvm-svn: 189565

63e6c0e9

Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC... · 4c459bcd

Nadav Rotem authored Aug 28, 2013

Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons:
1. They are a kind of cannonicalization.
2. The performance measurements show that it is better to keep them in.

There should be no functional change if you are not enabling the LateVectorization mode.

llvm-svn: 189539

4c459bcd

Fix typo. · 38874731
Matt Arsenault authored Aug 28, 2013
```
llvm-svn: 189524
```
38874731

Aug 28, 2013

Disable unrolling in the loop vectorizer when disabled in the pass manager · 6d09904c

Hal Finkel authored Aug 28, 2013

When unrolling is disabled in the pass manager, the loop vectorizer should also
not unroll loops. This will allow the -fno-unroll-loops option in Clang to
behave as expected (even for vectorizable loops). The loop vectorizer's
-force-vector-unroll option will (continue to) override the pass-manager
setting (including -force-vector-unroll=0 to force use of the internal
auto-selection logic).

In order to test this, I added a flag to opt (-disable-loop-unrolling) to force
disable unrolling through opt (the analog of -fno-unroll-loops in Clang). Also,
this fixes a small bug in opt where the loop vectorizer was enabled only after
the pass manager populated the queue of passes (the global_alias.ll test needed
a slight update to the RUN line as a result of this fix).

llvm-svn: 189499

6d09904c

80 cols · 9b7e2b55
Alexey Samsonov authored Aug 28, 2013
```
llvm-svn: 189473
```
9b7e2b55
DataFlowSanitizer: Implement trampolines for function pointers passed to custom functions. · 28a10aff
Peter Collingbourne authored Aug 27, 2013
```
Differential Revision: http://llvm-reviews.chandlerc.com/D1503

llvm-svn: 189408
```
28a10aff

Aug 27, 2013

Refactor 'vectorizeLoop' no functionality change. · 6b41f7cc

Nadav Rotem authored Aug 27, 2013

This patch merges LoopVectorize of InnerLoopVectorizer and InnerLoopUnroller by adding checks for VF=1. This helps in erasing the Unroller code that is almost identical to the InnerLoopVectorizer code.

llvm-svn: 189391

6b41f7cc

Fixed typo. · eab9a7fa
Michael Gottesman authored Aug 27, 2013
```
Noticed by Stephen Checkoway <s@pahtak.org>.

llvm-svn: 189312
```
eab9a7fa

Fix inserting instructions before last in bundle. · ed9f76d3

Matt Arsenault authored Aug 26, 2013

The builder inserts from before the insert point,
not after, so this would insert before the last
instruction in the bundle instead of after it.

I'm not sure if this can actually be a problem
with any of the current insertions.

llvm-svn: 189285

ed9f76d3

LoopVectorize: Implement partial loop unrolling when vectorization is not profitable. · bdc9ff44

Nadav Rotem authored Aug 26, 2013

This patch enables unrolling of loops when vectorization is legal but not profitable.
We add a new class InnerLoopUnroller, that extends InnerLoopVectorizer and replaces some of the vector-specific logic with scalars.

This patch does not introduce any runtime regressions and improves the following workloads:

SingleSource/Benchmarks/Shootout/matrix -22.64%
SingleSource/Benchmarks/Shootout-C++/matrix -13.06%
External/SPEC/CINT2006/464_h264ref/464_h264ref -3.99%
SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -1.95%

llvm-svn: 189281

bdc9ff44

Aug 26, 2013
- test commit. Remove blank line · 7107d415
  Yi Jiang authored Aug 26, 2013
```
llvm-svn: 189265
```
  7107d415
- Fix unused variable in release build · bcd8c577
  Matt Arsenault authored Aug 26, 2013
```
llvm-svn: 189264
```
  bcd8c577
- Constify functions · 8f21c838
  Matt Arsenault authored Aug 26, 2013
```
llvm-svn: 189234
```
  8f21c838
- Vectorize starting from insertelements building a vector · 39274be6
  Matt Arsenault authored Aug 26, 2013
```
llvm-svn: 189233
```
  39274be6
Aug 24, 2013
- Check if in set on insertion instead of separately · 8405888a
  Matt Arsenault authored Aug 24, 2013
```
llvm-svn: 189179
```
  8405888a
- Add a function object to compare the first or second component of a std::pair. · b12cf019
  Benjamin Kramer authored Aug 24, 2013
```
Replace instances of this scattered around the code base.

llvm-svn: 189169
```
  b12cf019
Aug 23, 2013

DataFlowSanitizer: correctly combine labels in the case where they are equal. · a96296f3
Peter Collingbourne authored Aug 23, 2013
```
llvm-svn: 189133
```
a96296f3

[msan] Fix handling of va_arg overflow area on x86_64. · d42863cc

Evgeniy Stepanov authored Aug 23, 2013

The code was erroneously reading overflow area shadow from the TLS slot,
bypassing the local copy. Reading shadow directly from TLS is wrong, because
it can be overwritten by a nested vararg call, if that happens before va_start.

llvm-svn: 189104

d42863cc

Turn MipsOptimizeMathLibCalls into a target-independent scalar transform · 37cd6cfb

Richard Sandiford authored Aug 23, 2013

...so that it can be used for z too.  Most of the code is the same.
The only real change is to use TargetTransformInfo to test when a sqrt
instruction is available.

The pass is opt-in because at the moment it only handles sqrt.

llvm-svn: 189097

37cd6cfb

80 cols · 6dae24df
Alexey Samsonov authored Aug 23, 2013
```
llvm-svn: 189091
```
6dae24df

Update StripDeadDebugInfo to use DebugInfoFinder so that it is no longer stale... · 823aaffd

Michael Gottesman authored Aug 23, 2013

Update StripDeadDebugInfo to use DebugInfoFinder so that it is no longer stale to the point of not working and more resilient to debug info changes.

The current version of StripDeadDebugInfo became stale and no longer actually
worked since it was expecting an older version of debug info.

This patch updates it to use DebugInfoFinder and the modern DebugInfo classes as
much as possible to make it more redundent to such changes. Additionally, the
only place where that was avoided (the code where we replace the old sets with
the new), I call verify on the DIContextUnit implying that if the format changes
and my live set changes no longer make sense an assert will be hit. In order to
ensure that that occurs I have included a test case.

The actual stripping of the dead debug info follows the same strategy as was
used before in this class: find the live set and replace the old set in the
given compile unit (which may contain dead global variables/functions) with the
new live one.

llvm-svn: 189078

823aaffd

Aug 22, 2013

DataFlowSanitizer: Replace non-instrumented aliases of instrumented functions,... · 34f0c313

Peter Collingbourne authored Aug 22, 2013

DataFlowSanitizer: Replace non-instrumented aliases of instrumented functions, and vice versa, with wrappers.

Differential Revision: http://llvm-reviews.chandlerc.com/D1442

llvm-svn: 189054

34f0c313

DataFlowSanitizer: Factor the wrapper builder out to buildWrapperFunction. · 761a4fc4
Peter Collingbourne authored Aug 22, 2013
```
Differential Revision: http://llvm-reviews.chandlerc.com/D1441

llvm-svn: 189053
```
761a4fc4

DataFlowSanitizer: Prefix the name of each instrumented function with "dfs$". · 59b1262d

Peter Collingbourne authored Aug 22, 2013

DFSan changes the ABI of each function in the module. This makes it possible
for a function with the native ABI to be called with the instrumented ABI,
or vice versa, thus possibly invoking undefined behavior. A simple way
of statically detecting instances of this problem is to prepend the prefix
"dfs$" to the name of each instrumented-ABI function.

This will not catch every such problem; in particular function pointers passed
across the instrumented-native barrier cannot be used on the other side.
These problems could potentially be caught dynamically.

Differential Revision: http://llvm-reviews.chandlerc.com/D1373

llvm-svn: 189052

59b1262d

Teach the SLP vectorizer the correct way to check for consecutive access · 1c34afcb

Chandler Carruth authored Aug 22, 2013

using GEPs. Previously, it used a number of different heuristics for
analyzing the GEPs. Several of these were conservatively correct, but
failed to fall back to SCEV even when SCEV might have given a reasonable
answer. One was simply incorrect in how it was formulated.

There was good code already to recursively evaluate the constant offsets
in GEPs, look through pointer casts, etc. I gathered this into a form
code like the SLP code can use in a previous commit, which allows all of
this code to become quite simple.

There is some performance (compile time) concern here at first glance as
we're directly attempting to walk both pointers constant GEP chains.
However, a couple of thoughts:

1) The very common cases where there is a dynamic pointer, and a second
   pointer at a constant offset (usually a stride) from it, this code
   will actually not do any unnecessary work.

2) InstCombine and other passes work very hard to collapse constant
   GEPs, so it will be rare that we iterate here for a long time.

That said, if there remain performance problems here, there are some
obvious things that can improve the situation immensely. Doing
a vectorizer-pass-wide memoizer for each individual layer of pointer
values, their base values, and the constant offset is likely to be able
to completely remove redundant work and strictly limit the scaling of
the work to scrape these GEPs. Since this optimization was not done on
the prior version (which would still benefit from it), I've not done it
here. But if folks have benchmarks that slow down it should be straight
forward for them to add.

I've added a test case, but I'm not really confident of the amount of
testing done for different access patterns, strides, and pointer
manipulation.

llvm-svn: 189007

1c34afcb

Teach LoopVectorize about address space sizes · f599d974
Matt Arsenault authored Aug 22, 2013
```
llvm-svn: 188980
```
f599d974
Fixed typo. · 0dc00645
Michael Gottesman authored Aug 21, 2013
```
llvm-svn: 188957
```
0dc00645
Removed trailing whitespace. · 0900993c
Michael Gottesman authored Aug 21, 2013
```
llvm-svn: 188956
```
0900993c

No functionality change. · 05efa232

Yunzhong Gao authored Aug 21, 2013

Replace "(255 & value)" with "(0xFF & value)" to improve clarity.

llvm-svn: 188941

05efa232

Aug 21, 2013

Teach InstCombine about address spaces · 745101d6
Matt Arsenault authored Aug 21, 2013
```
llvm-svn: 188926
```
745101d6
Use attribute helper function · 745832dc
Matt Arsenault authored Aug 21, 2013
```
llvm-svn: 188916
```
745832dc
Fix typo · 3c71dabd
Matt Arsenault authored Aug 21, 2013
```
llvm-svn: 188915
```
3c71dabd

Move registering the execution of a basic block to the beginning rather than the end. · 707f601f

Bill Wendling authored Aug 20, 2013

There are situations which can affect the correctness (or at least expectation)
of the gcov output. For instance, if a call to __gcov_flush() occurs within a
block before the execution count is registered and then the program aborts in
some way, then that block will not be marked as executed. This is not normally
what the user expects.

If we move the code that's registering when a block is executed to the
beginning, we can catch these types of situations.

PR16893

llvm-svn: 188849

707f601f

Aug 20, 2013

SLPVectorizer: Fix invalid iterator errors · e1f3ab69

Arnold Schwaighofer authored Aug 20, 2013

Update iterator when the SLP vectorizer changes the instructions in the basic
block by restarting the traversal of the basic block.

Patch by Yi Jiang!

Fixes PR 16899.

llvm-svn: 188832

e1f3ab69

Add a llvm.copysign intrinsic · 0c5c01aa

Hal Finkel authored Aug 19, 2013

This adds a llvm.copysign intrinsic; We already have Libfunc recognition for
copysign (which is turned into the FCOPYSIGN SDAG node). In order to
autovectorize calls to copysign in the loop vectorizer, we need a corresponding
intrinsic as well.

In addition to the expected changes to the language reference, the loop
vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into
an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a
few lists in LegalizeVector{Ops,Types} so that vector copysigns can be
expanded.

In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN
be Expand for vector types. This seems correct for all in-tree targets, and I
think is the right thing to do because, previously, there was no way to generate
vector-values FCOPYSIGN nodes (and most targets don't specify an action for
vector-typed FCOPYSIGN).

llvm-svn: 188728

0c5c01aa

Use pop_back_val() instead of both back() and pop_back(). · b4eb6ade
Jakub Staszak authored Aug 19, 2013
```
llvm-svn: 188723
```
b4eb6ade
Teach InstCombine visitGetElementPtr about address spaces · d79f7d9e
Matt Arsenault authored Aug 19, 2013
```
llvm-svn: 188721
```
d79f7d9e
Cleanup visitGetElementPtr to make address space change easier · 98f34e3a
Matt Arsenault authored Aug 19, 2013
```
llvm-svn: 188720
```
98f34e3a
commonPointerCast cleanups to make address space change easier · 94a028aa
Matt Arsenault authored Aug 19, 2013
```
llvm-svn: 188719
```
94a028aa