Commits · e7d67f2e01c4a49b1f7821946b4b103a51bcc895 · Roger Ferrer / llvm-epi-0.8

Sep 03, 2013

Teach InstCombineLoadCast about address spaces. · 3dfe54e9

Matt Arsenault authored Sep 03, 2013

This is another one that doesn't matter much,
but uses the right GEP index types in the first
place.

llvm-svn: 189854

3dfe54e9

Use type form of getIntPtrType in alloca visitor. · e38e4cdc

Matt Arsenault authored Sep 03, 2013

This doesn't actually matter, since alloca is always
0 address space, but this is more consistent.

llvm-svn: 189853

e38e4cdc

In this patch we are trying to do two things: · aeb5b46a

Yi Jiang authored Sep 03, 2013

1) If the width of vectorization list candidate is bigger than vector reg width, we will break it down to fit the vector reg.
2) We do not vectorize the width which is not power of two.

The performance result shows it will help some spec benchmarks. mesa improved 6.97% and ammp improved 1.54%.

llvm-svn: 189830

aeb5b46a

[msan] Fix handling of select with struct arguments. · e95d37c8
Evgeniy Stepanov authored Sep 03, 2013
```
llvm-svn: 189796
```
e95d37c8

[msan] Fix select instrumentation. · 566f5914

Evgeniy Stepanov authored Sep 03, 2013

Select condition shadow was being ignored resulting in false negatives.
This change OR-s sign-extended condition shadow into the result shadow.

llvm-svn: 189785

566f5914

Aug 31, 2013

SimplifyLibCalls: When emitting an overloaded fp function check that it's available. · 2702caad

Benjamin Kramer authored Aug 31, 2013

The existing code missed some edge cases when e.g. we're going to emit sqrtf but
only the availability of sqrt was checked. This happens on odd platforms like
windows.

llvm-svn: 189724

2702caad

Aug 30, 2013
- Compulsive reformatting. · 2865be79
  Bill Wendling authored Aug 30, 2013
```
llvm-svn: 189697
```
  2865be79
- InstCombine: Check for zero shift amounts before subtracting one causing integer overflow. · 010f1083
  Benjamin Kramer authored Aug 30, 2013
```
PR17026. Also avoid undefined shifts and shift amounts larger than 64 bits
(those are always undef because we can't represent integer types that large).

llvm-svn: 189672
```
  010f1083
- Random cleanup: No need to use a std::vector here, since createInternalizePass uses an ArrayRef. · 4c0d9ade
  Bill Wendling authored Aug 30, 2013
```
llvm-svn: 189632
```
  4c0d9ade
Aug 29, 2013

Revert: r189565 - Add getUnrollingPreferences to TTI · 8e83820a

Hal Finkel authored Aug 29, 2013

Revert unintentional commit (of an unreviewed change).

Original commit message:

Add getUnrollingPreferences to TTI

Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.

llvm-svn: 189566

8e83820a

Add getUnrollingPreferences to TTI · 63e6c0e9

Hal Finkel authored Aug 29, 2013

Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.

llvm-svn: 189565

63e6c0e9

Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC... · 4c459bcd

Nadav Rotem authored Aug 28, 2013

Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons:
1. They are a kind of cannonicalization.
2. The performance measurements show that it is better to keep them in.

There should be no functional change if you are not enabling the LateVectorization mode.

llvm-svn: 189539

4c459bcd

Fix typo. · 38874731
Matt Arsenault authored Aug 28, 2013
```
llvm-svn: 189524
```
38874731

Aug 28, 2013

Disable unrolling in the loop vectorizer when disabled in the pass manager · 6d09904c

Hal Finkel authored Aug 28, 2013

When unrolling is disabled in the pass manager, the loop vectorizer should also
not unroll loops. This will allow the -fno-unroll-loops option in Clang to
behave as expected (even for vectorizable loops). The loop vectorizer's
-force-vector-unroll option will (continue to) override the pass-manager
setting (including -force-vector-unroll=0 to force use of the internal
auto-selection logic).

In order to test this, I added a flag to opt (-disable-loop-unrolling) to force
disable unrolling through opt (the analog of -fno-unroll-loops in Clang). Also,
this fixes a small bug in opt where the loop vectorizer was enabled only after
the pass manager populated the queue of passes (the global_alias.ll test needed
a slight update to the RUN line as a result of this fix).

llvm-svn: 189499

6d09904c

80 cols · 9b7e2b55
Alexey Samsonov authored Aug 28, 2013
```
llvm-svn: 189473
```
9b7e2b55
DataFlowSanitizer: Implement trampolines for function pointers passed to custom functions. · 28a10aff
Peter Collingbourne authored Aug 27, 2013
```
Differential Revision: http://llvm-reviews.chandlerc.com/D1503

llvm-svn: 189408
```
28a10aff

Aug 27, 2013

Refactor 'vectorizeLoop' no functionality change. · 6b41f7cc

Nadav Rotem authored Aug 27, 2013

This patch merges LoopVectorize of InnerLoopVectorizer and InnerLoopUnroller by adding checks for VF=1. This helps in erasing the Unroller code that is almost identical to the InnerLoopVectorizer code.

llvm-svn: 189391

6b41f7cc

Fixed typo. · eab9a7fa
Michael Gottesman authored Aug 27, 2013
```
Noticed by Stephen Checkoway <s@pahtak.org>.

llvm-svn: 189312
```
eab9a7fa

Fix inserting instructions before last in bundle. · ed9f76d3

Matt Arsenault authored Aug 26, 2013

The builder inserts from before the insert point,
not after, so this would insert before the last
instruction in the bundle instead of after it.

I'm not sure if this can actually be a problem
with any of the current insertions.

llvm-svn: 189285

ed9f76d3

LoopVectorize: Implement partial loop unrolling when vectorization is not profitable. · bdc9ff44

Nadav Rotem authored Aug 26, 2013

This patch enables unrolling of loops when vectorization is legal but not profitable.
We add a new class InnerLoopUnroller, that extends InnerLoopVectorizer and replaces some of the vector-specific logic with scalars.

This patch does not introduce any runtime regressions and improves the following workloads:

SingleSource/Benchmarks/Shootout/matrix -22.64%
SingleSource/Benchmarks/Shootout-C++/matrix -13.06%
External/SPEC/CINT2006/464_h264ref/464_h264ref -3.99%
SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -1.95%

llvm-svn: 189281

bdc9ff44

Aug 26, 2013
- test commit. Remove blank line · 7107d415
  Yi Jiang authored Aug 26, 2013
```
llvm-svn: 189265
```
  7107d415
- Fix unused variable in release build · bcd8c577
  Matt Arsenault authored Aug 26, 2013
```
llvm-svn: 189264
```
  bcd8c577
- Constify functions · 8f21c838
  Matt Arsenault authored Aug 26, 2013
```
llvm-svn: 189234
```
  8f21c838
- Vectorize starting from insertelements building a vector · 39274be6
  Matt Arsenault authored Aug 26, 2013
```
llvm-svn: 189233
```
  39274be6
Aug 24, 2013
- Check if in set on insertion instead of separately · 8405888a
  Matt Arsenault authored Aug 24, 2013
```
llvm-svn: 189179
```
  8405888a
- Add a function object to compare the first or second component of a std::pair. · b12cf019
  Benjamin Kramer authored Aug 24, 2013
```
Replace instances of this scattered around the code base.

llvm-svn: 189169
```
  b12cf019
Aug 23, 2013

DataFlowSanitizer: correctly combine labels in the case where they are equal. · a96296f3
Peter Collingbourne authored Aug 23, 2013
```
llvm-svn: 189133
```
a96296f3

[msan] Fix handling of va_arg overflow area on x86_64. · d42863cc

Evgeniy Stepanov authored Aug 23, 2013

The code was erroneously reading overflow area shadow from the TLS slot,
bypassing the local copy. Reading shadow directly from TLS is wrong, because
it can be overwritten by a nested vararg call, if that happens before va_start.

llvm-svn: 189104

d42863cc

Turn MipsOptimizeMathLibCalls into a target-independent scalar transform · 37cd6cfb

Richard Sandiford authored Aug 23, 2013

...so that it can be used for z too.  Most of the code is the same.
The only real change is to use TargetTransformInfo to test when a sqrt
instruction is available.

The pass is opt-in because at the moment it only handles sqrt.

llvm-svn: 189097

37cd6cfb

80 cols · 6dae24df
Alexey Samsonov authored Aug 23, 2013
```
llvm-svn: 189091
```
6dae24df

Update StripDeadDebugInfo to use DebugInfoFinder so that it is no longer stale... · 823aaffd

Michael Gottesman authored Aug 23, 2013

Update StripDeadDebugInfo to use DebugInfoFinder so that it is no longer stale to the point of not working and more resilient to debug info changes.

The current version of StripDeadDebugInfo became stale and no longer actually
worked since it was expecting an older version of debug info.

This patch updates it to use DebugInfoFinder and the modern DebugInfo classes as
much as possible to make it more redundent to such changes. Additionally, the
only place where that was avoided (the code where we replace the old sets with
the new), I call verify on the DIContextUnit implying that if the format changes
and my live set changes no longer make sense an assert will be hit. In order to
ensure that that occurs I have included a test case.

The actual stripping of the dead debug info follows the same strategy as was
used before in this class: find the live set and replace the old set in the
given compile unit (which may contain dead global variables/functions) with the
new live one.

llvm-svn: 189078

823aaffd

Aug 22, 2013

DataFlowSanitizer: Replace non-instrumented aliases of instrumented functions,... · 34f0c313

Peter Collingbourne authored Aug 22, 2013

DataFlowSanitizer: Replace non-instrumented aliases of instrumented functions, and vice versa, with wrappers.

Differential Revision: http://llvm-reviews.chandlerc.com/D1442

llvm-svn: 189054

34f0c313

DataFlowSanitizer: Factor the wrapper builder out to buildWrapperFunction. · 761a4fc4
Peter Collingbourne authored Aug 22, 2013
```
Differential Revision: http://llvm-reviews.chandlerc.com/D1441

llvm-svn: 189053
```
761a4fc4

DataFlowSanitizer: Prefix the name of each instrumented function with "dfs$". · 59b1262d

Peter Collingbourne authored Aug 22, 2013

DFSan changes the ABI of each function in the module. This makes it possible
for a function with the native ABI to be called with the instrumented ABI,
or vice versa, thus possibly invoking undefined behavior. A simple way
of statically detecting instances of this problem is to prepend the prefix
"dfs$" to the name of each instrumented-ABI function.

This will not catch every such problem; in particular function pointers passed
across the instrumented-native barrier cannot be used on the other side.
These problems could potentially be caught dynamically.

Differential Revision: http://llvm-reviews.chandlerc.com/D1373

llvm-svn: 189052

59b1262d

Teach the SLP vectorizer the correct way to check for consecutive access · 1c34afcb

Chandler Carruth authored Aug 22, 2013

using GEPs. Previously, it used a number of different heuristics for
analyzing the GEPs. Several of these were conservatively correct, but
failed to fall back to SCEV even when SCEV might have given a reasonable
answer. One was simply incorrect in how it was formulated.

There was good code already to recursively evaluate the constant offsets
in GEPs, look through pointer casts, etc. I gathered this into a form
code like the SLP code can use in a previous commit, which allows all of
this code to become quite simple.

There is some performance (compile time) concern here at first glance as
we're directly attempting to walk both pointers constant GEP chains.
However, a couple of thoughts:

1) The very common cases where there is a dynamic pointer, and a second
   pointer at a constant offset (usually a stride) from it, this code
   will actually not do any unnecessary work.

2) InstCombine and other passes work very hard to collapse constant
   GEPs, so it will be rare that we iterate here for a long time.

That said, if there remain performance problems here, there are some
obvious things that can improve the situation immensely. Doing
a vectorizer-pass-wide memoizer for each individual layer of pointer
values, their base values, and the constant offset is likely to be able
to completely remove redundant work and strictly limit the scaling of
the work to scrape these GEPs. Since this optimization was not done on
the prior version (which would still benefit from it), I've not done it
here. But if folks have benchmarks that slow down it should be straight
forward for them to add.

I've added a test case, but I'm not really confident of the amount of
testing done for different access patterns, strides, and pointer
manipulation.

llvm-svn: 189007

1c34afcb

Teach LoopVectorize about address space sizes · f599d974
Matt Arsenault authored Aug 22, 2013
```
llvm-svn: 188980
```
f599d974
Fixed typo. · 0dc00645
Michael Gottesman authored Aug 21, 2013
```
llvm-svn: 188957
```
0dc00645
Removed trailing whitespace. · 0900993c
Michael Gottesman authored Aug 21, 2013
```
llvm-svn: 188956
```
0900993c

No functionality change. · 05efa232

Yunzhong Gao authored Aug 21, 2013

Replace "(255 & value)" with "(0xFF & value)" to improve clarity.

llvm-svn: 188941

05efa232

Aug 21, 2013
- Teach InstCombine about address spaces · 745101d6
  Matt Arsenault authored Aug 21, 2013
```
llvm-svn: 188926
```
  745101d6