Commits · c617be559a1e625aa5ce554f93dc89915648df31 · Roger Ferrer / llvm-epi

Aug 17, 2015

Rip out hand-rolled matching code for VMIN, VMAX, VMINNM and VMAXNM · c617be55
James Molloy authored Aug 17, 2015
```
This is no longer needed - SDAGBuilder will do this for us.

llvm-svn: 245197
```
c617be55

Generate FMINNAN/FMINNUM/FMAXNAN/FMAXNUM from SDAGBuilder. · ef183397

James Molloy authored Aug 17, 2015

These only get generated if the target supports them. If one of the variants is not legal and the other is, and it is safe to do so, the other variant will be emitted.

For example on AArch32 (V8), we have scalar fminnm but not fmin.

Fix up a couple of tests while we're here - one now produces better code, and the other was just plain wrong to start with.

llvm-svn: 245196

ef183397

Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks. · 3af28945

Karthik Bhat authored Aug 17, 2015

PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was
deleting the next basic block iterator. Fixed the same by resetting the basic block iterator
post call to DeleteDeadInstruction.

llvm-svn: 245195

3af28945

Revert "[InstCombinePHI] Partial simplification of identity operations." · 8ed559ad
David Majnemer authored Aug 17, 2015
```
This reverts commit r244887, it caused PR24470.

llvm-svn: 245194
```
8ed559ad

[PM] Port ScalarEvolution to the new pass manager. · 2f1fd165

Chandler Carruth authored Aug 17, 2015

This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.

I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.

But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.

To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.

To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.

With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.

Differential Revision: http://reviews.llvm.org/D12063

llvm-svn: 245193

2f1fd165

[ADT] Teach FoldingSet to be movable. · b596ba23

Chandler Carruth authored Aug 16, 2015

This is a very minimal move support - it leaves the moved-from object in
a zombie state that is only valid for destruction and move assignment.
This seems fine to me, and leaving it in the default constructed state
would require adding more state to the object and potentially allocating
memory (!!!) and so seems like a Bad Idea.

llvm-svn: 245192

b596ba23

Aug 16, 2015

[TableGen] Use range-based for loop. · 802d3d39
Craig Topper authored Aug 16, 2015
```
llvm-svn: 245191
```
802d3d39
[TableGen] Move the ConversionRow vector into the ConversionTable instead of copying. · c4de7ee7
Craig Topper authored Aug 16, 2015
```
llvm-svn: 245190
```
c4de7ee7
[SimplifyLibCalls] Drop default template args. No functional change. · bb70d751
Benjamin Kramer authored Aug 16, 2015
```
llvm-svn: 245189
```
bb70d751
[IR] Simplify code. No functionality change. · dc1d1cbd
Benjamin Kramer authored Aug 16, 2015
```
llvm-svn: 245188
```
dc1d1cbd

transform fmin/fmax calls when possible (PR24314) · 57fd1dc5

Sanjay Patel authored Aug 16, 2015

If we can ignore NaNs, fmin/fmax libcalls can become compare and select
(this is what we turn std::min / std::max into).

This IR should then be optimized in the backend to whatever is best for
any given target. Eg, x86 can use minss/maxss instructions.

This should solve PR24314:
https://llvm.org/bugs/show_bug.cgi?id=24314

Differential Revision: http://reviews.llvm.org/D11866

llvm-svn: 245187

57fd1dc5

Add 2nd test case for sdiv/srem instructions in a SCEV · 3278b7cd
Tobias Grosser authored Aug 16, 2015
```
llvm-svn: 245186
```
3278b7cd
Try to fix the lldb build on Visual C++. · fe16dea6
Yaron Keren authored Aug 16, 2015
```
llvm-svn: 245185
```
fe16dea6
Enable passing test on Windows + MSYS. · e0e6e5e1
Yaron Keren authored Aug 16, 2015
```
llvm-svn: 245184
```
e0e6e5e1
[LSR][NFC] Don’t duplicate entity name at the beginning of the comment. · 94c4aecf
Sanjoy Das authored Aug 16, 2015
```
llvm-svn: 245183
```
94c4aecf
[LSR][NFC] Use camelCase for method names in Formula and RegUseTracker. · 302bfd04
Sanjoy Das authored Aug 16, 2015
```
llvm-svn: 245182
```
302bfd04
use SDValue bool operator; NFCI · 3ab4a73b
Sanjay Patel authored Aug 16, 2015
```
llvm-svn: 245181
```
3ab4a73b
[FIX] Add XFAIL to crashing test case · eca5282d
Johannes Doerfert authored Aug 16, 2015
```
llvm-svn: 245180
```
eca5282d

Build the ScopStmt domain in-place. · 45545ff7

Johannes Doerfert authored Aug 16, 2015

  This will build the statement domains in-place, hence using the
  ScopStmt::Domain member instead of some intermediate isl_set.

llvm-svn: 245179

45545ff7

Add a crashing test case for the scalar code generation · c594dc9e

Johannes Doerfert authored Aug 16, 2015

  This test case crashes the scalar code generation as we are not
  consistent with the usage of the assumed context. To be precise, we
  use the assumed context for the dependence analysis but not to
  restrict the domains of the statements.

  A step by step explanation of the problem is given in the test case.

llvm-svn: 245176

c594dc9e

Add -polly-context option to provide additional context information · 8a9c2353

Tobias Grosser authored Aug 16, 2015

This option allows the user to provide additional information about parameter
values as an isl_set. To specify that N has the value 1024, we can provide
the context -polly-context='[N] -> {: N = 1024}'.

llvm-svn: 245175

8a9c2353

Remove trivially true condition · ddb83d0f
Johannes Doerfert authored Aug 16, 2015
```
llvm-svn: 245174
```
ddb83d0f
Add missing include guard. · 178c4652
Yaron Keren authored Aug 16, 2015
```
llvm-svn: 245173
```
178c4652

Revert "Add support for cross block dse. This patch enables dead stroe... · e04443ba

David Majnemer authored Aug 16, 2015

Revert "Add support for cross block dse. This patch enables dead stroe elimination across basicblocks."

This reverts commit r245025, it caused PR24469.

llvm-svn: 245172

e04443ba

[InstCombine] Replace an and+icmp with a trunc+icmp · dfa3b095

David Majnemer authored Aug 16, 2015

Bitwise arithmetic can obscure a simple sign-test.  If replacing the
mask with a truncate is preferable if the type is legal because it
permits us to rephrase the comparison more explicitly.

llvm-svn: 245171

dfa3b095

Revert r244127: [PM] Remove a failed attempt to port the CallGraph · 5efd530c

Chandler Carruth authored Aug 16, 2015

analysis ...

It turns out that we *do* need the old CallGraph ported to the new pass
manager. There are times where this model of a call graph is really
superior to the one provided by the LazyCallGraph. For example,
GlobalsModRef very specifically needs the model provided by CallGraph.

While here, I've tried to make the move semantics actually work. =]

llvm-svn: 245170

5efd530c

[X86] Widen the 'AND' mask if doing so shrinks the encoding size · 1a59e49f

David Majnemer authored Aug 16, 2015

We can set additional bits in a mask given that we know the other
operand of an AND already has some bits set to zero.  This can be more
efficient if doing so allows us to use an instruction which implicitly
sign extends the immediate.

This fixes PR24085.

Differential Revision: http://reviews.llvm.org/D11289

llvm-svn: 245169

1a59e49f

MergeFunc: Quick fix for r245140, Ignore second, aka Function*, in sorting. · 5196275e
NAKAMURA Takumi authored Aug 16, 2015
```
Don't assume second would be ordered in the module.

llvm-svn: 245168
```
5196275e

Aug 15, 2015

Try to appease VS 2015 warnings from http://reviews.llvm.org/D11890 · dfb655fe

Yaron Keren authored Aug 15, 2015

ByteSize and BitSize should not be size_t but unsigned, considering

1) They are at most 2^16 and 2^19, respectively.
2) BitSize is an argument to Type::getIntNTy which takes unsigned.

Also, use the correct utostr instead itostr and cache the string result.

Thanks to James Touton for reporting this!

llvm-svn: 245167

dfb655fe

[x86] enable machine combiner reassociations for scalar single-precision minimums · 40d4eb40
Sanjay Patel authored Aug 15, 2015
```
llvm-svn: 245166
```
40d4eb40
Updated broadcast stack folding test to avoid use of broadcast intrinsics. · d65ace84
Simon Pilgrim authored Aug 15, 2015
```
llvm-svn: 245165
```
d65ace84
fix typos; NFC · 3b7e3677
Sanjay Patel authored Aug 15, 2015
```
llvm-svn: 245164
```
3b7e3677
add test case to show current codegen · 9f6c7ddd
Sanjay Patel authored Aug 15, 2015
```
llvm-svn: 245163
```
9f6c7ddd
[Sema] Be consistent about diagnostic wording: always use "cannot". · 32cbff78
Davide Italiano authored Aug 15, 2015
```
Discussed with Richard Smith.

llvm-svn: 245162
```
32cbff78

Silence VS2015 warning. · 8b2a031c

Yaron Keren authored Aug 15, 2015

Patch by James Touton!

http://reviews.llvm.org/D11890

llvm-svn: 245161

8b2a031c

[DAGCombiner] Attempt to mask vectors before zero extension instead of after. · 0750c846

Simon Pilgrim authored Aug 15, 2015

For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors.

(zext (truncate x)) -> (zext (and(x, m))

Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code.

This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles.

Differential Revision: http://reviews.llvm.org/D11764

llvm-svn: 245160

0750c846

AST Generation Paper published in TOPLAS · 234a4827

Tobias Grosser authored Aug 15, 2015

The July issue of TOPLAS contains a 50 page discussion of the AST generation
techniques used in Polly. This discussion gives not only an in-depth
description of how we (re)generate an imperative AST from our polyhedral based
mathematical program description, but also gives interesting insights about:

- Schedule trees: A tree-based mathematical program description that enables us
to perform loop transformations on an abstract level, while issues like the
generation of the correct loop structure and loop bounds will be taken care of
by our AST generator.

- Polyhedral unrolling: We discuss techniques that allow the unrolling of
non-trivial loops in the context of parameteric loop bounds, complex tile
shapes and conditionally executed statements. Such unrolling support enables
the generation of predicated code e.g. in the context of GPGPU computing.

- Isolation for full/partial tile separation: We discuss native support for
handling full/partial tile separation and -- in general -- native support for
isolation of boundary cases to enable smooth code generation for core
computations.

- AST generation with modulo constraints: We discuss how modulo mappings are
lowered to efficient C/LLVM code.

- User-defined constraint sets for run-time checks We discuss how arbitrary
sets of constraints can be used to automatically create run-time checks that
ensure a set of constrainst actually hold. This feature is very useful to
verify at run-time various assumptions that have been taken program
optimization.

Polyhedral AST generation is more than scanning polyhedra
Tobias Grosser, Sven Verdoolaege, Albert Cohen
ACM Transations on Programming Languages and Systems (TOPLAS), 37(4), July 2015

llvm-svn: 245157

234a4827

Update link to Polly paper · 4c455425

Tobias Grosser authored Aug 15, 2015

By going through my personal website, people can go directly to the paper.

llvm-svn: 245156

4c455425

[PM/AA] Delete the LibCallAliasAnalysis and all the associated · e8824e30

Chandler Carruth authored Aug 15, 2015

infrastructure.

This AA was never used in tree. It's infrastructure also completely
overlaps that of TargetLibraryInfo which is used heavily by BasicAA to
achieve similar goals to those stated for this analysis.

As has come up in several discussions, the use case here is still really
important, but this code isn't helping move toward that use case. Any
progress on better supporting rich AA information for runtime library
environments would likely be better off starting from scratch or
starting from TargetLibraryInfo than from this base.

Differential Revision: http://reviews.llvm.org/D12028

llvm-svn: 245155

e8824e30

Tiny cleanup: move some Triple variables up to the top of the · 2db38f33
James Y Knight authored Aug 15, 2015
```
function, and remove a duplicate var.

llvm-svn: 245154
```
2db38f33