Commits · 1b3961c0195acc5bca165743a101b2f4e5717edb · Roger Ferrer / llvm-epi

Feb 01, 2015

Fix some bashims. More information on https://wiki.ubuntu.com/DashAsBinSh .... · 1b3961c0

Sylvestre Ledru authored Feb 01, 2015

Fix some bashims. More information on https://wiki.ubuntu.com/DashAsBinSh. Reported initially on https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=772302 & https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=772301

llvm-svn: 227744

1b3961c0

[multiversion] Kill FunctionTargetTransformInfo, TTI itself is now · 21fc195c
Chandler Carruth authored Feb 01, 2015
```
per-function and supports the exact desired interface.

llvm-svn: 227743
```
21fc195c
[multiversion] Remove the function parameter from the unrolling · ab5cb36c
Chandler Carruth authored Feb 01, 2015
```
preferences interface on TTI now that all of TTI is per-function.

llvm-svn: 227741
```
ab5cb36c

[multiversion] Switch the TTI queries from TargetMachine to Subtarget · c956ab66

Chandler Carruth authored Feb 01, 2015

now that we have a correct and cached subtarget specific to the
function.

Also, finish providing a cached per-function subtarget in the core
LLVMTargetMachine -- that layer hadn't switched over yet.

The only use of the TargetMachine was to re-lookup a subtarget for
a particular function to work around the fact that TTI was immutable.
Now that it is per-function and we haved a cached subtarget, use it.

This still leaves a few interfaces with real warts on them where we were
passing Function objects through the TTI interface. I'll remove these
and clean their usage up in subsequent commits now that this isn't
necessary.

llvm-svn: 227738

c956ab66

[multiversion] Remove the cached TargetMachine pointer from the · c340ca83

Chandler Carruth authored Feb 01, 2015

intermediate TTI implementation template and instead query up to the
derived class for both the TargetMachine and the TargetLowering.

Most of the derived types had a TLI cached already and there is no need
to store a less precisely typed target machine pointer.

This will in turn make it much cleaner to look up the TLI via
a per-function subtarget instead of the generic subtarget, and it will
pave the way toward pulling the subtarget used for unroll preferences
into the same form once we are *always* using the function to look up
the correct subtarget.

llvm-svn: 227737

c340ca83

[multiversion] Remove another place we were "handling" nullptr even · 0ef5e391
Chandler Carruth authored Feb 01, 2015
```
though it was never a reasonable input.

llvm-svn: 227736
```
0ef5e391

[multiversion] Switch all of the targets over to use the · 8b04c0d2

Chandler Carruth authored Feb 01, 2015

TargetIRAnalysis access path directly rather than implementing getTTI.

This even removes getTTI from the interface. It's more efficient for
each target to just register a precise callback that creates their
specific TTI.

As part of this, all of the targets which are building their subtargets
individually per-function now build their TTI instance with the function
and thus look up the correct subtarget and cache it. NVPTX, R600, and
XCore currently don't leverage this functionality, but its trivial for
them to add it now.

llvm-svn: 227735

8b04c0d2

[multiversion] Remove a false freedom to leave the TargetMachine pointer · ee642690

Chandler Carruth authored Feb 01, 2015

null.

For some reason some of the original TTI code supported a null target
machine. This seems to have been legacy, and I made matters worse when
refactoring this code by spreading that pattern further through the
various targets.

The TargetMachine can't actually be null, and it doesn't make sense to
support that use case. I've now consistently removed it and removed all
of the code trying to cope with that situation. This is probably good,
as several targets *didn't* cope with it being null despite the null
default argument in their constructors. =]

llvm-svn: 227734

ee642690

EarlyCSE: Replace custom hash mixing with Hashing.h · 6ab86b1b
Benjamin Kramer authored Feb 01, 2015
```
Brings it in line with the other hashes in EarlyCSE.

llvm-svn: 227733
```
6ab86b1b

[multiversion] Implement the old pass manager's TTI wrapper pass in · 5ec2b1d1

Chandler Carruth authored Feb 01, 2015

terms of the new pass manager's TargetIRAnalysis.

Yep, this is one of the nicer bits of the new pass manager's design.
Passes can in many cases operate in a vacuum and so we can just nest
things when convenient. This is particularly convenient here as I can
now consolidate all of the TargetMachine logic on this analysis.

The most important change here is that this pushes the function we need
TTI for all the way into the TargetMachine, and re-creates the TTI
object for each function rather than re-using it for each function.
We're now prepared to teach the targets to produce function-specific TTI
objects with specific subtargets cached, etc.

One piece of feedback I'd love here is whether its worth renaming any of
this stuff. None of the names really seem that awesome to me at this
point, but TargetTransformInfoWrapperPass is particularly ... odd.
TargetIRAnalysisWrapper might make more sense. I would want to do that
rename separately anyways, but let me know what you think.

llvm-svn: 227731

5ec2b1d1

[multiversion] Thread a function argument through all the callers of the · fdb9c573

Chandler Carruth authored Feb 01, 2015

getTTI method used to get an actual TTI object.

No functionality changed. This just threads the argument and ensures
code like the inliner can correctly look up the callee's TTI rather than
using a fixed one.

The next change will use this to implement per-function subtarget usage
by TTI. The changes after that should eliminate the need for FTTI as that
will have become the default.

llvm-svn: 227730

fdb9c573

[X86] Convert esp-relative movs of function arguments to pushes, step 2 · bd57186c

Michael Kuperstein authored Feb 01, 2015

This moves the transformation introduced in r223757 into a separate MI pass.
This allows it to cover many more cases (not only cases where there must be a
reserved call frame), and perform rudimentary call folding. It still doesn't
have a heuristic, so it is enabled only for optsize/minsize, with stack
alignment <= 8, where it ought to be a fairly clear win.

Differential Revision: http://reviews.llvm.org/D6789

llvm-svn: 227728

bd57186c

[PM] Clean up a stale comment that came from a differnt pass when · af3c256f
Chandler Carruth authored Feb 01, 2015
```
I created this header.

llvm-svn: 227727
```
af3c256f

[PM] Port SimplifyCFG to the new pass manager. · fdffd87d

Chandler Carruth authored Feb 01, 2015

This should be sufficient to replace the initial (minor) function pass
pipeline in Clang with the new pass manager. I'll probably add an (off
by default) flag to do that just to ensure we can get extra testing.

llvm-svn: 227726

fdffd87d

[PM] Port EarlyCSE to the new pass manager. · e8c686aa

Chandler Carruth authored Feb 01, 2015

I've added RUN lines both to the basic test for EarlyCSE and the
target-specific test, as this serves as a nice test that the TTI layer
in the new pass manager is in fact working well.

llvm-svn: 227725

e8c686aa

[PM] Teach the module-to-function adaptor to not run function passes · 9f8d9b61

Chandler Carruth authored Feb 01, 2015

over declarations.

This is both quite unproductive and causes things to crash, for example
domtree would just assert.

I've added a declaration and a domtree run to the basic high-level tests
for the new pass manager.

llvm-svn: 227724

9f8d9b61

[PM] Switch to a ranged based for loop. NFC · 9973aeed
Chandler Carruth authored Feb 01, 2015
```
llvm-svn: 227723
```
9973aeed

[PM] Port TTI to the new pass manager, introducing a TargetIRAnalysis to · e038552c

Chandler Carruth authored Feb 01, 2015

produce it.

This adds a function to the TargetMachine that produces this analysis
via a callback for each function. This in turn faves the way to produce
a *different* TTI per-function with the correct subtarget cached.

I've also done the necessary wiring in the opt tool to thread the target
machine down and make it available to the pass registry so that we can
construct this analysis from a target machine when available.

llvm-svn: 227721

e038552c

[X86] Add a few target specific nodes to 'getTargetNodeName' · 2844ca73
Craig Topper authored Feb 01, 2015
```
llvm-svn: 227720
```
2844ca73
AVX2: Added 2 more tests for gather intrinsics. · 534a99d8
Elena Demikhovsky authored Feb 01, 2015
```
llvm-svn: 227718
```
534a99d8
Removed assert that doesn't typecheck and breaks debug MSVC build. · a691f3e9
Michael Kuperstein authored Feb 01, 2015
```
llvm-svn: 227717
```
a691f3e9

[PM] Refactor the analysis registration and pass pipeline parsing to · 2dc92e90

Chandler Carruth authored Feb 01, 2015

live in a class.

While this isn't really significant right now, I need to expose some
state to the pass construction expressions, and making them get
evaluated within a class context is a nice way to collect members that
they may need to access.

llvm-svn: 227715

2dc92e90

[SeparateConstOffsetFromGEP] skip optnone functions · 6c26bb63
Jingyue Wu authored Feb 01, 2015
```
llvm-svn: 227705
```
6c26bb63
[SeparateConstOffsetFromGEP] set PreservesCFG flag · 6e091c8e
Jingyue Wu authored Feb 01, 2015
```
SeparateConstOffsetFromGEP does not change the shape of the control flow graph.

llvm-svn: 227704
```
6e091c8e

[NVPTX] Emit .pragma "nounroll" for loops marked with nounroll · 0220df0d

Jingyue Wu authored Feb 01, 2015

Summary:
CUDA driver can unroll loops when jit-compiling PTX. To prevent CUDA
driver from unrolling a loop marked with llvm.loop.unroll.disable is not
unrolled by CUDA driver, we need to emit .pragma "nounroll" at the
header of that loop.

This patch also extracts getting unroll metadata from loop ID metadata
into a shared helper function.

Test Plan: test/CodeGen/NVPTX/nounroll.ll

Reviewers: eliben, meheff, jholewinski

Reviewed By: jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D7041

llvm-svn: 227703

0220df0d

Fix PR22393. When recursively replacing an aggregate with a smaller · 152ac396

Adrian Prantl authored Feb 01, 2015

aggregate or scalar, the debug info needs to refer to the absolute offset
(relative to the entire variable) instead of storing the offset inside
the smaller aggregate.

llvm-svn: 227702

152ac396

Add missing tags. · 02d6f22c
Adrian Prantl authored Feb 01, 2015
```
llvm-svn: 227701
```
02d6f22c
[CMake] LLVMLTO requires Intrinsics.gen since r227685 introduced... · 7ae226df
NAKAMURA Takumi authored Feb 01, 2015
```
[CMake] LLVMLTO requires Intrinsics.gen since r227685 introduced llvm/Analysis/TargetTransformInfo.h.

llvm-svn: 227700
```
7ae226df
[CMake] LLVMTarget requires Intrinsics.gen since r227669 introduced... · d75e0203
NAKAMURA Takumi authored Feb 01, 2015
```
[CMake] LLVMTarget requires Intrinsics.gen since r227669 introduced llvm/Analysis/TargetTransformInfo.h.

llvm-svn: 227699
```
d75e0203
[PM] Remove a bunch of stale TTI creation method declarations. I nuked · d8b3e9a4
Chandler Carruth authored Feb 01, 2015
```
their definitions, but forgot to clean up all the declarations which are
in different files.

llvm-svn: 227698
```
d8b3e9a4
Fix typo · 25f61a6f
Matt Arsenault authored Jan 31, 2015
```
llvm-svn: 227697
```
25f61a6f

Jan 31, 2015

R600/SI: Only select cvt_flr/cvt_rpi with no NaNs. · 08ad328a
Matt Arsenault authored Jan 31, 2015
```
These have different behavior from cvt_i32_f32 on NaN.

llvm-svn: 227693
```
08ad328a

X86: silence a GCC warning · 3475dc35

Saleem Abdulrasool authored Jan 31, 2015

GCC 4.9 gives the following warning:
  warning: enumeral and non-enumeral type in conditional expression
Cast the enumeral value to an integer within the ternary operation.  NFC.

llvm-svn: 227692

3475dc35

Remove unused variable. · 6253035c

Diego Novillo authored Jan 31, 2015

Summary:
This variable is only used inside an assert. This breaks builds with
asserts disabled.

OK for trunk?

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7314

llvm-svn: 227691

6253035c

Removed a spurious semicolon; NFC · a3bcd37c
Aaron Ballman authored Jan 31, 2015
```
llvm-svn: 227690
```
a3bcd37c

Removed SSE lane blend findCommutedOpIndices overrides. NFCI. · 43fbaada

Simon Pilgrim authored Jan 31, 2015

The default op indices frmo TargetInstrInfo::findCommutedOpIndices are being commuted so we don't need to do this.

llvm-svn: 227689

43fbaada

[X86][SSE] Shuffle mask decode support for zero extend, scalar float/double... · 9c76b474

Simon Pilgrim authored Jan 31, 2015

[X86][SSE] Shuffle mask decode support for zero extend, scalar float/double moves and integer load instructions

This patch adds shuffle mask decodes for integer zero extends (pmovzx** and movq xmm,xmm) and scalar float/double loads/moves (movss/movsd).

Also adds shuffle mask decodes for integer loads (movd/movq).

Differential Revision: http://reviews.llvm.org/D7228

llvm-svn: 227688

9c76b474

[PM] Switch the TargetMachine interface from accepting a pass manager · 93dcdc47

Chandler Carruth authored Jan 31, 2015

base which it adds a single analysis pass to, to instead return the type
erased TargetTransformInfo object constructed for that TargetMachine.

This removes all of the pass variants for TTI. There is now a single TTI
*pass* in the Analysis layer. All of the Analysis <-> Target
communication is through the TTI's type erased interface itself. While
the diff is large here, it is nothing more that code motion to make
types available in a header file for use in a different source file
within each target.

I've tried to keep all the doxygen comments and file boilerplate in line
with this move, but let me know if I missed anything.

With this in place, the next step to making TTI work with the new pass
manager is to introduce a really simple new-style analysis that produces
a TTI object via a callback into this routine on the target machine.
Once we have that, we'll have the building blocks necessary to accept
a function argument as well.

llvm-svn: 227685

93dcdc47

[asan][mips] Fix MIPS64 Asan mapping · 9559a5c0
Kumar Sukhani authored Jan 31, 2015
```
llvm-svn: 227684
```
9559a5c0

Replace another std::set in the core of CodeGenRegister, this time with sorted arrays. · be2edf30

Owen Anderson authored Jan 31, 2015

The hot path through this region of code does lots of batch inserts into sets. By storing them as sorted arrays, we can defer the sorting to the end of the batch, which is dramatically more efficient. This reduces tblgen runtime by 25% on my worst-case target.

llvm-svn: 227682

be2edf30