Commits · b21afc6db4c25808a07afa44dab1c16a222dc795 · Roger Ferrer / llvm-epi-0.8

Jan 07, 2013

Remove unnecessary # tokens at the beginning and end of defm names. · bd62d64c
Craig Topper authored Jan 07, 2013
```
llvm-svn: 171694
```
bd62d64c
Fix the enumerator names for ShuffleKind to match tho coding standards, · 2109f47d
Chandler Carruth authored Jan 07, 2013
```
and make its comments doxygen comments.

llvm-svn: 171688
```
2109f47d
Make the popcnt support enums and methods have more clear names and · 50a36cd1
Chandler Carruth authored Jan 07, 2013
```
follow the conding conventions regarding enumerating a set of "kinds" of
things.

llvm-svn: 171687
```
50a36cd1
Move TargetTransformInfo to live under the Analysis library. This no · d3e73556
Chandler Carruth authored Jan 07, 2013
```
longer would violate any dependency layering and it is in fact an
analysis. =]

llvm-svn: 171686
```
d3e73556

Switch TargetTransformInfo from an immutable analysis pass that requires · 664e354d

Chandler Carruth authored Jan 07, 2013

a TargetMachine to construct (and thus isn't always available), to an
analysis group that supports layered implementations much like
AliasAnalysis does. This is a pretty massive change, with a few parts
that I was unable to easily separate (sorry), so I'll walk through it.

The first step of this conversion was to make TargetTransformInfo an
analysis group, and to sink the nonce implementations in
ScalarTargetTransformInfo and VectorTargetTranformInfo into
a NoTargetTransformInfo pass. This allows other passes to add a hard
requirement on TTI, and assume they will always get at least on
implementation.

The TargetTransformInfo analysis group leverages the delegation chaining
trick that AliasAnalysis uses, where the base class for the analysis
group delegates to the previous analysis *pass*, allowing all but tho
NoFoo analysis passes to only implement the parts of the interfaces they
support. It also introduces a new trick where each pass in the group
retains a pointer to the top-most pass that has been initialized. This
allows passes to implement one API in terms of another API and benefit
when some other pass above them in the stack has more precise results
for the second API.

The second step of this conversion is to create a pass that implements
the TargetTransformInfo analysis using the target-independent
abstractions in the code generator. This replaces the
ScalarTargetTransformImpl and VectorTargetTransformImpl classes in
lib/Target with a single pass in lib/CodeGen called
BasicTargetTransformInfo. This class actually provides most of the TTI
functionality, basing it upon the TargetLowering abstraction and other
information in the target independent code generator.

The third step of the conversion adds support to all TargetMachines to
register custom analysis passes. This allows building those passes with
access to TargetLowering or other target-specific classes, and it also
allows each target to customize the set of analysis passes desired in
the pass manager. The baseline LLVMTargetMachine implements this
interface to add the BasicTTI pass to the pass manager, and all of the
tools that want to support target-aware TTI passes call this routine on
whatever target machine they end up with to add the appropriate passes.

The fourth step of the conversion created target-specific TTI analysis
passes for the X86 and ARM backends. These passes contain the custom
logic that was previously in their extensions of the
ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces.
I separated them into their own file, as now all of the interface bits
are private and they just expose a function to create the pass itself.
Then I extended these target machines to set up a custom set of analysis
passes, first adding BasicTTI as a fallback, and then adding their
customized TTI implementations.

The fourth step required logic that was shared between the target
independent layer and the specific targets to move to a different
interface, as they no longer derive from each other. As a consequence,
a helper functions were added to TargetLowering representing the common
logic needed both in the target implementation and the codegen
implementation of the TTI pass. While technically this is the only
change that could have been committed separately, it would have been
a nightmare to extract.

The final step of the conversion was just to delete all the old
boilerplate. This got rid of the ScalarTargetTransformInfo and
VectorTargetTransformInfo classes, all of the support in all of the
targets for producing instances of them, and all of the support in the
tools for manually constructing a pass based around them.

Now that TTI is a relatively normal analysis group, two things become
straightforward. First, we can sink it into lib/Analysis which is a more
natural layer for it to live. Second, clients of this interface can
depend on it *always* being available which will simplify their code and
behavior. These (and other) simplifications will follow in subsequent
commits, this one is clearly big enough.

Finally, I'm very aware that much of the comments and documentation
needs to be updated. As soon as I had this working, and plausibly well
commented, I wanted to get it committed and in front of the build bots.
I'll be doing a few passes over documentation later if it sticks.

Commits to update DragonEgg and Clang will be made presently.

llvm-svn: 171681

664e354d

Jan 06, 2013

Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si,... · 4f1c7256

Craig Topper authored Jan 06, 2013

Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior.

cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix.
cvtt*2si/cvt*2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix.

Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein.

llvm-svn: 171668

4f1c7256

Fix for PR14739. It's not safe to fold a load into a call across a store.... · 3fb03e23

Evan Cheng authored Jan 06, 2013

Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch.

llvm-svn: 171665

3fb03e23

Jan 05, 2013

Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions... · 92a70b1e

Craig Topper authored Jan 05, 2013

Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks.

llvm-svn: 171608

92a70b1e

Revert revision 171524. Original message: · 478b6a47

Nadav Rotem authored Jan 05, 2013

URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171603

478b6a47

Move 'break' to the right place to prevent fallthru. There is no test-case · 43fafaf4
Jakub Staszak authored Jan 04, 2013
```
because conditions in the next case prevented from doing anything nasty.

llvm-svn: 171549
```
43fafaf4

Jan 04, 2013

The current Intel Atom microarchitecture has a feature whereby when a function · e36b685a

Preston Gurd authored Jan 04, 2013

returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171524

e36b685a

LoopVectorizer: · e1d5c4b8

Nadav Rotem authored Jan 04, 2013

1. Add code to estimate register pressure.
2. Add code to select the unroll factor based on register pressure.
3. Add bits to TargetTransformInfo to provide the number of registers.

llvm-svn: 171469

e1d5c4b8

· c616a540

Nadav Rotem authored Jan 04, 2013

Revert revision: 171467. This transformation is incorrect and makes some tests fail. Original message:

Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1.
Added a test.

llvm-svn: 171468

c616a540

Jan 03, 2013

Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. · 5f2f06d2
Elena Demikhovsky authored Jan 03, 2013
```
Added a test.

llvm-svn: 171467
```
5f2f06d2

Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when... · 820aac1c

Michael Gottesman authored Jan 03, 2013

Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks."

This reverts commit r171461 since it breaks the following tests:

Clang :: Analysis/outofbound-notwork.c
Clang :: Analysis/string-fail.c
Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp
Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp
Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp
Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp
Clang :: CXX/temp/temp.param/p14.cpp
Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp
Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c
Clang :: CodeGen/blocks-2.c
Clang :: CodeGen/libcalls-d.c
Clang :: CodeGen/libcalls-ld.c
Clang :: CodeGenCXX/conversion-function.cpp
Clang :: CodeGenCXX/debug-info-limit-type.cpp
Clang :: CodeGenCXX/inheriting-constructor.cpp
Clang :: FixIt/fixit-errors.c
Clang :: FixIt/fixit-pmem.cpp
Clang :: Modules/namespaces.cpp
Clang :: PCH/changed-files.c
Clang :: PCH/pr4489.c
Clang :: PCH/source-manager-stack.c
Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp
Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp
Clang :: SemaTemplate/instantiate-function-1.mm

llvm-svn: 171466

820aac1c

Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when... · 7c27cc9f

Craig Topper authored Jan 03, 2013

Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks.

llvm-svn: 171461

7c27cc9f

Add a subtype parameter to VTTI::getShuffleCost · 95de3f30

Hal Finkel authored Jan 03, 2013

In order to cost subvector insertion and extraction, we need to know
the type of the subvector being extracted.

No functionality change.

llvm-svn: 171453

95de3f30

Jan 02, 2013

Adds missing aliases for fcom and fcomp instructions without arguments. · 726e0ea6
Kevin Enderby authored Jan 02, 2013
```
Patch by Michael M Kuperstein!

llvm-svn: 171414
```
726e0ea6
AVX: Fix a bug in WidenMaskArithmetic. · 761937a7
Nadav Rotem authored Jan 02, 2013
```
llvm-svn: 171398
```
761937a7

Move all of the header files which are involved in modelling the LLVM IR · 9fb823bb

Chandler Carruth authored Jan 02, 2013

into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366

9fb823bb

Merge SSE and AVX instruction definitions for scalar forms of SQRT, RSQRT, and RCP. · 9791afb1
Craig Topper authored Jan 02, 2013
```
llvm-svn: 171356
```
9791afb1
Merge SSE and AVX instruction definitions for PSHUFD/PSHUFHW/PSHUFLW. · 4bc5c4e1
Craig Topper authored Jan 02, 2013
```
llvm-svn: 171355
```
4bc5c4e1
Revert 171351. It broke MC/X86/x86-32-avx.s. · db1a84c8
Rafael Espindola authored Jan 02, 2013
```
llvm-svn: 171352
```
db1a84c8

Jan 01, 2013
- Merge SSE and AVX instruction definitions for scalar forms of SQRT, RSQRT, and RCP. · 86d0cdb8
  Craig Topper authored Jan 01, 2013
```
llvm-svn: 171351
```
  86d0cdb8
- Remove unused argument from a multiclass. · 12ed9cd6
  Craig Topper authored Jan 01, 2013
```
llvm-svn: 171340
```
  12ed9cd6
- Merge intrinsic instruction definitions for SSE and AVX versions of RCPPS and RSQRTPS. · 2edafc05
  Craig Topper authored Jan 01, 2013
```
llvm-svn: 171339
```
  2edafc05
- Remove 2 unused multiclasses. · d04dbec6
  Craig Topper authored Jan 01, 2013
```
llvm-svn: 171338
```
  d04dbec6
- Merge AVX/SSE instruction definitions for SQRTPS/PD, RSQRTPS, RCPPS. No funcitonal change intended. · 7cc4f322
  Craig Topper authored Jan 01, 2013
```
llvm-svn: 171337
```
  7cc4f322
- Use packed instead of scalar itineraries for SSE1/2 SQRTPS/PD, RCPPS, and... · c2521cd3
  Craig Topper authored Dec 31, 2012
```
Use packed instead of scalar itineraries for SSE1/2 SQRTPS/PD, RCPPS, and RSQRTPS. VEX-encoded forms already use packed.

llvm-svn: 171336
```
  c2521cd3
Dec 30, 2012
- Use the predicate methods off of AttributeSet instead of Attribute. · 749a43d8
  Bill Wendling authored Dec 30, 2012
```
llvm-svn: 171257
```
  749a43d8
- Remove the Function::getRetAttributes method in favor of using the AttributeSet accessor method. · 74dba875
  Bill Wendling authored Dec 30, 2012
```
llvm-svn: 171256
```
  74dba875
- Remove the Function::getFnAttributes method in favor of using the AttributeSet · 698e84fc
  Bill Wendling authored Dec 30, 2012
```
directly.

This is in preparation for removing the use of the 'Attribute' class as a
collection of attributes. That will shift to the AttributeSet class instead.

llvm-svn: 171253
```
  698e84fc
Dec 29, 2012
- Remove intrinsic specific instructions for (V)SQRTPS/PD. Instead lower to... · fe82eb6b
  Craig Topper authored Dec 29, 2012
```
Remove intrinsic specific instructions for (V)SQRTPS/PD. Instead lower to target-independent ISD nodes and use the existing patterns for those.

llvm-svn: 171237
```
  fe82eb6b
- Merge similar functionality using a nested switch. · f4a9c6e2
  Craig Topper authored Dec 29, 2012
```
llvm-svn: 171229
```
  f4a9c6e2
- Remove intrinsic specific instructions for SSE/SSE2/AVX floating point max/min... · 6b27251a
  Craig Topper authored Dec 29, 2012
```
Remove intrinsic specific instructions for SSE/SSE2/AVX floating point max/min instructions. Lower them to target specific nodes and use those patterns instead. This also allows them to be commuted if UnsafeFPMath is enabled.

llvm-svn: 171227
```
  6b27251a
- Simplify code, no functionality change. · 215f9414
  Jakub Staszak authored Dec 29, 2012
```
llvm-svn: 171226
```
  215f9414
Dec 28, 2012
- CostModel: initial checkin for code that estimates the cost of special shuffles. · 9785f519
  Nadav Rotem authored Dec 28, 2012
```
llvm-svn: 171180
```
  9785f519
- wrap 80-col lines. · c982a2dc
  Nadav Rotem authored Dec 28, 2012
```
llvm-svn: 171179
```
  c982a2dc
- AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these... · 3da9ac72
  Nadav Rotem authored Dec 28, 2012
```
AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend.

llvm-svn: 171178
```
  3da9ac72
- Reverse the 'if' condition and reduce the indentation. · 68441914
  Nadav Rotem authored Dec 27, 2012
```
llvm-svn: 171172
```
  68441914