Commits · d60589ad39a73f8de2f9f0a4e5b78ef7b1a8f8f9 · Roger Ferrer / llvm-epi-0.8

Jan 11, 2013
- Update patch for the pad short functions pass for Intel Atom (only). · 99c69904
  Preston Gurd authored Jan 11, 2013
```
Adds a check for -Oz, changes the code to not re-visit BBs,
and skips over DBG_VALUE instrs.

Patch by Andy Zhang.

llvm-svn: 172258
```
  99c69904
- X86AsmParser.cpp: Fix up r172148, to add initializer in another CreateMem(). · 7f254276
  NAKAMURA Takumi authored Jan 11, 2013
```
llvm-svn: 172157
```
  7f254276
- Remove heavy and unused #inclues from X86TargetObjectFile.cpp. · ab3d878f
  Jakub Staszak authored Jan 10, 2013
```
llvm-svn: 172151
```
  ab3d878f
- [ms-inline asm] Make sure we set a default value for AddressOf. Follow on to · 8c2a9c74
  Chad Rosier authored Jan 10, 2013
```
r172121.

llvm-svn: 172148
```
  8c2a9c74
Jan 10, 2013
- [ms-inline asm] Add support for calling functions from inline assembly. · a4bc9437
  Chad Rosier authored Jan 10, 2013
```
Part of rdar://12991541

llvm-svn: 172121
```
  a4bc9437
Jan 09, 2013

Fix description of ARMOperand · 5459754d
Joel Jones authored Jan 09, 2013
```
llvm-svn: 172011
```
5459754d

ARM Cost model: Use the size of vector registers and widest vectorizable... · b1791a75

Nadav Rotem authored Jan 09, 2013

ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor.

llvm-svn: 172010

b1791a75

PowerPC: EH adjustments · 1ae2248e

Adhemerval Zanella authored Jan 09, 2013

 
This patch adjust the r171506 to make all DWARF enconding pc-relative
for PPC64. It also adds the R_PPC64_REL32 relocation handling in MCJIT
(since the eh_frame will not generate PIC-relative relocation) and also
adds the emission of stubs created by the TTypeEncoding.

llvm-svn: 171979

1ae2248e

Efficient lowering of vector sdiv when the divisor is a splatted power of two constant. · 977e0be4

Nadav Rotem authored Jan 09, 2013

PR 14848. The lowered sequence is based on the existing sequence the target-independent
DAG Combiner creates for the scalar case.

Patch by Zvi Rackover.

llvm-svn: 171953

977e0be4

Last in the series of removing unnecessary '0' arguments for · bf7bc496
Eric Christopher authored Jan 09, 2013
```
address space. Reordered the EmitULEB128IntValue arguments to
make this easier.

llvm-svn: 171949
```
bf7bc496

MIsched: add an ILP window property to machine model. · 9f0b95f2

Andrew Trick authored Jan 09, 2013

This was an experimental option, but needs to be defined
per-target. e.g. PPC A2 needs to aggressively hide latency.

I converted some in-order scheduling tests to A2. Hal is working on
more test cases.

llvm-svn: 171946

9f0b95f2

These functions have default arguments of 0 for the last arg. Use · e3ab3d0e
Eric Christopher authored Jan 09, 2013
```
them.

llvm-svn: 171933
```
e3ab3d0e
Cost Model: Move the 'max unroll factor' variable to the TTI and add initial... · b696c36f
Nadav Rotem authored Jan 09, 2013
```
Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM.

llvm-svn: 171928
```
b696c36f

Jan 08, 2013

This patch produces the correct addend value for · c3dd91c4
Jack Carter authored Jan 08, 2013
```
an R_MIPS_GPREL16 relocation.


Contributer: Jack Carter
llvm-svn: 171882
```
c3dd91c4

This patch produces the correct pointer size · 9e28cd3f

Jack Carter authored Jan 08, 2013

value in the 64 bit .eh_frame section.

It doesn't however allow exception handling to work
yet since it depends on the correct relocation model
being set in the ELF header flags.


Contributer: Jack Carter
llvm-svn: 171881

9e28cd3f

Pad Short Functions for Intel Atom · a01daace

Preston Gurd authored Jan 08, 2013

The current Intel Atom microarchitecture has a feature whereby
when a function returns early then it is slightly faster to execute
a sequence of NOP instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction until
the return address is ready.

When compiling for X86 Atom only, this patch will run a pass,
called "X86PadShortFunction" which will add NOP instructions where less
than four cycles elapse between function entry and return.

It includes tests.

This patch has been updated to address Nadav's review comments
- Optimize only at >= O1 and don't do optimization if -Os is set
- Stores MachineBasicBlock* instead of BBNum
- Uses DenseMap instead of std::map
- Fixes placement of braces

Patch by Andy Zhang.

llvm-svn: 171879

a01daace

Renamed MCInstFragment to MCRelaxableFragment and added some comments. · 4d9ada03
Eli Bendersky authored Jan 08, 2013
```
No change in functionality.

llvm-svn: 171822
```
4d9ada03

Jan 07, 2013

ARM: Copy-paste error. · 9dbf3ee9
Jim Grosbach authored Jan 07, 2013
```
llvm-svn: 171790
```
9dbf3ee9
ARM: Fix a few copy-paste errors. · 553eb756
Jim Grosbach authored Jan 07, 2013
```
s/X86/ARM/

llvm-svn: 171789
```
553eb756

This patch addresses bug 14678 by fixing two problems in medium code model · 9b1e3e25

Bill Schmidt authored Jan 07, 2013

code generation.  Variables addressed through a GlobalAlias were not being
handled, and variables with available_externally linkage were treated
incorrectly.  The patch contains two new tests to verify the correct code
generation for these cases.

llvm-svn: 171778

9b1e3e25

Change SMRange to be half-open (exclusive end) instead of closed (inclusive) · e8f1eaea

Jordan Rose authored Jan 07, 2013

This is necessary not only for representing empty ranges, but for handling
multibyte characters in the input. (If the end pointer in a range refers to
a multibyte character, should it point to the beginning or the end of the
character in a char array?) Some of the code in the asm parsers was already
assuming this anyway.

llvm-svn: 171765

e8f1eaea

R600/SIISelLowering.cpp: Suppress a warning. [-Wunused-variable] · 458a8277
NAKAMURA Takumi authored Jan 07, 2013
```
llvm-svn: 171728
```
458a8277

Add LICENSE.TXT covering contributions made by ARM. · 2883da3b

Tim Northover authored Jan 07, 2013

Absent a Contributor's License Agreement (CLA) with an LLVM legal entity and as
reviewed and agreed with Chris Lattner, add a patent license covering future
contributions from ARM until there is a CLA. This is to make explicit ARM's
grant of patent rights to recipients of LLVM containing ARM-contributed
material.

llvm-svn: 171721

2883da3b

Remove more unnecessary # operators with nothing to paste proceeding them. · ae65212a
Craig Topper authored Jan 07, 2013
```
llvm-svn: 171702
```
ae65212a

Remove # from the beginning and end of def names. The # is a paste operator... · a8c5ec09

Craig Topper authored Jan 07, 2013

Remove # from the beginning and end of def names. The # is a paste operator and should only be used with something to paste on either side.

llvm-svn: 171697

a8c5ec09

Remove # from the beginning and end of def names. · 25cdf92b
Craig Topper authored Jan 07, 2013
```
llvm-svn: 171696
```
25cdf92b
Remove unnecessary # tokens at the beginning and end of defm names. · bd62d64c
Craig Topper authored Jan 07, 2013
```
llvm-svn: 171694
```
bd62d64c
Fix the enumerator names for ShuffleKind to match tho coding standards, · 2109f47d
Chandler Carruth authored Jan 07, 2013
```
and make its comments doxygen comments.

llvm-svn: 171688
```
2109f47d
Make the popcnt support enums and methods have more clear names and · 50a36cd1
Chandler Carruth authored Jan 07, 2013
```
follow the conding conventions regarding enumerating a set of "kinds" of
things.

llvm-svn: 171687
```
50a36cd1
Move TargetTransformInfo to live under the Analysis library. This no · d3e73556
Chandler Carruth authored Jan 07, 2013
```
longer would violate any dependency layering and it is in fact an
analysis. =]

llvm-svn: 171686
```
d3e73556

Switch TargetTransformInfo from an immutable analysis pass that requires · 664e354d

Chandler Carruth authored Jan 07, 2013

a TargetMachine to construct (and thus isn't always available), to an
analysis group that supports layered implementations much like
AliasAnalysis does. This is a pretty massive change, with a few parts
that I was unable to easily separate (sorry), so I'll walk through it.

The first step of this conversion was to make TargetTransformInfo an
analysis group, and to sink the nonce implementations in
ScalarTargetTransformInfo and VectorTargetTranformInfo into
a NoTargetTransformInfo pass. This allows other passes to add a hard
requirement on TTI, and assume they will always get at least on
implementation.

The TargetTransformInfo analysis group leverages the delegation chaining
trick that AliasAnalysis uses, where the base class for the analysis
group delegates to the previous analysis *pass*, allowing all but tho
NoFoo analysis passes to only implement the parts of the interfaces they
support. It also introduces a new trick where each pass in the group
retains a pointer to the top-most pass that has been initialized. This
allows passes to implement one API in terms of another API and benefit
when some other pass above them in the stack has more precise results
for the second API.

The second step of this conversion is to create a pass that implements
the TargetTransformInfo analysis using the target-independent
abstractions in the code generator. This replaces the
ScalarTargetTransformImpl and VectorTargetTransformImpl classes in
lib/Target with a single pass in lib/CodeGen called
BasicTargetTransformInfo. This class actually provides most of the TTI
functionality, basing it upon the TargetLowering abstraction and other
information in the target independent code generator.

The third step of the conversion adds support to all TargetMachines to
register custom analysis passes. This allows building those passes with
access to TargetLowering or other target-specific classes, and it also
allows each target to customize the set of analysis passes desired in
the pass manager. The baseline LLVMTargetMachine implements this
interface to add the BasicTTI pass to the pass manager, and all of the
tools that want to support target-aware TTI passes call this routine on
whatever target machine they end up with to add the appropriate passes.

The fourth step of the conversion created target-specific TTI analysis
passes for the X86 and ARM backends. These passes contain the custom
logic that was previously in their extensions of the
ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces.
I separated them into their own file, as now all of the interface bits
are private and they just expose a function to create the pass itself.
Then I extended these target machines to set up a custom set of analysis
passes, first adding BasicTTI as a fallback, and then adding their
customized TTI implementations.

The fourth step required logic that was shared between the target
independent layer and the specific targets to move to a different
interface, as they no longer derive from each other. As a consequence,
a helper functions were added to TargetLowering representing the common
logic needed both in the target implementation and the codegen
implementation of the TTI pass. While technically this is the only
change that could have been committed separately, it would have been
a nightmare to extract.

The final step of the conversion was just to delete all the old
boilerplate. This got rid of the ScalarTargetTransformInfo and
VectorTargetTransformInfo classes, all of the support in all of the
targets for producing instances of them, and all of the support in the
tools for manually constructing a pass based around them.

Now that TTI is a relatively normal analysis group, two things become
straightforward. First, we can sink it into lib/Analysis which is a more
natural layer for it to live. Second, clients of this interface can
depend on it *always* being available which will simplify their code and
behavior. These (and other) simplifications will follow in subsequent
commits, this one is clearly big enough.

Finally, I'm very aware that much of the comments and documentation
needs to be updated. As soon as I had this working, and plausibly well
commented, I wanted to get it committed and in front of the build bots.
I'll be doing a few passes over documentation later if it sticks.

Commits to update DragonEgg and Clang will be made presently.

llvm-svn: 171681

664e354d

Jan 06, 2013

Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si,... · 4f1c7256

Craig Topper authored Jan 06, 2013

Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior.

cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix.
cvtt*2si/cvt*2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix.

Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein.

llvm-svn: 171668

4f1c7256

Fix for PR14739. It's not safe to fold a load into a call across a store.... · 3fb03e23

Evan Cheng authored Jan 06, 2013

Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch.

llvm-svn: 171665

3fb03e23

Jan 05, 2013

Convert the TargetTransformInfo from an immutable pass with dynamic · 539edf4e

Chandler Carruth authored Jan 05, 2013

interfaces which could be extracted from it, and must be provided on
construction, to a chained analysis group.

The end goal here is that TTI works much like AA -- there is a baseline
"no-op" and target independent pass which is in the group, and each
target can expose a target-specific pass in the group. These passes will
naturally chain allowing each target-specific pass to delegate to the
generic pass as needed.

In particular, this will allow a much simpler interface for passes that
would like to use TTI -- they can have a hard dependency on TTI and it
will just be satisfied by the stub implementation when that is all that
is available.

This patch is a WIP however. In particular, the "stub" pass is actually
the one and only pass, and everything there is implemented by delegating
to the target-provided interfaces. As a consequence the tools still have
to explicitly construct the pass. Switching targets to provide custom
passes and sinking the stub behavior into the NoTTI pass is the next
step.

llvm-svn: 171621

539edf4e

Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions... · 92a70b1e

Craig Topper authored Jan 05, 2013

Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks.

llvm-svn: 171608

92a70b1e

Revert revision 171524. Original message: · 478b6a47

Nadav Rotem authored Jan 05, 2013

URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171603

478b6a47

Refactor the ScalarTargetTransformInfo API for querying about the · 4a7c3110

Chandler Carruth authored Jan 05, 2013

legality of an address mode to not use a struct of four values and
instead to accept them as parameters. I'd love to have named parameters
here as most callers only care about one or two of these, but the
defaults aren't terribly scary to write out.

That said, there is no real impact of this as the passes aren't yet
using STTI for this and are still relying upon TargetLowering.

llvm-svn: 171595

4a7c3110

[mips] Fix data layout string. Add 64 to the list of native integer widths · d35a2630
Akira Hatanaka authored Jan 05, 2013
```
and add stack alignment information.

llvm-svn: 171587
```
d35a2630
Move 'break' to the right place to prevent fallthru. There is no test-case · 43fafaf4
Jakub Staszak authored Jan 04, 2013
```
because conditions in the next case prevented from doing anything nasty.

llvm-svn: 171549
```
43fafaf4

Jan 04, 2013

The current Intel Atom microarchitecture has a feature whereby when a function · e36b685a

Preston Gurd authored Jan 04, 2013

returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171524

e36b685a