Commits · e59a920b0c41c1e7e36a5c8d9d609aad87e4ec1e · Roger Ferrer / llvm-epi-0.8

Oct 16, 2012

Issue: · e59a920b

Stepan Dyatkovskiy authored Oct 16, 2012

Stack is formed improperly for long structures passed as byval arguments for
EABI mode.

If we took AAPCS reference, we can found the next statements:

A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).

B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).

So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.

Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.

Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.

P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch. 

llvm-svn: 166018

e59a920b

Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>. · 1705a999

NAKAMURA Takumi authored Oct 16, 2012

Original message:

The attached is the fix to radar://11663049. The optimization can be outlined by following rules:

   (select (x != c), e, c) -> select (x != c), e, x),
   (select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.

 The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.

  While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.

  The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".

Original message since r165661:

My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code.

llvm-svn: 166017

1705a999

Cleanup whitespace. · 118a78b9
Bill Wendling authored Oct 16, 2012
```
llvm-svn: 166016
```
118a78b9
Move X86MCInstLower class definition into implementation file. It's not needed outside. · 2a3f7758
Craig Topper authored Oct 16, 2012
```
llvm-svn: 166014
```
2a3f7758
Cleanup whitespace. · a529ade5
Bill Wendling authored Oct 16, 2012
```
llvm-svn: 166013
```
a529ade5
Have AttributesImpl defriend the Attributes class. · 147ee8e3
Bill Wendling authored Oct 16, 2012
```
llvm-svn: 166012
```
147ee8e3
Have AttrBuilder defriend the Attributes class. · 3ffbac44
Bill Wendling authored Oct 16, 2012
```
llvm-svn: 166011
```
3ffbac44

Use the Attributes::get method which takes an AttrVal value directly to... · c6a15cf5

Bill Wendling authored Oct 16, 2012

Use the Attributes::get method which takes an AttrVal value directly to simplify the code a bit. No functionality change.

llvm-svn: 166009

c6a15cf5

Put simple c'tors inline. · a517c30e
Bill Wendling authored Oct 16, 2012
```
llvm-svn: 166008
```
a517c30e
Pass in the context to the Attributes::get method. · 4f69e148
Bill Wendling authored Oct 16, 2012
```
llvm-svn: 166007
```
4f69e148
Fix filename in file header. · c74b600a
Craig Topper authored Oct 16, 2012
```
llvm-svn: 166004
```
c74b600a

misched: Added handleMove support for updating all kill flags, not just for allocatable regs. · d9d4be0d

Andrew Trick authored Oct 16, 2012

This is a medium term workaround until we have a more robust solution
in the form of a register liveness utility for postRA passes.

llvm-svn: 166001

d9d4be0d

Remove unused BitVectors from getAllocatableSet(). · 244beb42
Jakob Stoklund Olesen authored Oct 16, 2012
```
llvm-svn: 165999
```
244beb42
Remove RegisterClassInfo::isReserved() and isAllocatable(). · f67bf3e0
Jakob Stoklund Olesen authored Oct 15, 2012
```
Clients can use the equivalent functions in MRI.

llvm-svn: 165990
```
f67bf3e0

Add __builtin_setjmp/_longjmp supprt in X86 backend · 97bf363a

Michael Liao authored Oct 15, 2012

- Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also
  used as a light-weight replacement of setjmp/longjmp which are used to
  implementation continuation, user-level threading, and etc. The support added
  in this patch ONLY addresses this usage and is NOT intended to support SjLj
  exception handling as zero-cost DWARF exception handling is used by default
  in X86.

llvm-svn: 165989

97bf363a

Remove LIS::isAllocatable() and isReserved() helpers. · cea596ac
Jakob Stoklund Olesen authored Oct 15, 2012
```
All callers can simply use the corresponding MRI functions.

llvm-svn: 165985
```
cea596ac

Oct 15, 2012

Switch most getReservedRegs() clients to the MRI equivalent. · c30a9af2

Jakob Stoklund Olesen authored Oct 15, 2012

Using the cached bit vector in MRI avoids comstantly allocating and
recomputing the reserved register bit vector.

llvm-svn: 165983

c30a9af2

Freeze the reserved registers as soon as isel is complete. · 57e31061

Jakob Stoklund Olesen authored Oct 15, 2012

Also provide an MRI::getReservedRegs() function to access the frozen
register set, and isReserved() and isAllocatable() methods to test
individual registers.

The various implementations of TRI::getReservedRegs() are quite
complicated, and many passes need to look at the reserved register set.
This patch makes it possible for these passes to use the cached copy in
MRI, avoiding a lot of malloc traffic and repeated calculations.

llvm-svn: 165982

57e31061

ARM: v1i64 and v2i64 VBSL intrinsic support. · 54c7432e
Jim Grosbach authored Oct 15, 2012
```
rdar://12502028

llvm-svn: 165981
```
54c7432e

Move the Attributes::Builder outside of the Attributes class and into its own... · 50d27849

Bill Wendling authored Oct 15, 2012

Move the Attributes::Builder outside of the Attributes class and into its own class named AttrBuilder. No functionality change.

llvm-svn: 165960

50d27849

[ms-inline asm] If we parsed a statement and the opcode is valid, then it's an instruction. · f3bc5996
Chad Rosier authored Oct 15, 2012
```
llvm-svn: 165955
```
f3bc5996
Make sure we iterate over newly created instructions. Fixes pr13625. Testcase to · 048405f5
Rafael Espindola authored Oct 15, 2012
```
follow in one sec.

llvm-svn: 165951
```
048405f5
misched: ILP scheduler for experimental heuristics. · 90f711da
Andrew Trick authored Oct 15, 2012
```
llvm-svn: 165950
```
90f711da
[ms-inline asm] Update the end loc for ParseIntelMemOperand. · 499d4a14
Chad Rosier authored Oct 15, 2012
```
llvm-svn: 165947
```
499d4a14

[ms-inline asm] Add a few new APIs to the AsmParser class in support of MS-Style · 0d6f149e

Chad Rosier authored Oct 15, 2012

inline assembly.  For the time being, these will be called directly by clang.
However, in the near future I expect these to be sunk back into the MC layer
and more basic APIs (e.g., getClobbers(), getConstraints(), etc.) will be called
by clang.

llvm-svn: 165946

0d6f149e

[ms-inline asm] Use incoming argument rather than hard coding to false. · ca0ada1a
Chad Rosier authored Oct 15, 2012
```
llvm-svn: 165945
```
ca0ada1a

Resubmit the changes to llvm core to update the functions to support different... · 4bb926d9

Micah Villmow authored Oct 15, 2012

Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis.

llvm-svn: 165941

4bb926d9

PowerPC: add EmitTCEntry class for TOC creation · ef206f19

Adhemerval Zanella authored Oct 15, 2012

This patch replaces the EmitRawText by a EmitTCEntry class (specialized for
each Streamer) in PowerPC64 TOC entry creation.

llvm-svn: 165940

ef206f19

[asan] make AddressSanitizer to be a FunctionPass instead of ModulePass. This... · b0e2506d

Kostya Serebryany authored Oct 15, 2012

[asan] make AddressSanitizer to be a FunctionPass instead of ModulePass. This will simplify chaining other FunctionPasses with asan. Also some minor cleanup 

llvm-svn: 165936

b0e2506d

Update the memcpy rewriting to fully support widened int rewriting. This · 49c8eea3

Chandler Carruth authored Oct 15, 2012

includes extracting ints for copying elsewhere and inserting ints when
copying into the alloca. This should fix the CanSROA assertion coming
out of Clang's regression test suite.

llvm-svn: 165931

49c8eea3

Follow-up fix to r165928: handle memset rewriting for widened integers, · 9d966a20

Chandler Carruth authored Oct 15, 2012

and generally clean up the memset handling. It had rotted a bit as the
other rewriting logic got polished more.

llvm-svn: 165930

9d966a20

Fixed PR13938: the ARM backend was crashing because it couldn't select a... · b1409700

Silviu Baranga authored Oct 15, 2012

Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE.

llvm-svn: 165929

b1409700

First major step toward addressing PR14059. This teaches SROA to handle · 435c4e07

Chandler Carruth authored Oct 15, 2012

cases where we have partial integer loads and stores to an otherwise
promotable alloca to widen[1] those loads and stores to cover the entire
alloca and bitcast them into the appropriate type such that promotion
can proceed.

These partial loads and stores stem from an annoying confluence of ARM's
calling convention and ABI lowering and the FCA pre-splitting which
takes place in SROA. Clang lowers a { double, double } in-register
function argument as a [4 x i32] function argument to ensure it is
placed into integer 32-bit registers (a really unnerving implicit
contract between Clang and the ARM backend I would add). This results in
a FCA load of [4 x i32]* from the { double, double } alloca, and SROA
decomposes this into a sequence of i32 loads and stores. Inlining
proceeds, code gets folded, but at the end of the day, we still have i32
stores to the low and high halves of a double alloca. Widening these to
be i64 operations, and bitcasting them to double prior to loading or
storing allows promotion to proceed for these allocas.

I looked quite a bit changing the IR which Clang produces for this case
to be more friendly, but small changes seem unlikely to help. I think
the best representation we could use currently would be to pass 4 i32
arguments thereby avoiding any FCAs, but that would still require this
fix. It seems like it might eventually be nice to somehow encode the ABI
register selection choices outside of the parameter type system so that
the parameter can be a { double, double }, but the CC register
annotations indicate that this should be passed via 4 integer registers.

This patch does not address the second problem in PR14059, which is the
reverse: when a struct alloca is loaded as a *larger* single integer.

This patch also does not address some of the code quality issues with
the FCA-splitting. Those don't actually impede any optimizations really,
but they're on my list to clean up.

[1]: Pedantic footnote: for those concerned about memory model issues
here, this is safe. For the alloca to be promotable, it cannot escape or
have any use of its address that could allow these loads or stores to be
racing. Thus, widening is always safe.

llvm-svn: 165928

435c4e07

Hoist the canConvertValue predicate and the convertValue transform out · aa6afbb8
Chandler Carruth authored Oct 15, 2012
```
into static helper functions. They're really quite generic and are going
to be needed elsewhere shortly.

llvm-svn: 165927
```
aa6afbb8

Add an enum for the return and function indexes into the AttrListPtr object.... · fbd38fe2

Bill Wendling authored Oct 15, 2012

Add an enum for the return and function indexes into the AttrListPtr object. This gets rid of some magic numbers.

llvm-svn: 165924

fbd38fe2

Use a ::get method to create the attribute from Attributes::AttrVals instead of a constructor. · 79d45dbb
Bill Wendling authored Oct 15, 2012
```
llvm-svn: 165923
```
79d45dbb
Move the AttributesImpl header file into the VMCore directory so that it can be opaque. · 8c3e65db
Bill Wendling authored Oct 15, 2012
```
llvm-svn: 165920
```
8c3e65db

Attributes Rewrite · d079a446

Bill Wendling authored Oct 15, 2012

Convert the internal representation of the Attributes class into a pointer to an
opaque object that's uniqued by and stored in the LLVMContext object. The
Attributes class then becomes a thin wrapper around this opaque
object. Eventually, the internal representation will be expanded to include
attributes that represent code generation options, etc.

llvm-svn: 165917

d079a446

instcombine: Migrate strcmp and strncmp optimizations · 40b6fac3

Meador Inge authored Oct 15, 2012

This patch migrates the strcmp and strncmp optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.

llvm-svn: 165915

40b6fac3

Oct 14, 2012
- Simplify code. No functionality change. · c5b0678c
  Benjamin Kramer authored Oct 14, 2012
```
llvm-svn: 165904
```
  c5b0678c