Commits · 5dc203e8f48ece36bdf789504cb748f0364a4e97 · Roger Ferrer / llvm-epi-0.8

Oct 19, 2012
- Reapply the TargerTransformInfo changes, minus the changes to LSR and Lowerinvoke. · 5dc203e8
  Nadav Rotem authored Oct 18, 2012
```
llvm-svn: 166248
```
  5dc203e8
Oct 18, 2012

Fix a bug where a 32-bit address with the high bit does not get symbolicated · b23926d3
Kevin Enderby authored Oct 18, 2012
```
because the value is incorrectly being signed extended when passed to
SymbolLookUp().

llvm-svn: 166234
```
b23926d3

This patch fixes failures in the SingleSource/Regression/C/uint64_to_float · d34b5bd6

Ulrich Weigand authored Oct 18, 2012

test case on PowerPC caused by rounding errors when converting from a 64-bit
integer to a single-precision floating point. The reason for this are
double-rounding effects, since on PowerPC we have to convert to an
intermediate double-precision value first, which gets rounded to the
final single-precision result.

The patch fixes the problem by preparing the 64-bit integer so that the
first conversion step to double-precision will always be exact, and the
final rounding step will result in the correctly-rounded single-precision
result. The generated code sequence is equivalent to what GCC would generate.

When -enable-unsafe-fp-math is in effect, that extra effort is omitted
and we accept possible rounding errors (just like GCC does as well).

llvm-svn: 166178

d34b5bd6

Temporarily revert the TargetTransform changes. · d6d9ccca

Bob Wilson authored Oct 18, 2012

The TargetTransform changes are breaking LTO bootstraps of clang.  I am
working with Nadav to figure out the problem, but I am reverting it for now
to get our buildbots working.

This reverts svn commits: 165665 165669 165670 165786 165787 165997
and I have also reverted clang svn 165741

llvm-svn: 166168

d6d9ccca

Add conditional branch instructions and their patterns. · 6743924a
Reed Kotler authored Oct 17, 2012
```
llvm-svn: 166134
```
6743924a

Oct 17, 2012

Merge MRI::isPhysRegOrOverlapUsed() into isPhysRegUsed(). · 07364426

Jakob Stoklund Olesen authored Oct 17, 2012

All callers of these functions really want the isPhysRegOrOverlapUsed()
functionality which also checks aliases. For historical reasons, targets
without register aliases were calling isPhysRegUsed() instead.

Change isPhysRegUsed() to also check aliases, and switch all
isPhysRegOrOverlapUsed() callers to isPhysRegUsed().

llvm-svn: 166117

07364426

Check for empty YMM use-def lists in X86VZeroUpper. · a10c0980

Jakob Stoklund Olesen authored Oct 17, 2012

The previous MRI.isPhysRegUsed(YMM0) would also return true when the
function contains a call to a function that may clobber YMM0. That's
most of them.

Checking the use-def chains allows us to skip functions that don't
explicitly mention YMM registers.

llvm-svn: 166110

a10c0980

Fix fallout from RegInfo => FrameLowering refactoring on MSP430. · 0a69176c
Anton Korobeynikov authored Oct 17, 2012
```
Patch by Job Noorman!

llvm-svn: 166108
```
0a69176c

Check SSSE3 instead of SSE4.1 · cef9541d

Michael Liao authored Oct 17, 2012

- All shuffle insns required, especially PSHUB, are added in SSSE3.

llvm-svn: 166086

cef9541d

Fix setjmp on models with non-Small code model nor non-Static relocation model · 6f720613

Michael Liao authored Oct 17, 2012

- MBB address is only valid as an immediate value in Small & Static
  code/relocation models. On other models, LEA is needed to load IP address of
  the restore MBB.
- A minor fix of MBB in MC lowering is added as well to enable target
  relocation flag being propagated into MC.

llvm-svn: 166084

6f720613

Oct 16, 2012

Support v8f32 to v8i8/vi816 conversion through custom lowering · 02ca3454

Michael Liao authored Oct 16, 2012

- Add custom FP_TO_SINT on v8i16 (and v8i8 which is legalized as v8i16 due to
  vector element-wise widening) to reduce DAG combiner and its overhead added
  in X86 backend.

llvm-svn: 166036

02ca3454

This patch addresses PR13949. · 48081cad

Bill Schmidt authored Oct 16, 2012

For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8
bytes are to be passed in the low-order bits ("right-adjusted") of the
doubleword register or memory slot assigned to them. A previous patch
addressed this for aggregates passed in registers. However, small
aggregates passed in the overflow portion of the parameter save area are
still being passed left-adjusted.

The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the
caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on
the callee side. The main fix on the callee side simply extends
existing logic for 1- and 2-byte objects to 1- through 7-byte objects,
and correcting a constant left over from 32-bit code. There is also a
fix to a bogus calculation of the offset to the following argument in
the parameter save area.

On the caller side, again a constant left over from 32-bit code is
fixed. Additionally, some code for 1, 2, and 4-byte objects is
duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only. The
LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to
handle both ABIs, and I propose to separate this into two functions in a
future patch, at which time the duplication can be removed.

The patch adds a new test (structsinmem.ll) to demonstrate correct
passing of structures of all seven sizes. Eight dummy parameters are
used to force these structures to be in the overflow portion of the
parameter save area.

As a side effect, this corrects the case when aggregates passed in
registers are saved into the first eight doublewords of the parameter
save area: Previously they were stored left-justified, and now are
properly stored right-justified. This requires changing the expected
output of existing test case structsinregs.ll.

llvm-svn: 166022

48081cad

Issue: · e59a920b

Stepan Dyatkovskiy authored Oct 16, 2012

Stack is formed improperly for long structures passed as byval arguments for
EABI mode.

If we took AAPCS reference, we can found the next statements:

A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).

B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).

So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.

Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.

Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.

P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch. 

llvm-svn: 166018

e59a920b

Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>. · 1705a999

NAKAMURA Takumi authored Oct 16, 2012

Original message:

The attached is the fix to radar://11663049. The optimization can be outlined by following rules:

   (select (x != c), e, c) -> select (x != c), e, x),
   (select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.

 The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.

  While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.

  The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".

Original message since r165661:

My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code.

llvm-svn: 166017

1705a999

Move X86MCInstLower class definition into implementation file. It's not needed outside. · 2a3f7758
Craig Topper authored Oct 16, 2012
```
llvm-svn: 166014
```
2a3f7758
Pass in the context to the Attributes::get method. · 4f69e148
Bill Wendling authored Oct 16, 2012
```
llvm-svn: 166007
```
4f69e148

Add __builtin_setjmp/_longjmp supprt in X86 backend · 97bf363a

Michael Liao authored Oct 15, 2012

- Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also
  used as a light-weight replacement of setjmp/longjmp which are used to
  implementation continuation, user-level threading, and etc. The support added
  in this patch ONLY addresses this usage and is NOT intended to support SjLj
  exception handling as zero-cost DWARF exception handling is used by default
  in X86.

llvm-svn: 165989

97bf363a

Oct 15, 2012

ARM: v1i64 and v2i64 VBSL intrinsic support. · 54c7432e
Jim Grosbach authored Oct 15, 2012
```
rdar://12502028

llvm-svn: 165981
```
54c7432e

Move the Attributes::Builder outside of the Attributes class and into its own... · 50d27849

Bill Wendling authored Oct 15, 2012

Move the Attributes::Builder outside of the Attributes class and into its own class named AttrBuilder. No functionality change.

llvm-svn: 165960

50d27849

[ms-inline asm] If we parsed a statement and the opcode is valid, then it's an instruction. · f3bc5996
Chad Rosier authored Oct 15, 2012
```
llvm-svn: 165955
```
f3bc5996
[ms-inline asm] Update the end loc for ParseIntelMemOperand. · 499d4a14
Chad Rosier authored Oct 15, 2012
```
llvm-svn: 165947
```
499d4a14
[ms-inline asm] Use incoming argument rather than hard coding to false. · ca0ada1a
Chad Rosier authored Oct 15, 2012
```
llvm-svn: 165945
```
ca0ada1a

Resubmit the changes to llvm core to update the functions to support different... · 4bb926d9

Micah Villmow authored Oct 15, 2012

Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis.

llvm-svn: 165941

4bb926d9

PowerPC: add EmitTCEntry class for TOC creation · ef206f19

Adhemerval Zanella authored Oct 15, 2012

This patch replaces the EmitRawText by a EmitTCEntry class (specialized for
each Streamer) in PowerPC64 TOC entry creation.

llvm-svn: 165940

ef206f19

Fixed PR13938: the ARM backend was crashing because it couldn't select a... · b1409700

Silviu Baranga authored Oct 15, 2012

Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE.

llvm-svn: 165929

b1409700

Attributes Rewrite · d079a446

Bill Wendling authored Oct 15, 2012

Convert the internal representation of the Attributes class into a pointer to an
opaque object that's uniqued by and stored in the LLVMContext object. The
Attributes class then becomes a thin wrapper around this opaque
object. Eventually, the internal representation will be expanded to include
attributes that represent code generation options, etc.

llvm-svn: 165917

d079a446

Oct 14, 2012
- Don't pass in an Attributes object to something that expects an integral value. · 9e1eb4d1
  Bill Wendling authored Oct 14, 2012
```
llvm-svn: 165887
```
  9e1eb4d1
Oct 13, 2012

X86: Disable long nops for all cpus prior to pentiumpro/i686. · 35480284
Benjamin Kramer authored Oct 13, 2012
```
llvm-svn: 165878
```
35480284
X86: Fix accidentally swapped operands. · ecd15d7f
Benjamin Kramer authored Oct 13, 2012
```
llvm-svn: 165871
```
ecd15d7f

X86: Promote i8 cmov when both operands are coming from truncates of the same width. · d6b9362f

Benjamin Kramer authored Oct 13, 2012

X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this
level is often not a good idea because it's too late for many optimizations to
kick in. This solution doesn't add any extensions (truncs are free) and tries
to avoid introducing partial register stalls by filtering direct copyfromregs.

I'm seeing a ~10% speedup on reading a random .png file with libpng15 via
graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture.

llvm-svn: 165868

d6b9362f

[ms-inline asm] Remove the MatchInstruction() function. Previously, this was · 49963555

Chad Rosier authored Oct 13, 2012

the interface between the front-end and the MC layer when parsing inline
assembly.  Unfortunately, this is too deep into the parsing stack. Specifically,
we're unable to handle target-independent assembly (i.e., assembly directives,
labels, etc.).  Note the MatchAndEmitInstruction() isn't the correct
abstraction either.  I'll be exposing target-independent hooks shortly, so this
is really just a cleanup.

llvm-svn: 165858

49963555

ARM: tail-call inside a function where part of a byval argument is on caller's · 7e48b252

Manman Ren authored Oct 12, 2012

local frame causes problem.

For example:
void f(StructToPass s) {
  g(&s, sizeof(s));
}
will cause problem with tail-call since part of s is passed via registers and
saved in f's local frame. When g tries to access s, part of s may be corrupted
since f's local frame is popped out before the tail-call.

The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for
the caller. This is a conservative approach, if we can prove the address of
s or part of s is not taken and passed to g, it should be okay to perform
tail-call.

rdar://12442472

llvm-svn: 165853

7e48b252

[ms-inline asm] Capitalize per coding standard. · 4453e845
Chad Rosier authored Oct 12, 2012
```
llvm-svn: 165847
```
4453e845

ARM: Mark VSELECT as 'expand'. · 30af442a

Jim Grosbach authored Oct 12, 2012

The backend already pattern matches to form VBSL when it can. We may want to
teach it to use the vbsl intrinsics at some point to prevent machine licm from
mucking with this, but using the Expand is completely correct.

http://llvm.org/bugs/show_bug.cgi?id=13831
http://llvm.org/bugs/show_bug.cgi?id=13961

Patch by Peter Couperus <peter.couperus@st.com>.

llvm-svn: 165845

30af442a

[ms-inline asm] Use the new API introduced in r165830 in lieu of the · 2f480a8a
Chad Rosier authored Oct 12, 2012
```
MapAndConstraints vector.  Also remove the unused Kind argument.

llvm-svn: 165833
```
2f480a8a

Oct 12, 2012
- Div, Rem int/unsigned int · cf11c59e
  Reed Kotler authored Oct 12, 2012
```
llvm-svn: 165783
```
  cf11c59e
- Remove unnecessary classof()'s · 506a1c5a
  Sean Silva authored Oct 11, 2012
```
isa<> et al. automatically infer when the cast is an upcast (including a
self-cast), so these are no longer necessary.

llvm-svn: 165767
```
  506a1c5a
Oct 11, 2012

Revert 165732 for further review. · 0c61134d
Micah Villmow authored Oct 11, 2012
```
llvm-svn: 165747
```
0c61134d

Add in the first iteration of support for llvm/clang/lldb to allow variable... · 08318973

Micah Villmow authored Oct 11, 2012

Add in the first iteration of support for llvm/clang/lldb to allow variable per address space pointer sizes to be optimized correctly.

llvm-svn: 165726

08318973

This patch addresses PR13947. · 22162470

Bill Schmidt authored Oct 11, 2012

For function calls on the 64-bit PowerPC SVR4 target, each parameter
is mapped to as many doublewords in the parameter save area as
necessary to hold the parameter.  The first 13 non-varargs
floating-point values are passed in registers; any additional
floating-point parameters are passed in the parameter save area.  A
single-precision floating-point parameter (32 bits) must be mapped to
the second (rightmost, low-order) word of its assigned doubleword
slot.

Currently LLVM violates this ABI requirement by mapping such a
parameter to the first (leftmost, high-order) word of its assigned
doubleword slot.  This is internally self-consistent but will not
interoperate correctly with libraries compiled with an ABI-compliant
compiler.

This patch corrects the problem by adjusting the parameter addressing
on both sides of the calling convention.

llvm-svn: 165714

22162470