Commits · 4fe0e1908ecc28a59e8764d6f5b5460ecc4166a0 · Roger Ferrer / llvm-epi-0.8

Apr 28, 2012

Spring cleaning - Delete dead code. · 4fe0e190
Jakob Stoklund Olesen authored Apr 28, 2012
```
llvm-svn: 155765
```
4fe0e190

Fix a problem with blocks that need to be split twice. · ae7521d1

Jakob Stoklund Olesen authored Apr 28, 2012

The code could search past the end of the basic block when there was
already a constant pool entry after the block.

Test case with giant basic block in SingleSource/UnitTests/Vector/constpool.c

llvm-svn: 155753

ae7521d1

Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice. · 833f0496

Andrew Trick authored Apr 28, 2012

This time, also fix the caller of AddGlue to properly handle
incomplete chains. AddGlue had failure modes, but shamefully hid them
from its caller. It's luck ran out.

Fixes rdar://11314175: BuildSchedUnits assert.

llvm-svn: 155749

833f0496

ARM: Thumb add(sp plus register) asm constraints. · c6f32b32

Jim Grosbach authored Apr 27, 2012

Make sure when parsing the Thumb1 sp+register ADD instruction that
the source and destination operands match. In thumb2, just use the
wide encoding if they don't. In Thumb1, issue a diagnostic.

rdar://11219154

llvm-svn: 155748

c6f32b32

ARM: Tweak tADDrSP definition for consistent operand order. · 9d8f6f3d
Jim Grosbach authored Apr 27, 2012
```
Make the operand order of the instruction match that of the asm syntax.

llvm-svn: 155747
```
9d8f6f3d
Revert r155745 · a99b1681
Derek Schuff authored Apr 27, 2012
```
llvm-svn: 155746
```
a99b1681

Fix fastcc structure return with fast-isel on x86-32 · bbf8b83e

Derek Schuff authored Apr 27, 2012

On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

llvm-svn: 155745

bbf8b83e

Track worst case alignment padding more accurately. · 5f0d1b46

Jakob Stoklund Olesen authored Apr 27, 2012

Previously, ARMConstantIslandPass would conservatively compute the
address of an aligned basic block as:

  RoundUpToAlignment(Offset + UnknownPadding)

This worked fine for the layout algorithm itself, but it could fool the
verify() function because it accounts for alignment padding twice: Once
when adding the worst case UnknownPadding, and again by rounding up the
fictional block offset. This meant that when optimizeThumb2Instructions
would shrink an instruction, the conservative distance estimate could
grow. That shouldn't be possible since the woorst case alignment padding
wss already included.

This patch drops the use of RoundUpToAlignment, and depends only on
worst case padding to compute conservative block offsets. This has the
weird effect that the computed offset for an aligned block may not be
aligned.

The important difference is that shrinking an instruction can never
cause the estimated distance between two instructions to grow. The
estimated distance is always larger than the real distance that only the
assembler knows.

<rdar://problem/11339352>

llvm-svn: 155744

5f0d1b46

Temporarily revert r155668: Fix the SD scheduler to avoid gluing. · 7a773ec0
Andrew Trick authored Apr 27, 2012
```
This definitely caused regression with ARM -mno-thumb.

llvm-svn: 155743
```
7a773ec0
Use 'unsigned' instead of 'int' in several places when retrieving number of vector elements. · 0fa6c7e5
Craig Topper authored Apr 27, 2012
```
llvm-svn: 155742
```
0fa6c7e5

Add x86-specific DAG combine to simplify: · 32c2178e

Chad Rosier authored Apr 27, 2012

 x == -y --> x+y == 0
 x != -y --> x+y != 0

On x86, the generated code goes from
   negl    %esi
   cmpl    %esi, %edi
   je    .LBB0_2
to
   addl    %esi, %edi
   je    .L4

This case is correctly handled for ARM with "cmn".

Patch by Manman Ren.
rdar://11245199
PR12545

llvm-svn: 155739

32c2178e

Apr 27, 2012

[Support/YAMLParser] Fix ASan found bugs. · 6033113e
Michael J. Spencer authored Apr 27, 2012
```
llvm-svn: 155735
```
6033113e
Tidy up spacing. · 42cd8d2c
Craig Topper authored Apr 27, 2012
```
llvm-svn: 155733
```
42cd8d2c

Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.). · 27c32461

Hal Finkel authored Apr 27, 2012

Target specific types should not be vectorized. As a practical matter,
these types are already register matched (at least in the x86 case),
and codegen does not always work correctly (at least in the ppc case,
and this is not worth fixing because ppc_fp128 is currently broken and
will probably go away soon).

llvm-svn: 155729

27c32461

Change recurse depth limit to uint32 to fix warning. · 84e4b399
David Blaikie authored Apr 27, 2012
```
llvm-svn: 155727
```
84e4b399
Miscellaneous accumulated cleanups. · dae3349a
Dan Gohman authored Apr 27, 2012
```
llvm-svn: 155725
```
dae3349a
Fix the order of the operands in the llvm.fma intrinsic patterns for ARM, · ea001225
Lang Hames authored Apr 27, 2012
```
<rdar://problem/11325085>.

llvm-svn: 155724
```
ea001225

Add an early bailout to IsValueFullyAvailableInBlock from deeply nested blocks. · 6120cfb8

Mon P Wang authored Apr 27, 2012

The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow
issues. <rdar://problem/11286839>.

llvm-svn: 155722

6120cfb8

Reapply r155682, making constant folding more consistent, with a fix to work · 1ccecdb2
Dan Gohman authored Apr 27, 2012
```
properly with how the code handles all-undef PHI nodes.

llvm-svn: 155721
```
1ccecdb2
Fix ARM assembly parsing for upper case condition codes on IT instructions. · 82f95ea2
Richard Barton authored Apr 27, 2012
```
llvm-svn: 155720
```
82f95ea2

X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures. · 913da4b2

Benjamin Kramer authored Apr 27, 2012

* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.

Fixes PR6679. Patch by Christoph Erhardt!

llvm-svn: 155704

913da4b2

[asan] small optimization: do not emit "x+0" instructions · 5a464f03
Kostya Serebryany authored Apr 27, 2012
```
llvm-svn: 155701
```
5a464f03
Refactor IT handling not to store the bottom bit of the condition code in the... · f435b09e
Richard Barton authored Apr 27, 2012
```
Refactor IT handling not to store the bottom bit of the condition code in the mask operand in the MCInst.

llvm-svn: 155700
```
f435b09e
Revert r155682, "Use ConstantExpr::getExtractElement when constant-folding vectors" · 6008dfdb
NAKAMURA Takumi authored Apr 27, 2012
```
It broke stage2 build. stage1/clang sometimes crashed.

llvm-svn: 155699
```
6008dfdb
[tsan] Atomic support for ThreadSanitizer, patch by Dmitry Vyukov · a1259778
Kostya Serebryany authored Apr 27, 2012
```
llvm-svn: 155698
```
a1259778
Implement a bastardized ABI. · 1ec87ee0
Evan Cheng authored Apr 27, 2012
```
llvm-svn: 155686
```
1ec87ee0
- thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2 · f52003de
Evan Cheng authored Apr 27, 2012
```
  instructions.
- However, it does support dmb, dsb, isb, mrs, and msr.
rdar://11331541

llvm-svn: 155685
```
f52003de

Use ConstantExpr::getExtractElement when constant-folding vectors · 90f3798f

Dan Gohman authored Apr 27, 2012

instead of getAggregateElement. This has the advantage of being
more consistent and allowing higher-level constant folding to
procede even if an inner extract element cannot be folded.

Make ConstantFoldInstruction call ConstantFoldConstantExpression
on the instruction's operands, making it more consistent with 
ConstantFoldConstantExpression itself. This makes sure that
ConstantExprs get TargetData-aware folding before being handed
off as operands for further folding.

This causes more expressions to be folded, but due to a known
shortcoming in constant folding, this currently has the side effect
of stripping a few more nuw and inbounds flags in the non-targetdata
side of constant-fold-gep.ll. This is mostly harmless.

This fixes rdar://11324230.

llvm-svn: 155682

90f3798f

Break up getProfitableChainIncrement(). · c90abc89

Jakob Stoklund Olesen authored Apr 26, 2012

The required checks are moved to ChainInstruction() itself and the
policy decisions are moved to IVChain::isProfitableInc().

Also cache the ExprBase in IVChain to avoid frequent recomputations.

No functional change intended.

llvm-svn: 155676

c90abc89

Turn IVChain into a struct. · a0337d7b
Jakob Stoklund Olesen authored Apr 26, 2012
```
No functional change intended.

llvm-svn: 155675
```
a0337d7b

Add instcombine patterns for the following transformations: · 7813dcee

Chad Rosier authored Apr 26, 2012

 (x & y) | (x ^ y) -> x | y 
 (x & y) + (x ^ y) -> x | y 

Patch by Manman Ren.
rdar://10770603

llvm-svn: 155674

7813dcee

Apr 26, 2012

Fix the SD scheduler to avoid gluing the same node twice. · 03fa574a

Andrew Trick authored Apr 26, 2012

DAGCombine strangeness may result in multiple loads from the same
offset. They both may try to glue themselves to another load. We could
insist that the redundant loads glue themselves to each other, but the
beter fix is to bail out from bad gluing at the time we detect it.

Fixes rdar://11314175: BuildSchedUnits assert.

llvm-svn: 155668

03fa574a

ARM: Thumb ldr(literal) base address alignment is 32-bits. · 3d6c629e

Jim Grosbach authored Apr 26, 2012

The base address for the PC-relative load is Align(PC,4), so it's the
address of the word containing the 16-bit instruction, not the address
of the instruction itself. Ugh.

rdar://11314619

llvm-svn: 155659

3d6c629e

· 81290f4b

Preston Gurd authored Apr 26, 2012

Trivial change to set UseLeaForSP flag in addition to toggling
the FeatureLeaForSP feature bit when llvm auto detects Intel Atom.

Patch by Andy Zhang

llvm-svn: 155655

81290f4b

[Support/YAML] Properly fix unitialized variable warning by inserting a · a6c2c291
Michael J. Spencer authored Apr 26, 2012
```
'REPLACEMENT CHARACTER' (U+FFFD) when getAsInteger fails.

llvm-svn: 155653
```
a6c2c291

Use VLD1 in NEON extenting-load patterns instead of VLDR. · 3de97b7a

Tim Northover authored Apr 26, 2012

On some cores it's a bad idea for performance to mix VFP and NEON instructions
and since these patterns are NEON anyway, the NEON load should be used.

llvm-svn: 155630

3de97b7a

Test commit. · 6699a60b
Tim Northover authored Apr 26, 2012
```
llvm-svn: 155626
```
6699a60b

Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to... · 08ccfbe5

Craig Topper authored Apr 26, 2012

Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to corei7-avx, core-avx-i, and core-avx2 cpu names.

llvm-svn: 155618

08ccfbe5

Teach the reassociate pass to fold chains of multiplies with repeated · 739ef80f

Chandler Carruth authored Apr 26, 2012

elements to minimize the number of multiplies required to compute the
final result. This uses a heuristic to attempt to form near-optimal
binary exponentiation-style multiply chains. While there are some cases
it misses, it seems to at least a decent job on a very diverse range of
inputs.

Initial benchmarks show no interesting regressions, and an 8%
improvement on SPASS. Let me know if any other interesting results (in
either direction) crop up!

Credit to Richard Smith for the core algorithm, and helping code the
patch itself.

llvm-svn: 155616

739ef80f

If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume · 9f7ad310

Evan Cheng authored Apr 26, 2012

the feature set of v7a. This comes about if the user specifies something like
-arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as
uxtab in this case.

rdar://11318438

llvm-svn: 155601

9f7ad310