Commits · 76c95562bc34c2819642e9756b31e508f838b052 · Roger Ferrer / llvm-epi-0.8

Feb 12, 2011

Add a note about SSE4.1 roundss/roundsd. · 69affe6a
Benjamin Kramer authored Feb 12, 2011
```
llvm-svn: 125438
```
69affe6a

Nadav Rotem authored Feb 12, 2011

The DAGCombiner created illegal BUILD_VECTOR operations.
The patch added a check that either illegal operations are
allowed or that the created operation is legal.

llvm-svn: 125435

db2f5481

AsmMatcher custom operand parser failure enhancements. · 861e49ce

Jim Grosbach authored Feb 12, 2011

Teach the AsmMatcher handling to distinguish between an error custom-parsing
an operand and a failure to match. The former should propogate the error
upwards, while the latter should continue attempting to parse with
alternative matchers.

Update the ARM asm parser accordingly.

llvm-svn: 125426

861e49ce

Feb 11, 2011

Also fold (A+B) == A -> B == 0 when the add is commuted. · 1800d823
Benjamin Kramer authored Feb 11, 2011
```
llvm-svn: 125411
```
1800d823

Per discussion with Dan G, inbounds geps *certainly* can have · 7936a8a4

Chris Lattner authored Feb 11, 2011

unsigned overflow (e.g. "gep P, -1"), and while they can have
signed wrap in theoretical situations, modelling an AddRec as
not having signed wrap is going enough for any case we can 
think of today.  In the future if this isn't enough, we can
revisit this.  Modeling them as having NUW isn't causing any
known problems either FWIW.

llvm-svn: 125410

7936a8a4

When lowering an inbounds gep, the intermediate adds can have · d3c0e05f

Chris Lattner authored Feb 11, 2011

unsigned overflow (e.g. due to a negative array index), but
the scales on array size multiplications are known to not
sign wrap.

llvm-svn: 125409

d3c0e05f

Adds llvm::sys::path::is_separator() to test whether a char is a path separator · 606bb1a2
Zhanyong Wan authored Feb 11, 2011
```
on the host OS.  Reviewed by dgregor.

llvm-svn: 125406
```
606bb1a2
Implement sdiv & udiv for <4 x i16> and <8 x i8> NEON vector types. · fa62d504
Nate Begeman authored Feb 11, 2011
```
This avoids moving each element to the integer register file and calling __divsi3 etc. on it.

llvm-svn: 125402
```
fa62d504
SimplifySelectOps can only handle selects with a scalar condition. Add a check · a49a02a0
Nadav Rotem authored Feb 11, 2011
```
that the condition is not a vector.

llvm-svn: 125398
```
a49a02a0

Fix 9173. · 10134c33

Nadav Rotem authored Feb 11, 2011

Add more folding patterns to constant expressions of vector selects and vector
bitcasts.

llvm-svn: 125393

10134c33

· 18f6a334

Nadav Rotem authored Feb 11, 2011

Fix #9190

The bug happens when the DAGCombiner attempts to optimize one of the patterns
of the SUB opcode. It tries to create a zero of type v2i64. This type is legal
on 32bit machines, but the initializer of this vector (i64) is target dependent.
Currently, the initializer attempts to create an i64 zero constant, which fails.
Added a flag to tell the DAGCombiner to create a legal zero, if we require that
the pass would generate legal types.

llvm-svn: 125391

18f6a334

More whitespace cleanup... · 01af6c46
Jim Grosbach authored Feb 11, 2011
```
llvm-svn: 125388
```
01af6c46

Make LoopUnswitch preserve ScalarEvolution by just forgetting everything about · 99de19b3

Cameron Zwarich authored Feb 11, 2011

a loop when unswitching it. It only does this in the complex case, because
everything should be fine already in the simple case.

llvm-svn: 125369

99de19b3

LoopInstSimplify preserves ScalarEvolution. · 25cb63c7
Cameron Zwarich authored Feb 11, 2011
```
llvm-svn: 125368
```
25cb63c7
make ConstantExpr::replaceUsesOfWithOnConstant preserve the inbounds · 603af188
Chris Lattner authored Feb 11, 2011
```
flag.  Noticed by Jin Gu Kang!

llvm-svn: 125366
```
603af188
make the constantexpr interfaces for inbounds GEPs follow the same style · 94c8d294
Chris Lattner authored Feb 11, 2011
```
as other constantexpr flags, reducing redundancy.

llvm-svn: 125365
```
94c8d294
Remove std::string version of getNameWithPrefix. · 34b59389
Rafael Espindola authored Feb 11, 2011
```
llvm-svn: 125363
```
34b59389

Fix buggy fcopysign lowering. · 2da1c959

Evan Cheng authored Feb 11, 2011

This
define float @foo(float %x, float %y) nounwind readnone {
entry:
  %0 = tail call float @copysignf(float %x, float %y) nounwind readnone
  ret float %0
}

Was compiled to:
    vmov     s0, r1
    bic      r0, r0, #-2147483648
    vmov     s1, r0
    vcmpe.f32    s0, #0
    vmrs         apsr_nzcv, fpscr
    it           lt
    vneglt.f32   s1, s1
    vmov         r0, s1
    bx           lr

This fails to copy the sign of -0.0f because it's lost during the float to int
conversion. Also, it's sub-optimal when the inputs are in GPR registers.

Now it uses integer and + or operations when it's profitable. And it's correct!
    lsrs    r1, r1, #31
    bfi     r0, r1, #31, #1
    bx      lr
rdar://8984306

llvm-svn: 125357

2da1c959

Tolerate degenerate phi nodes that can occur in the middle of optimization · ac0b62c2
Nick Lewycky authored Feb 10, 2011
```
passes. Fixes PR9112. Patch by Jakub Staszak!

llvm-svn: 125319
```
ac0b62c2
If we can't avoid running loop-simplify twice for now, at least avoid running · 97dae4d3
Cameron Zwarich authored Feb 10, 2011
```
iv-users twice.

llvm-svn: 125318
```
97dae4d3
Rename 'loopsimplify' to 'loop-simplify'. · d8e66038
Cameron Zwarich authored Feb 10, 2011
```
llvm-svn: 125317
```
d8e66038

· 79827a5a

David Greene authored Feb 10, 2011

[AVX] Implement 256-bit vector lowering for SCALAR_TO_VECTOR.  This
largely completes support for 128-bit fallback lowering for code that
is not 256-bit ready.

llvm-svn: 125315

79827a5a

Feb 10, 2011

Fix a lot of o32 CC issues and add a bunch of tests. Patch by Akira Hatanaka... · 61a61e9d
Bruno Cardoso Lopes authored Feb 10, 2011
```
Fix a lot of o32 CC issues and add a bunch of tests. Patch by Akira Hatanaka with some small modifications by me.

llvm-svn: 125292
```
61a61e9d

· ce318e49

David Greene authored Feb 10, 2011

[AVX] Implement 256-bit vector lowering for EXTRACT_VECTOR_ELT.

llvm-svn: 125284

ce318e49

ptx: add passing parameter to kernel functions · 84fde9ef
Che-Liang Chiou authored Feb 10, 2011
```
llvm-svn: 125279
```
84fde9ef

implement the first part of PR8882: when lowering an inbounds · d86ded17

Chris Lattner authored Feb 10, 2011

gep to explicit addressing, we know that none of the intermediate
computation overflows.

This could use review: it seems that the shifts certainly wouldn't
overflow, but could the intermediate adds overflow if there is a 
negative index?

Previously the testcase would instcombine to:

define i1 @test(i64 %i) {
  %p1.idx.mask = and i64 %i, 4611686018427387903
  %cmp = icmp eq i64 %p1.idx.mask, 1000
  ret i1 %cmp
}

now we get:

define i1 @test(i64 %i) {
  %cmp = icmp eq i64 %i, 1000
  ret i1 %cmp
}

llvm-svn: 125271

d86ded17

switch the constantexpr, target folder, and IRBuilder interfaces · e9b4ad73

Chris Lattner authored Feb 10, 2011

for NSW/NUW binops to follow the pattern of exact binops.  This
allows someone to use Builder.CreateAdd(x, y, "tmp", MaybeNUW);

llvm-svn: 125270

e9b4ad73

Enhance a bunch of transformations in instcombine to start generating · 6b657aed

Chris Lattner authored Feb 10, 2011

exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.

Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck.  I believe that this takes care of a bunch of related
code quality issues attached to PR8862.

llvm-svn: 125267

6b657aed

Enhance the "compare with shift" and "compare with div" · 98457101

Chris Lattner authored Feb 10, 2011

optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts.  For example, these (which
are the same except the first is 'exact' sdiv:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %A = sdiv exact i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %A = sdiv i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

compile down to:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %1 = icmp eq i64 %X, 0
  ret i1 %1
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %X.off = add i64 %X, 4
  %1 = icmp ult i64 %X.off, 9
  ret i1 %1
}

This happens when you do something like:
  (ptr1-ptr2) == 42

where the pointers are pointers to non-unit types.

llvm-svn: 125266

98457101

more cleanups, notably bitcast isn't used for "signed to unsigned type · dcef03fb
Chris Lattner authored Feb 10, 2011
```
conversions". :)

llvm-svn: 125265
```
dcef03fb

A bunch of cleanups and simplifications using the new PatternMatch predicates · 7d0e43ff

Chris Lattner authored Feb 10, 2011

and generally tidying things up.  Only very trivial functionality changes
like now doing (-1 - A) -> (~A) for vectors too.

 InstCombineAddSub.cpp |  296 +++++++++++++++++++++-----------------------------
 1 file changed, 126 insertions(+), 170 deletions(-)

llvm-svn: 125264

7d0e43ff

teach SimplifyDemandedBits that exact shifts demand the bits they · 768003c5
Chris Lattner authored Feb 10, 2011
```
are shifting out since they do require them to be zeros.  Similarly
for NUW/NSW bits of shl

llvm-svn: 125263
```
768003c5

After 3-addressifying a two-address instruction, update the register maps; add... · d4fcc053

Evan Cheng authored Feb 10, 2011

After 3-addressifying a two-address instruction, update the register maps; add a missing check when considering whether it's profitable to commute. rdar://8977508.

llvm-svn: 125259

d4fcc053

Revert this in an attempt to bring the builders back. · da6bd450
Eric Christopher authored Feb 10, 2011
```
llvm-svn: 125257
```
da6bd450

Turn this pass ordering: · 58c8670a

Cameron Zwarich authored Feb 10, 2011

Natural Loop Information
 Loop Pass Manager
   Canonicalize natural loops
 Scalar Evolution Analysis
 Loop Pass Manager
   Induction Variable Users
   Canonicalize natural loops
   Induction Variable Users
   Loop Strength Reduction

into this:

Scalar Evolution Analysis
Loop Pass Manager
  Canonicalize natural loops
  Induction Variable Users
  Loop Strength Reduction

This fixes <rdar://problem/8869639>. I also filed PR9184 on doing this sort of
thing automatically, but it seems easier to just change the ordering of the
passes if this is the only case.

llvm-svn: 125254

58c8670a

Delete unused code for analyzing and splitting around loops. · 8dafc875

Jakob Stoklund Olesen authored Feb 09, 2011

Loop splitting is better handled by the more generic global region splitting
based on the edge bundle graph.

llvm-svn: 125243

8dafc875

Rip out realpath() support. It's expensive, and often a bad idea, and · 56b2ffda
Douglas Gregor authored Feb 09, 2011
```
I have another way to achieve the same goal.

llvm-svn: 125239
```
56b2ffda
Simplify using the new leaveIntvBefore() · 77ba1cf2
Jakob Stoklund Olesen authored Feb 09, 2011
```
llvm-svn: 125238
```
77ba1cf2
Use the LiveBLocks array for SplitEditor::splitSingleBlocks() as well. · 7cb57b30
Jakob Stoklund Olesen authored Feb 09, 2011
```
This fixes a bug where splitSingleBlocks() could split a live range after a
terminator instruction.

llvm-svn: 125237
```
7cb57b30
Attempt to fix the build after r125228. · f73f5ba2
Cameron Zwarich authored Feb 09, 2011
```
llvm-svn: 125236
```
f73f5ba2