Commits · 428f0b1430b6850492070d7c4803e698940c1ab3 · Lorenzo Albano / LLVM bpEVL

Jan 04, 2015

InstCombine: match can find ConstantExprs, don't assume we have a Value · 087dc8b8

David Majnemer authored Jan 04, 2015

We assumed the output of a match was a Value, this would cause us to
assert because we would fail a cast<>.  Instead, use a helper in the
Operator family to hide the distinction between Value and Constant.

This fixes PR22087.

llvm-svn: 225127

087dc8b8

Jan 03, 2015
- [asan] simplify the tracing code, make it use the same guard variables as coverage · d421db05
  Kostya Serebryany authored Jan 03, 2015
```
llvm-svn: 225103
```
  d421db05
Jan 02, 2015

InstCombine: Detect when llvm.umul.with.overflow always overflows · c8a576b5

David Majnemer authored Jan 02, 2015

We know overflow always occurs if both ~LHSKnownZero * ~RHSKnownZero
and LHSKnownOne * RHSKnownOne overflow.

llvm-svn: 225077

c8a576b5

Analysis: Reformulate WillNotOverflowUnsignedMul for reusability · 491331ac

David Majnemer authored Jan 02, 2015

WillNotOverflowUnsignedMul's smarts will live in ValueTracking as
computeOverflowForUnsignedMul.  It now returns a tri-state result:
never overflows, always overflows and sometimes overflows.

llvm-svn: 225076

491331ac

[SROA] Teach SROA to be more aggressive in splitting now that we have · 24ac830d

Chandler Carruth authored Jan 02, 2015

a pre-splitting pass over loads and stores.

Historically, splitting could cause enough problems that I hamstrung the
entire process with a requirement that splittable integer loads and
stores must cover the entire alloca. All smaller loads and stores were
unsplittable to prevent chaos from ensuing. With the new pre-splitting
logic that does load/store pair splitting I introduced in r225061, we
can now very nicely handle arbitrarily splittable loads and stores. In
order to fully benefit from these smarts, we need to mark all of the
integer loads and stores as splittable.

However, we don't actually want to rewrite partitions with all integer
loads and stores marked as splittable. This will fail to extract scalar
integers from aggregates, which is kind of the point of SROA. =] In
order to resolve this, what we really want to do is only do
pre-splitting on the alloca slices with integer loads and stores fully
splittable. This allows us to uncover all non-integer uses of the alloca
that would benefit from a split in an integer load or store (and where
introducing the split is safe because it is just memory transfer from
a load to a store). Once done, we make all the non-whole-alloca integer
loads and stores unsplittable just as they have historically been,
repartition and rewrite.

The result is that when there are integer loads and stores anywhere
within an alloca (such as from a memcpy of a sub-object of a larger
object), we can split them up if there are non-integer components to the
aggregate hiding beneath. I've added the challenging test cases to
demonstrate how this is able to promote to scalars even a case where we
have even *partially* overlapping loads and stores.

This restores the single-store behavior for small arrays of i8s which is
really nice. I've restored both the little endian testing and big endian
testing for these exactly as they were prior to r225061. It also forced
me to be more aggressive in an alignment test to actually defeat SROA.
=] Without the added volatiles there, we actually split up the weird i16
loads and produce nice double allocas with better alignment.

This also uncovered a number of bugs where we failed to handle
splittable load and store slices which didn't have a begininng offset of
zero. Those fixes are included, and without them the existing test cases
explode in glorious fireworks. =]

I've kept support for leaving whole-alloca integer loads and stores as
splittable even for the purpose of rewriting, but I think that's likely
no longer needed. With the new pre-splitting, we might be able to remove
all the splitting support for loads and stores from the rewriter. Not
doing that in this patch to try to isolate any performance regressions
that causes in an easy to find and revert chunk.

llvm-svn: 225074

24ac830d

[SROA] Make the computation of adjusted pointers not leak GEP · 5986b541

Chandler Carruth authored Jan 02, 2015

instructions.

I noticed this when working on dialing up how aggressively we can
pre-split loads and stores. My test case wasn't passing because dead
GEPs into the allocas persisted when they were built by this routine.
This isn't terribly harmful, we still rewrote and promoted the alloca
and I can't conceive of how to cause this to happen in a case where we
will keep the exact same alloca but rewrite and promote the uses of it.
If that ever happened, we'd get an assert out of mem2reg.

So I don't have a direct test case yet, but the subsequent commit's test
case wouldn't pass without this. There are other problems fixed by this
patch that I spotted purely by inspection such as the fact that
getAdjustedPtr could have actually deleted dead base pointers. I don't
know how to get a base pointer to go into getAdjustedPtr today, so
I think this bug could never have manifested (and I certainly can't
write a test case for it) but, it wasn't the intent of the code. The
code really just wanted to GC the new instructions built. That can be
done more directly by comparing with the base pointer which is the only
non-new instruction that this code can return.

llvm-svn: 225073

5986b541

[SROA] Fix the loop exit placement to be prior to indexing the splits · 29c22fae

Chandler Carruth authored Jan 02, 2015

array. This prevents it from walking out of bounds on the splits array.

Bug found with the existing tests by ASan and by the MSVC debug build.

llvm-svn: 225069

29c22fae

[SROA] Fix two total think-os in r225061 that should have been caught on · c39eaa50

Chandler Carruth authored Jan 01, 2015

a +asserts bootstrap, but my bootstrap had asserts off. Oops.

Anyways, in some places it is reasonable to cast (as a sanity check) the
pointer operand to a load or store to an instruction within SROA --
namely when the pointer operand is expected to be derived from an
alloca, and thus always an instruction. However, the pre-splitting code
also deals with loads and stores to non-alloca pointers and there we
need to just use the Value*. Nothing about the code relied on the
instruction cast, it was only there essentially as an invariant
assertion. Remove the two that don't actually hold.

This should fix the proximate issue in PR22080, but I'm also doing an
asserts bootstrap myself to see if there are other issues lurking.

I'll craft a reduced test case in a moment, but I wanted to get the tree
healthy as quickly as possible.

llvm-svn: 225068

c39eaa50

Jan 01, 2015

[SROA] Switch to using a more direct debug logging technique in one part · 6044c0bc

Chandler Carruth authored Jan 01, 2015

of my new load and store splitting, and fix a bug where it logged
a totally irrelevant slice rather than the actual slice in question.

The logging here previously worked because we used to place new slices
onto the back of the core sequence, but that caused other problems.
I updated the actual code to store new slices in their own vector but
didn't update the logging. There isn't a good way to reuse the logging
any more, and frankly it wasn't needed. We can directly log this bit
more easily.

llvm-svn: 225063

6044c0bc

[SROA] Fix formatting with clang-format which I managed to fail to do · 994cde88
Chandler Carruth authored Jan 01, 2015
```
prior to committing r225061. Sorry for that.

llvm-svn: 225062
```
994cde88

[SROA] Teach SROA how to much more intelligently handle split loads and · 0715cba0

Chandler Carruth authored Jan 01, 2015

stores.

When there are accesses to an entire alloca with an integer
load or store as well as accesses to small pieces of the alloca, SROA
splits up the large integer accesses. In order to do that, it uses bit
math to merge the small accesses into large integers. While this is
effective, it produces insane IR that can cause significant problems in
the rest of the optimizer:

- It can cause load and store mismatches with GVN on the non-alloca side
  where we end up loading an i64 (or some such) rather than loading
  specific elements that are stored.
- We can't always get rid of the integer bit math, which is why we can't
  always fix the loads and stores to work well with GVN.
- This is especially bad when we have operations that mix poorly with
  integer bit math such as floating point operations.
- It will block things like the vectorizer which might be able to handle
  the scalar stores that underly the aggregate.

At the same time, we can't just directly split up these loads and stores
in all cases. If there is actual integer arithmetic involved on the
values, then using integer bit math is actually the perfect lowering
because we can often combine it heavily with the surrounding math.

The solution this patch provides is to find places where SROA is
partitioning aggregates into small elements, and look for splittable
loads and stores that it can split all the way to some other adjacent
load and store. These are uniformly the cases where failing to split the
loads and stores hurts the optimizer that I have seen, and I've looked
extensively at the code produced both from more and less aggressive
approaches to this problem.

However, it is quite tricky to actually do this in SROA. We may have
loads and stores to the same alloca, or other complex patterns that are
hard to handle. This complexity leads to the somewhat subtle algorithm
implemented here. We have to do this entire process as a separate pass
over the partitioning of the alloca, and split up all of the loads prior
to splitting the stores so that we can handle safely the cases of
overlapping, including partially overlapping, loads and stores to the
same alloca. We also have to reconstitute the post-split slice
configuration so we can avoid iterating again over all the alloca uses
(the slow part of SROA). But we also have to ensure that when we split
up loads and stores to *other* allocas, we *do* re-iterate over them in
SROA to adapt to the more refined partitioning now required.

With this, I actually think we can fix a long-standing TODO in SROA
where I avoided splitting as many loads and stores as probably should be
splittable. This limitation historically mitigated the fallout of all
the bad things mentioned above. Now that we have more intelligent
handling, I plan to remove the FIXME and more aggressively mark integer
loads and stores as splittable. I'll do that in a follow-up patch to
help with bisecting any fallout.

The net result of this change should be more fine-grained and accurate
scalars being formed out of aggregates. At the very least, Clang now
generates perfect code for this high-level test case using
std::complex<float>:

  #include <complex>

  void g1(std::complex<float> &x, float a, float b) {
    x += std::complex<float>(a, b);
  }
  void g2(std::complex<float> &x, float a, float b) {
    x -= std::complex<float>(a, b);
  }

  void foo(const std::complex<float> &x, float a, float b,
           std::complex<float> &x1, std::complex<float> &x2) {
    std::complex<float> l1 = x;
    g1(l1, a, b);
    std::complex<float> l2 = x;
    g2(l2, a, b);
    x1 = l1;
    x2 = l2;
  }

This code isn't just hypothetical either. It was reduced out of the hot
inner loops of essentially every part of the Eigen math library when
using std::complex<float>. Those loops would consistently and
pervasively hop between the floating point unit and the integer unit due
to bit math extraction and insertion of floating point values that were
"stored" in a 64-bit integer register around the loop backedge.

So far, this change has passed a bootstrap and I have done some other
testing and so far, no issues. That doesn't mean there won't be though,
so I'll be prepared to help with any fallout. If you performance swings
in particular, please let me know. I'm very curious what all the impact
of this change will be. Stay tuned for the follow-up to also split more
integer loads and stores.

llvm-svn: 225061

0715cba0

Dec 31, 2014

InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, X · e68f7157

Sanjay Patel authored Dec 31, 2014

Some day the backend may handle instruction-level fast math flags and make
this transform unnecessary, but it's still better practice to use the canonical
representation of fneg when possible (use a -0.0).

This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ).
See also http://reviews.llvm.org/D6723.

Differential Revision: http://reviews.llvm.org/D6731

llvm-svn: 225050

e68f7157

InstCombine: try to transform A-B < 0 into A < B · f89dc3ed

David Majnemer authored Dec 31, 2014

We are allowed to move the 'B' to the right hand side if we an prove
there is no signed overflow and if the comparison itself is signed.

llvm-svn: 225034

f89dc3ed

Dec 30, 2014

[asan] change _sanitizer_cov_module_init to accept int* instead of int** · aa185bfc
Kostya Serebryany authored Dec 30, 2014
```
llvm-svn: 224999
```
aa185bfc
Some code improvements in Masked Load/Store. · 84d1997b
Elena Demikhovsky authored Dec 30, 2014
```
No functional changes.

llvm-svn: 224986
```
84d1997b

Carry facts about nullness and undef across GC relocation · 9db26ffc

Philip Reames authored Dec 29, 2014

This change implements four basic optimizations:

If a relocated value isn't used, it doesn't need to be relocated.
If the value being relocated is null, relocation doesn't change that. (Technically, this might be collector specific. I don't know of one which it doesn't work for though.)
If the value being relocated is undef, the relocation is meaningless.
If the value being relocated was known nonnull, the relocated pointer also isn't null. (Since it points to the same source language object.)

I outlined other planned work in comments.

Differential Revision: http://reviews.llvm.org/D6600

llvm-svn: 224968

9db26ffc

Refine the notion of MayThrow in LICM to include a header specific version · b35f46ce

Philip Reames authored Dec 29, 2014

In LICM, we have a check for an instruction which is guaranteed to execute and thus can't introduce any new faults if moved to the preheader. To handle a function which might unconditionally throw when first called, we check for any potentially throwing call in the loop and give up.

This is unfortunate when the potentially throwing condition is down a rare path. It prevents essentially all LICM of potentially faulting instructions where the faulting condition is checked outside the loop. It also greatly diminishes the utility of loop unswitching since control dependent instructions - which are now likely in the loops header block - will not be lifted by subsequent LICM runs.

define void @nothrow_header(i64 %x, i64 %y, i1 %cond) {
; CHECK-LABEL: nothrow_header
; CHECK-LABEL: entry
; CHECK: %div = udiv i64 %x, %y
; CHECK-LABEL: loop
; CHECK: call void @use(i64 %div)
entry:
  br label %loop
loop: ; preds = %entry, %for.inc
  %div = udiv i64 %x, %y
  br i1 %cond, label %loop-if, label %exit
loop-if:
  call void @use(i64 %div)
  br label %loop
exit:
  ret void
}

The current patch really only helps with non-memory instructions (i.e. divs, etc..) since the maythrow call down the rare path will be considered to alias an otherwise hoistable load.  The one exception is that it does kick in for loads which are known to be invariant without regard to other possible stores, i.e. those marked with either !invarant.load metadata of tbaa 'is constant memory' metadata.

Differential Revision: http://reviews.llvm.org/D6725

llvm-svn: 224965

b35f46ce

Dec 29, 2014

Loading from null is valid outside of addrspace 0 · 5ad26c35

Philip Reames authored Dec 29, 2014

This patches fixes a miscompile where we were assuming that loading from null is undefined and thus we could assume it doesn't happen. This transform is perfectly legal in address space 0, but is not neccessarily legal in other address spaces.

We really should introduce a hook to control this property on a per target per address space basis. We may be loosing valuable optimizations in some address spaces by being too conservative.

Original patch by Thomas P Raoux (submitted to llvm-commits), tests and formatting fixes by me.

llvm-svn: 224961

5ad26c35

Dec 26, 2014

InstCombine: Infer nuw for multiplies · b1296ec0

David Majnemer authored Dec 26, 2014

A multiply cannot unsigned wrap if there are bitwidth, or more, leading
zero bits between the two operands.

llvm-svn: 224849

b1296ec0

InstCombe: Infer nsw for multiplies · 54c2ca25

David Majnemer authored Dec 26, 2014

We already utilize this logic for reducing overflow intrinsics, it makes
sense to reuse it for normal multiplies as well.

llvm-svn: 224847

54c2ca25

Dec 25, 2014
- Masked Load/Store - Changed the order of parameters in intrinsics. · fb81b93e
  Elena Demikhovsky authored Dec 25, 2014
```
No functional changes.
The documentation is coming.

llvm-svn: 224829
```
  fb81b93e
Dec 24, 2014

[SROA] Update the documentation and names for accessing the slices · ffb7ce56

Chandler Carruth authored Dec 24, 2014

within a partition of an alloca in SROA.

This reflects the fact that the organization of the slices isn't really
ideal for analysis, but is the naive way in which the slices are
available while we're processing them in the core partitioning
algorithm.

It is possible we could improve matters, and I've left a FIXME with
one of my ideas for how to do this, but it is a lot of work, the benefit
is somewhat minor, and it isn't clear that it would be strictly better.
=/ Not really satisfying, but I'm out of really good ideas.

This also improves one place where the debug logging failed to mark some
split partitions. Now we log in one place, slightly later, and with
accurate information about whether the slice is split by the partition
being rewritten.

llvm-svn: 224800

ffb7ce56

[SROA] Refactor the integer and vector promotion testing logic to · 5031bbe8

Chandler Carruth authored Dec 24, 2014

operate in terms of the new Partition class, and generally have a more
clear set of arguments. No functionality changed.

The most notable improvements here are consistently using the
terminology of 'partition' for a collection of slices that will be
rewritten together and 'slice' for a region of an alloca that is used by
a particular instruction.

This also makes it more clear that the split things are actually slices
as well, just ones that will be split by the proposed partition.

This doesn't yet address the confusing aspects of the partition's
interface where slices that will be split by the partition and start
prior to the partition are accesssed via Partition::splitSlices() while
the core range of slices exposed by a Partition includes both unsplit
slices and slices which will be split by the end, but started within the
offset range of the partition. This is particularly hard to address
because the algorithm which computes partitions quite literally doesn't
know which slices these will end up being until too late. I'm looking at
whether I can fix that or not, but I'm not optimistic. I'll update the
comments and/or names to further explain this either way. I've also
added one FIXME in this patch relating to this confusion so that I don't
forget about it.

llvm-svn: 224798

5031bbe8

Dec 23, 2014

[asan] change the coverage collection scheme so that we can easily emit... · 9fdeb37b

Kostya Serebryany authored Dec 23, 2014

[asan] change the coverage collection scheme so that we can easily emit coverage for the entire process as a single bit set, and if coverage_bitset=1 actually emit that bitset

llvm-svn: 224789

9fdeb37b

[SimplifyCFG] Revise common code sinking · 5313da32

Michael Liao authored Dec 23, 2014

- Fix the case where more than 1 common instructions derived from the same
  operand cannot be sunk. When a pair of value has more than 1 derived values
  in both branches, only 1 derived value could be sunk.
- Replace BB1 -> (BB2, PN) map with joint value map, i.e.
  map of (BB1, BB2) -> PN, which is more accurate to track common ops.

llvm-svn: 224757

5313da32

Remove a bad cast in CloneModule() · 0bf33ffd

Michael Kuperstein authored Dec 23, 2014

A cast that was introduced in r209007 was accidentally left in after the changes made to GlobalAlias rules in r210062. This crashes if the aliasee is a now-leggal ConstantExpr.

llvm-svn: 224756

0bf33ffd

Revert r224739: Debug info: Teach SROA how to update debug info for · c7d1e24b

Chandler Carruth authored Dec 23, 2014

fragmented variables.

This caused codegen to start crashing when we built somewhat large
programs with debug info and optimizations. 'check-msan' hit in, and
I suspect a bootstrap would as well. I mailed a test case to the
review thread.

llvm-svn: 224750

c7d1e24b

Remove dynamic allocation/indirection from GCOVBlocks owned by GCOVFunction · ea37c117

David Blaikie authored Dec 22, 2014

Since these are all created in the DenseMap before they are referenced,
there's no problem with pointer validity by the time it's required. This
removes another use of DeleteContainerSeconds/manual memory management
which I'm cleaning up from time to time.

llvm-svn: 224744

ea37c117

Dec 22, 2014

[SROA] Lift the logic for traversing the alloca slices one partition at · e2f66cee

Chandler Carruth authored Dec 22, 2014

a time into a partition iterator and a Partition class.

There is a lot of knock-on simplification that this enables, largely
stemming from having a Partition object to refer to in lots of helpers.
I've only done a minimal amount of that because enoguh stuff is changing
as-is in this commit.

This shouldn't change any observable behavior. I've worked hard to
preserve the *exact* traversal semantics which were originally present
even though some of them make no sense. I'll be changing some of this in
subsequent commits now that the logic is carefully factored into
a reusable place.

The primary motivation for this change is to break the rewriting into
phases in order to support more intelligent rewriting. For example, I'm
planning to change how split loads and stores are rewritten to remove
the significant overuse of integer bit packing in the resulting code and
allow more effective secondary splitting of aggregates. For any of this
to work, they have to share the exact traversal logic.

llvm-svn: 224742

e2f66cee

[LCSSA] Handle PHI insertion in disjoint loops · bad65c3b

Bruno Cardoso Lopes authored Dec 22, 2014

Take two disjoint Loops L1 and L2.

LoopSimplify fails to simplify some loops (e.g. when indirect branches
are involved). In such situations, it can happen that an exit for L1 is
the header of L2. Thus, when we create PHIs in one of such exits we are
also inserting PHIs in L2 header.

This could break LCSSA form for L2 because these inserted PHIs can also
have uses in L2 exits, which are never handled in the current
implementation. Provide a fix for this corner case and test that we
don't assert/crash on that.

Differential Revision: http://reviews.llvm.org/D6624

rdar://problem/19166231

llvm-svn: 224740

bad65c3b

Debug info: Teach SROA how to update debug info for fragmented variables. · a47ace59

Adrian Prantl authored Dec 22, 2014

This allows us to generate debug info for extremely advanced code such as

  typedef struct { long int a; int b;} S;

  int foo(S s) {
    return s.b;
  }

which at -O1 on x86_64 is codegen'd into

  define i32 @foo(i64 %s.coerce0, i32 %s.coerce1) #0 {
    ret i32 %s.coerce1, !dbg !24
  }

with this patch we emit the following debug info for this

  TAG_formal_parameter [3]
    AT_location( 0x00000000
                 0x0000000000000000 - 0x0000000000000006: rdi, piece 0x00000008, rsi, piece 0x00000004
                 0x0000000000000006 - 0x0000000000000008: rdi, piece 0x00000008, rax, piece 0x00000004 )
                 AT_name( "s" )
                 AT_decl_file( "/Volumes/Data/llvm/_build.ninja.release/test.c" )

Thanks to chandlerc, dblaikie, and echristo for their feedback on all
previous iterations of this patch!

llvm-svn: 224739

a47ace59

Dec 20, 2014

InstCombine: Squash an icmp+select into bitwise arithmetic · b0362e4e

David Majnemer authored Dec 20, 2014

(X & INT_MIN) == 0 ? X ^ INT_MIN : X  into  X | INT_MIN
(X & INT_MIN) != 0 ? X ^ INT_MIN : X  into  X & INT_MAX

This fixes PR21993.

llvm-svn: 224676

b0362e4e

[SROA] Run clang-format over the entire SROA pass as I wrote it before · 113dc64c

Chandler Carruth authored Dec 20, 2014

much of the glory of clang-format, and now any time I touch it I risk
introducing formatting changes as part of a functional commit.

Also, clang-format is *way* better at formatting my code than I am.
Most of this is a huge improvement although I reverted a couple of
places where I hit a clang-format bug with lambdas that has been filed
but not (fully) fixed.

llvm-svn: 224666

113dc64c

Dec 19, 2014

[BBVectorize] Remove two more redundant assignments. · be98f3c4
Tilmann Scheller authored Dec 19, 2014
```
Found by the Clang static analyzer.

llvm-svn: 224590
```
be98f3c4
[BBVectorize] Remove redundant assignment. · 945ce0ae
Tilmann Scheller authored Dec 19, 2014
```
Found by the Clang static analyzer.

llvm-svn: 224589
```
945ce0ae

Reapply: [InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr · f6cf8ad4

Bruno Cardoso Lopes authored Dec 19, 2014

The visitSwitchInst generates SUB constant expressions to recompute the
switch condition. When truncating the condition to a smaller type, SUB
expressions should use the previous type (before trunc) for both
operands. Also, fix code to also return the modified switch when only
the truncation is performed.

This fixes an assertion crash.

Differential Revision: http://reviews.llvm.org/D6644

rdar://problem/19191835

llvm-svn: 224588

f6cf8ad4

[LoopVectorize] Remove redundant assignment. · b811030b
Tilmann Scheller authored Dec 19, 2014
```
Found by the Clang static analyzer.

llvm-svn: 224587
```
b811030b

use -0.0 when creating an fneg instruction · ea3c8028

Sanjay Patel authored Dec 19, 2014

Backends recognize (-0.0 - X) as the canonical form for fneg
and produce better code. Eg, ppc64 with 0.0:

   lis r2, ha16(LCPI0_0)
   lfs f0, lo16(LCPI0_0)(r2)
   fsubs f1, f0, f1
   blr

vs. -0.0:

   fneg f1, f1
   blr

Differential Revision: http://reviews.llvm.org/D6723

llvm-svn: 224583

ea3c8028

Revert "[InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr" · 3be15b2f

Bruno Cardoso Lopes authored Dec 19, 2014

Reverts commit r224574 to appease buildbots:

The visitSwitchInst generates SUB constant expressions to recompute the
switch condition. When truncating the condition to a smaller type, SUB
expressions should use the previous type (before trunc) for both
operands. This fixes an assertion crash.

llvm-svn: 224576

3be15b2f

[InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr · c9005f2f

Bruno Cardoso Lopes authored Dec 19, 2014

The visitSwitchInst generates SUB constant expressions to recompute the
switch condition. When truncating the condition to a smaller type, SUB
expressions should use the previous type (before trunc) for both
operands. This fixes an assertion crash.

Differential Revision: http://reviews.llvm.org/D6644

rdar://problem/19191835

llvm-svn: 224574

c9005f2f