Commits · 1d68112c4b9ec7502ed9776555fd499cb2483347 · Lorenzo Albano / LLVM bpEVL

Jan 25, 2018

[InstCombine] narrow masked zexted binops (PR35792) · 1d68112c

Sanjay Patel authored Jan 25, 2018

This is guarded by shouldChangeType(), so the tests show that
we don't do the fold if the narrower type is not legal. Note
that there is a proposal (D42424) that would change the results
for the specific cases shown in these tests. That difference is
also discussed in PR35792:
https://bugs.llvm.org/show_bug.cgi?id=35792

Alive proofs for the cases handled here as well as the bitwise 
logic binops that we should already do better on:
https://rise4fun.com/Alive/c97
https://rise4fun.com/Alive/Lc5E
https://rise4fun.com/Alive/kdf

llvm-svn: 323437

1d68112c

[InstCombine] add tests for PR35792; NFC · 0f95dd23
Sanjay Patel authored Jan 25, 2018
```
llvm-svn: 323436
```
0f95dd23

Jan 24, 2018

[InstCombine] fix datalayout in test file · 60c13c77

Sanjay Patel authored Jan 24, 2018

The only part of the datalayout that should matter for these tests
is the part that specifies the legal int widths ('n*'). But there
was a bug - that part of the string was not correctly separated with
the expected '-' character, so we were testing as if there were no
legal int widths at all. Removed the leading cruft so we have some 
legal ints to test with.

I noticed this while testing a potential change to the way we 
transform shifts and sexts in D42424.

llvm-svn: 323377

60c13c77

Jan 21, 2018

[InstCombine] (X << Y) / X -> 1 << Y · 9530f188

Sanjay Patel authored Jan 21, 2018

...when the shift is known to not overflow with the matching
signed-ness of the division.

This closes an optimization gap caused by canonicalizing mul
by power-of-2 to shl as shown in PR35709:
https://bugs.llvm.org/show_bug.cgi?id=35709

Patch by Anton Bikineev!

Differential Revision: https://reviews.llvm.org/D42032

llvm-svn: 323068

9530f188

Jan 20, 2018
- [InstCombine] add baseline tests for (X << Y) / X -> 1 << Y; NFC · 43913218
  Sanjay Patel authored Jan 20, 2018
```
This fold is proposed in D42032.

llvm-svn: 323043
```
  43913218
Jan 19, 2018

Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) · 1e68724d

Daniel Neilson authored Jan 19, 2018

Summary:
This is a resurrection of work first proposed and discussed in Aug 2015:
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
require that the alignments for source & dest be equal.

For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use
getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965

1e68724d

[InstCombine] Make foldSelectOpOp able to handle two-operand getelementptr · 2867bd72

John Brawn authored Jan 19, 2018

Three (or more) operand getelementptrs could plausibly also be handled, but
handling only two-operand fits in easily with the existing BinaryOperator
handling.

Differential Revision: https://reviews.llvm.org/D39958

llvm-svn: 322930

2867bd72

Jan 17, 2018
- [InstCombine] add baseline tests for D39958; NFC · 218a0b51
  Sanjay Patel authored Jan 17, 2018
```
llvm-svn: 322733
```
  218a0b51
- [InstCombine] fix demanded-bits propagation for zext/trunc · aa766efd
  Sanjay Patel authored Jan 17, 2018
```
I was comparing the demanded-bits implementations between InstCombine
and TargetLowering as part of investigating questions in D42088 and
noticed that this was wrong in IR. We were losing all of the prior
known bits when we got back to the 'zext'.

llvm-svn: 322662
```
  aa766efd
- [InstCombine] add test to show hole in demanded bits; NFC · 178deccb
  Sanjay Patel authored Jan 17, 2018
```
llvm-svn: 322660
```
  178deccb
Jan 11, 2018

[InstCombine] Apply the fix from r322284 for sin / cos -> tan too · 738e6e7c
Benjamin Kramer authored Jan 11, 2018
```
llvm-svn: 322285
```
738e6e7c

[InstCombine] For cos/sin -> tan copy attributes from cos instead of the · 44993ede

Benjamin Kramer authored Jan 11, 2018

parent function

Ideally we should merge the attributes from the functions somehow, but
this is obviously an improvement over taking random attributes from the
caller which will trip up the verifier if they're nonsensical for an
unary intrinsic call.

llvm-svn: 322284

44993ede

[ValueTracking] recognize min/max-of-min/max with notted ops (PR35875) · e63d8dda

Sanjay Patel authored Jan 11, 2018

This was originally planned as the fix for:
https://bugs.llvm.org/show_bug.cgi?id=35834
...but simpler transforms handled that case, so I implemented a 
lesser solution. It turns out we need to handle the case with 'not'
ops too because the real code example that we are trying to solve:
https://bugs.llvm.org/show_bug.cgi?id=35875
...has extra uses of the intermediate values, so we can't rely on 
smaller canonicalizations to get us to the goal.

As with rL321672, I've tried to show every possibility in the
codegen tests because that's the simplest way to prove we're doing
the right thing in the wide variety of permutations of this pattern.

We can also show an InstCombine win because we added a fold for
this case in:
rL321998 / D41603

An Alive proof for one variant of the pattern to show that the 
InstCombine and codegen results are correct:
https://rise4fun.com/Alive/vd1

Name: min3_nots
  %nx = xor i8 %x, -1
  %ny = xor i8 %y, -1
  %nz = xor i8 %z, -1
  %cmpxz = icmp slt i8 %nx, %nz
  %minxz = select i1 %cmpxz, i8 %nx, i8 %nz
  %cmpyz = icmp slt i8 %ny, %nz
  %minyz = select i1 %cmpyz, i8 %ny, i8 %nz
  %cmpyx = icmp slt i8 %y, %x
  %r = select i1 %cmpyx, i8 %minxz, i8 %minyz
=>
  %cmpxyz = icmp slt i8 %minxz, %ny
  %r = select i1 %cmpxyz, i8 %minxz, i8 %ny

Name: min3_nots_alt
  %nx = xor i8 %x, -1
  %ny = xor i8 %y, -1
  %nz = xor i8 %z, -1
  %cmpxz = icmp slt i8 %nx, %nz
  %minxz = select i1 %cmpxz, i8 %nx, i8 %nz
  %cmpyz = icmp slt i8 %ny, %nz
  %minyz = select i1 %cmpyz, i8 %ny, i8 %nz
  %cmpyx = icmp slt i8 %y, %x
  %r = select i1 %cmpyx, i8 %minxz, i8 %minyz
=>
  %xz = icmp sgt i8 %x, %z
  %maxxz = select i1 %xz, i8 %x, i8 %z
  %xyz = icmp sgt i8 %maxxz, %y
  %maxxyz = select i1 %xyz, i8 %maxxz, i8 %y
  %r = xor i8 %maxxyz, -1

llvm-svn: 322283

e63d8dda

[InstCombine] add min3-with-nots test (PR35875); NFC · e0df4650
Sanjay Patel authored Jan 11, 2018
```
llvm-svn: 322281
```
e0df4650

[InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x) · e5fbf591

Dmitry Venikov authored Jan 11, 2018

Summary: This patch enables folding sin(x) / cos(x) -> tan(x), cos(x) / sin(x) -> 1 / tan(x) under -ffast-math flag

Reviewers: hfinkel, spatel

Reviewed By: spatel

Subscribers: andrew.w.kaylor, efriedma, scanon, llvm-commits

Differential Revision: https://reviews.llvm.org/D41286

llvm-svn: 322255

e5fbf591

Jan 10, 2018

[InstCombine] add test to show missed bswap; NFC · d04026ea

Sanjay Patel authored Jan 10, 2018

D41353 / D41233 are proposing to alter the shl/and canonicalization,
but I think that would just move an existing pattern-matching hole
to a different place.

llvm-svn: 322206

d04026ea

Jan 09, 2018
- [InstCombine] weaken assertions for icmp folds (PR35846) · 6fb1357c
  Sanjay Patel authored Jan 09, 2018
```
Because of potential UB (known bits conflicts with an llvm.assume),
we have to check rather than assert here because InstSimplify doesn't
kill the compare:
https://bugs.llvm.org/show_bug.cgi?id=35846

llvm-svn: 322104
```
  6fb1357c
- [InstCombine] Check for out of range ashr values using APInt before calling getZExtValue · 5d909be9
  Simon Pilgrim authored Jan 09, 2018
```
Reduced from oss-fuzz #5032 test case

llvm-svn: 322078
```
  5d909be9
- [InstCombine] Add pow2 mul -> shl tests for vectors with uniform/non-uniform constants · 94357afd
  Simon Pilgrim authored Jan 09, 2018
```
llvm-svn: 322072
```
  94357afd
Jan 08, 2018

[ValueTracking] remove overzealous assert · 7dfe96ad

Sanjay Patel authored Jan 08, 2018

The test is derived from a failing fuzz test:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5008

Credit to @rksimon for pointing out the problem.

llvm-svn: 322016

7dfe96ad

[InstCombine] fold min/max tree with common operand (PR35717) · 31b4b76f

Sanjay Patel authored Jan 08, 2018

There is precedence for factorization transforms in instcombine for FP ops with fast-math. 
We also have similar logic in foldSPFofSPF().

It would take more work to add this to reassociate because that's specialized for binops, 
and min/max are not binops (or even single instructions). Also, I don't have evidence that 
larger min/max trees than this exist in real code, but if we find that's true, we might
want to reorganize where/how we do this optimization.

In the motivating example from https://bugs.llvm.org/show_bug.cgi?id=35717 , we have:

int test(int xc, int xm, int xy) {
  int xk;
  if (xc < xm)
    xk = xc < xy ? xc : xy;
  else
    xk = xm < xy ? xm : xy;
  return xk;
}

This patch solves that problem because we recognize more min/max patterns after rL321672

https://rise4fun.com/Alive/Qjne
https://rise4fun.com/Alive/3yg

Differential Revision: https://reviews.llvm.org/D41603

llvm-svn: 321998

31b4b76f

Jan 06, 2018

[InstCombine] relax use constraint for min/max (~a, ~b) --> ~min/max(a, b) · 26a6fcde

Sanjay Patel authored Jan 06, 2018

In the minimal case, this won't remove instructions, but it still improves
uses of existing values.

In the motivating example from PR35834, it does remove instructions, and
sets that case up to be optimized by something like D41603:
https://reviews.llvm.org/D41603

llvm-svn: 321936

26a6fcde

[InstCombine] add more tests for max(~a, ~b) and PR35834; NFC · f7e77529
Sanjay Patel authored Jan 06, 2018
```
llvm-svn: 321935
```
f7e77529

Jan 05, 2018

[InstCombine] add folds for min(~a, b) --> ~max(a, b) · 5b6aacf2

Sanjay Patel authored Jan 05, 2018

Besides the bug of omitting the inverse transform of max(~a, ~b) --> ~min(a, b),
the use checking and operand creation were off. We were potentially creating 
repeated identical instructions of existing values. This led to infinite
looping after I added the extra folds.

By using the simpler m_Not matcher and not creating new 'not' ops for a and b,
we avoid that problem. It's possible that not using IsFreeToInvert() here is
more limiting than the simpler matcher, but there are no tests for anything
more exotic. It's also possible that we should relax the use checking further
to handle a case like PR35834:
https://bugs.llvm.org/show_bug.cgi?id=35834
...but we can make that a follow-up if it is needed. 

llvm-svn: 321882

5b6aacf2

Jan 04, 2018
- [InstCombine] safely create a constant of the right type (PR35794) · c63f9014
  Sanjay Patel authored Jan 04, 2018
```
llvm-svn: 321801
```
  c63f9014
Jan 03, 2018
- [InstCombine] Check for out of range shift values using APInt before calling getZExtValue · 3bf2d645
  Simon Pilgrim authored Jan 03, 2018
```
Reduced from oss-fuzz #4871 test case

llvm-svn: 321748
```
  3bf2d645
- [InstCombine] Add test to remove VarArg casts (NFC) · dcc0ba9b
  Florian Hahn authored Jan 03, 2018
```
llvm-svn: 321706
```
  dcc0ba9b
Jan 02, 2018

[InstCombine] Missed optimization in math expression: squashing sqrt functions · a58d8deb

Dmitry Venikov authored Jan 02, 2018

Summary: This patch enables folding under -ffast-math flag sqrt(a) * sqrt(b) -> sqrt(a*b)

Reviewers: hfinkel, spatel, davide

Reviewed By: spatel, davide

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D41322

llvm-svn: 321637

a58d8deb

Jan 01, 2018
- [ValueTracking] Don't assume shift values are in range · 6720726d
  Simon Pilgrim authored Jan 01, 2018
```
Reduced (as best I could...) from oss-fuzz #4857 test case

llvm-svn: 321634
```
  6720726d
- [InstCombine] Regenerate udiv tests. · af35f5ec
  Simon Pilgrim authored Jan 01, 2018
```
llvm-svn: 321633
```
  af35f5ec
Dec 30, 2017

[instsimplify] consistently handle undef and out of bound indices for... · e499bc30

Philip Reames authored Dec 30, 2017

[instsimplify] consistently handle undef and out of bound indices for insertelement and extractelement

In one case, we were handling out of bounds, but not undef indices.  In the other, we were handling undef (with the comment making the analogy to out of bounds), but not out of bounds.  Be consistent and treat both undef and constant out of bounds indices as producing undefined results.

As a side effect, this also protects instcombine from having to handle large constant indices as we always simplify first.

llvm-svn: 321575

e499bc30

Add another test case for r321489 · 8e1abe4a

Philip Reames authored Dec 30, 2017

Went to reduce another fuzzer failure to find it's already been fixed, but the test case is slightly different so it's worth adding anyways.

Reduced from oss-fuzz #4768 test case

llvm-svn: 321573

8e1abe4a

Move tests associated with transforms moved in r321467 · 3e9c6719
Philip Reames authored Dec 30, 2017
```
llvm-svn: 321572
```
3e9c6719

Dec 28, 2017
- [InstCombine] Check for isa<Instruction> before using cast<> · 472689a1
  Simon Pilgrim authored Dec 28, 2017
```
Protects against casts from constexpr etc.

Reduced from oss-fuzz #4788 test case

llvm-svn: 321515
```
  472689a1
Dec 27, 2017
- [InstCombine] add tests for min/max folds (PR35717); NFC · 84d54c3d
  Sanjay Patel authored Dec 27, 2017
```
llvm-svn: 321500
```
  84d54c3d
- [InstCombine] Gracefully handle out of range extractelement indices · e7d032f1
  Simon Pilgrim authored Dec 27, 2017
```
InstSimplify is responsible for handling these, but we shouldn't just assert here.

Reduced from oss-fuzz #4808 test case

llvm-svn: 321489
```
  e7d032f1
- [instcombine] add powi(x, 2) -> x * x · cd13a663
  Philip Reames authored Dec 27, 2017
```
llvm-svn: 321468
```
  cd13a663
Dec 26, 2017

[InstCombine] fix miscompile of frem with 0.0 operand (PR34870) · 14adbacd

Sanjay Patel authored Dec 26, 2017

We might want to select NAN here or do this transform with fast-math,
but this should at least fix the miscompile.

llvm-svn: 321461

14adbacd

[InstCombine] add test for frem with 0.0 (PR34870); NFC · 546c43fd
Sanjay Patel authored Dec 26, 2017
```
llvm-svn: 321460
```
546c43fd

[ValueTracking] ignore FP signed-zero when detecting a casted-to-integer fmin/fmax pattern · 9a39979d

Sanjay Patel authored Dec 26, 2017

This is a preliminary step for the patch discussed in D41136 (and denoted here with the FIXME comment).

When we match an FP min/max that is cast to integer, any intermediate difference between +0.0 or -0.0
should be muted in the result by the conversion (either fptosi or fptoui) of the result. Thus, we can
enable 'nsz' for the purpose of matching fmin/fmax.

Note that there's probably room to generalize this more, possibly by fixing the current calls to the
weak version of isKnownNonZero() in matchSelectPattern() to the more powerful recursive version.

Differential Revision: https://reviews.llvm.org/D41333

llvm-svn: 321456

9a39979d