Commits · c54920a123fd3049bdbaa5413c0d5d7e593a4668 · Lorenzo Albano / LLVM bpEVL

Mar 23, 2015
- [ctorutils] Update and sort includes. NFC. · 1f7c328b
  Benjamin Kramer authored Mar 23, 2015
```
llvm-svn: 232995
```
  1f7c328b
- Another set of missing raw_ostream.h. Still no functional change. · b85d3756
  Benjamin Kramer authored Mar 23, 2015
```
llvm-svn: 232993
```
  b85d3756
- Purge unused includes throughout libSupport. · 16132e6f
  Benjamin Kramer authored Mar 23, 2015
```
NFC.

llvm-svn: 232976
```
  16132e6f
- Move private classes into anonymous namespaces · 51f6096c
  Benjamin Kramer authored Mar 23, 2015
```
NFC.

llvm-svn: 232944
```
  51f6096c
Mar 21, 2015

[SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform. · d6aa0ec7
Benjamin Kramer authored Mar 21, 2015
```
llvm-svn: 232903
```
d6aa0ec7

[SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check. · 7857d723

Benjamin Kramer authored Mar 21, 2015

strchr("123!", C) != nullptr is a common pattern to check if C is one
of 1, 2, 3 or !. If the largest element of the string is smaller than
the target's register size we can easily create a bitfield and just
do a simple test for set membership.

int foo(char C) { return strchr("123!", C) != nullptr; } now becomes

	cmpl	$64, %edi ## range check
	sbbb	%al, %al
	movabsq	$0xE000200000001, %rcx
	btq	%rdi, %rcx ## bit test
	sbbb	%cl, %cl
	andb	%al, %cl ## and the two conditions
	andb	$1, %cl
	movzbl	%cl, %eax ## returning an int
	ret

(imho the backend should expand this into a series of branches, but
that's a different story)

The code is currently limited to bit fields that fit in a register, so
usually 64 or 32 bits. Sadly, this misses anything using alpha chars
or {}. This could be fixed by just emitting a i128 bit field, but that
can generate really ugly code so we have to find a better way. To some
degree this is also recreating switch lowering logic, but we can't
simply emit a switch instruction and thus change the CFG within
instcombine.

llvm-svn: 232902

7857d723

SimplifyLibCalls: Add basic optimization of memchr calls. · 691363e7
Benjamin Kramer authored Mar 21, 2015
```
This is just memchr(x, y, 0) -> nullptr and constant folding.

llvm-svn: 232896
```
691363e7
[sanitizer] experimental tracing for cmp instructions · f4e35cc4
Kostya Serebryany authored Mar 21, 2015
```
llvm-svn: 232873
```
f4e35cc4

Mar 20, 2015

[X86, AVX] instcombine common cases of vperm2* intrinsics into shuffles · ccf5f24b

Sanjay Patel authored Mar 20, 2015

vperm2* intrinsics are just shuffles. 
In a few special cases, they're not even shuffles.

Optimizing intrinsics in InstCombine is better than
handling this in the front-end for at least two reasons:

1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders
   really angry (and so I have regrets about some patches from last week).

2. Doing mask conversion logic in header files is hard to write and 
   subsequently read.

There are a couple of TODOs in this patch to complete this optimization.

Differential Revision: http://reviews.llvm.org/D8486

llvm-svn: 232852

ccf5f24b

Fixing a bug with WinEH PHI handling · 3170e562
Andrew Kaylor authored Mar 20, 2015
```
llvm-svn: 232851
```
3170e562

SanitizerCoverage: Check for null DebugLocs · 18c97fa2

Duncan P. N. Exon Smith authored Mar 20, 2015

After a WIP patch to make `DIDescriptor` accessors more strict, this
started asserting.

llvm-svn: 232832

18c97fa2

SampleProfile: Check for missing debug locations · 41a1546e

Duncan P. N. Exon Smith authored Mar 20, 2015

Don't use `DebugLoc` accessors if we're pointing at null, which will be
a problem after a WIP patch to make the `DIDescriptor` accessors more
strict.  Caught by Frontend/profile-sample-use-loc-tracking.c (in
clang).

llvm-svn: 232792

41a1546e

Mar 19, 2015

Verifier: Remove the separate -verify-di pass · ab58a568

Duncan P. N. Exon Smith authored Mar 19, 2015

Remove `DebugInfoVerifierLegacyPass` and the `-verify-di` pass.
Instead, call into the `DebugInfoVerifier` from inside
`VerifierLegacyPass::finalizeModule()`.  This better matches the logic
in `verifyModule()` (used by the new PassManager), avoids requiring two
separate passes to verify the IR, and makes the API for "add a pass to
verify the IR" simple.

Note: the `-verify-debug-info` flag still works (for now, at least;
eventually it might make sense to just remove it).

llvm-svn: 232772

ab58a568

LowerBitSets: Avoid reusing byte set addresses. · 994ba3d2

Peter Collingbourne authored Mar 19, 2015

Each use of the byte array uses a different alias. This makes the
backend less likely to reuse previously computed byte array addresses,
improving the security of the CFI mechanism based on this pass.

Differential Revision: http://reviews.llvm.org/D8455

llvm-svn: 232770

994ba3d2

libLTO, llvm-lto, gold: Introduce flag for controlling optimization level. · 070843d6

Peter Collingbourne authored Mar 19, 2015

This change also introduces a link-time optimization level of 1. This
optimization level runs only the globaldce pass as well as cleanup passes for
passes that run at -O0, specifically simplifycfg which cleans up lowerbitsets.

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266951.html

llvm-svn: 232769

070843d6

PassManagerBuilder: Remove effectively dead 'StripDebug' option · 0a93e2db

Duncan P. N. Exon Smith authored Mar 19, 2015

`StripDebug` was only used by tools/opt/opt.cpp in
`AddStandardLinkPasses()`, but opt.cpp adds the same pass based on its
command-line flag before it calls `AddStandardLinkPasses()`.  Stripping
debug info twice isn't very useful.

llvm-svn: 232765

0a93e2db

GlobalDCE: Improve performance for large modules containing comdats. · 0dbc7088

Peter Collingbourne authored Mar 19, 2015

When we encounter a global with a comdat, rather than iterating over
every global in the module to find globals in the same comdat, store the
members in a multimap. This effectively lowers the complexity to O(N log N),
improving performance significantly for large modules such as might be
encountered during LTO.

It looks like we used to do something like this until r219191.

No functional change.

Differential Revision: http://reviews.llvm.org/D8431

llvm-svn: 232743

0dbc7088

[InstCombine] Don't fold a GEP into itself through a PHI node · 5add63f2

Daniel Jasper authored Mar 19, 2015

This can only occur (I think) through the back-edge of the loop.

However, folding a GEP into itself means that the value of the previous
iteration needs to be stored in the meantime, thus requiring an
additional register variable to be live, but not actually achieving
anything (the gep still needs to be executed once per loop iteration).

The attached test case is derived from:
  typedef unsigned uint32;
  typedef unsigned char uint8;
  inline uint8 *f(uint32 value, uint8 *target) {
    while (value >= 0x80) {
      value >>= 7;
      ++target;
    }
    ++target;
    return target;
  }
  uint8 *g(uint32 b, uint8 *target) {
    target = f(b, f(42, target));
    return target;
  }

What happens is that the GEP stored in incptr2 is folded into itself
through the loop's back-edge and the phi-node stored in loopptr,
effectively incrementing the ptr by "2" in each iteration instead of "1".

In this case, it is actually increasing the number of GEPs required as
the GEP before the loop can't be folded away anymore. For comparison:

With this patch:
  define i8* @test4(i32 %value, i8* %buffer) {
  entry:
    %cmp = icmp ugt i32 %value, 127
    br i1 %cmp, label %loop.header, label %exit

  loop.header:                                      ; preds = %entry
    br label %loop.body

  loop.body:                                        ; preds = %loop.body, %loop.header
    %buffer.pn = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
    %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
    %loopptr = getelementptr inbounds i8, i8* %buffer.pn, i64 1
    %shr = lshr i32 %newval, 7
    %cmp2 = icmp ugt i32 %newval, 16383
    br i1 %cmp2, label %loop.body, label %loop.exit

  loop.exit:                                        ; preds = %loop.body
    br label %exit

  exit:                                             ; preds = %loop.exit, %entry
    %0 = phi i8* [ %loopptr, %loop.exit ], [ %buffer, %entry ]
    %incptr3 = getelementptr inbounds i8, i8* %0, i64 2
    ret i8* %incptr3
  }

Without this patch:
  define i8* @test4(i32 %value, i8* %buffer) {
  entry:
    %incptr = getelementptr inbounds i8, i8* %buffer, i64 1
    %cmp = icmp ugt i32 %value, 127
    br i1 %cmp, label %loop.header, label %exit

  loop.header:                                      ; preds = %entry
    br label %loop.body

  loop.body:                                        ; preds = %loop.body, %loop.header
    %0 = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
    %loopptr = phi i8* [ %incptr, %loop.header ], [ %incptr2, %loop.body ]
    %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
    %shr = lshr i32 %newval, 7
    %incptr2 = getelementptr inbounds i8, i8* %0, i64 2
    %cmp2 = icmp ugt i32 %newval, 16383
    br i1 %cmp2, label %loop.body, label %loop.exit

  loop.exit:                                        ; preds = %loop.body
    br label %exit

  exit:                                             ; preds = %loop.exit, %entry
    %ptr2 = phi i8* [ %incptr2, %loop.exit ], [ %incptr, %entry ]
    %incptr3 = getelementptr inbounds i8, i8* %ptr2, i64 1
    ret i8* %incptr3
  }

Review: http://reviews.llvm.org/D8245
llvm-svn: 232718

5add63f2

Mar 18, 2015

[ConstantRange] Split makeICmpRegion in two. · 7182d36f

Sanjoy Das authored Mar 18, 2015

Summary:
This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and
`makeSatisfyingICmpRegion` with slightly different contracts.  The first
one is useful for determining what values some expression //may// take,
given that a certain `icmp` evaluates to true.  The second one is useful
for determining what values are guaranteed to //satisfy// a given
`icmp`.

Reviewers: nlewycky

Reviewed By: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8345

llvm-svn: 232575

7182d36f

Mar 17, 2015

Try to fix a test broken by one of my previous commits. · 9ef5671d
Michael Zolotukhin authored Mar 17, 2015
```
llvm-svn: 232536
```
9ef5671d

LoopVectorize: teach loop vectorizer to vectorize calls. · 9b3cf604

Michael Zolotukhin authored Mar 17, 2015

The tests would be committed in a commit for http://reviews.llvm.org/D8131

Review: http://reviews.llvm.org/D8095
llvm-svn: 232530

9b3cf604

LoopVectorizer: Add TargetTransformInfo. · 1d4e5251
Michael Zolotukhin authored Mar 17, 2015
```
Review: http://reviews.llvm.org/D8092
llvm-svn: 232522
```
1d4e5251
[asan] remove redundant ifndefs. NFC · b1870a64
Kostya Serebryany authored Mar 17, 2015
```
llvm-svn: 232521
```
b1870a64
[SwitchLowering] Remove incoming values in the reverse order · 24fcae8f
Michael Liao authored Mar 17, 2015
```
- To prevent invalidating *successive* indices.
 

llvm-svn: 232510
```
24fcae8f

Fix GCC -Wparentheses warning (& reformat now that the precedence is fixed) · c4dfa639

David Blaikie authored Mar 17, 2015

Benign warning (clang deliberately suppresses this case) but does
regularly produce bad formatting, so it's nice to fix/reformat.

llvm-svn: 232508

c4dfa639

asan: optimization experiments · 618d580e

Dmitry Vyukov authored Mar 17, 2015

The experiments can be used to evaluate potential optimizations that remove
instrumentation (assess false negatives). Instead of completely removing
some instrumentation, you set Exp to a non-zero value (mask of optimization
experiments that want to remove instrumentation of this instruction).
If Exp is non-zero, this pass will emit special calls into runtime
(e.g. __asan_report_exp_load1 instead of __asan_report_load1). These calls
make runtime terminate the program in a special way (with a different
exit status). Then you run the new compiler on a buggy corpus, collect
the special terminations (ideally, you don't see them at all -- no false
negatives) and make the decision on the optimization.

The exact reaction to experiments in runtime is not implemented in this patch.
It will be defined and implemented in a subsequent patch.

http://reviews.llvm.org/D8198

llvm-svn: 232502

618d580e

Use an underlying enum type of unsigned to silence a -Wmicrosoft warning about... · 0b168598

Reid Kleckner authored Mar 17, 2015

Use an underlying enum type of unsigned to silence a -Wmicrosoft warning about being unable to put (unsigned)-1 into the default underyling type of int

llvm-svn: 232498

0b168598

[IRCE] Add a -irce-print-range-checks option. · 9c1bfae6

Sanjoy Das authored Mar 17, 2015

-irce-print-range-checks prints out the set of range checks recognized
by IRCE.

llvm-svn: 232451

9c1bfae6

MapMetadata: Allow unresolved metadata if it won't change · 170c26d7

Duncan P. N. Exon Smith authored Mar 17, 2015

Allow unresolved nodes through the `MapMetadata()` if
`RF_NoModuleLevelChanges`, since there's no remapping to do anyway.

This fixes PR22929.  I'll add a clang test as a follow-up.

llvm-svn: 232449

170c26d7

[IRCE] Add comments, NFC. · 7a0b7f59

Sanjoy Das authored Mar 17, 2015

This change adds some comments that justify why a potentially
overflowing operation is safe.

llvm-svn: 232445

7a0b7f59

[IRCE] Support half-range checks. · e2cde6f1

Sanjoy Das authored Mar 17, 2015

This change to IRCE gets it to recognize "half" range checks.  Half
range checks are range checks that only either check if the index is
`slt` some positive integer ("length") or if the index is `sge` `0`.

The range solver does not try to be clever / aggressive about solving
half-range checks -- it transforms "I < L" to "0 <= I < L" and "0 <= I"
to "0 <= I < INT_SMAX".  This is safe, but not always optimal.

llvm-svn: 232444

e2cde6f1

GCOV: Make the exit block placement from r223193 optional · 3faa76bf

Justin Bogner authored Mar 16, 2015

By default we want our gcov emission to stay 4.2 compatible, which
means we need to continue emit the exit block last by default. We add
an option to emit it before the body for users that need it.

llvm-svn: 232438

3faa76bf

LowerBitSets: do not use private aliases at all on Darwin. · ad0bdcd2

Peter Collingbourne authored Mar 16, 2015

LLVM currently turns these into linker-private symbols, which can be dead
stripped by the Darwin linker.

llvm-svn: 232435

ad0bdcd2

Mar 16, 2015

[llvm] Replacing asserts with static_asserts where appropriate · fee04343

Gabor Horvath authored Mar 16, 2015

Summary:
This patch consists of the suggestions of clang-tidy/misc-static-assert check.


Reviewers: alexfh

Reviewed By: alexfh

Subscribers: xazax.hun, llvm-commits

Differential Revision: http://reviews.llvm.org/D8343

llvm-svn: 232366

fee04343

asan: fix overflows in isSafeAccess · ee842385

Dmitry Vyukov authored Mar 16, 2015

As pointed out in http://reviews.llvm.org/D7583
The current checks can cause overflows when object size/access offset cross Quintillion bytes.

http://reviews.llvm.org/D8193

llvm-svn: 232358

ee842385

One more try with unused. · d63436fb
Michael Gottesman authored Mar 16, 2015
```
llvm-svn: 232357
```
d63436fb
Add in an unreachable after a covered switch to appease certain bots. · a0d2d337
Michael Gottesman authored Mar 16, 2015
```
llvm-svn: 232356
```
a0d2d337
Remove a used that snuck in that seems to be triggering the MSVC buildbots. · c219dd1d
Michael Gottesman authored Mar 16, 2015
```
llvm-svn: 232355
```
c219dd1d
[objc-arc] Fix indentation of debug logging so it is easy to read the output. · c01ab519
Michael Gottesman authored Mar 16, 2015
```
llvm-svn: 232352
```
c01ab519

[objc-arc] Make the ARC optimizer more conservative by forcing it to be... · dd60f9bb

Michael Gottesman authored Mar 16, 2015

[objc-arc] Make the ARC optimizer more conservative by forcing it to be non-safe in both direction, but mitigate the problem by noting that we just care if there was a further use.

The problem here is the infamous one direction known safe. I was
hesitant to turn it off before b/c of the potential for regressions
without an actual bug from users hitting the problem. This is that bug ;
).

The main performance impact of having known safe in both directions is
that often times it is very difficult to find two releases without a use
in-between them since we are so conservative with determining potential
uses. The one direction known safe gets around that problem by taking
advantage of many situations where we have two retains in a row,
allowing us to avoid that problem. That being said, the one direction
known safe is unsafe. Consider the following situation:

retain(x)
retain(x)
call(x)
call(x)
release(x)

Then we know the following about the reference count of x:

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
retain(x)
// rc(x) == N+2
call A(x)
call B(x)
// rc(x) >= 1 (since we can not release a deallocated pointer).
release(x)
// rc(x) >= 0

That is all the information that we can know statically. That means that
we know that A(x), B(x) together can release (x) at most N+1 times. Lets
say that we remove the inner retain, release pair.

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
call A(x)
call B(x)
// rc(x) >= 1
release(x)
// rc(x) >= 0

We knew before that A(x), B(x) could release x up to N+1 times meaning
that rc(x) may be zero at the release(x). That is not safe. On the other
hand, consider the following situation where we have a must use of
release(x) that x must be kept alive for after the release(x)**. Then we
know that:

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
retain(x)
// rc(x) == N+2
call A(x)
call B(x)
// rc(x) >= 2 (since we know that we are going to release x and that that release can not be the last use of x).
release(x)
// rc(x) >= 1 (since we can not deallocate the pointer since we have a must use after x).
…
// rc(x) >= 1
use(x)

Thus we know that statically the calls to A(x), B(x) can together only
release rc(x) N times. Thus if we remove the inner retain, release pair:

// rc(x) == N (for some N).
retain(x)
// rc(x) == N+1
call A(x)
call B(x)
// rc(x) >= 1
…
// rc(x) >= 1
use(x)

We are still safe unless in the final … there are unbalanced retains,
releases which would have caused the program to blow up anyways even
before optimization occurred. The simplest form of must use is an
additional release that has not been paired up with any retain (if we
had paired the release with a retain and removed it we would not have
the additional use). This fits nicely into the ARC framework since
basically what you do is say that given any nested releases regardless
of what is in between, the inner release is known safe. This enables us to get
back the lost performance.

<rdar://problem/19023795>

llvm-svn: 232351

dd60f9bb