Commits · 01e5037fece0aed1cfb47ff7c604df29524e461d · Roger Ferrer / llvm-epi-0.8

Jan 13, 2014
- Re-sort #include lines again, prior to moving headers around. · 07baed53
  Chandler Carruth authored Jan 13, 2014
```
llvm-svn: 199080
```
  07baed53
Jan 12, 2014

Switch-to-lookup tables: Don't require a result for the default · ac114a3c

Hans Wennborg authored Jan 12, 2014

case when the lookup table doesn't have any holes.

This means we can build a lookup table for switches like this:

  switch (x) {
    case 0: return 1;
    case 1: return 2;
    case 2: return 3;
    case 3: return 4;
    default: exit(1);
  }

The default case doesn't yield a constant result here, but that doesn't matter,
since a default result is only necessary for filling holes in the lookup table,
and this table doesn't have any holes.

This makes us transform 505 more switches in a clang bootstrap, and shaves 164 KB
off the resulting clang binary.

llvm-svn: 199025

ac114a3c

Jan 11, 2014

LoopVectorizer: Enable strided memory accesses versioning per default · 66c742ae
Arnold Schwaighofer authored Jan 11, 2014
```
I saw no compile or execution time regressions on x86_64 -mavx -O3.

radar://13075509

llvm-svn: 199015
```
66c742ae

LoopVectorize.cpp: Appease MSC16. · 41c409ce

NAKAMURA Takumi authored Jan 11, 2014

Excuse me, I hope msc16 builders would be fine till its end day.
Introduce nullptr then. ;)

llvm-svn: 199001

41c409ce

Extend and simplify the sample profile input file. · 9518b63b

Diego Novillo authored Jan 10, 2014

1- Use the line_iterator class to read profile files.

2- Allow comments in profile file. Lines starting with '#'
   are completely ignored while reading the profile.

3- Add parsing support for discriminators and indirect call samples.

   Our external profiler can emit more profile information that we are
   currently not handling. This patch does not add new functionality to
   support this information, but it allows profile files to provide it.

   I will add actual support later on (for at least one of these
   features, I need support for DWARF discriminators in Clang).

   A sample line may contain the following additional information:

   Discriminator. This is used if the sampled program was compiled with
   DWARF discriminator support
   (http://wiki.dwarfstd.org/index.php?title=Path_Discriminators). This
   is currently only emitted by GCC and we just ignore it.

   Potential call targets and samples. If present, this line contains a
   call instruction. This models both direct and indirect calls. Each
   called target is listed together with the number of samples. For
   example,

                    130: 7  foo:3  bar:2  baz:7

   The above means that at relative line offset 130 there is a call
   instruction that calls one of foo(), bar() and baz(). With baz()
   being the relatively more frequent call target.

   Differential Revision: http://llvm-reviews.chandlerc.com/D2355

4- Simplify format of profile input file.

   This implements earlier suggestions to simplify the format of the
   sample profile file. The symbol table is not necessary and function
   profiles do not need to know the number of samples in advance.

   Differential Revision: http://llvm-reviews.chandlerc.com/D2419

llvm-svn: 198973

9518b63b

Propagation of profile samples through the CFG. · 0accb3d2

Diego Novillo authored Jan 10, 2014

This adds a propagation heuristic to convert instruction samples
into branch weights. It implements a similar heuristic to the one
implemented by Dehao Chen on GCC.

The propagation proceeds in 3 phases:

1- Assignment of block weights. All the basic blocks in the function
   are initial assigned the same weight as their most frequently
   executed instruction.

2- Creation of equivalence classes. Since samples may be missing from
   blocks, we can fill in the gaps by setting the weights of all the
   blocks in the same equivalence class to the same weight. To compute
   the concept of equivalence, we use dominance and loop information.
   Two blocks B1 and B2 are in the same equivalence class if B1
   dominates B2, B2 post-dominates B1 and both are in the same loop.

3- Propagation of block weights into edges. This uses a simple
   propagation heuristic. The following rules are applied to every
   block B in the CFG:

   - If B has a single predecessor/successor, then the weight
     of that edge is the weight of the block.

   - If all the edges are known except one, and the weight of the
     block is already known, the weight of the unknown edge will
     be the weight of the block minus the sum of all the known
     edges. If the sum of all the known edges is larger than B's weight,
     we set the unknown edge weight to zero.

   - If there is a self-referential edge, and the weight of the block is
     known, the weight for that edge is set to the weight of the block
     minus the weight of the other incoming edges to that block (if
     known).

Since this propagation is not guaranteed to finalize for every CFG, we
only allow it to proceed for a limited number of iterations (controlled
by -sample-profile-max-propagate-iterations). It currently uses the same
GCC default of 100.

Before propagation starts, the pass builds (for each block) a list of
unique predecessors and successors. This is necessary to handle
identical edges in multiway branches. Since we visit all blocks and all
edges of the CFG, it is cleaner to build these lists once at the start
of the pass.

Finally, the patch fixes the computation of relative line locations.
The profiler emits lines relative to the function header. To discover
it, we traverse the compilation unit looking for the subprogram
corresponding to the function. The line number of that subprogram is the
line where the function begins. That becomes line zero for all the
relative locations.

llvm-svn: 198972

0accb3d2

Jan 10, 2014

LoopVectorizer: Handle strided memory accesses by versioning · c2e9d759

Arnold Schwaighofer authored Jan 10, 2014

 for (i = 0; i < N; ++i)
   A[i * Stride1] += B[i * Stride2];

We take loops like this and check that the symbolic strides 'Strided1/2' are one
and drop to the scalar loop if they are not.

This is currently disabled by default and hidden behind the flag
'enable-mem-access-versioning'.

radar://13075509

llvm-svn: 198950

c2e9d759

Jan 09, 2014

Put the functionality for printing a value to a raw_ostream as an · d48cdbf0

Chandler Carruth authored Jan 09, 2014

operand into the Value interface just like the core print method is.
That gives a more conistent organization to the IR printing interfaces
-- they are all attached to the IR objects themselves. Also, update all
the users.

This removes the 'Writer.h' header which contained only a single function
declaration.

llvm-svn: 198836

d48cdbf0

Jan 08, 2014

Fix a bug about generating undef operand when optimising shuffle vector and... · 26abebbb

Hao Liu authored Jan 08, 2014

Fix a bug about generating undef operand when optimising shuffle vector and insert element in instruction combine.

llvm-svn: 198730

26abebbb

Jan 07, 2014

Move the LLVM IR asm writer header files into the IR directory, as they · 9aca918d

Chandler Carruth authored Jan 07, 2014

are part of the core IR library in order to support dumping and other
basic functionality.

Rename the 'Assembly' include directory to 'AsmParser' to match the
library name and the only functionality left their -- printing has been
in the core IR library for quite some time.

Update all of the #includes to match.

All of this started because I wanted to have the layering in good shape
before I started adding support for printing LLVM IR using the new pass
infrastructure, and commandline support for the new pass infrastructure.

llvm-svn: 198688

9aca918d

Re-sort all of the includes with ./utils/sort_includes.py so that · 8a8cd2ba

Chandler Carruth authored Jan 07, 2014

subsequent changes are easier to review. About to fix some layering
issues, and wanted to separate out the necessary churn.

Also comment and sink the include of "Windows.h" in three .inc files to
match the usage in Memory.inc.

llvm-svn: 198685

8a8cd2ba

Reapply r198654 "indvars: sink truncates outside the loop." · e4a18605

Andrew Trick authored Jan 07, 2014

This doesn't seem to have actually broken anything. It was paranoia
on my part. Trying again now that bots are more stable.

This is a follow up of the r198338 commit that added truncates for
lcssa phi nodes. Sinking the truncates below the phis cleans up the
loop and simplifies subsequent analysis within the indvars pass.

llvm-svn: 198678

e4a18605

Revert "indvars: sink truncates outside the loop." · 3c0ed089
Andrew Trick authored Jan 07, 2014
```
This reverts commit r198654.

One of the bots reported a SciMark failure.

llvm-svn: 198659
```
3c0ed089

indvars: sink truncates outside the loop. · 0b8e3b2c

Andrew Trick authored Jan 07, 2014

This is a follow up of the r198338 commit that added truncates for
lcssa phi nodes. Sinking the truncates below the phis cleans up the
loop and simplifies subsequent analysis within the indvars pass.

llvm-svn: 198654

0b8e3b2c

80 col. comment. · b70d9780
Andrew Trick authored Jan 07, 2014
```
llvm-svn: 198653
```
b70d9780

Jan 06, 2014

Reapply r198478 "Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things." · 6796ab42

Andrew Trick authored Jan 06, 2014

Now with a fix for PR18384: ValueHandleBase::ValueIsDeleted.

We need to invalidate SCEV's loop info when we delete a block, even if no values are hoisted.

llvm-svn: 198631

6796ab42

Jan 04, 2014

Add missed cleanup from r198456 · f929e09b

Alp Toker authored Jan 04, 2014

All other uses of this macro in LLVM/clang have been moved to the function
definition so follow suite (and the usage advice) here too for consistency.

llvm-svn: 198516

f929e09b

Revert "Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things." · 5e9f3265

Alp Toker authored Jan 04, 2014

This commit was the source of crasher PR18384:

While deleting: label %for.cond127
An asserting value handle still pointed to this value!
UNREACHABLE executed at llvm/lib/IR/Value.cpp:671!

Reverting to get the builders green, feel free to re-land after fixing up.
(Renato has a handy isolated repro if you need it.)

This reverts commit r198478.

llvm-svn: 198503

5e9f3265

Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things. · aceac974

Andrew Trick authored Jan 04, 2014

getSCEV for an ashr instruction creates an intermediate zext
expression when it truncates its operand.

The operand is initially inside the loop, so the narrow zext
expression has a non-loop-invariant loop disposition.

LoopSimplify then runs on an outer loop, hoists the ashr operand, and
properly invalidate the SCEVs that are mapped to value.

The SCEV expression for the ashr is now an AddRec with the hoisted
value as the now loop-invariant start value.

The LoopDisposition of this wide value was properly invalidated during
LoopSimplify.

However, if we later get the ashr SCEV again, we again try to create
the intermediate zext expression. We get the same SCEV that we did
earlier, and it is still cached because it was never mapped to a
Value. When we try to create a new AddRec we abort because we're using
the old non-loop-invariant LoopDisposition.

I don't have a solution for this other than to clear LoopDisposition
when LoopSimplify hoists things.

I think the long-term strategy should be to perform LoopSimplify on
all loops before computing SCEV and before running any loop opts on
individual loops. It's possible we may want to rerun LoopSimplify on
individual loops, but it should rarely do anything, so rarely require
invalidating SCEV.

llvm-svn: 198478

aceac974

Jan 03, 2014

Add a LLVM_DUMP_METHOD macro. · 7408c706

Nico Weber authored Jan 03, 2014

The motivation is to mark dump methods as used in debug builds so that they can
be called from lldb, but to not do so in release builds so that they can be
dead-stripped.

There's lots of potential follow-up work suggested in the thread
"Should dump methods be LLVM_ATTRIBUTE_USED only in debug builds?" on cfe-dev,
but everyone seems to agreen on this subset.

Macro name chosen by fair coin toss.

llvm-svn: 198456

7408c706

Fix loop rerolling pass failure with non-consant loop lower bound · ea9ba446

David Peixotto authored Jan 03, 2014

The loop rerolling pass was failing with an assertion failure from a
failed cast on loops like this:

  void foo(int *A, int *B, int m, int n) {
    for (int i = m; i < n; i+=4) {
      A[i+0] = B[i+0] * 4;
      A[i+1] = B[i+1] * 4;
      A[i+2] = B[i+2] * 4;
      A[i+3] = B[i+3] * 4;
    }
  }

The code was casting the SCEV-expanded code for the new
induction variable to a phi-node. When the loop had a non-constant
lower bound, the SCEV expander would end the code expansion with an
add insted of a phi node and the cast would fail.

It looks like the cast to a phi node was only needed to get the
induction variable value coming from the backedge to compute the end
of loop condition. This patch changes the loop reroller to compare
the induction variable to the number of times the backedge is taken
instead of the iteration count of the loop. In other words, we stop
the loop when the current value of the induction variable ==
IterationCount-1. Previously, the comparison was comparing the
induction variable value from the next iteration == IterationCount.

This problem only seems to occur on 32-bit targets. For some reason,
the loop is not rerolled on 64-bit targets.

PR18290

llvm-svn: 198425

ea9ba446

Jan 02, 2014

Disable compare sinking in CodeGenPrepare when multiple condition registers are available · decb024c

Hal Finkel authored Jan 02, 2014

As noted in the comment above CodeGenPrepare::OptimizeInst, which aggressively
sinks compares to reduce pressure on the condition register(s), for targets
such as PowerPC with multiple condition registers, this may not be the right
thing to do. This adds an HasMultipleConditionRegisters boolean to TLI, and
CodeGenPrepare::OptimizeInst is skipped when HasMultipleConditionRegisters is
true.

This functionality will be used by the PowerPC backend in an upcoming commit.
Especially when the PowerPC backend starts tracking individual condition
register bits as separate allocatable entities (which will happen in this
upcoming commit), this sinking from CodeGenPrepare::OptimizeInst is
significantly suboptimial.

llvm-svn: 198354

decb024c

indvars: cleanup the IV visitor. It does more than gather sext/zext info. · b6bc7830
Andrew Trick authored Jan 02, 2014
```
llvm-svn: 198353
```
b6bc7830
Delete unread globals through addrspacecast · 461c8e0a
Matt Arsenault authored Jan 02, 2014
```
llvm-svn: 198346
```
461c8e0a
Fix addrspacecast with metadata globals · da1deabb
Matt Arsenault authored Jan 02, 2014
```
llvm-svn: 198345
```
da1deabb

indvars: insert truncate at loop boundary to avoid redundant IVs. · 020dd898

Andrew Trick authored Jan 02, 2014

When widening an IV to remove s/zext, we generally try to eliminate
the original narrow IV. However, LCSSA phi nodes outside the loop were
still using the original IV. Clean this up more aggressively to avoid
redundancy in generated code.

llvm-svn: 198338

020dd898

Dec 30, 2013

Set LLVM_EXPORTED_SYMBOL_FILE in CMakeLists whose corresponding Makefiles do so. · 12265310

Nico Weber authored Dec 29, 2013

(unittests/ExecutionEngine/JIT/CMakeLists.txt is still missing for now, since
it handles export files in a strange way: It generates a .exports file from a
.def file instead of the other way round.)

llvm-svn: 198183

12265310

Dec 25, 2013

[ASan] Fix the test for __asan_gen_ globals and actually fix... · 4f0335f8

Alexander Potapenko authored Dec 25, 2013

[ASan] Fix the test for __asan_gen_ globals and actually fix http://llvm.org/bugs/show_bug.cgi?id=17976
by setting the correct linkage (as stated in the bug).

llvm-svn: 198018

4f0335f8

[ASan] Make sure none of the __asan_gen_ global strings end up in the symbol table, add a test. · daf96ae8

Alexander Potapenko authored Dec 25, 2013

This should fix http://llvm.org/bugs/show_bug.cgi?id=17976
Another test checking for the global variables' locations and prefixes on Darwin will be committed separately.

llvm-svn: 198017

daf96ae8

Dec 24, 2013

Add support to indvars for optimizing sadd.with.overflow. · 0ba77a07

Andrew Trick authored Dec 23, 2013

Split sadd.with.overflow into add + sadd.with.overflow to allow
analysis and optimization. This should ideally be done after
InstCombine, which can perform code motion (eventually indvars should
run after all canonical instcombines). We want ISEL to recombine the
add and the check, at least on x86.

This is currently under an option for reducing live induction
variables: -liv-reduce. The next step is reducing liveness of IVs that
are live out of the overflow check paths. Once the related
optimizations are fully developed, reviewed and tested, I do expect
this to become default.

llvm-svn: 197926

0ba77a07

Dec 23, 2013

Fix Scalarizer insertion point when replacing PHIs with insertelements · 1fb5c13e

Richard Sandiford authored Dec 23, 2013

If the Scalarizer scalarized a vector PHI but could not scalarize
all uses of it, it would insert a series of insertelements to reconstruct
the vector PHI value from the scalar ones.  The problem was that it would
emit these insertelements immediately after the PHI, even if there were
other PHIs after it.

llvm-svn: 197909

1fb5c13e

Fix Scalarizer handling of vector GEPs with multiple index operands · 3548cbb9
Richard Sandiford authored Dec 23, 2013
```
The old code only worked for one index operand.  Also handle "inbounds".

llvm-svn: 197908
```
3548cbb9

[asan] don't unpoison redzones on function exit in use-after-return mode. · 530e207d

Kostya Serebryany authored Dec 23, 2013

Summary:
Before this change the instrumented code before Ret instructions looked like:
  <Unpoison Frame Redzones>
  if (Frame != OriginalFrame) // I.e. Frame is fake
     <Poison Complete Frame>

Now the instrumented code looks like:
  if (Frame != OriginalFrame) // I.e. Frame is fake
     <Poison Complete Frame>
  else
     <Unpoison Frame Redzones>

Reviewers: eugenis

Reviewed By: eugenis

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2458

llvm-svn: 197907

530e207d

[asan] produce fewer stores when poisoning stack shadow · ff7bde15
Kostya Serebryany authored Dec 23, 2013
```
llvm-svn: 197904
```
ff7bde15

Dec 20, 2013

Transforms: Don't create bad weights when eliminating dead cases · 0ba3f211

Justin Bogner authored Dec 20, 2013

If we happen to eliminate every case in a switch that has branch
weights, we currently try to create metadata for the one remaining
branch, triggering an assert. Instead, we need to check that the
metadata we're trying to create is sensible.

llvm-svn: 197791

0ba3f211

Dec 19, 2013

Stay classy (and legal) LLVM. Remove links to 3rd party SMT solver whose links... · e37d5209
Kay Tiong Khoo authored Dec 19, 2013
```
Stay classy (and legal) LLVM. Remove links to 3rd party SMT solver whose links may not be permanent.

llvm-svn: 197713
```
e37d5209

Improved fix for PR17827 (instcombine of shift/and/compare). · a570b5ad

Kay Tiong Khoo authored Dec 19, 2013

This change fixes the case of arithmetic shift right - do not attempt to fold that case.
This change also relaxes the conditions when attempting to fold the logical shift right and shift left cases.

No additional IR-level test cases included at this time. See http://llvm.org/bugs/show_bug.cgi?id=17827 for proofs that these are correct transformations.

llvm-svn: 197705

a570b5ad

[dfsan] Simplify code after r197677. · a284e559
Evgeniy Stepanov authored Dec 19, 2013
```
llvm-svn: 197679
```
a284e559

Add an explicit insert point argument to SplitBlockAndInsertIfThen. · a9164e9e

Evgeniy Stepanov authored Dec 19, 2013

Currently SplitBlockAndInsertIfThen requires that branch condition is an
Instruction itself, which is very inconvenient, because it is sometimes an
Operator, or even a Constant.

llvm-svn: 197677

a9164e9e

Dec 17, 2013

LoopVectorizer: Don't if-convert constant expressions that can trap · 50b8302c

Arnold Schwaighofer authored Dec 17, 2013

A phi node operand or an instruction operand could be a constant expression that
can trap (division). Check that we don't vectorize such cases.

PR16729
radar://15653590

llvm-svn: 197449

50b8302c