Commits · 156227ac2bcac239f9dfd0a8314b6efcae9021b0 · Roger Ferrer / llvm-epi-0.8

Nov 13, 2013

Don't call doFinalization from verifyFunction. · 156227ac

Rafael Espindola authored Nov 13, 2013

verifyFunction needs to call doInitialization to collect metadata and avoid
crashing when verifying debug info in a function.

But it should not call doFinalization since that is where the verifier will
check declarations, variables and aliases, which is not desirable when one
only wants to verify a function.

A possible cleanup would be to split the class into a ModuleVerifier and
FunctionVerifier.

Issue reported by Ilia Filippov. Patch by Michael Kruse.

llvm-svn: 194574

156227ac

Fix bug in .gpword directive parsing. · e10c1125
Vladimir Medic authored Nov 13, 2013
```
llvm-svn: 194570
```
e10c1125
Support for microMIPS trap instruction with immediate operands. · ccb70caa
Zoran Jovanovic authored Nov 13, 2013
```
llvm-svn: 194569
```
ccb70caa
Fix -Wdelete-non-virtual-dtor warnings by making SampleProfile methods non-virtual · aa19c0a1
Alexey Samsonov authored Nov 13, 2013
```
llvm-svn: 194568
```
aa19c0a1

SampleProfileLoader pass. Initial setup. · 8d6568b5

Diego Novillo authored Nov 13, 2013

This adds a new scalar pass that reads a file with samples generated
by 'perf' during runtime. The samples read from the profile are
incorporated and emmited as IR metadata reflecting that profile.

The profile file is assumed to have been generated by an external
profile source. The profile information is converted into IR metadata,
which is later used by the analysis routines to estimate block
frequencies, edge weights and other related data.

External profile information files have no fixed format, each profiler
is free to define its own. This includes both the on-disk representation
of the profile and the kind of profile information stored in the file.
A common kind of profile is based on sampling (e.g., perf), which
essentially counts how many times each line of the program has been
executed during the run.

The SampleProfileLoader pass is organized as a scalar transformation.
On startup, it reads the file given in -sample-profile-file to
determine what kind of profile it contains.  This file is assumed to
contain profile information for the whole application. The profile
data in the file is read and incorporated into the internal state of
the corresponding profiler.

To facilitate testing, I've organized the profilers to support two file
formats: text and native. The native format is whatever on-disk
representation the profiler wants to support, I think this will mostly
be bitcode files, but it could be anything the profiler wants to
support. To do this, every profiler must implement the
SampleProfile::loadNative() function.

The text format is mostly meant for debugging. Records are separated by
newlines, but each profiler is free to interpret records as it sees fit.
Profilers must implement the SampleProfile::loadText() function.

Finally, the pass will call SampleProfile::emitAnnotations() for each
function in the current translation unit. This function needs to
translate the loaded profile into IR metadata, which the analyzer will
later be able to use.

This patch implements the first steps towards the above design. I've
implemented a sample-based flat profiler. The format of the profile is
fairly simplistic. Each sampled function contains a list of relative
line locations (from the start of the function) together with a count
representing how many samples were collected at that line during
execution. I generate this profile using perf and a separate converter
tool.

Currently, I have only implemented a text format for these profiles. I
am interested in initial feedback to the whole approach before I send
the other parts of the implementation for review.

This patch implements:

- The SampleProfileLoader pass.
- The base ExternalProfile class with the core interface.
- A SampleProfile sub-class using the above interface. The profiler
  generates branch weight metadata on every branch instructions that
  matches the profiles.
- A text loader class to assist the implementation of
  SampleProfile::loadText().
- Basic unit tests for the pass.

Additionally, the patch uses profile information to compute branch
weights based on instruction samples.

This patch converts instruction samples into branch weights. It
does a fairly simplistic conversion:

Given a multi-way branch instruction, it calculates the weight of
each branch based on the maximum sample count gathered from each
target basic block.

Note that this assignment of branch weights is somewhat lossy and can be
misleading. If a basic block has more than one incoming branch, all the
incoming branches will get the same weight. In reality, it may be that
only one of them is the most heavily taken branch.

I will adjust this assignment in subsequent patches.

llvm-svn: 194566

8d6568b5

FileCheck: fix a bug with multiple --check-prefix options. · 21a340fa

Alexey Samsonov authored Nov 13, 2013

Summary:
This fixes a subtle bug in new FileCheck feature added
in r194343. When we search for the first satisfying check-prefix,
we should actually return the first encounter of some check-prefix as a
substring, even if it's not a part of valid check-line. Otherwise
"FileCheck --check-prefix=FOO --check-prefix=BAR" with check file:

  FOO not a vaild check-line
  FOO: foo
  BAR: bar

incorrectly accepted file:

  fog
  bar

as it skipped the first two encounters of FOO, matching only BAR: line.

Reviewers: arsenm, dsanders

Reviewed By: dsanders

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2166

llvm-svn: 194565

21a340fa

XCore target: implement exception handling · a83c0482
Robert Lytton authored Nov 13, 2013
```
llvm-svn: 194564
```
a83c0482
Fix typo + add URL · 59f661b9
Sylvestre Ledru authored Nov 13, 2013
```
llvm-svn: 194563
```
59f661b9
This patch fixes a bug in floating point operands parsing, when instruction... · 77ffd7af
Vladimir Medic authored Nov 13, 2013
```
This patch fixes a bug in floating point operands parsing, when instruction alias uses default register operand.

llvm-svn: 194562
```
77ffd7af
Add XFAIL:arm again on 4 MCJIT tests, since r194558. AArch64 has been left removed. · db5d18d2
NAKAMURA Takumi authored Nov 13, 2013
```
They are failing on clang-native-arm-cortex-a9.

Please tweak MCJIT/lit.local.cfg, if this didn't satisfy bots.

llvm-svn: 194561
```
db5d18d2
Remove XFAIL:aarch64,arm from 4 tests in test/ExecutionEngine/MCJIT. · b71b7baa
NAKAMURA Takumi authored Nov 13, 2013
```
They are reported as XPASSing.

llvm-svn: 194558
```
b71b7baa
Mips16InstrInfo.cpp: Use <cctype> instead of <ctype.h> · 435f62a8
NAKAMURA Takumi authored Nov 13, 2013
```
Also, prune <stdlib.h>, seems stray.

llvm-svn: 194557
```
435f62a8

Allow the code which returns the length for inline assembler to know · 5c8ae095

Reed Kotler authored Nov 13, 2013

specifically about the .space directive. This allows us to force large
blocks of code to appear in test cases for things like constant islands
without having to make giant test cases to force things like long 
branches to take effect.

llvm-svn: 194555

5c8ae095

Add myself to CODE_OWNERS for the OCaml bindings · de853be2
Peter Zotov authored Nov 13, 2013
```
Per discussion with Chris Lattner

llvm-svn: 194554
```
de853be2
Add a test case to verify that misusing anyregcc crashes as expected. · 5469ae8f
Andrew Trick authored Nov 13, 2013
```
llvm-svn: 194553
```
5469ae8f
Add another (perhaps better) video for Sean's talk. (Thanks Marshall!) · 3d7fd3da
Chandler Carruth authored Nov 13, 2013
```
llvm-svn: 194549
```
3d7fd3da

Fix a null pointer dereference when copying a null polymorphic pointer. · ccb19097

Chandler Carruth authored Nov 13, 2013

This bug only bit the C++98 build bots because all of the actual uses
really do move. ;] But not *quite* ready to do the whole C++11 switch
yet, so clean it up. Also add a unit test that catches this immediately.

llvm-svn: 194548

ccb19097

R600: Fix selection failure on EXTLOAD · 00a0d6f6
Matt Arsenault authored Nov 13, 2013
```
llvm-svn: 194547
```
00a0d6f6

SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. · 34c652d3

Juergen Ributzka authored Nov 13, 2013

This patch reapplies r193676 with an additional fix for the Hexagon backend. The
SystemZ backend has already been fixed by r194148.

The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. Now the type
legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

Reviewed by Nadav

llvm-svn: 194542

34c652d3

Give folks a reference to some material on the fundamental design · a477d2ab
Chandler Carruth authored Nov 13, 2013
```
pattern in use here. Addresses review feedback from Sean (thanks!) and
others.

llvm-svn: 194541
```
a477d2ab

Introduce an AnalysisManager which is like a pass manager but with a lot · 74015a70

Chandler Carruth authored Nov 13, 2013

more smarts in it. This is where most of the interesting logic that used
to live in the implicit-scheduling-hackery of the old pass manager will
live.

Like the previous commits, note that this is a very early prototype!
I expect substantial changes before this is ready to use.

The core of the design is the following:

- We have an AnalysisManager which can be used across a series of
  passes over a module.
- The code setting up a pass pipeline registers the analyses available
  with the manager.
- Individual transform passes can check than an analysis manager
  provides the analyses they require in order to fail-fast.
- There is *no* implicit registration or scheduling.
- Analysis passes are different from other passes: they produce an
  analysis result that is cached and made available via the analysis
  manager.
- Cached results are invalidated automatically by the pass managers.
- When a transform pass requests an analysis result, either the analysis
  is run to produce the result or a cached result is provided.

There are a few aspects of this design that I *know* will change in
subsequent commits:
- Currently there is no "preservation" system, that needs to be added.
- All of the analysis management should move up to the analysis library.
- The analysis management needs to support at least SCC passes. Maybe
  loop passes. Living in the analysis library will facilitate this.
- Need support for analyses which are *both* module and function passes.
- Need support for pro-actively running module analyses to have cached
  results within a function pass manager.
- Need a clear design for "immutable" passes.
- Need support for requesting cached results when available and not
  re-running the pass even if that would be necessary.
- Need more thorough testing of all of this infrastructure.

There are other aspects that I view as open questions I'm hoping to
resolve as I iterate a bit on the infrastructure, and especially as
I start writing actual passes against this.
- Should we have separate management layers for function, module, and
  SCC analyses? I think "yes", but I'm not yet ready to switch the code.
  Adding SCC support will likely resolve this definitively.
- How should the 'require' functionality work? Should *that* be the only
  way to request results to ensure that passes always require things?
- How should preservation work?
- Probably some other things I'm forgetting. =]

Look forward to more patches in shorter order now that this is in place.

llvm-svn: 194538

74015a70

Update the docs to match the function name. · ea186b95
Nadav Rotem authored Nov 13, 2013
```
llvm-svn: 194537
```
ea186b95

Removing llvm::huge_vald and llvm::huge_vall because they are not currently... · 4337e970

Aaron Ballman authored Nov 13, 2013

Removing llvm::huge_vald and llvm::huge_vall because they are not currently used, and HUGE_VALD does not appear to be supported everywhere anyways.

llvm-svn: 194535

4337e970

Replacing HUGE_VALF with llvm::huge_valf in order to work around a warning triggered in MSVC 12. · 04999041
Aaron Ballman authored Nov 13, 2013
```
Patch reviewed by Reid Kleckner and Jim Grosbach.

llvm-svn: 194533
```
04999041
Remove always true flag. · 6cd1b9ae
Rafael Espindola authored Nov 12, 2013
```
llvm-svn: 194530
```
6cd1b9ae

Nov 12, 2013

Cleanup the stackmap operand folding code and fix a corner case. · 0ef482ef
Andrew Trick authored Nov 12, 2013
```
I still don't know how to refer to the fixed operands symbolically. I
plan to look into it.

llvm-svn: 194529
```
0ef482ef

improve dependence analysis testcases · a1cc34b9

Sebastian Pop authored Nov 12, 2013

print the name of the function on which the dependence analysis is performed
such that changes to the testcase are easier to review.

llvm-svn: 194528

a1cc34b9

delinearization of arrays · c62c679c
Sebastian Pop authored Nov 12, 2013
```
llvm-svn: 194527
```
c62c679c
remove virtual methods in SCEVApplyRewriter and SCEVParameterRewriter · 9f8004fb
Sebastian Pop authored Nov 12, 2013
```
llvm-svn: 194526
```
9f8004fb

Fold (iszero(A&K1) | iszero(A&K2)) -> (A&(K1|K2)) != (K1|K2) if we know that... · 0ed2fdb5

Nadav Rotem authored Nov 12, 2013

Fold (iszero(A&K1) | iszero(A&K2)) ->  (A&(K1|K2)) != (K1|K2) if we know that K1 and K2 are 'one-hot' (only one bit is on).

llvm-svn: 194525

0ed2fdb5

FoldBranchToCommonDest merges branches into a single branch with or/and of the... · 53d32211

Nadav Rotem authored Nov 12, 2013

FoldBranchToCommonDest merges branches into a single branch with or/and of the condition. It has a heuristics for estimating when some of the dependencies are processed by out-of-order processors. This patch adds another rule to the heuristics that says that if the "BonusInstruction" that we speculatively execute is used by the condition of the second branch then it is okay to hoist it. This change exposes more opportunities for other passes to transform the code. It does not matter that much that we if-convert the code because the selectiondag builder splits or/and branches into multiple branches when profitable.

llvm-svn: 194524

53d32211

[mips] Fix a bug in function CC_MipsO32_FP64. The second double precision · d6c9f6eb
Akira Hatanaka authored Nov 12, 2013
```
argument was not being passed in $f14.
 

llvm-svn: 194522
```
d6c9f6eb
[mips] Run test case with command line option -mattr=+fp64. · c8e4bd15
Akira Hatanaka authored Nov 12, 2013
```
 

llvm-svn: 194519
```
c8e4bd15
Add a FIXME for 32-bit q modifiers. · a01c954d
Eric Christopher authored Nov 12, 2013
```
llvm-svn: 194515
```
a01c954d

Protect user-supplied runtime library functions in LTO · b10a520c

Justin Bogner authored Nov 12, 2013

Add user-supplied C runtime and compiler-rt library functions to
llvm.compiler.used to protect them from premature optimization by
passes like -globalopt and -ipsccp.  Calls to (seemingly unused)
runtime library functions can be added by -instcombine and instruction
lowering.

Patch by Duncan Exon Smith, thanks!

Fixes <rdar://problem/14740087>

llvm-svn: 194514

b10a520c

ARM: diagnose invalid system LDM/STM · 8eaf1543

Tim Northover authored Nov 12, 2013

The system LDM and STM instructions can't usually writeback to the base
register. The one exception is when an LDM is actually an exception-return
(i.e. contains PC in the register list).

(There's already a test that "ldm sp!, {r0-r3, pc}^" works, which is why there
is no positive test).

rdar://problem/15223374

llvm-svn: 194512

8eaf1543

[mips] Revert part of r194510 that was accidentally committed. · d748e29c
Akira Hatanaka authored Nov 12, 2013
```
 

llvm-svn: 194511
```
d748e29c
[mips] Fix and re-enable a test case that has been disabled for a long time. · 937ce7c1
Akira Hatanaka authored Nov 12, 2013
```
 

llvm-svn: 194510
```
937ce7c1

[OCaml] Dynamically link LLVM on --enable-shared builds · 7b321f83

Peter Zotov authored Nov 12, 2013

This commit significantly speeds up both bytecode and native
builds of LLVM clients (from ~20 second to sub-second link time),
and allows to invoke LLVM functions from OCaml toplevel.

The behavior for --disable-shared builds is unchanged.

llvm-svn: 194509

7b321f83

[OCaml] Fix a typo · b1d1388e
Peter Zotov authored Nov 12, 2013
```
llvm-svn: 194508
```
b1d1388e