Commits · 03456a176dd8c83b050eb24986786a0dbae53bc9 · Roger Ferrer / llvm-epi-0.8

Feb 11, 2014

LTO API: add lto_module_create_from_memory_with_path. · 03456a17

Manman Ren authored Feb 10, 2014

This function adds an extra path argument to lto_module_create_from_memory.
The path argument will be passed to makeBuffer to make sure the MemoryBuffer
has a name and the created module has a module identifier.

This is mainly for emitting warning messages from the linker. When we emit
warning message on a module, we can use the module identifier.

rdar://15985737

llvm-svn: 201114

03456a17

Feb 10, 2014

Mark the methods in the Mangler const. · efedd3aa

Rafael Espindola authored Feb 10, 2014

A const ObjectFile needs to be able to provide its name. For an IRObjectFile,
that means being able to call the mangler. Since each IRObjectFile can have
a different mangling, it is natural for them to contain a Mangler which is
therefore also const.

llvm-svn: 201113

efedd3aa

Change the begin and end methods in ObjectFile to match the style guide. · b5155a57
Rafael Espindola authored Feb 10, 2014
```
llvm-svn: 201108
```
b5155a57

ARM: use natural LLVM IR for vshll instructions · b0430415

Tim Northover authored Feb 10, 2014

Similarly to the vshrn instructions, these are simple zext/sext + trunc
operations. Using normal LLVM IR should allow for better code, and more sharing
with the AArch64 backend.

llvm-svn: 201093

b0430415

Make succ_iterator a real random access iterator and clean up a couple of users. · 3c29c070
Benjamin Kramer authored Feb 10, 2014
```
llvm-svn: 201088
```
3c29c070

ARM: use LLVM IR to represent the vshrn operation · 170daafe

Tim Northover authored Feb 10, 2014

vshrn is just the combination of a right shift and a truncate (and the limits
on the immediate value actually mean the signedness of the shift doesn't
matter). Using that representation allows us to get rid of an ARM-specific
intrinsic, share more code with AArch64 and hopefully get better code out of
the mid-end optimisers.

llvm-svn: 201085

170daafe

[mips][msa] Add DLSA instruction. · 4b27eb58
Matheus Almeida authored Feb 10, 2014
```
llvm-svn: 201081
```
4b27eb58

MCParser: add a single token lookahead · a879fab3

Saleem Abdulrasool authored Feb 09, 2014

Some of the more complex directive and macro handling for GAS compatibility
requires lookahead.  Add a single token lookahead in the MCAsmLexer.

llvm-svn: 201058

a879fab3

Feb 09, 2014

Use a consistent argument order in TargetLoweringObjectFile. · 15b26696

Rafael Espindola authored Feb 09, 2014

These methods normally call each other and it is really annoying if the
arguments are in different order. The more common rule was that the arguments
specific to call are first (GV, Encoding, Suffix) and the auxiliary objects
(Mang, TM) come after. This patch changes the exceptions.

llvm-svn: 201044

15b26696

Feb 08, 2014
- Pass the Mangler by reference. · fa0f7283
  Rafael Espindola authored Feb 08, 2014
```
It is never null and it is not used in casts, so there is no reason to use a
pointer. This matches how we pass TM.

llvm-svn: 201025
```
  fa0f7283
- Add LLVM_OVERRIDE to a few declarations. · 10705015
  Rafael Espindola authored Feb 08, 2014
```
llvm-svn: 201022
```
  10705015
Feb 07, 2014

Comment cleanup. Don't repeat the function name in the comment. · 3d8a106f
Rafael Espindola authored Feb 07, 2014
```
llvm-svn: 201001
```
3d8a106f
Comment cleanup. Don't repeat the function name in the comment. · 8193d17f
Rafael Espindola authored Feb 07, 2014
```
llvm-svn: 200999
```
8193d17f
Remove training whitespace. · 4fb4b47f
Rafael Espindola authored Feb 07, 2014
```
llvm-svn: 200998
```
4fb4b47f

LLVM-1163: AAPCS-VFP violation when CPRC allocated to stack · 1dc10342

Oliver Stannard authored Feb 07, 2014

According to the AAPCS, when a CPRC is allocated to the stack, all other
VFP registers should be marked as unavailable.

I have also modified the rules for allocating non-CPRCs to the stack, to make
it more explicit that all GPRs must be made unavailable. I cannot think of a
case where the old version would produce incorrect answers, so there is no test
for this.

llvm-svn: 200970

1dc10342

X86: Resolve a long standing FIXME and properly isel pextr[bw]. · e9008de6

Jim Grosbach authored Feb 07, 2014

Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use
them to match the relevant pextr store instructions.

The test widen_load-2.ll requires a slight change because with the
stores gone, the remaining instructions are scheduled in a different
order.

Add test cases for SSE4 and AVX variants.

Resolves rdar://13414672.

Patch by Adam Nemet <anemet@apple.com>.

llvm-svn: 200957

e9008de6

Feb 06, 2014

Remove const_cast for STI when parsing inline asm · ea2bcb9e

David Peixotto authored Feb 06, 2014

In a previous commit (r199818) we added a const_cast to an existing
subtarget info instead of creating a new one so that we could reuse
it when creating the TargetAsmParser for parsing inline assembly.
This cast was necessary because we needed to reuse the existing STI
to avoid generating incorrect code when the inline asm contained
mode-switching directives (e.g. .code 16).

The root cause of the failure was that there was an implicit sharing
of the STI between the parser and the MCCodeEmitter. To fix a
different but related issue, we now explicitly pass the STI to the
MCCodeEmitter (see commits r200345-r200351).

The const_cast is no longer necessary and we can now create a fresh
STI for the inline asm parser to use.

Differential Revision: http://llvm-reviews.chandlerc.com/D2709

llvm-svn: 200929

ea2bcb9e

[PM] Fix horrible typos that somehow didn't cause a failure in a C++11 · d1ba2efb

Chandler Carruth authored Feb 06, 2014

build but spectacularly changed behavior of the C++98 build. =]

This shows my one problem with not having unittests -- basic API
expectations aren't well exercised by the integration tests because they
*happen* to not come up, even though they might later. I'll probably add
a basic unittest to complement the integration testing later, but
I wanted to revive the bots.

llvm-svn: 200905

d1ba2efb

[PM] Add a new "lazy" call graph analysis pass for the new pass manager. · bf71a34e

Chandler Carruth authored Feb 06, 2014

The primary motivation for this pass is to separate the call graph
analysis used by the new pass manager's CGSCC pass management from the
existing call graph analysis pass. That analysis pass is (somewhat
unfortunately) over-constrained by the existing CallGraphSCCPassManager
requirements. Those requirements make it *really* hard to cleanly layer
the needed functionality for the new pass manager on top of the existing
analysis.

However, there are also a bunch of things that the pass manager would
specifically benefit from doing differently from the existing call graph
analysis, and this new implementation tries to address several of them:

- Be lazy about scanning function definitions. The existing pass eagerly
  scans the entire module to build the initial graph. This new pass is
  significantly more lazy, and I plan to push this even further to
  maximize locality during CGSCC walks.
- Don't use a single synthetic node to partition functions with an
  indirect call from functions whose address is taken. This node creates
  a huge choke-point which would preclude good parallelization across
  the fanout of the SCC graph when we got to the point of looking at
  such changes to LLVM.
- Use a memory dense and lightweight representation of the call graph
  rather than value handles and tracking call instructions. This will
  require explicit update calls instead of some updates working
  transparently, but should end up being significantly more efficient.
  The explicit update calls ended up being needed in many cases for the
  existing call graph so we don't really lose anything.
- Doesn't explicitly model SCCs and thus doesn't provide an "identity"
  for an SCC which is stable across updates. This is essential for the
  new pass manager to work correctly.
- Only form the graph necessary for traversing all of the functions in
  an SCC friendly order. This is a much simpler graph structure and
  should be more memory dense. It does limit the ways in which it is
  appropriate to use this analysis. I wish I had a better name than
  "call graph". I've commented extensively this aspect.

This is still very much a WIP, in fact it is really just the initial
bits. But it is about the fourth version of the initial bits that I've
implemented with each of the others running into really frustrating
problms. This looks like it will actually work and I'd like to split the
actual complexity across commits for the sake of my reviewers. =] The
rest of the implementation along with lots of wiring will follow
somewhat more rapidly now that there is a good path forward.

Naturally, this doesn't impact any of the existing optimizer. This code
is specific to the new pass manager.

A bunch of thanks are deserved for the various folks that have helped
with the design of this, especially Nick Lewycky who actually sat with
me to go through the fundamentals of the final version here.

llvm-svn: 200903

bf71a34e

Disable most IR-level transform passes on functions marked 'optnone'. · af4e64d0

Paul Robinson authored Feb 06, 2014

Ideally only those transform passes that run at -O0 remain enabled,
in reality we get as close as we reasonably can.
Passes are responsible for disabling themselves, it's not the job of
the pass manager to do it for them.

llvm-svn: 200892

af4e64d0

Add address space argument to allowsUnalignedMemoryAccess. · 25793a3f
Matt Arsenault authored Feb 05, 2014
```
On R600, some address spaces have more strict alignment
requirements than others.

llvm-svn: 200887
```
25793a3f

Feb 05, 2014

Fix layering StringRef copy using BumpPtrAllocator. · 4d6d9812

Nick Kledzik authored Feb 05, 2014

Now to copy a string into a BumpPtrAllocator and get a StringRef to the copy:

   StringRef myCopy = myStr.copy(myAllocator);
   

llvm-svn: 200885

4d6d9812

[PM] Don't require analysis results to be const in the new pass manager. · eedf9fca

Chandler Carruth authored Feb 05, 2014

I think this was just over-eagerness on my part. The analysis results
need to often be non-const because they need to (in some cases at least)
be updated by the transformation pass in order to remain correct. It
also makes lazy analyses (a common case) needlessly annoying to write in
order to make their entire state mutable.

llvm-svn: 200881

eedf9fca

Remove support for not using .loc directives. · b4eec1da
Rafael Espindola authored Feb 05, 2014
```
Clang itself was not using this. The only way to access it was via llc.

llvm-svn: 200862
```
b4eec1da
Revert "Fix an invalid check for duplicate option categories." · 0bca63a3
Rafael Espindola authored Feb 05, 2014
```
This reverts commit r200853.

It was causing clang/Analysis/checker-plugins.c to crash.

llvm-svn: 200858
```
0bca63a3

Fix an invalid check for duplicate option categories. · e88421b6

Alexander Kornienko authored Feb 05, 2014

Summary:
The check performed in the comparator is invalid, as some STL
implementations enforce strict weak ordering by calling the comparator with the
same value. This check was also in a wrong place: the assertion would only fire
when -help was used. The new check is performed each time the category is
registered (we are not going to have thousands of them, so it's fine to do it in
O(N^2)).

Reviewers: jordan_rose

Reviewed By: jordan_rose

CC: cfe-commits, alexmc

Differential Revision: http://llvm-reviews.chandlerc.com/D2699

llvm-svn: 200853

e88421b6

AVX-512: Added intrinsic for cvtph2ps. · a30e4376

Elena Demikhovsky authored Feb 05, 2014

Added VPTESTNM instruction.
Added a pattern to vselect (lit tests will follow).

llvm-svn: 200823

a30e4376

Add CheckChildInteger to ISelMatcher operations. Removes nearly 2000 bytes from X86 matcher table. · 7ca1d180
Craig Topper authored Feb 05, 2014
```
llvm-svn: 200821
```
7ca1d180

Fix configure to find arc4random via header files. · 4ccfe392

Todd Fiala authored Feb 05, 2014

ISSUE:

On Ubuntu 12.04 LTS, arc4random is provided by libbsd.so, which is a
transitive dependency of libedit. If a system had libedit on it that
was implemented in terms of libbsd.so, then the arc4random test,
previously implemented as a linker test, would succeed with -ledit.
However, on Ubuntu this would also require a #include <bsd/stdlib.h>.
This caused a build breakage on configure-based Ubuntu 12.04 with
libedit installed.

FIX:

This fix changes configure to test for arc4random by searching for it
in the standard header files. On Ubuntu 12.04, this test now properly
fails to find arc4random as it is not defined in the default header
locations. It also tweaks the #define names to match the output of the
header check command, which is slightly different than the linker
function check #defines.

I tested the following scenarios:

(1) Ubuntu 12.04 without the libedit package [did not find arc4random,
as expected]

(2) Ubuntu 12.04 with libedit package [properly did not find
arc4random, as expected]

(3) Ubuntu 12.04 with most recent libedit, custom built, and not
dependent on libbsd.so [properly did not find arc4random, as
expected].

(4) FreeBSD 10.0B1 [properly found arc4random, as expected]

llvm-svn: 200819

4ccfe392

Feb 04, 2014

Remove unused SF_ThreadLocal. · 975e115e
Rafael Espindola authored Feb 04, 2014
```
llvm-svn: 200800
```
975e115e
SimplifyLibCalls: Push TLI through the exp2->ldexp transform. · 34f460ed
Benjamin Kramer authored Feb 04, 2014
```
For the odd case of platforms with exp2 available but not ldexp.

llvm-svn: 200795
```
34f460ed
Every target uses .align. Simplify. · 7cbbd28c
Rafael Espindola authored Feb 04, 2014
```
llvm-svn: 200782
```
7cbbd28c

Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly · b9b7362c

David Peixotto authored Feb 04, 2014

This patch fixes the ldr-pseudo implementation to work when used in
inline assembly.  The fix is to move arm assembler constant pools
from the ARMAsmParser class to the ARMTargetStreamer class.

Previously we kept the assembler generated constant pools in the
ARMAsmParser object. This does not work for inline assembly because
a new parser object is created for each blob of inline assembly.
This patch moves the constant pools to the ARMTargetStreamer class
so that the constant pool will remain alive for the entire code
generation process.

An ARMTargetStreamer class is now required for the arm backend.
There was no existing implementation for MachO, only Asm and ELF.
Instead of creating an empty MachO subclass, we decided to make the
ARMTargetStreamer a non-abstract class and provide default
(llvm_unreachable) implementations for the non constant-pool related
methods.

Differential Revision: http://llvm-reviews.chandlerc.com/D2638

llvm-svn: 200777

b9b7362c

OS X: the correct function is __sincospif_stret, not __sincospi_stretf · 103e648d
Tim Northover authored Feb 04, 2014
```
rdar://problem/13729466

llvm-svn: 200771
```
103e648d

ARM & AArch64: merge NEON absolute compare intrinsics · fdbdb4b6

Tim Northover authored Feb 04, 2014

There was an extremely confusing proliferation of LLVM intrinsics to implement
the vacge & vacgt instructions. This combines them all into two polymorphic
intrinsics, shared across both backends.

llvm-svn: 200768

fdbdb4b6

llvm-cov: Implement the preserve-paths flag · c6af3506

Justin Bogner authored Feb 04, 2014

Until now, when a path in a gcno file included a directory, we would
emit our .gcov file in that directory, whereas gcov always emits the
file in the current directory. In doing so, this implements gcov's
strange name-mangling -p flag, which is needed to avoid clobbering
files when two with the same name exist in different directories.

The path mangling is a bit ugly and only handles unix-like paths, but
it's simple, and it doesn't make any guesses as to how it should
behave outside of what gcov documents. If we decide this should be
cross platform later, we can consider the compatibility implications
then.

llvm-svn: 200754

c6af3506

Feb 03, 2014

AArch64 & ARM: refactor crypto intrinsics to take scalars · 24979d8e

Tim Northover authored Feb 03, 2014

Some of the SHA instructions take a scalar i32 as one argument (largely because
they work on 160-bit hash fragments). This wasn't reflected in the IR
previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x
i32> respectively) which was ugly.

This makes all the affected intrinsics take a uniform "i32", allowing them to
become non-polymorphic at the same time.

llvm-svn: 200706

24979d8e

Remove outdated & incorrect part of comment. · 309e77fb

Eli Bendersky authored Feb 03, 2014

This comment was copied over from another class in r34170, where it made sense.

llvm-svn: 200697

309e77fb

Introduce SmallPtrSetImpl<T *> which allows insert, erase, count, and · 784de75c

Chandler Carruth authored Feb 03, 2014

iteration. This alows the majority of operations to be performed without
encoding a specific small size. It follows the model of
SmallVectorImpl<T>.

llvm-svn: 200688

784de75c

Rename the non-templated base class of SmallPtrSet to · 173bd7ed

Chandler Carruth authored Feb 03, 2014

'SmallPtrSetImplBase'. This more closely matches the organization of
SmallVector and should allow introducing a SmallPtrSetImpl which serves
the same purpose as SmallVectorImpl: isolating the element type from the
particular small size chosen. This in turn allows a lot of
simplification of APIs by not coding them against a specific small size
which is rarely needed.

llvm-svn: 200687

173bd7ed