Commits · d0ed730f924ecd1a39b9d979741dc627229d0cf9 · Roger Ferrer / llvm-epi-0.8

Nov 27, 2013

Remove dead argument. · d0ed730f
Rafael Espindola authored Nov 27, 2013
```
llvm-svn: 195806
```
d0ed730f
[AArch64] Add support for NEON scalar floating-point absolute difference. · 75290c63
Chad Rosier authored Nov 27, 2013
```
llvm-svn: 195803
```
75290c63

Use simple section names for COMDAT sections on COFF. · 2d30ae2b

Rafael Espindola authored Nov 27, 2013

With this patch we use simple names for COMDAT sections (like .text or .bss).
This matches the MSVC behavior.

When merging it is the COMDAT symbol that is used to decide if two sections
should be merged, so there is no point in building a fancy name.

This survived a bootstrap on mingw32.

llvm-svn: 195798

2d30ae2b

Nov 26, 2013

PR1860 - We can't save a list of ExtractElement instructions to CSE because... · b0082d24

Nadav Rotem authored Nov 26, 2013

PR1860 - We can't save a list of ExtractElement instructions to CSE because some of these instructions
may be removed and optimized in future iterations. Instead we save a list of basic blocks that we need to CSE.

llvm-svn: 195791

b0082d24

80-column fixups. · f52eddf9
Eric Christopher authored Nov 26, 2013
```
llvm-svn: 195790
```
f52eddf9
[AArch64] Add support for NEON scalar floating-point to integer convert · 9653d5c9
Chad Rosier authored Nov 26, 2013
```
instructions.

llvm-svn: 195788
```
9653d5c9

LoopVectorizer: Truncate i64 trip counts of i32 phis if necessary · a2c8e008

Arnold Schwaighofer authored Nov 26, 2013

In signed arithmetic we could end up with an i64 trip count for an i32 phi.
Because it is signed arithmetic we know that this is only defined if the i32
does not wrap. It is therefore safe to truncate the i64 trip count to a i32
value.

Fixes PR18049.

llvm-svn: 195787

a2c8e008

Fix a bug related to constant islands for Mips16 and mips16/32 dual mode. · 3aeb1d08
Reed Kotler authored Nov 26, 2013
```
The determination of when we are doing constant pools was being made too
early in the asm printer.

llvm-svn: 195781
```
3aeb1d08

Refactor some code in SampleProfile.cpp · c0dd1037

Diego Novillo authored Nov 26, 2013

I'm adding new functionality in the sample profiler. This will
require more data to be kept around for each function, so I moved
the structure SampleProfile that we keep for each function into
a separate class.

There are no functional changes in this patch. It simply provides
a new home where to place all the new data that I need to propagate
weights through edges.

There are some other name and minor edits throughout.

llvm-svn: 195780

c0dd1037

Fix PR18054 · d617a301

Michael Liao authored Nov 26, 2013

- Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG
  lowering where we need to check whether x is a vector type (in-reg
  type) of i8, i16 or i32; otherwise, that optimization is not valid.

llvm-svn: 195779

d617a301

DwarfDebug: Include type units in accelerator tables. · fd1eff5a

David Blaikie authored Nov 26, 2013

Since type units aren't in the CUMap, use the DwarfUnits list to iterate
over units for tasks such as accelerator table building.

llvm-svn: 195776

fd1eff5a

Fix spurious return introduced by my earlier patch to DebugInfo · 1388f070
Renato Golin authored Nov 26, 2013
```
llvm-svn: 195775
```
1388f070

PR18060 - When we RAUW values with ExtractElement instructions in some cases · f9f8482e

Nadav Rotem authored Nov 26, 2013

we generate PHI nodes with multiple entries from the same basic block but
with different values. Enabling CSE on ExtractElement instructions make sure
that all of the RAUWed instructions are the same.

llvm-svn: 195773

f9f8482e

Add return to DIType::Verify · 47f46fd4

Renato Golin authored Nov 26, 2013

Code scanner ran by Sylvestre Ledru got a no_return bug
in DebugInfo.cpp. Adding the return statements that
should be there.

llvm-svn: 195772

47f46fd4

PR17925 bugfix. · abb8505d

Stepan Dyatkovskiy authored Nov 26, 2013

Short description.

This issue is about case of treating pointers as integers.
We treat pointers as different if they references different address space.
At the same time, we treat pointers equal to integers (with machine address
width). It was a point of false-positive. Consider next case on 32bit machine:

void foo0(i32 addrespace(1)* %p)
void foo1(i32 addrespace(2)* %p)
void foo2(i32 %p)

foo0 != foo1, while
foo1 == foo2 and foo0 == foo2.

As you can see it breaks transitivity. That means that result depends on order
of how functions are presented in module. Next order causes merging of foo0
and foo1: foo2, foo0, foo1
First foo0 will be merged with foo2, foo0 will be erased. Second foo1 will be
merged with foo2.
Depending on order, things could be merged we don't expect to.

The fix:
Forbid to treat any pointer as integer, except for those, who belong to address space 0.

llvm-svn: 195769

abb8505d

Rename DwarfException methods so the new names are consistent with DwarfDebug and the style guide · 119f3073
Timur Iskhodzhanov authored Nov 26, 2013
```
llvm-svn: 195763
```
119f3073
Darwin-ARM: use movw/movt for static relocations · fa36dfee
Tim Northover authored Nov 26, 2013
```
llvm-svn: 195759
```
fa36dfee

[PM] Factor the overwhelming majority of the interface boiler plate out · 16ea68e8

Chandler Carruth authored Nov 26, 2013

of the two analysis managers into a CRTP base class that can be shared
and re-used in building any analysis manager. This will in turn simplify
adding yet another analysis manager to the system.

The base class provides all of the interface sugar for the analysis
manager delegating the functionality back through DerivedT methods which
operate on simple pass IDs. It also provides the pass registration,
storage, and lookup system which is common across the various
formulations of analysis managers.

llvm-svn: 195747

16ea68e8

[SystemZ] Fix incorrect use of RISBG for a zero-extended right shift · dd7dd930

Richard Sandiford authored Nov 26, 2013

We would wrongly transform the testcase into the equivalent of an AND with 1.
The problem was that, when testing whether the shifted-in bits of the right
shift were significant, we used the width of the final zero-extended result
rather than the width of the shifted value.

llvm-svn: 195731

dd7dd930

[PM] Split the CallGraph out from the ModulePass which creates the · 6378cf53

Chandler Carruth authored Nov 26, 2013

CallGraph.

This makes the CallGraph a totally generic analysis object that is the
container for the graph data structure and the primary interface for
querying and manipulating it. The pass logic is separated into its own
class. For compatibility reasons, the pass provides wrapper methods for
most of the methods on CallGraph -- they all just forward.

This will allow the new pass manager infrastructure to provide its own
analysis pass that constructs the same CallGraph object and makes it
available. The idea is that in the new pass manager, the analysis pass's
'run' method returns a concrete analysis 'result'. Here, that result is
a 'CallGraph'. The 'run' method will typically do only minimal work,
deferring much of the work into the implementation of the result object
in order to be lazy about computing things, but when (like DomTree)
there is *some* up-front computation, the analysis does it prior to
handing the result back to the querying pass.

I know some of this is fairly ugly. I'm happy to change it around if
folks can suggest a cleaner interim state, but there is going to be some
amount of unavoidable ugliness during the transition period. The good
thing is that this is very limited and will naturally go away when the
old pass infrastructure goes away. It won't hang around to bother us
later.

Next up is the initial new-PM-style call graph analysis. =]

llvm-svn: 195722

6378cf53

[PM] Reformat some code with clang-format as I'm going to be editting as · 878b5537
Chandler Carruth authored Nov 26, 2013
```
part of generalizing the call graph infrastructure for the new pass
manager.

llvm-svn: 195718
```
878b5537
Refactored the implementation of AArch64 NEON instruction ZIP, UZP · 599c47d0
Kevin Qin authored Nov 26, 2013
```
and TRN.
Fix a bug when mixed use of vget_high_u8() and vuzp_u8().

llvm-svn: 195716
```
599c47d0
[AArch64]Implement 128 bit register copy with NEON. · 33ca18fd
Kevin Qin authored Nov 26, 2013
```
llvm-svn: 195713
```
33ca18fd

StackMap: Implement support for DirectMemRefOp. · 391dbadb

Andrew Trick authored Nov 26, 2013

A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.

For example:

entry:
  %a = alloca i64...
  llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)

Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.

llvm-svn: 195712

391dbadb

whitespace · d3ab37cf
Andrew Trick authored Nov 26, 2013
```
llvm-svn: 195711
```
d3ab37cf

Lift self-copy protection up to the header file and add self-move · 480f5d26

Chandler Carruth authored Nov 26, 2013

protection to the same layer.

This is in line with Howard's advice on how best to handle self-move
assignment as he explained on SO[1]. It also ensures that implementing
swap with move assignment continues to work in the case of self-swap.

[1]: http://stackoverflow.com/questions/9322174/move-assignment-operator-and-if-this-rhs

llvm-svn: 195705

480f5d26

Fix a self-memcpy which only breaks under Valgrind's memcpy · 2664317b

Chandler Carruth authored Nov 26, 2013

implementation. Silliness, but it'll be a trivial performance
optimization. This should clear up a failure on the vg_leak bot.

llvm-svn: 195704

2664317b

[PM] Rename the 'Mod' member to the more idiomatic 'M'. No functionality · 9a398f45
Chandler Carruth authored Nov 26, 2013
```
changed.

llvm-svn: 195701
```
9a398f45

DebugInfo: Remove CompileUnit::constructTypeDIEImpl now that it's just a simple wrapper again. · fbd29eb3

David Blaikie authored Nov 26, 2013

r195698 moved the type unit checking up into getOrCreateTypeDIE so
remove the redundant check and fold the functions back together again.

llvm-svn: 195700

fbd29eb3

DebugInfo: Avoid emitting pubtype entries for type DIEs that just indirect to a type unit. · 8a263cbc
David Blaikie authored Nov 26, 2013
```
llvm-svn: 195698
```
8a263cbc
Add an intrinsic for the SSE2 PAUSE instruction. · c592e525
Cameron McInally authored Nov 26, 2013
```
llvm-svn: 195697
```
c592e525

DebugInfo: Pubtypes: Coelesce pubtype registration with accelerator type registration. · 9d861bed

David Blaikie authored Nov 26, 2013

It might be possible to eventually use one data structure, but I haven't
looked at the exact criteria used for accelerator tables and pubtypes to
see if there's good reason for the differences between the two or not.

llvm-svn: 195696

9d861bed

Nov 25, 2013

Do the string comparison in the constructor instead of once per nop. · a834e301
Rafael Espindola authored Nov 25, 2013
```
Thanks to Roman Divacky for the suggestion.

llvm-svn: 195684
```
a834e301

Don't use nopl in cpus that don't support it. · 1b8bfdaa

Rafael Espindola authored Nov 25, 2013

Patch by Mikulas Patocka. I added the test. I checked that for cpu names that
gas knows about, it also doesn't generate nopl.

The modified cpus:
i686 - there are i686-class CPUs that don't have nopl: Via c3, Transmeta
        Crusoe, Microsoft VirtualBox - see
        https://bbs.archlinux.org/viewtopic.php?pid=775414
k6, k6-2, k6-3, winchip-c6, winchip2 - these are 586-class CPUs
via c3 c3-2 - see https://bugs.archlinux.org/task/19733 as a proof that
        Via c3 and c3-Nehemiah don't have nopl

llvm-svn: 195679

1b8bfdaa

ARM integrated assembler generates incorrect nop opcode · 7266731f

David Peixotto authored Nov 25, 2013

This patch fixes a bug in the assembler that was causing bad code to
be emitted.  When switching modes in an assembly file (e.g. arm to
thumb mode) we would always emit the opcode from the original mode.

Consider this small example:

$ cat align.s
.code 16
foo:
  add r0, r0
.align 3
  add r0, r0

$ llvm-mc -triple armv7-none-linux align.s -filetype=obj -o t.o
$ llvm-objdump -triple thumbv7 -d t.o
Disassembly of section .text:
foo:
       0:       00 44         add     r0, r0
       2:       00 f0 20 e3   blx #4195904
       6:       00 00         movs    r0, r0
       8:       00 44         add     r0, r0

This shows that we have actually emitted an arm nop (e320f000)
instead of a thumb nop. Unfortunately, this encodes to a thumb
branch which causes bad things to happen when compiling assembly
code with align directives.

The fix is to notify the ARMAsmBackend when we switch mode. The
MCMachOStreamer was already doing this correctly. This patch makes
the same change for the MCElfStreamer.

There is still a bug in the way nops are emitted for alignment
because the MCAlignment fragment does not store the correct mode.
The ARMAsmBackend will emit nops for the last mode it knew about. In
the example above, we still generate an arm nop if we add a `.code
32` to the end of the file.

PR18019

llvm-svn: 195677

7266731f

Unrevert r195599 with testcase fix. · 9200bb08
Bill Wendling authored Nov 25, 2013
```
I'm not sure how it was checking for the wrong values...
PR18023.

llvm-svn: 195670
```
9200bb08
Fix indentation typo · d34094e5
Tim Northover authored Nov 25, 2013
```
llvm-svn: 195660
```
d34094e5

ARM: remove special cases for Darwin dynamic-no-pic mode. · db962e2c

Tim Northover authored Nov 25, 2013

These are handled almost identically to static mode (and ELF's global address
materialisation), except that a symbol may have "$non_lazy_ptr" appended. This
can be handled by passing appropriate flags along with the instruction instead
of using entirely separate pseudo-instructions.

llvm-svn: 195655

db962e2c

Fix .comm and .lcomm on COFF. · edcf1ff7

Rafael Espindola authored Nov 25, 2013

These should not use COMDATs. GNU as uses .bss for .lcomm and section 0 for
.comm.

Given

static int a;
int b;

MSVC puts both in .bss. This patch then puts both .comm and .lcomm on .bss. With
this change we agree with gas on .lcomm, are much closer on .comm and clang-cl
matches msvc on the above example.

llvm-svn: 195654

edcf1ff7

Refactor to make the .bss, .data and .text sections available for other uses. · 3294e057
Rafael Espindola authored Nov 25, 2013
```
No functionality change.

llvm-svn: 195653
```
3294e057