Commits · 5f54c655c140305abbb71546b88b4324f273a105 · Roger Ferrer / llvm-epi-0.8

Nov 08, 2013

CalculateSpillWeights does not need to be a pass · ed812f65

Arnaud A. de Grandmaison authored Nov 08, 2013

Based on discussions with Lang Hames and Jakob Stoklund Olesen at the hacker's lab, and in the light of upcoming work on the PBQP register allocator, it was though that CalcSpillWeights does not need to be a pass. This change will enable to customize / tune the spill weight computation depending on the allocator.

Update the documentation style while there.

No functionnal change.

llvm-svn: 194269

ed812f65

CalculateSpillWeights cleanup: remove unneeded includes · 3b52f0b1
Arnaud A. de Grandmaison authored Nov 08, 2013
```
llvm-svn: 194259
```
3b52f0b1

Nov 05, 2013

Slightly change the way stackmap and patchpoint intrinsics are lowered. · 6664df12

Andrew Trick authored Nov 05, 2013

MorphNodeTo is not safe to call during DAG building. It eagerly
deletes dependent DAG nodes which invalidates the NodeMap. We could
expose a safe interface for morphing nodes, but I don't think it's
worth it. Just create a new MachineNode and replaceAllUsesWith.

My understaning of the SD design has been that we want to support
early target opcode selection. That isn't very well supported, but
generally works. It seems reasonable to rely on this feature even if
it isn't widely used.

llvm-svn: 194102

6664df12

Nov 02, 2013
- Comment some and reformat for clarity beginFunction. · fedfa449
  Eric Christopher authored Nov 01, 2013
```
llvm-svn: 193894
```
  fedfa449
Nov 01, 2013

[Stackmap] Remove erroneous assert. · 359c532d
Juergen Ributzka authored Nov 01, 2013
```
llvm-svn: 193871
```
359c532d

Remove linkonce_odr_auto_hide. · 716e7405

Rafael Espindola authored Nov 01, 2013

linkonce_odr_auto_hide was in incomplete attempt to implement a way
for the linker to hide symbols that are known to be available in every
TU and whose addresses are not relevant for a particular DSO.

It was redundant in that it all its uses are equivalent to
linkonce_odr+unnamed_addr. Unlike those, it has never been connected
to clang or llvm's optimizers, so it was effectively dead.

Given that nothing produces it, this patch just nukes it
(other than the llvm-c enum value).

llvm-svn: 193865

716e7405

Commenting out this assert because it is causing the build bots to fail. This... · 2b7a733b

Aaron Ballman authored Nov 01, 2013

Commenting out this assert because it is causing the build bots to fail.  This effectively reverts r193861, but needs to be fixed as part of r193769.

llvm-svn: 193862

2b7a733b

Fixing an order of evaluation error in an assert. · 96321aa5
Aaron Ballman authored Nov 01, 2013
```
llvm-svn: 193861
```
96321aa5
DebugInfo: Emit member variable locations as data instead of expressions in blocks · 71d34a2e
David Blaikie authored Nov 01, 2013
```
Drive by space optimization. Also makes the DIEs more regular which
might speed up DWARF parsing.

llvm-svn: 193835
```
71d34a2e

Oct 31, 2013

Unused variable · c21d86f7
Andrew Trick authored Oct 31, 2013
```
llvm-svn: 193819
```
c21d86f7
Add support for stack map generation in the X86 backend. · 153ebe6d
Andrew Trick authored Oct 31, 2013
```
Originally implemented by Lang Hames.

llvm-svn: 193811
```
153ebe6d

Debug Info: remove duplication of DIEs when a DIE can be shared across CUs. · 4dbdc902

Manman Ren authored Oct 31, 2013

We add a map in DwarfDebug to map MDNodes that are shareable across CUs to the
corresponding DIEs: MDTypeNodeToDieMap. These DIEs can be shared across CUs,
that is why we keep the maps in DwarfDebug instead of CompileUnit.

We make the assumption that if a DIE is not added to an owner yet, we assume
it belongs to the current CU. Since DIEs for the type system are added to
their owners immediately after creation, and other DIEs belong to the current
CU, the assumption should be true.

A testing case is added to show that we only create a single DIE for a type
MDNode and we use ref_addr to refer to the type DIE.

We also add a testing case to show ref_addr relocations for non-darwin
platforms.

llvm-svn: 193779

4dbdc902

Lower stackmap intrinsics directly to their target opcode in the DAG builder. · 74f4c749
Andrew Trick authored Oct 31, 2013
```
llvm-svn: 193769
```
74f4c749
whitespace · d4d1d9c0
Andrew Trick authored Oct 31, 2013
```
llvm-svn: 193765
```
d4d1d9c0
Remove the --shrink-wrap option. · dbec9d9b
Rafael Espindola authored Oct 31, 2013
```
It had no tests, was unused and was "experimental at best".

llvm-svn: 193749
```
dbec9d9b

Legalize: Improve legalization of long vector extends. · 72366786

Jim Grosbach authored Oct 31, 2013

When an extend more than doubles the size of the elements (e.g., a zext
from v16i8 to v16i32), the normal legalization method of splitting the
vectors will run into problems as by the time the destination vector is
legal, the source vector is illegal. The end result is the operation
often becoming scalarized, with the typical horrible performance. For
example, on x86_64, the simple input of:
define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind {
  %tmp = zext <16 x i8> %a to <16 x i32>
  store <16 x i32> %tmp, <16 x i32>*%p
  ret void
}

Generates:
  .section  __TEXT,__text,regular,pure_instructions
  .section  __TEXT,__const
  .align  5
LCPI0_0:
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _bar
  .align  4, 0x90
_bar:
  vpunpckhbw  %xmm0, %xmm0, %xmm1
  vpunpckhwd  %xmm0, %xmm1, %xmm2
  vpmovzxwd %xmm1, %xmm1
  vinsertf128 $1, %xmm2, %ymm1, %ymm1
  vmovaps LCPI0_0(%rip), %ymm2
  vandps  %ymm2, %ymm1, %ymm1
  vpmovzxbw %xmm0, %xmm3
  vpunpckhwd  %xmm0, %xmm3, %xmm3
  vpmovzxbd %xmm0, %xmm0
  vinsertf128 $1, %xmm3, %ymm0, %ymm0
  vandps  %ymm2, %ymm0, %ymm0
  vmovaps %ymm0, (%rdi)
  vmovaps %ymm1, 32(%rdi)
  vzeroupper
  ret

So instead we can check if there are legal types that enable us to split
more cleverly when the input vector is already legal such that we don't
turn it into an illegal type. If the extend is such that it's more than
doubling the size of the input we check if
  - the number of vector elements is even,
  - the source type is legal,
  - the type of a split source is illegal,
  - the type of an extended (by doubling element size) source is legal, and
  - the type of that extended source when split is legal.
If the conditions are met, instead of just splitting both the
destination and the source types, we create an extend that only goes up
one "step" (doubling the element width), and the continue legalizing the
rest of the operation normally. The result is that this operates as a
new, more effecient, termination condition for the loop of "split the
operation until the destination type is legal."

With this change, the above example now compiles to:
_bar:
  vpxor %xmm1, %xmm1, %xmm1
  vpunpcklbw  %xmm1, %xmm0, %xmm2
  vpunpckhwd  %xmm1, %xmm2, %xmm3
  vpunpcklwd  %xmm1, %xmm2, %xmm2
  vinsertf128 $1, %xmm3, %ymm2, %ymm2
  vpunpckhbw  %xmm1, %xmm0, %xmm0
  vpunpckhwd  %xmm1, %xmm0, %xmm3
  vpunpcklwd  %xmm1, %xmm0, %xmm0
  vinsertf128 $1, %xmm3, %ymm0, %ymm0
  vmovaps %ymm0, 32(%rdi)
  vmovaps %ymm2, (%rdi)
  vzeroupper
  ret

This generalizes a custom lowering that was added a while back to the
ARM backend. That lowering is no longer necessary, and is removed. The
testcases for it, however, provide excellent ARM tests for this change
and so remain.

rdar://14735100

llvm-svn: 193727

72366786

Fix CodeGen for unaligned loads with address spaces · 2ba54c3d
Matt Arsenault authored Oct 30, 2013
```
llvm-svn: 193721
```
2ba54c3d

Oct 30, 2013

Produce .weak_def_can_be_hidden for some linkonce_odr values · 6f1b2852

Rafael Espindola authored Oct 30, 2013

With this patch llvm produces a weak_def_can_be_hidden for linkonce_odr
if they are also unnamed_addr or don't have their address taken.

There is not a lot of documentation about .weak_def_can_be_hidden, but
from the old discussion about linkonce_odr_auto_hide and the name of
the directive this looks correct: these symbols can be hidden.

Testing this with the ld64 in Xcode 5 linking clang reduces the number of
exported symbols from 21053 to 19049.

llvm-svn: 193718

6f1b2852

DebugInfo: Push header handling down into CompileUnit · 6b288cfa

David Blaikie authored Oct 30, 2013

This is a preliminary step to handling type units by abstracting over
all (type or compile) units.

llvm-svn: 193714

6b288cfa

DwarfDebug: Change Abbreviations member from pointer to reference · 2d4e1122
David Blaikie authored Oct 30, 2013
```
llvm-svn: 193699
```
2d4e1122
Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too." · 3bd686d4
Juergen Ributzka authored Oct 30, 2013
```
Now Hexagon and SystemZ are not happy with it :-(

llvm-svn: 193677
```
3bd686d4

SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. · 6ad05d6b

Juergen Ributzka authored Oct 30, 2013

The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. This mask has
usually the same size as the VSELECT return type (except for Intel KNL). Now the
type legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

Reviewed by Nadav

llvm-svn: 193676

6ad05d6b

Reformat code with clang-format. · 7245f1d8

Josh Magee authored Oct 30, 2013

Differential Revision: http://llvm-reviews.chandlerc.com/D2057

llvm-svn: 193672

7245f1d8

Debug Info: code clean up. · 251a1bd2

Manman Ren authored Oct 29, 2013

Use EmitLabelOffsetDifference for handling on darwin platform when
non-darwin platforms use EmitLabelPlusOffset.

Also fix a bug in EmitLabelOffsetDifference where the size is hard-coded
to 4 even though Size is passed in as an argument.

llvm-svn: 193660

251a1bd2

Oct 29, 2013

Debug Info: support for DW_FORM_ref_addr. · ce20d460

Manman Ren authored Oct 29, 2013

To support ref_addr, we calculate the section offset of a DIE (i.e. offset
of a DIE from beginning of the debug info section). The Offset field in DIE
is currently CU-relative. To calculate the section offset, we add a
DebugInfoOffset field in CompileUnit to store the offset of a CU from beginning
of the debug info section. We set the value in DwarfUnits::computeSizeAndOffset
for each CompileUnit.

A helper function DIE::getCompileUnit is added to return the CU DIE that
the input DIE belongs to. We also add a map CUDieMap in DwarfDebug to help
finding the CU for a given CU DIE.

For a cross-referenced DIE, we first find the CU DIE it belongs to with
getCompileUnit, then we use CUDieMap to get the corresponding CU for the CU DIE.
Adding the section offset of the CU with the CU-relative offset of a DIE gives
us the seciton offset of the DIE.

We correctly emit ref_addr with relocation using EmitLabelPlusOffset when
doesDwarfUseRelocationsAcrossSections is true.

This commit handles the emission of DW_FORM_ref_addr when we have an attribute
with FORM_ref_addr. A follow-on patch will start using ref_addr when adding a
DIEEntry. This commit will be tested and verified in the follow-on patch.

Reviewed off-list by Eric, Thanks.

llvm-svn: 193658

ce20d460

Debug Info: instead of calling addToContextOwner which constructs the context · f4c339e0

Manman Ren authored Oct 29, 2013

after the DIE creation, we construct the context first.

Ensure that we create the context before we create a type so that we can add
the newly created type to the parent. Remove last use of addToContextOwner
now that it's not needed.

We use createAndAddDIE to wrap around "new DIE(". Now all shareable DIEs
should be added to their parents right after the creation.

Reviewed off-list by Eric, Thanks.

llvm-svn: 193657

f4c339e0

[stackprotector] Update the StackProtector pass to perform datalayout analysis. · 3f1c0e35

Josh Magee authored Oct 29, 2013

This modifies the pass to classify every SSP-triggering AllocaInst according to
an SSPLayoutKind (LargeArray, SmallArray, AddrOf).  This analysis is collected
by the pass and made available for use, but no other pass uses it yet.

The next patch will make use of this analysis in PEI and StackSlot
passes.  The end goal is to support ssp-strong stack layout rules.

WIP.

Differential Revision: http://llvm-reviews.chandlerc.com/D1789

llvm-svn: 193653

3f1c0e35

Move getSymbol to TargetLoweringObjectFile. · e133ed88
Rafael Espindola authored Oct 29, 2013
```
This allows constructing a Mangler with just a TargetMachine.

llvm-svn: 193630
```
e133ed88
Add a helper getSymbol to AsmPrinter. · 79858aa3
Rafael Espindola authored Oct 29, 2013
```
llvm-svn: 193627
```
79858aa3

Debug Info: instead of calling addToContextOwner which constructs the context · f6b936bc

Manman Ren authored Oct 29, 2013

after the DIE creation, we construct the context first.

This touches creation of namespaces and global variables. The purpose is to
handle all DIE creations similarly: constructs the context first, then creates
the DIE and immediately adds the DIE to its parent.

We use createAndAddDIE to wrap around "new DIE(".

llvm-svn: 193589

f6b936bc

Fix "existant" typos · 6a033745
Alp Toker authored Oct 29, 2013
```
llvm-svn: 193579
```
6a033745

Debug Info: use createAndAddDIE to wrap around "new DIE" in DwarfDebug. · 4a841a86

Manman Ren authored Oct 29, 2013

This commit ensures DIEs are constructed within a compile unit and
immediately added to their parents.

Reviewed off-list by Eric.

llvm-svn: 193568

4a841a86

Debug Info: use createAndAddDIE for newly-created Subprogram DIEs. · 73d697c6

Manman Ren authored Oct 29, 2013

More patches will be submitted to convert "new DIE(" to use createAddAndDIE in
DwarfCompileUnit.cpp. This will simplify implementation of addDIEEntry where
we have to decide between ref4 and ref_addr, because DIEs that can be shared
across CU will be added to a CU already.

Reviewed off-list by Eric.

llvm-svn: 193567

73d697c6

Debug Info: add a helper function createAndAddDIE. · b987e517

Manman Ren authored Oct 29, 2013

It wraps around "new DIE(" and handles the bookkeeping part of the newly-created
DIE. It adds the DIE to its parent, and calls insertDIE if necessary. It makes
sure that bookkeeping is done at the earliest time and we should not see
parentless DIEs if all constructions of DIEs go through this helper function.

Later on, we can use an allocator for DIE allocation, and will only need to
change createAndAddDIE instead of modifying all the "new DIE(".

Reviewed off-list by Eric.

llvm-svn: 193566

b987e517

Oct 28, 2013

[DAGCombiner] Respect volatility when checking for aliases · 981fdeb4

Richard Sandiford authored Oct 28, 2013

Making useAA() default to true for SystemZ showed that the combiner alias
analysis wasn't handling volatile accesses.  This hit many of the SystemZ
tests, but I arbitrarily picked one for the purpose of this patch.

llvm-svn: 193518

981fdeb4

Keep TBAA info when rewriting SelectionDAG loads and stores · 39c1ce4d

Richard Sandiford authored Oct 28, 2013

Most SelectionDAG code drops the TBAA info when creating a new form of a
load and store (e.g. during legalization, or when converting a plain
load to an extending one). This patch tries to catch all cases where
the TBAA information can legitimately be carried over.

The patch adds alternative forms of getLoad() and getExtLoad() that take
a MachineMemOperand instead of individual fields. (The corresponding
getTruncStore() already exists.) The idea is to use the MachineMemOperand
forms when all fields are carried over (size, pointer info, isVolatile,
isNonTemporal, alignment and TBAA info). If some adjustment is being
made, e.g. to narrow the load, then we still pass the individual fields
but also pass the TBAA info.

llvm-svn: 193517

39c1ce4d

Oct 25, 2013

DIEHash: Summary hashing of member functions · 8bc7db77
David Blaikie authored Oct 25, 2013
```
llvm-svn: 193432
```
8bc7db77
DIEHash: Summary hashing of nested types · 65cc969f
David Blaikie authored Oct 25, 2013
```
llvm-svn: 193427
```
65cc969f

LegalizeDAG: allow libcalls for max/min atomic operations · a564d329

Tim Northover authored Oct 25, 2013

ARM processors without ldrex/strex need to be able to make libcalls for all
atomic operations, including the newer min/max versions.

The alternative would probably be expanding these operations in terms of
cmpxchg (as x86 does always), but in the configurations where this matters
code-size tends to be paramount so the libcall is more desirable.

llvm-svn: 193398

a564d329

Optimize concat_vectors(X, undef) -> scalar_to_vector(X). · d369d4bd

Nadav Rotem authored Oct 25, 2013

This optimization is not SSE specific so I am moving it to DAGco.
The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add.

llvm-svn: 193393

d369d4bd