Commits · e891c5f2648c85a31c57c2cc63f76a5b95da86d3 · Roger Ferrer / llvm-epi-0.8

Jan 03, 2014

[AArch64][NEON] Added SXTL and SXTL2 instruction aliases · e891c5f2
Ana Pazos authored Jan 03, 2014
```
llvm-svn: 198437
```
e891c5f2

Revert "Revert "Debug Info: Type Units: Simplify type hashing using IR-provided unique names."" · cfb2115e

David Blaikie authored Jan 03, 2014

This reverts commit r198398, thus reapplying r198397.

I had accidentally introduced an endianness issue when applying the hash
to the type unit. Using support::ulittle64_t in the reinterpret_cast in
addDwarfTypeUnitType fixes this issue.

Original commit message:

Debug Info: Type Units: Simplify type hashing using IR-provided unique
names.

What's good for LTO metadata size problems ought to be good for non-LTO
debug info size too, so let's rely on the same uniqueness in both cases.
If it's insufficient for non-LTO for whatever reason (since we now won't
be uniquing CU-local types or any C types - but these are likely to not
be the most significant contributors to type bloat) we should consider a
frontend solution that'll help both LTO and non-LTO alike, rather than
using DWARF-level DIE-hashing that only helps non-LTO debug info size.

It's also much simpler this way and benefits C++ even more since we can
deduplicate lexically separate definitions of the same C++ type since
they have the same mangled name.

llvm-svn: 198436

cfb2115e

Fix loop rerolling pass failure with non-consant loop lower bound · ea9ba446

David Peixotto authored Jan 03, 2014

The loop rerolling pass was failing with an assertion failure from a
failed cast on loops like this:

  void foo(int *A, int *B, int m, int n) {
    for (int i = m; i < n; i+=4) {
      A[i+0] = B[i+0] * 4;
      A[i+1] = B[i+1] * 4;
      A[i+2] = B[i+2] * 4;
      A[i+3] = B[i+3] * 4;
    }
  }

The code was casting the SCEV-expanded code for the new
induction variable to a phi-node. When the loop had a non-constant
lower bound, the SCEV expander would end the code expansion with an
add insted of a phi node and the cast would fail.

It looks like the cast to a phi node was only needed to get the
induction variable value coming from the backedge to compute the end
of loop condition. This patch changes the loop reroller to compare
the induction variable to the number of times the backedge is taken
instead of the iteration count of the loop. In other words, we stop
the loop when the current value of the induction variable ==
IterationCount-1. Previously, the comparison was comparing the
induction variable value from the next iteration == IterationCount.

This problem only seems to occur on 32-bit targets. For some reason,
the loop is not rerolled on 64-bit targets.

PR18290

llvm-svn: 198425

ea9ba446

BasicAA: Use reachabilty instead of dominance for checking value equality in phi · 833a82ec

Arnold Schwaighofer authored Jan 03, 2014

cycles

This allows the value equality check to work even if we don't have a dominator
tree. Also add some more comments.

I was worried about compile time impacts and did not implement reachability but
used the dominance check in the initial patch. The trade-off was that the
dominator tree was required.
The llvm utility function isPotentiallyReachable cuts off the recursive search
after 32 visits. Testing did not show any compile time regressions showing my
worries unjustfied.

No compile time or performance regressions at O3 -flto -mavx on test-suite +
externals.

Addresses review comments from r198290.

llvm-svn: 198400

833a82ec

Revert "Debug Info: Type Units: Simplify type hashing using IR-provided unique names." · ab0ba249
David Blaikie authored Jan 03, 2014
```
Reverting due to bot failure I won't have time to investigate until
tomorrow.

This reverts commit r198397.

llvm-svn: 198398
```
ab0ba249

Debug Info: Type Units: Simplify type hashing using IR-provided unique names. · ddb66281

David Blaikie authored Jan 03, 2014

What's good for LTO metadata size problems ought to be good for non-LTO
debug info size too, so let's rely on the same uniqueness in both cases.
If it's insufficient for non-LTO for whatever reason (since we now won't
be uniquing CU-local types or any C types - but these are likely to not
be the most significant contributors to type bloat) we should consider a
frontend solution that'll help both LTO and non-LTO alike, rather than
using DWARF-level DIE-hashing that only helps non-LTO debug info size.

It's also much simpler this way and benefits C++ even more since we can
deduplicate lexically separate definitions of the same C++ type since
they have the same mangled name.

llvm-svn: 198397

ddb66281

80-column. · 4d214b9e
Eric Christopher authored Jan 03, 2014
```
llvm-svn: 198394
```
4d214b9e
Remove TextSectionSym as it is unused. · 50effa04
Eric Christopher authored Jan 03, 2014
```
llvm-svn: 198393
```
50effa04

Revert "Reverting r193835 due to weirdness with Go..." · 22b29a5f

David Blaikie authored Jan 03, 2014

The cgo problem was that it wants dwarf2 which doesn't support direct
constant encoding of the location. So let's add support for dwarf2
encoding (using a location expression) of data member locations.

This reverts commit r198385.

llvm-svn: 198389

22b29a5f

Reverting r193835 due to weirdness with Go... · 2ada116a

David Blaikie authored Jan 03, 2014

Apologies for the noise - we're seeing some Go failures with cgo
interacting with Clang's debug info due to this change.

llvm-svn: 198385

2ada116a

Jan 02, 2014

[RegAlloc] Make tryInstructionSplit less aggressive. · 1fb3362a

Quentin Colombet authored Jan 02, 2014

The greedy register allocator tries to split a live-range around each
instruction where it is used or defined to relax the constraints on the entire
live-range (this is a last chance split before falling back to spill).
The goal is to have a big live-range that is unconstrained (i.e., that can use
the largest legal register class) and several small local live-range that carry
the constraints implied by each instruction.
E.g.,
Let csti be the constraints on operation i.

V1=
op1 V1(cst1)
op2 V1(cst2)

V1 live-range is constrained on the intersection of cst1 and cst2.

tryInstructionSplit relaxes those constraints by aggressively splitting each
def/use point:
V1=
V2 = V1
V3 = V2
op1 V3(cst1)
V4 = V2
op2 V4(cst2)

Because of how the coalescer infrastructure works, each new variable (V3, V4)
that is alive at the same time as V1 (or its copy, here V2) interfere with V1.
Thus, we end up with an uncoalescable copy for each split point.

To make tryInstructionSplit less aggressive, we check if the split point
actually relaxes the constraints on the whole live-range. If it does not, we do
not insert it.
Indeed, it will not help the global allocation problem:
- V1 will have the same constraints.
- V1 will have the same interference + possibly the newly added split variable
  VS.
- VS will produce an uncoalesceable copy if alive at the same time as V1.

<rdar://problem/15570057>

llvm-svn: 198369

1fb3362a

[PPC] Fix comment to match function name · 860fa905
Hal Finkel authored Jan 02, 2014
```
llvm-svn: 198362
```
860fa905
Remove comments on CU skeleton construction, they're probably · 94932438
Eric Christopher authored Jan 02, 2014
```
obvious.

llvm-svn: 198361
```
94932438

[PPC] Fix the scheduling of CR logicals on the P7 · 1d429f2e

Hal Finkel authored Jan 02, 2014

CR logicals (crand, crxor, etc.) on the P7 need to be in the first slot of each
dispatch group. The old itinerary entry was just wrong (but has not mattered
because we don't generate these instructions).

This will matter when, in an upcoming commit, we start generating these
instructions.

llvm-svn: 198359

1d429f2e

Elaborate on comment for skeleton CU construction. · d8beca3b
Eric Christopher authored Jan 02, 2014
```
llvm-svn: 198358
```
d8beca3b
Revert seemingly unnecessary section sym for the data section. · 40734c4c
Eric Christopher authored Jan 02, 2014
```
llvm-svn: 198357
```
40734c4c

[PPC] Use the correct immediate operands on 64-bit instructions · 77c8dc1d

Hal Finkel authored Jan 02, 2014

Several of the 64-bit fixed-point instructions with immediate operands were
using the 32-bit (i32) operand nodes instead of the corresponding 64-bit (i64)
operand definitions (u16imm instead of u16imm64, for example).

This error has had no effect so far, but would have caused type-checking
violations with an upcoming change.

llvm-svn: 198356

77c8dc1d

Disable compare sinking in CodeGenPrepare when multiple condition registers are available · decb024c

Hal Finkel authored Jan 02, 2014

As noted in the comment above CodeGenPrepare::OptimizeInst, which aggressively
sinks compares to reduce pressure on the condition register(s), for targets
such as PowerPC with multiple condition registers, this may not be the right
thing to do. This adds an HasMultipleConditionRegisters boolean to TLI, and
CodeGenPrepare::OptimizeInst is skipped when HasMultipleConditionRegisters is
true.

This functionality will be used by the PowerPC backend in an upcoming commit.
Especially when the PowerPC backend starts tracking individual condition
register bits as separate allocatable entities (which will happen in this
upcoming commit), this sinking from CodeGenPrepare::OptimizeInst is
significantly suboptimial.

llvm-svn: 198354

decb024c

indvars: cleanup the IV visitor. It does more than gather sext/zext info. · b6bc7830
Andrew Trick authored Jan 02, 2014
```
llvm-svn: 198353
```
b6bc7830

Fix up a couple of review comments: · d4368fde

Eric Christopher authored Jan 02, 2014

Use an if statement instead of a pair of ternary operators checking
the same condition.
Use a cheap method call rather than returning the local symbol.

llvm-svn: 198351

d4368fde

Simplify conditional. · 8bdb6e1d
Eric Christopher authored Jan 02, 2014
```
llvm-svn: 198350
```
8bdb6e1d
Allow addrspacecast in global aliases · 00436ea1
Matt Arsenault authored Jan 02, 2014
```
llvm-svn: 198349
```
00436ea1

[TableGen] Correctly generate implicit anonymous prototype defs in multiclasses · a8c1f467

Hal Finkel authored Jan 02, 2014

Even within a multiclass, we had been generating concrete implicit anonymous
defs when parsing values (generally in value lists). This behavior was
incorrect, and led to errors when multiclass parameters were used in the
parameter list of the implicit anonymous def.

If we had some multiclass:

multiclass mc<string n> {

 ... : SomeClass<SomeOtherClass<n> >

The capture of the multiclass parameter 'n' would not work correctly, and
depending on how the implicit SomeOtherClass was used, either TableGen would
ignore something it shouldn't, or would crash.

To fix this problem, when inside a multiclass, we generate prototype anonymous
defs for implicit anonymous defs (just as we do for explicit anonymous defs).
Within the multiclass, the current record prototype is populated with a node
that is essentially: !cast<SomeOtherClass>(!strconcat(NAME, anon_value_name)).
This is then resolved to the correct concrete anonymous def, in the usual way,
when NAME is resolved during multiclass instantiation.

llvm-svn: 198348

a8c1f467

Delete unread globals through addrspacecast · 461c8e0a
Matt Arsenault authored Jan 02, 2014
```
llvm-svn: 198346
```
461c8e0a
Fix addrspacecast with metadata globals · da1deabb
Matt Arsenault authored Jan 02, 2014
```
llvm-svn: 198345
```
da1deabb
Remove redundant fold call introduced in r195944. Thanks very much to Juergen · 8e6e6abf
Lang Hames authored Jan 02, 2014
```
for pointing this out.
 

llvm-svn: 198341
```
8e6e6abf

[TableGen] Use the same anonymous name as the prefix on all multiclass defs · f2a0b2b3

Hal Finkel authored Jan 02, 2014

TableGen had been generating a different name for an anonymous multiclass's
NAME for every def in the multiclass. This had an unfortunate side effect: it
was impossible to reference one def within the multiclass from another (in the
parameter list, for example). By making sure we only generate an anonymous name
once per multiclass (which, as it turns out, requires only changing the name
parameter to reference type), we can now concatenate NAME within the multiclass
with a def name in order to generate a reference to that def.

This does not matter so much, in and of itself, but is necessary for a
follow-up commit that will fix variable capturing in implicit anonymous
multiclass defs (and that is important).

llvm-svn: 198340

f2a0b2b3

indvars: insert truncate at loop boundary to avoid redundant IVs. · 020dd898

Andrew Trick authored Jan 02, 2014

When widening an IV to remove s/zext, we generally try to eliminate
the original narrow IV. However, LCSSA phi nodes outside the loop were
still using the original IV. Clean this up more aggressively to avoid
redundancy in generated code.

llvm-svn: 198338

020dd898

Mark REX64_PREFIX as In64BitMode, remove hack from X86RecognizableInstr. · 66c20f34
Craig Topper authored Jan 02, 2014
```
llvm-svn: 198336
```
66c20f34
Make llvm::Regex non-copyable but movable. · 7a238048
David Blaikie authored Jan 02, 2014
```
Based on a patch by Maciej Piechotka.

llvm-svn: 198334
```
7a238048
Revert "Debug info: Add enumerators to the __apple_names accelerator table." · fd3279f2
Adrian Prantl authored Jan 02, 2014
```
This reverts r197927 until the discussion on llvm-commits comes to a
conclusion.

llvm-svn: 198333
```
fd3279f2
Mark PUSHFS64/PUSHGS64/POPFS64/POPGS64 as In64BitMode and remove the hack from... · eabdbcb8
Craig Topper authored Jan 02, 2014
```
Mark PUSHFS64/PUSHGS64/POPFS64/POPGS64 as In64BitMode and remove the hack from the disassembler table builder.

llvm-svn: 198327
```
eabdbcb8

Mark all x86 Int_ and _Int patterns as isCodeGenOnly so the disassembler table... · 9dd48c8e

Craig Topper authored Jan 02, 2014

Mark all x86 Int_ and _Int patterns as isCodeGenOnly so the disassembler table builder doesn't need to string match them to exclude them.

llvm-svn: 198323

9dd48c8e

[arm] Add softvfp to supported FPU names. · 05ae7448
Logan Chien authored Jan 02, 2014
```
llvm-svn: 198313
```
05ae7448

Make the ARM ABI selectable via SubtargetFeature. · d89b16dc

Rafael Espindola authored Jan 02, 2014

This patch makes it possible to select the ABI with -mattr. It will be used to
forward clang's -target-abi option to llvm's CodeGen.

llvm-svn: 198304

d89b16dc

BasicAA: Fix value equality and phi cycles · 0d10a9d5

Arnold Schwaighofer authored Jan 02, 2014

When there are cycles in the value graph we have to be careful interpreting
"Value*" identity as "value" equivalence. We interpret the value of a phi node
as the value of its operands.
When we check for value equivalence now we make sure that the "Value*" dominates
all cycles (phis).

%0 = phi [%noaliasval, %addr2]
%l = load %ptr
%addr1 = gep @a, 0, %l
%addr2 = gep @a, 0, (%l + 1)
store %ptr ...

Before this patch we would return NoAlias for (%0, %addr1) which is wrong
because the value of the load is from different iterations of the loop.

Tested on x86_64 -mavx at O3 and O3 -flto with no performance or compile time
regressions.

PR18068
radar://15653794

llvm-svn: 198290

0d10a9d5

Jan 01, 2014

Remove the 's' DataLayout specification · 6994fdf3

Rafael Espindola authored Jan 01, 2014

During the years there have been some attempts at figuring out how to
align byval arguments. A look at the commit log suggests that they
were

* Use the ABI alignment.
* When that was not sufficient for x86-64, I added the 's' specification to
  DataLayout.
* When that was not sufficient Evan added the virtual getByValTypeAlignment.
* When even that was not sufficient, we just got the FE to add the alignment
  to the byval.

This patch is just a simple cleanup that removes my first attempt at fixing the
problem. I also added an AArch64 implementation of getByValTypeAlignment to
make sure this patch is a nop. I also left the 's' parsing for backward
compatibility.

I will send a short email to llvmdev about the change for anyone maintaining
an out of tree target.

llvm-svn: 198287

6994fdf3

[Sparc] Handle atomic loads/stores in sparc backend. · 9a3da52e
Venkatraman Govindaraju authored Jan 01, 2014
```
llvm-svn: 198286
```
9a3da52e

Remove modifierType/Base from X86 disassembler tables as they are no longer... · 3321c99a

Craig Topper authored Jan 01, 2014

Remove modifierType/Base from X86 disassembler tables as they are no longer used. Removes ~11.5K from static tables.

llvm-svn: 198284

3321c99a

[SparcV9]: Custom lower UMULO/SMULO so that the arguments are send to __multi3() in correct order. · 77011e86
Venkatraman Govindaraju authored Jan 01, 2014
```
llvm-svn: 198281
```
77011e86