Commits · 06f621d3496ed2549edcc763403a64812d35e2a0 · Roger Ferrer / llvm-epi

Aug 04, 2014

Don't destroy MacroInfos if we find the macro definition is invalid; it'll get · 06f621d3
Richard Smith authored Aug 03, 2014
```
destroyed on shutdown regardless. Fixes a double-delete.

llvm-svn: 214675
```
06f621d3
Account for possible leading '.' in label string. · 065cabf4
Sanjay Patel authored Aug 03, 2014
```
llvm-svn: 214674
```
065cabf4

[x86] Don't add nodes to the combined set (and prune subsequent · cde4eb56

Chandler Carruth authored Aug 03, 2014

combines) until they are legal.

Doing it the old way could, when the stars align *just* right, cause
a node to get into the combine set prior to being legalized. Then, when
the same node showed up as an operand to another node later on (but not
so much later on that it had been deleted as dead) we would fail to add
it back to the worklist thinking it had already been combined. This
would in turn cause it to not be legalized. Fortunately, we can also
walk the operands looking for uncombined (and thus potentially
un-legalized) nodes late. It will still ensure that we walk all operands
of all nodes and send all of them through both the legalizer without
changes and the combiner at least once. (Which was the original goal of
this).

I have a test case for this bug, but it is terribly brittle. For
example, it will stop finding the bug the moment I enable the new
shuffle lowering. I don't yet have any test case that reliably exercises
this bug, and it isn't clear that it will be possible to craft one. It
is entirely possible that with the new shuffle lowering the two forms of
doing this are precisely equivalent. That doesn't mean we shouldn't take
the more conservative approach of insisting on things in the combined
set having survived the legalizer.

llvm-svn: 214673

cde4eb56

X86: silence warning (-Wparentheses) · 557023e3

Saleem Abdulrasool authored Aug 03, 2014

GCC 4.8.2 points out the ambiguity in evaluation of the assertion condition:

lib/Target/X86/X86FloatingPoint.cpp:949:49: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
assert(STReturns == 0 || isMask_32(STReturns) && N <= 2);

llvm-svn: 214672

557023e3

CodeGen: silence a warning · befa2153

Saleem Abdulrasool authored Aug 03, 2014

GCC 4.8.2 objects to the tautological condition in the assert as the unsigned
value is guaranteed to be >= 0.  Simplify the assertion by dropping the
tautological condition.

llvm-svn: 214671

befa2153

fix for PR20354 - Miscompile of fabs due to vectorization · 2ef67440

Sanjay Patel authored Aug 03, 2014

This is intended to be the minimal change needed to fix PR20354 ( http://llvm.org/bugs/show_bug.cgi?id=20354 ). The check for a vector operation was wrong; we need to check that the fabs itself is not a vector operation.

This patch will not generate the optimal code. A constant pool load and 'and' op will be generated instead of just returning a value that we can calculate in advance (as we do for the scalar case). I've put a 'TODO' comment for that here and expect to have that patch ready soon.

There is a very similar optimization that we can do in visitFNEG, so I've put another 'TODO' there and expect to have another patch for that too.

llvm-svn: 214670

2ef67440

MachineCombiner Pass for selecting faster instruction · 35ba4671

Gerolf Hoflehner authored Aug 03, 2014

 sequence -  AArch64 target support

 This patch turns off madd/msub generation in the DAGCombiner and generates
 them in the MachineCombiner instead. It replaces the original code sequence
 with the combined sequence when it is beneficial to do so.

 When there is no machine model support it always generates the madd/msub
 instruction. This is true also when the objective is to optimize for code
 size: when the combined sequence is shorter is always chosen and does not
 get evaluated.

 When there is a machine model the combined instruction sequence
 is evaluated for critical path and resource length using machine
 trace metrics and the original code sequence is replaced when it is
 determined to be faster.

 rdar://16319955

llvm-svn: 214669

35ba4671

Aug 03, 2014

Driver: Simplify a use of the path API · 6bcf724f

Justin Bogner authored Aug 03, 2014

It's a bit more obvious what's going on if we use path::filename
rather than decrementing an iterator here.

llvm-svn: 214668

6bcf724f

Change ProcessGDBRemote::DidLaunchOrAttach to · 921c01b5

Jason Molenda authored Aug 03, 2014

call Target::SetArchitecture instead of modifying a
reference to the target's architecture so that the
target logging can show that the arch has been changed.

llvm-svn: 214667

921c01b5

MachineCombiner Pass for selecting faster instruction · 5e1207e5

Gerolf Hoflehner authored Aug 03, 2014

 sequence -  target independent framework

 When the DAGcombiner selects instruction sequences
 it could increase the critical path or resource len.

 For example, on arm64 there are multiply-accumulate instructions (madd,
 msub). If e.g. the equivalent  multiply-add sequence is not on the
 crictial path it makes sense to select it instead of  the combined,
 single accumulate instruction (madd/msub). The reason is that the
 conversion from add+mul to the madd could lengthen the critical path
 by the latency of the multiply.

 But the DAGCombiner would always combine and select the madd/msub
 instruction.

 This patch uses machine trace metrics to estimate critical path length
 and resource length of an original instruction sequence vs a combined
 instruction sequence and picks the faster code based on its estimates.

 This patch only commits the target independent framework that evaluates
 and selects code sequences. The machine instruction combiner is turned
 off for all targets and expected to evolve over time by gradually
 handling DAGCombiner pattern in the target specific code.

 This framework lays the groundwork for fixing
 rdar://16319955

llvm-svn: 214666

5e1207e5

Do allow negative offsets in the outermost array dimension · f57d63f9

Tobias Grosser authored Aug 03, 2014

There is no needed for neither 1-dimensional nor higher dimensional arrays to
require positive offsets in the outermost array dimension.

We originally introduced this assumption with the support for delinearizing
multi-dimensional arrays.

llvm-svn: 214665

f57d63f9

MC: virtualise EmitWindowsUnwindTables · 4544c16e

Saleem Abdulrasool authored Aug 03, 2014

This makes EmitWindowsUnwindTables a virtual function and lowers the
implementation of the function to the X86WinCOFFStreamer.  This method is a
target specific operation.  This enables making the behaviour target dependent
by isolating it entirely to the target specific streamer.

llvm-svn: 214664

4544c16e

MC: rename Win64EHFrameInfo to WinEH::FrameInfo · b3be7371

Saleem Abdulrasool authored Aug 03, 2014

The frame information stored in this structure is driven by the requirements for
Windows NT unwinding rather than Windows 64 specifically.  As a result, this
type can be shared across multiple architectures (ARM, AXP, MIPS, PPC, SH).
Rename this class in preparation for adding support for supporting unwinding
information for Windows on ARM.

Take the opportunity to constify the members as everything except the
ChainedParent is read-only.  This required some adjustment to the label
handling.

llvm-svn: 214663

b3be7371

[Mips] Add the `mips64-linux-gnu` target to the test case to check `in128` · 3ab94b91
Simon Atanasyan authored Aug 03, 2014
```
type handling.

llvm-svn: 214662
```
3ab94b91

R600/SI: Fix extra whitespace in asm str · 9215b17e

Matt Arsenault authored Aug 03, 2014

This slipped in in r214467, so something like

V_MOV_B32_e32  v0, ... is now printed with 2 spaces
between the instruction name and first operand.

llvm-svn: 214660

9215b17e

Fix the modifiable access creation · a63b2579

Johannes Doerfert authored Aug 03, 2014

  + Remove the class IslGenerator which duplicates the functionality of
    IslExprBuilder.
  + Use the IslExprBuilder to create code for memory access relations.
    + Also handle array types during access creation.
  + Enable scev codegen for one of the transformed memory access tests,
    thus access creation without canonical induction variables available.
  + Update one test case to the new output.

llvm-svn: 214659

a63b2579

Allow the IslExprBuilder to generate access operations · ed878311
Johannes Doerfert authored Aug 03, 2014
```
llvm-svn: 214658
```
ed878311

Update the jscop tests and port them to isl codegen. · b5d1c322

Johannes Doerfert authored Aug 03, 2014

  The updated tests use a different context than the old ones did.
  Other than that only their path and the code generation we use
  changed.

llvm-svn: 214657

b5d1c322

Tools.cpp: Avoid std::to_string() on -fbuild-session-timestamp to appease mingw32 builder. · 0f9447d1
NAKAMURA Takumi authored Aug 03, 2014
```
llvm-svn: 214656
```
0f9447d1

[SimplifyCFG] fix accessing deleted PHINodes in switch-to-table conversion. · 062f58d5

Manman Ren authored Aug 02, 2014

When we have a covered lookup table, make sure we don't delete PHINodes that
are cached in PHIs.

rdar://17887153

llvm-svn: 214642

062f58d5

Aug 02, 2014

[Mips] Replace assembler code by YAML to make the 'gotsym.test' test · 0670abdd
Simon Atanasyan authored Aug 02, 2014
```
target independent.

llvm-svn: 214641
```
0670abdd
tlbia support · c03105ba
Joerg Sonnenberger authored Aug 02, 2014
```
llvm-svn: 214640
```
c03105ba
mfdcr / mtdcr support · e8a167ce
Joerg Sonnenberger authored Aug 02, 2014
```
llvm-svn: 214639
```
e8a167ce
fix bug 20513 - Crash in SLP Vectorizer · 26a1bf7d
Erik Eckstein authored Aug 02, 2014
```
llvm-svn: 214638
```
26a1bf7d
Update test to use a more modern AArch64 triple, as requested by Renato. · 6b999ae6
James Molloy authored Aug 02, 2014
```
llvm-svn: 214637
```
6b999ae6

Don't use additional arguments for dss and friends to satisfy DSS_Form, · 99ab590a

Joerg Sonnenberger authored Aug 02, 2014

when let can do the same thing. Keep the 64bit variants as codegen-only.
While they have a different register class, the encoding is the same for
32bit and 64bit mode. Having both present would otherwise confuse the
disassembler.

llvm-svn: 214636

99ab590a

vcfsx and dss instructions require immediates, variables are not valid. · 466a31eb
Joerg Sonnenberger authored Aug 02, 2014
```
llvm-svn: 214635
```
466a31eb

[AArch64] Teach DAGCombiner that converting two consecutive loads into a... · ce45be04

James Molloy authored Aug 02, 2014

[AArch64] Teach DAGCombiner that converting two consecutive loads into a vector load is not a good transform when paired loads are available.

The combiner was creating Q-register loads and stores, which then had to be spilled because there are no callee-save Q registers!

llvm-svn: 214634

ce45be04

Mark a GPGPU test case as XFAIL · 8c112d83

Tobias Grosser authored Aug 02, 2014

This area of code is currently not very much tested. It will hopefully be
superseeded by Yabin's GSoC project.

llvm-svn: 214633

8c112d83

No need to run -mem2reg twice · 5b5fd4e2
Tobias Grosser authored Aug 02, 2014
```
llvm-svn: 214632
```
5b5fd4e2
[x86] Remove the FIXME that was implemented in r214628. Managed to · 16c13cad
Chandler Carruth authored Aug 02, 2014
```
forget to update the comment here... =/

llvm-svn: 214630
```
16c13cad

[x86] Give this test a bare metal triple so it doesn't use the weird · bec57b40

Chandler Carruth authored Aug 02, 2014

Darwin x86 asm comment prefix designed to work around GAS on that
platform. That makes the comment-matching of the test much more stable.

llvm-svn: 214629

bec57b40

[x86] Largely complete the use of PSHUFB in the new vector shuffle · 4c57955f

Chandler Carruth authored Aug 02, 2014

lowering with a small addition to it and adding PSHUFB combining.

There is one obvious place in the new vector shuffle lowering where we
should form PSHUFBs directly: when without them we will unpack a vector
of i8s across two different registers and do a potentially 4-way blend
as i16s only to re-pack them into i8s afterward. This is the crazy
expensive fallback path for i8 shuffles and we can just directly use
pshufb here as it will always be cheaper (the unpack and pack are
two instructions so even a single shuffle between them hits our
three instruction limit for forming PSHUFB).

However, this doesn't generate very good code in many cases, and it
leaves a bunch of common patterns not using PSHUFB. So this patch also
adds support for extracting a shuffle mask from PSHUFB in the X86
lowering code, and uses it to handle PSHUFBs in the recursive shuffle
combining. This allows us to combine through them, combine multiple ones
together, and generally produce sufficiently high quality code.

Extracting the PSHUFB mask is annoyingly complex because it could be
either pre-legalization or post-legalization. At least this doesn't have
to deal with re-materialized constants. =] I've added decode routines to
handle the different patterns that show up at this level and we dispatch
through them as appropriate.

The two primary test cases are updated. For the v16 test case there is
still a lot of room for improvement. Since I was going through it
systematically I left behind a bunch of FIXME lines that I'm hoping to
turn into ALL lines by the end of this.

llvm-svn: 214628

4c57955f

[x86] Switch to using the variable we extracted this operand into. · d10b2924

Chandler Carruth authored Aug 02, 2014

Spotted this missed refactoring by inspection when reading code, and it
doesn't changethe functionality at all.

llvm-svn: 214627

d10b2924

[x86] Fix a few typos in my comments spotted in passing. · 5219d4ef
Chandler Carruth authored Aug 02, 2014
```
llvm-svn: 214626
```
5219d4ef

[x86] Teach the target shuffle mask extraction to recognize unary forms · 34f9a987

Chandler Carruth authored Aug 02, 2014

of normally binary shuffle instructions like PUNPCKL and MOVLHPS.

This detects cases where a single register is used for both operands
making the shuffle behave in a unary way. We detect this and adjust the
mask to use the unary form which allows the existing DAG combine for
shuffle instructions to actually work at all.

As a consequence, this uncovered a number of obvious bugs in the
existing DAG combine which are fixed. It also now canonicalizes several
shuffles even with the existing lowering. These typically are trying to
match the shuffle to the domain of the input where before we only really
modeled them with the floating point variants. All of the cases which
change to an integer shuffle here have something in the integer domain, so
there are no more or fewer domain crosses here AFAICT. Technically, it
might be better to go from a GPR directly to the floating point domain,
but detecting floating point *outputs* despite integer inputs is a lot
more code and seems unlikely to be worthwhile in practice. If folks are
seeing domain-crossing regressions here though, let me know and I can
hack something up to fix it.

Also as a consequence, a bunch of missed opportunities to form pshufb
now can be formed. Notably, splats of i8s now form pshufb.
Interestingly, this improves the existing splat lowering too. We go from
3 instructions to 1. Yes, we may tie up a register, but it seems very
likely to be worth it, especially if splatting the 0th byte (the
common case) as then we can use a zeroed register as the mask.

llvm-svn: 214625

34f9a987

[x86] Teach my pshufb comment printer to handle VPSHUFB forms as well as · 2ad69eea
Chandler Carruth authored Aug 02, 2014
```
PSHUFB forms. This will be important to update some AVX tests when I add
PSHUFB combining.

llvm-svn: 214624
```
2ad69eea

[SDAG] Refactor the code which deletes nodes in the DAG combiner to do · 18066974

Chandler Carruth authored Aug 02, 2014

so using a single helper which adds operands back onto the worklist.

Several places didn't rigorously do this but a couple already did.
Factoring them together and doing it rigorously is important to delete
things recursively early on in the combiner and get a chance to see
accurate hasOneUse values. While no existing test cases change, an
upcoming patch to add DAG combining logic for PSHUFB requires this to
work correctly.

llvm-svn: 214623

18066974

Fix issues with ISD::FNEG and ISD::FMA SDNodes where they would not be constant-folded · 9d5a8c28

Owen Anderson authored Aug 02, 2014

during DAGCombine in certain circumstances. Unfortunately, the circumstances required
to trigger the issue seem to require a pretty specific interaction of DAGCombines,
and I haven't been able to find a testcase that reproduces on X86, ARM, or AArch64.
The functionality added here is replicated in essentially every other DAG combine,
so it seems pretty obviously correct.

llvm-svn: 214622

9d5a8c28

Changed tool-template to use CommonOptionsParser. · da2734d4

Alexander Kornienko authored Aug 02, 2014

Reviewers: pcc, klimek

Reviewed By: klimek

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D4765

llvm-svn: 214621

da2734d4