Commits · d4eb47c110f278c4bd503daec771b587de86f9aa · Roger Ferrer / llvm-epi-0.8

Apr 06, 2013

Reapply r178845 with fix - Fix bug in PEI's virtual-register scavenging · 3005c299

Hal Finkel authored Apr 05, 2013

This fixes PEI as previously described, but correctly handles the case where
the instruction defining the virtual register to be scavenged is the first in
the block. Arnold provided me with a bugpoint-reduced test case, but even that
seems too large to use as a regression test. If I'm successful in cleaning it
up then I'll commit that as well.

Original commit message:

This change fixes a bug that I introduced in r178058. After a register is
scavenged using one of the available spills slots the instruction defining the
virtual register needs to be moved to after the spill code. The scavenger has
already processed the defining instruction so that registers killed by that
instruction are available for definition in that same instruction. Unfortunately,
after this, the scavenger needs to iterate through the spill code and then
visit, again, the instruction that defines the now-scavenged register. In order
to avoid confusion, the register scavenger needs the ability to 'back up'
through the spill code so that it can again process the instructions in the
appropriate order. Prior to this fix, once the scavenger reached the
just-moved instruction, it would assert if it killed any registers because,
having already processed the instruction, it believed they were undefined.

Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.

llvm-svn: 178919

3005c299

Apr 05, 2013

Use the target options specified on a function to reset the back-end. · eb108bad

Bill Wendling authored Apr 05, 2013

During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.

llvm-svn: 178917

eb108bad

Revert r178845 - Fix bug in PEI's virtual-register scavenging · 81c46d08

Hal Finkel authored Apr 05, 2013

Reverting because this breaks one of the LTO builders. Original commit message:

Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.

llvm-svn: 178916

81c46d08

Fix bug in PEI's virtual-register scavenging · e6f48e4e

Hal Finkel authored Apr 05, 2013

Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.

llvm-svn: 178845

e6f48e4e

RegisterPressure heuristics currently require signed comparisons. · 80e66ce0
Andrew Trick authored Apr 05, 2013
```
llvm-svn: 178823
```
80e66ce0

Disable DFSResult for ConvergingScheduler. · 96ce3848

Andrew Trick authored Apr 05, 2013

For now, just save the compile time since the ConvergingScheduler
heuristics don't use this analysis. We'll probably enable it later
after compile-time investigation.

llvm-svn: 178822

96ce3848

MachineScheduler: format DEBUG output. · 419d4917

Andrew Trick authored Apr 05, 2013

I'm getting more serious about tuning and enabling on x86/ARM. Start
by making the trace readable.

llvm-svn: 178821

419d4917

CostModel: Add parameter to instruction cost to further classify operand values · b9773871

Arnold Schwaighofer authored Apr 04, 2013

On certain architectures we can support efficient vectorized version of
instructions if the operand value is uniform (splat) or a constant scalar.
An example of this is a vector shift on x86.

We can efficiently support

for (i = 0 ; i < ; i += 4)
  w[0:3] = v[0:3] << <2, 2, 2, 2>

but not

for (i = 0; i < ; i += 4)
  w[0:3] = v[0:3] << x[0:3]

This patch adds a parameter to getArithmeticInstrCost to further qualify operand
values as uniform or uniform constant.

Targets can then choose to return a different cost for instructions with such
operand values.

A follow-up commit will test this feature on x86.

radar://13576547

llvm-svn: 178807

b9773871

Debug Info: revert 178722 for now. · bdcb4464

Manman Ren authored Apr 04, 2013

There is a difference for FORM_ref_addr between DWARF 2 and DWARF 3+.
Since Eric is against guarding DWARF 2 ref_addr with DarwinGDBCompat, we are
still in discussion on how to handle this.

The correct solution is to update our header to say version 4 instead of version
2 and update tool chains as well.

rdar://problem/13559431

llvm-svn: 178806

bdcb4464

typo · 322f41d0
Adrian Prantl authored Apr 04, 2013
```
llvm-svn: 178804
```
322f41d0

Apr 04, 2013

Formatting · fc186358
Eli Bendersky authored Apr 04, 2013
```
llvm-svn: 178771
```
fc186358

Debug Info: according to DWARF 2, FORM_ref_addr the same size as an address on · 5a15c9ed

Manman Ren authored Apr 04, 2013

the target system.

It was hard-coded to 4 bytes before. I can't get llvm to generate a
ref_addr on a reasonably sized testing case.

rdar://problem/13559431

llvm-svn: 178722

5a15c9ed

Apr 03, 2013
- Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC. · 92e26646
  Bill Schmidt authored Apr 03, 2013
```
For this we need to use a libcall.  Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner.  A test case from the bug report is included.

llvm-svn: 178639
```
  92e26646
- Fix grammar. · 14c2067c
  Eric Christopher authored Apr 03, 2013
```
llvm-svn: 178624
```
  14c2067c
- Remove ZeroOrMore from the option description. We don't need it here. · 5590949f
  Eric Christopher authored Apr 03, 2013
```
llvm-svn: 178623
```
  5590949f
- Allow MachineTraceMetrics to be used when the model has no resources. · aeb69a54
  Jakob Stoklund Olesen authored Apr 02, 2013
```
It it still possible to extract information from itineraries, for
example.

llvm-svn: 178582
```
  aeb69a54
Apr 02, 2013

Don't attempt MTM heuristics without a scheduling model present. · 8fbfc591
Jakob Stoklund Olesen authored Apr 02, 2013
```
This should fix the PPC buildbots.

llvm-svn: 178558
```
8fbfc591

Count processor resources individually in MachineTraceMetrics. · 3ca14772

Jakob Stoklund Olesen authored Apr 02, 2013

The new instruction scheduling models provide information about the
number of cycles consumed on each processor resource. This makes it
possible to estimate ILP more accurately than simply counting
instructions / issue width.

The functions getResourceDepth() and getResourceLength() now identify
the limiting processor resource, and return a cycle count based on that.

This gives more precise resource information, particularly in traces
that use one resource a lot more than others.

llvm-svn: 178553

3ca14772

DAGCombiner: Merge store/loads when we have extload/truncstores · d6c6e868

Arnold Schwaighofer authored Apr 02, 2013

This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.

copy(char *a, char *b, int n) {
 do {
   int t0 = a[0];
   int t1 = a[1];
   b[0] = t0;
   b[1] = t1;

radar://13536387

llvm-svn: 178546

d6c6e868

Apr 01, 2013

Merge load/store sequences with adresses: base + index + offset · 6752366e

Arnold Schwaighofer authored Apr 01, 2013

We would also like to merge sequences that involve a variable index like in the
example below.

    int index = *idx++
    int i0 = c[index+0];
    int i1 = c[index+1];
    b[0] = i0;
    b[1] = i1;

By extending the parsing of the base pointer to handle dags that contain a
base, index, and offset we can handle examples like the one above.

The dag for the code above will look something like:

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i8 load %index))))

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))

The code that parses the tree ignores the intermediate sign extensions. However,
if there is a sign extension it needs to be on all indexes.

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (add (i8 load %index)
                                     (i8 1))))
 vs

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))
radar://13536387

llvm-svn: 178483

6752366e

Mar 30, 2013
- DAGCombine: visitXOR can replace a node without returning it, bail out in that case. · 93354432
  Benjamin Kramer authored Mar 30, 2013
```
Fixes the crash reported in PR15608.

llvm-svn: 178429
```
  93354432
- Use SmallVectorImpl instead of SmallVector at the uses. · 4887c8f4
  Eric Christopher authored Mar 29, 2013
```
llvm-svn: 178386
```
  4887c8f4
Mar 29, 2013

Use 12 as the magic number for our abbreviation data and our · 9c8414f8

Eric Christopher authored Mar 29, 2013

die values. A lot of DIEs have 10 attributes in C++ code (example
clang), none had more than 12. Seems like a good default.

llvm-svn: 178366

9c8414f8

Move the construction of the skeleton compile unit after the · 6be35037
Eric Christopher authored Mar 29, 2013
```
entire original compile unit has been constructed.

llvm-svn: 178365
```
6be35037

Remove the old CodePlacementOpt pass. · 70671b99

Benjamin Kramer authored Mar 29, 2013

It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1.

llvm-svn: 178349

70671b99

Fix a typo · 6036f581
Nadav Rotem authored Mar 29, 2013
```
llvm-svn: 178346
```
6036f581

[fast-isel] Add a preemptive fix for the case where we fail to materialize an · dbac025d

Chad Rosier authored Mar 28, 2013

immediate in a register.  I don't believe this should ever fail, but I see no
harm in trying to make this code bullet proof.

I've added an assert to ensure my assumtion is correct.  If the assertion fires
something is wrong and we should fix it, rather then just silently fall back to
SelectionDAG isel.

llvm-svn: 178305

dbac025d

Mar 27, 2013

Fix target-customized spilling in the register scavenger · 35dd5c59

Hal Finkel authored Mar 27, 2013

This is a follow-up to r178073 (which should actually make target-customized
spilling work again).

I still don't have a regression test for this (but it would be good to have
one; Thumb 1 and Mips16 use this callback as well).

Patch by Richard Sandiford.

llvm-svn: 178137

35dd5c59

Mar 26, 2013

Fix the register scavenger for targets that provide custom spilling · 1fa2f945

Hal Finkel authored Mar 26, 2013

As pointed out by Richard Sandiford, my recent updates to the register
scavenger broke targets that use custom spilling (because the new code assumed
that if there were no valid spill slots, than spilling would be impossible).

I don't have a test case, but it should be possible to create one for Thumb 1,
Mips 16, etc.

llvm-svn: 178073

1fa2f945

Update PEI's virtual-register-based scavenging to support multiple simultaneous mappings · 4e05788c

Hal Finkel authored Mar 26, 2013

The previous algorithm could not deal properly with scavenging multiple virtual
registers because it kept only one live virtual -> physical mapping (and
iterated through operands in order). Now we don't maintain a current mapping,
but rather use replaceRegWith to completely remove the virtual register as
soon as the mapping is established.

In order to allow the register scavenger to return a physical register killed
by an instruction for definition by that same instruction, we now call
RS->forward(I) prior to eliminating virtual registers defined in I. This
requires a minor update to forward to ignore virtual registers.

These new features will be tested in forthcoming commits.

llvm-svn: 178058

4e05788c

Enhance folding of (extract_subvec (insert_subvec V1, V2, IIdx), EIdx) · bb05a1d7

Michael Liao authored Mar 25, 2013

- Handle the case where the result of 'insert_subvect' is bitcasted
  before 'extract_subvec'. This removes the redundant insertf128/extractf128
  pair on unaligned 256-bit vector load/store on vectors of non 64-bit integer.

llvm-svn: 177945

bb05a1d7

Mar 25, 2013
- Disable some unsafe-fp-math DAG-combine transformation after legalization. · 93b1f12a
  Shuxin Yang authored Mar 25, 2013
```
For instance, following transformation will be disabled:
    x + x + x => 3.0f * x;

The problem of these transformations is that it introduces a FP constant, which
following Instruction-Selection pass cannot handle.

Reviewed by Nadav, thanks a lot!

rdar://13445387

llvm-svn: 177933
```
  93b1f12a
- Couple more sets of tidying. · 3820184a
  Eric Christopher authored Mar 25, 2013
```
llvm-svn: 177920
```
  3820184a
- Formatting. · 7f44037c
  Eric Christopher authored Mar 25, 2013
```
llvm-svn: 177898
```
  7f44037c
- Teach cmake about the new Erlang GC files. · d58611a4
  Duncan Sands authored Mar 25, 2013
```
llvm-svn: 177869
```
  d58611a4
- Add a GC plugin for Erlang · dbb4adf1
  Yiannis Tsiouris authored Mar 25, 2013
```
llvm-svn: 177867
```
  dbb4adf1
Mar 23, 2013

Remove the type legality check from the SelectionDAGBuilder when it lowers... · c81616b0

Owen Anderson authored Mar 23, 2013

Remove the type legality check from the SelectionDAGBuilder when it lowers @llvm.fmuladd to ISD::FMA nodes.
Performing this check unilaterally prevented us from generating FMAs when the incoming IR contained illegal vector types which would eventually be legalized to underlying types that *did* support FMA.
For example, an @llvm.fmuladd on an OpenCL float16 should become a sequence of float4 FMAs, not float4 fmul+fadd's.

NOTE: Because we still call the target-specific profitability hook, individual targets can reinstate the old behavior, if desired, by simply performing the legality check inside their callback hook. They can also perform more sophisticated legality checks, if, for example, some illegal vector types can be productively implemented as FMAs, but not others.
llvm-svn: 177820

c81616b0

Fix comparison of mixed signedness · 446122ed

Hal Finkel authored Mar 23, 2013

177774 broke the lld-x86_64-darwin11 builder; error:
error: comparison of integers of different signs: 'int' and 'size_type' (aka 'unsigned long')
  for (SI = 0; SI < Scavenged.size(); ++SI)
               ~~ ^ ~~~~~~~~~~~~~~~~

Fix this by making SI also unsigned.

llvm-svn: 177780

446122ed

Allow the register scavenger to spill multiple registers · 9e331c2f

Hal Finkel authored Mar 22, 2013

This patch lets the register scavenger make use of multiple spill slots in
order to guarantee that it will be able to provide multiple registers
simultaneously.

To support this, the RS's API has changed slightly: setScavengingFrameIndex /
getScavengingFrameIndex have been replaced by addScavengingFrameIndex /
isScavengingFrameIndex / getScavengingFrameIndices.

In forthcoming commits, the PowerPC backend will use this capability in order
to implement the spilling of condition registers, and some special-purpose
registers, without relying on r0 being reserved. In some cases, spilling these
registers requires two GPRs: one for addressing and one to hold the value being
transferred.

llvm-svn: 177774

9e331c2f

Mar 22, 2013

Remove ScavengedRC from RegisterScavenging · 7dbe0f06

Hal Finkel authored Mar 22, 2013

ScavengedRC was a dead private variable (set, but not otherwise used). No
functionality change intended.

llvm-svn: 177708

7dbe0f06