Commits · d4eb47c110f278c4bd503daec771b587de86f9aa · Roger Ferrer / llvm-epi-0.8

Apr 06, 2013

Removed trailing whitespace. · 7924997c
Michael Gottesman authored Apr 05, 2013
```
llvm-svn: 178932
```
7924997c

R600/SI: Add support for buffer stores v2 · 754f80ff

Tom Stellard authored Apr 05, 2013



v2:
  - Use the ADDR64 bit

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178931

754f80ff

R600/SI: Use same names for corresponding MUBUF operands and encoding fields · 6db08eb4

Tom Stellard authored Apr 05, 2013



The code emitter knows how to encode operands whose name matches one of
the encoding fields.  If there is no match, the code emitter relies on
the order of the operand and field definitions to determine how operands
should be encoding.  Matching by order makes it easy to accidentally break
the instruction encodings, so we prefer to match by name.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178930

6db08eb4

R600: Add RV670 processor · 60174bb9

Tom Stellard authored Apr 05, 2013



This is an R600 GPU with double support.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178929

60174bb9

R600/SI: Add processor types for each SI variant · 2f21c7e5
Tom Stellard authored Apr 05, 2013
```
Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178928
```
2f21c7e5

R600/SI: Avoid generating S_MOVs with 64-bit immediates v2 · edbf1eb4

Tom Stellard authored Apr 05, 2013



SITargetLowering::analyzeImmediate() was converting the 64-bit values
to 32-bit and then checking if they were an inline immediate.  Some
of these conversions caused this check to succeed and produced
S_MOV instructions with 64-bit immediates, which are illegal.

v2:
  - Clean up logic

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178927

edbf1eb4

Enable early if conversion on PPC · ed6a2859

Hal Finkel authored Apr 05, 2013

On cores for which we know the misprediction penalty, and we have
the isel instruction, we can profitably perform early if conversion.
This enables us to replace some small branch sequences with selects
and avoid the potential stalls from mispredicting the branches.

Enabling this feature required implementing canInsertSelect and
insertSelect in PPCInstrInfo; isel code in PPCISelLowering was
refactored to use these functions as well.

llvm-svn: 178926

ed6a2859

Correct the PPC A2 misprediction penalty · 85526f2e

Hal Finkel authored Apr 05, 2013

The manual states that there is a minimum of 13 cycles from when the
mispredicted branch is issued to when the correct branch target is
issued.

llvm-svn: 178925

85526f2e

An objc_retain can serve as a use for a different pointer. · 31ba23aa

Michael Gottesman authored Apr 05, 2013

This is the counterpart to commit r160637, except it performs the action
in the bottomup portion of the data flow analysis.

llvm-svn: 178922

31ba23aa

Properly model precise lifetime when given an incomplete dataflow sequence. · 1d8d2577

Michael Gottesman authored Apr 05, 2013

The normal dataflow sequence in the ARC optimizer consists of the following
states:

    Retain -> CanRelease -> Use -> Release

The optimizer before this patch stored the uses that determine the lifetime of
the retainable object pointer when it bottom up hits a retain or when top down
it hits a release. This is correct for an imprecise lifetime scenario since what
we are trying to do is remove retains/releases while making sure that no
``CanRelease'' (which is usually a call) deallocates the given pointer before we
get to the ``Use'' (since that would cause a segfault).

If we are considering the precise lifetime scenario though, this is not
correct. In such a situation, we *DO* care about the previous sequence, but
additionally, we wish to track the uses resulting from the following incomplete
sequences:

  Retain -> CanRelease -> Release   (TopDown)
  Retain <- Use <- Release          (BottomUp)

*NOTE* This patch looks large but the most of it consists of updating
test cases. Additionally this fix exposed an additional bug. I removed
the test case that expressed said bug and will recommit it with the fix
in a little bit.

llvm-svn: 178921

1d8d2577

Reapply r178845 with fix - Fix bug in PEI's virtual-register scavenging · 3005c299

Hal Finkel authored Apr 05, 2013

This fixes PEI as previously described, but correctly handles the case where
the instruction defining the virtual register to be scavenged is the first in
the block. Arnold provided me with a bugpoint-reduced test case, but even that
seems too large to use as a regression test. If I'm successful in cleaning it
up then I'll commit that as well.

Original commit message:

This change fixes a bug that I introduced in r178058. After a register is
scavenged using one of the available spills slots the instruction defining the
virtual register needs to be moved to after the spill code. The scavenger has
already processed the defining instruction so that registers killed by that
instruction are available for definition in that same instruction. Unfortunately,
after this, the scavenger needs to iterate through the spill code and then
visit, again, the instruction that defines the now-scavenged register. In order
to avoid confusion, the register scavenger needs the ability to 'back up'
through the spill code so that it can again process the instructions in the
appropriate order. Prior to this fix, once the scavenger reached the
just-moved instruction, it would assert if it killed any registers because,
having already processed the instruction, it believed they were undefined.

Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.

llvm-svn: 178919

3005c299

Apr 05, 2013

Use the target options specified on a function to reset the back-end. · eb108bad

Bill Wendling authored Apr 05, 2013

During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.

llvm-svn: 178917

eb108bad

Revert r178845 - Fix bug in PEI's virtual-register scavenging · 81c46d08

Hal Finkel authored Apr 05, 2013

Reverting because this breaks one of the LTO builders. Original commit message:

Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.

llvm-svn: 178916

81c46d08

Tidy up a bit. No functional change. · bdbd7346
Jim Grosbach authored Apr 05, 2013
```
llvm-svn: 178915
```
bdbd7346

Disable the optimization about promoting vector-element-access with symbolic index. · 95adf525

Shuxin Yang authored Apr 05, 2013

This optimization is unstable at this moment; it 
  1) block us on a very important application
  2) PR15200
  3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll
     (the CHECK command compare the output against wrong result)

   I personally believe this optimization should not have any impact on the
autovectorized code, as auto-vectorizer is supposed to put gather/scatter
in a "right" way.  Although in theory downstream optimizaters might reveal 
some gather/scatter optimization opportunities, the chance is quite slim.

   For the hand-crafted vectorizing code, in term of redundancy elimination,
load-CSE, copy-propagation and DSE can collectively achieve the same result,
but in much simpler way. On the other hand, these optimizers are able to 
improve the code in a incremental way; in contrast, SROA is sort of all-or-none
approach. However, SROA might slighly win in stack size, as it tries to figure 
out a stretch of memory tightenly cover the area accessed by the dynamic index.

 rdar://13174884
 PR15200

llvm-svn: 178912

95adf525

<rdar://problem/13551789 > Fix a race in the LockFileManager. · 0cb68460

Douglas Gregor authored Apr 05, 2013

It's possible for the lock file to disappear and the owning process to
return before we're able to see the generated file. Spin for a little
while to see if it shows up before failing. 

llvm-svn: 178909

0cb68460

<rdar://problem/13551789 > Fix yet another race in unique_file. · 6bd4d8cf

Douglas Gregor authored Apr 05, 2013

If the directory that will contain the unique file doesn't exist when
we tried to create the file, but another process creates it before we
get a chance to try creating it, we would bail out rather than try to
create the unique file.

llvm-svn: 178908

6bd4d8cf

[Support][FileSystem] Fix identify_magic for big endian ELF. · b8055cbc
Michael J. Spencer authored Apr 05, 2013
```
llvm-svn: 178905
```
b8055cbc
Define versions of Section that are explicitly marked as little endian. · 4386fa99
Rafael Espindola authored Apr 05, 2013
```
These should really be templated like ELF, but this is a start.

llvm-svn: 178896
```
4386fa99
Added two debug logging messages to VisitInstructionsTopDown to match VisitInstructionsBottomUp. · bab49e97
Michael Gottesman authored Apr 05, 2013
```
llvm-svn: 178895
```
bab49e97
Don't use InMemoryStruct in getSection and getSection64. · 8622f2c1
Rafael Espindola authored Apr 05, 2013
```
llvm-svn: 178894
```
8622f2c1
Cleaned up whitespace and made debug logging less verbose. · 89279f83
Michael Gottesman authored Apr 05, 2013
```
llvm-svn: 178893
```
89279f83
Reverting 178851 as it broke buildbots · 91de828f
Renato Golin authored Apr 05, 2013
```
llvm-svn: 178883
```
91de828f

[ms-inline asm] Add support for numeric displacement expressions in bracketed · 4a7005e9

Chad Rosier authored Apr 05, 2013

memory operands.

Essentially, this layers an infix calculator on top of the parsing state
machine.  The scale on the index register is still expected to be an immediate

 __asm mov eax, [eax + ebx*4]

and will not work with more complex expressions.  For example,

 __asm mov eax, [eax + ebx*(2*2)]

The plus and minus binary operators assume the numeric value of a register is
zero so as to not change the displacement.  Register operands should never
be an operand for a multiply or divide operation; the scale*indexreg
expression is always replaced with a zero on the operand stack to prevent
such a case.
rdar://13521380

llvm-svn: 178881

4a7005e9

[Support] Disable assertion dialogs from the MSVC debug CRT · bd39f213

Reid Kleckner authored Apr 05, 2013

Summary:
Sets a report hook that emulates pressing "retry" in the "abort, retry,
ignore" dialog box that _CrtDbgReport normally raises.  There are many
other ways to disable assertion reports, but this was the only way I
could find that still calls our exception handler.

Reviewers: Bigcheese

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D625

llvm-svn: 178880

bd39f213

Don't fetch pointers from a InMemoryStruct. · b0f76a4b

Rafael Espindola authored Apr 05, 2013

InMemoryStruct is extremely dangerous as it returns data from an internal
buffer when the endiannes doesn't match. This should fix the tests on big
endian hosts.

llvm-svn: 178875

b0f76a4b

· 78e9765b

Ulrich Weigand authored Apr 05, 2013

Respect Addend when processing MCJIT relocations to local/global symbols.

When the RuntimeDyldELF::processRelocationRef routine finds the target
symbol of a relocation in the local or global symbol table, it performs
a section-relative relocation:

    Value.SectionID = lsi->second.first;
    Value.Addend = lsi->second.second;

At this point, however, any Addend that might have been specified in
the original relocation record is lost.  This is somewhat difficult to
trigger for relocations within the code section since they usually
do not contain non-zero Addends (when built with the default JIT code
model, in any case).  However, the problem can be reliably triggered
by a relocation within the data section caused by code like:

 int test[2] = { -1, 0 };
 int *p = &test[1];

The initializer of "p" will need a relocation to "test + 4".  On
platforms using RelA relocations this means an Addend of 4 is required.
Current code ignores this addend when processing the relocation,
resulting in incorrect execution.

Fixed by taking the Addend into account when processing relocations
to symbols found in the local or global symbol table.

Tested on x86_64-linux and powerpc64-linux.

llvm-svn: 178869

78e9765b

Buildbot fix for r178851: mistake was in wrong TargetRegisterInfo::getRegClass usage. · 6b53a2f5
Stepan Dyatkovskiy authored Apr 05, 2013
```
llvm-svn: 178854
```
6b53a2f5

Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated... · b309b3b3

Stepan Dyatkovskiy authored Apr 05, 2013

Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated instruction vldmia at incorrect position".
Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass.
It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4
But it also tracks conflicts between different register classes (e.g. D2 and S5).
For more details see:
Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824
LLVM Commits discussion:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html

llvm-svn: 178851

b309b3b3

Add a SchedMachineModel for the PPC G5 · 1a958cf3
Hal Finkel authored Apr 05, 2013
```
llvm-svn: 178850
```
1a958cf3
Add a SchedMachineModel for the PPC A2 · 5fde1b03
Hal Finkel authored Apr 05, 2013
```
llvm-svn: 178848
```
5fde1b03

Fix bug in PEI's virtual-register scavenging · e6f48e4e

Hal Finkel authored Apr 05, 2013

Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.

llvm-svn: 178845

e6f48e4e

ARM scheduler model: Add scheduler info to more instructions and resource · fb6b9f48
Arnold Schwaighofer authored Apr 05, 2013
```
descriptions for compares

llvm-svn: 178844
```
fb6b9f48
ARM scheduler model: Swift has varying latencies, uops for simple ALU ops · 5dde1f39
Arnold Schwaighofer authored Apr 05, 2013
```
llvm-svn: 178842
```
5dde1f39
RegisterPressure heuristics currently require signed comparisons. · 80e66ce0
Andrew Trick authored Apr 05, 2013
```
llvm-svn: 178823
```
80e66ce0

Disable DFSResult for ConvergingScheduler. · 96ce3848

Andrew Trick authored Apr 05, 2013

For now, just save the compile time since the ConvergingScheduler
heuristics don't use this analysis. We'll probably enable it later
after compile-time investigation.

llvm-svn: 178822

96ce3848

MachineScheduler: format DEBUG output. · 419d4917

Andrew Trick authored Apr 05, 2013

I'm getting more serious about tuning and enabling on x86/ARM. Start
by making the trace readable.

llvm-svn: 178821

419d4917

LoopVectorizer: Pass OperandValueKind information to the cost model · df6f67ed

Arnold Schwaighofer authored Apr 04, 2013

Pass down the fact that an operand is going to be a vector of constants.

This should bring the performance of MultiSource/Benchmarks/PAQ8p/paq8p on x86
back. It had degraded to scalar performance due to my pervious shift cost change
that made all shifts expensive on x86.

radar://13576547

llvm-svn: 178809

df6f67ed

X86 cost model: Differentiate cost for vector shifts of constants · 44f902ed

Arnold Schwaighofer authored Apr 04, 2013

SSE2 has efficient support for shifts by a scalar. My previous change of making
shifts expensive did not take this into account marking all shifts as expensive.
This would prevent vectorization from happening where it is actually beneficial.

With this change we differentiate between shifts of constants and other shifts.

radar://13576547

llvm-svn: 178808

44f902ed

CostModel: Add parameter to instruction cost to further classify operand values · b9773871

Arnold Schwaighofer authored Apr 04, 2013

On certain architectures we can support efficient vectorized version of
instructions if the operand value is uniform (splat) or a constant scalar.
An example of this is a vector shift on x86.

We can efficiently support

for (i = 0 ; i < ; i += 4)
  w[0:3] = v[0:3] << <2, 2, 2, 2>

but not

for (i = 0; i < ; i += 4)
  w[0:3] = v[0:3] << x[0:3]

This patch adds a parameter to getArithmeticInstrCost to further qualify operand
values as uniform or uniform constant.

Targets can then choose to return a different cost for instructions with such
operand values.

A follow-up commit will test this feature on x86.

radar://13576547

llvm-svn: 178807

b9773871