Commits · edaf66b056f6cb9fe08e43cc6d23f0dadb7060d6 · Roger Ferrer / llvm-epi-0.8

Apr 07, 2013

Implement LowerReturn_64 for SPARC v9. · edaf66b0

Jakob Stoklund Olesen authored Apr 06, 2013

Integer return values are sign or zero extended by the callee, and
structs up to 32 bytes in size can be returned in registers.

The CC_Sparc64 CallingConv definition is shared between
LowerFormalArguments_64 and LowerReturn_64. Function arguments and
return values are passed in the same registers.

The inreg flag is also used for return values. This is required to handle
C functions returning structs containing floats and ints:

  struct ifp {
    int i;
    float f;
  };

  struct ifp f(void);

LLVM IR:

  define inreg { i32, float } @f() {
     ...
     ret { i32, float } %retval
  }

The ABI requires that %retval.i is returned in the high bits of %i0
while %retval.f goes in %f1.

Without the inreg return value attribute, %retval.i would go in %i0 and
%retval.f would go in %f3 which is a more efficient way of returning
%multiple values, but it is not ABI compliant for returning C structs.

llvm-svn: 178966

edaf66b0

Apr 06, 2013

SPARC v9 stack pointer bias. · 03d9f7fd

Jakob Stoklund Olesen authored Apr 06, 2013

64-bit SPARC v9 processes use biased stack and frame pointers, so the
current function's stack frame is located at %sp+BIAS .. %fp+BIAS where
BIAS = 2047.

This makes more local variables directly accessible via [%fp+simm13]
addressing.

llvm-svn: 178965

03d9f7fd

Implement PPCInstrInfo::FoldImmediate · d61d4f80

Hal Finkel authored Apr 06, 2013

There are certain PPC instructions into which we can fold a zero immediate
operand. We can detect such cases by looking at the register class required
by the using operand (so long as it is not otherwise constrained).

llvm-svn: 178961

d61d4f80

PPC ISEL is a select and never has side effects · 8fc33e5d
Hal Finkel authored Apr 06, 2013
```
llvm-svn: 178960
```
8fc33e5d

Add a comment to TargetInstrInfo about FoldImmediate · 537ec717

Hal Finkel authored Apr 06, 2013

This comment documents the current behavior of the ARM implementation of this
callback, and also the soon-to-be-committed PPC version.

llvm-svn: 178959

537ec717

Complete formal arguments for the SPARC v9 64-bit ABI. · 1c9a95ab

Jakob Stoklund Olesen authored Apr 06, 2013

All arguments are formally assigned to stack positions and then promoted
to floating point and integer registers. Since there are more floating
point registers than integer registers, this can cause situations where
floating point arguments are assigned to registers after integer
arguments that where assigned to the stack.

Use the inreg flag to indicate 32-bit fragments of structs containing
both float and int members.

The three-way shadowing between stack, integer, and floating point
registers requires custom argument lowering. The good news is that
return values are passed in the exact same way, and we can share the
code.

Still missing:

 - Update LowerReturn to handle structs returned in registers.
 - LowerCall.
 - Variadic functions.

llvm-svn: 178958

1c9a95ab

typo · c4bd84c1
Nadav Rotem authored Apr 06, 2013
```
llvm-svn: 178949
```
c4bd84c1
Remove last use of InMemoryStruct from MachOObjectFile.cpp. · 91af8e84
Rafael Espindola authored Apr 06, 2013
```
llvm-svn: 178948
```
91af8e84
Don't use InMemoryStruct<macho::SymtabLoadCommand>. · 15e2a9cd
Rafael Espindola authored Apr 06, 2013
```
This also required not using the RegisterStringTable API, which is also a
good thing.

llvm-svn: 178947
```
15e2a9cd
Don't use InMemoryStruct in getSymbol64TableEntry. · a65f5de4
Rafael Espindola authored Apr 06, 2013
```
llvm-svn: 178946
```
a65f5de4
Don't use InMemoryStruct in getSymbolTableEntry. · 2a34c2d8
Rafael Espindola authored Apr 06, 2013
```
llvm-svn: 178945
```
2a34c2d8
Don't use InMemoryStruct in getRelocation. · 7caf2fbd
Rafael Espindola authored Apr 06, 2013
```
llvm-svn: 178943
```
7caf2fbd

Dwarf: use utostr on CUID to append to SmallString. · 5b22f9fe

Manman Ren authored Apr 06, 2013

We used to do "SmallString += CUID", which is incorrect, since CUID will
be truncated to a char.

rdar://problem/13573833

llvm-svn: 178941

5b22f9fe

Removed trailing whitespace. · 7924997c
Michael Gottesman authored Apr 05, 2013
```
llvm-svn: 178932
```
7924997c

R600/SI: Add support for buffer stores v2 · 754f80ff

Tom Stellard authored Apr 05, 2013



v2:
  - Use the ADDR64 bit

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178931

754f80ff

R600/SI: Use same names for corresponding MUBUF operands and encoding fields · 6db08eb4

Tom Stellard authored Apr 05, 2013



The code emitter knows how to encode operands whose name matches one of
the encoding fields.  If there is no match, the code emitter relies on
the order of the operand and field definitions to determine how operands
should be encoding.  Matching by order makes it easy to accidentally break
the instruction encodings, so we prefer to match by name.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178930

6db08eb4

R600: Add RV670 processor · 60174bb9

Tom Stellard authored Apr 05, 2013



This is an R600 GPU with double support.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178929

60174bb9

R600/SI: Add processor types for each SI variant · 2f21c7e5
Tom Stellard authored Apr 05, 2013
```
Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178928
```
2f21c7e5

R600/SI: Avoid generating S_MOVs with 64-bit immediates v2 · edbf1eb4

Tom Stellard authored Apr 05, 2013



SITargetLowering::analyzeImmediate() was converting the 64-bit values
to 32-bit and then checking if they were an inline immediate.  Some
of these conversions caused this check to succeed and produced
S_MOV instructions with 64-bit immediates, which are illegal.

v2:
  - Clean up logic

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178927

edbf1eb4

Enable early if conversion on PPC · ed6a2859

Hal Finkel authored Apr 05, 2013

On cores for which we know the misprediction penalty, and we have
the isel instruction, we can profitably perform early if conversion.
This enables us to replace some small branch sequences with selects
and avoid the potential stalls from mispredicting the branches.

Enabling this feature required implementing canInsertSelect and
insertSelect in PPCInstrInfo; isel code in PPCISelLowering was
refactored to use these functions as well.

llvm-svn: 178926

ed6a2859

Correct the PPC A2 misprediction penalty · 85526f2e

Hal Finkel authored Apr 05, 2013

The manual states that there is a minimum of 13 cycles from when the
mispredicted branch is issued to when the correct branch target is
issued.

llvm-svn: 178925

85526f2e

An objc_retain can serve as a use for a different pointer. · 31ba23aa

Michael Gottesman authored Apr 05, 2013

This is the counterpart to commit r160637, except it performs the action
in the bottomup portion of the data flow analysis.

llvm-svn: 178922

31ba23aa

Properly model precise lifetime when given an incomplete dataflow sequence. · 1d8d2577

Michael Gottesman authored Apr 05, 2013

The normal dataflow sequence in the ARC optimizer consists of the following
states:

    Retain -> CanRelease -> Use -> Release

The optimizer before this patch stored the uses that determine the lifetime of
the retainable object pointer when it bottom up hits a retain or when top down
it hits a release. This is correct for an imprecise lifetime scenario since what
we are trying to do is remove retains/releases while making sure that no
``CanRelease'' (which is usually a call) deallocates the given pointer before we
get to the ``Use'' (since that would cause a segfault).

If we are considering the precise lifetime scenario though, this is not
correct. In such a situation, we *DO* care about the previous sequence, but
additionally, we wish to track the uses resulting from the following incomplete
sequences:

  Retain -> CanRelease -> Release   (TopDown)
  Retain <- Use <- Release          (BottomUp)

*NOTE* This patch looks large but the most of it consists of updating
test cases. Additionally this fix exposed an additional bug. I removed
the test case that expressed said bug and will recommit it with the fix
in a little bit.

llvm-svn: 178921

1d8d2577

Reapply r178845 with fix - Fix bug in PEI's virtual-register scavenging · 3005c299

Hal Finkel authored Apr 05, 2013

This fixes PEI as previously described, but correctly handles the case where
the instruction defining the virtual register to be scavenged is the first in
the block. Arnold provided me with a bugpoint-reduced test case, but even that
seems too large to use as a regression test. If I'm successful in cleaning it
up then I'll commit that as well.

Original commit message:

This change fixes a bug that I introduced in r178058. After a register is
scavenged using one of the available spills slots the instruction defining the
virtual register needs to be moved to after the spill code. The scavenger has
already processed the defining instruction so that registers killed by that
instruction are available for definition in that same instruction. Unfortunately,
after this, the scavenger needs to iterate through the spill code and then
visit, again, the instruction that defines the now-scavenged register. In order
to avoid confusion, the register scavenger needs the ability to 'back up'
through the spill code so that it can again process the instructions in the
appropriate order. Prior to this fix, once the scavenger reached the
just-moved instruction, it would assert if it killed any registers because,
having already processed the instruction, it believed they were undefined.

Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.

llvm-svn: 178919

3005c299

Apr 05, 2013

Use the target options specified on a function to reset the back-end. · eb108bad

Bill Wendling authored Apr 05, 2013

During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.

llvm-svn: 178917

eb108bad

Revert r178845 - Fix bug in PEI's virtual-register scavenging · 81c46d08

Hal Finkel authored Apr 05, 2013

Reverting because this breaks one of the LTO builders. Original commit message:

Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.

llvm-svn: 178916

81c46d08

Tidy up a bit. No functional change. · bdbd7346
Jim Grosbach authored Apr 05, 2013
```
llvm-svn: 178915
```
bdbd7346

Disable the optimization about promoting vector-element-access with symbolic index. · 95adf525

Shuxin Yang authored Apr 05, 2013

This optimization is unstable at this moment; it 
  1) block us on a very important application
  2) PR15200
  3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll
     (the CHECK command compare the output against wrong result)

   I personally believe this optimization should not have any impact on the
autovectorized code, as auto-vectorizer is supposed to put gather/scatter
in a "right" way.  Although in theory downstream optimizaters might reveal 
some gather/scatter optimization opportunities, the chance is quite slim.

   For the hand-crafted vectorizing code, in term of redundancy elimination,
load-CSE, copy-propagation and DSE can collectively achieve the same result,
but in much simpler way. On the other hand, these optimizers are able to 
improve the code in a incremental way; in contrast, SROA is sort of all-or-none
approach. However, SROA might slighly win in stack size, as it tries to figure 
out a stretch of memory tightenly cover the area accessed by the dynamic index.

 rdar://13174884
 PR15200

llvm-svn: 178912

95adf525

[mips] XFAIL test-interp-vec-loadstore.ll in an attempt to turn builder · fac2db4a

Akira Hatanaka authored Apr 05, 2013

llvm-mips-linux green.

llvm-mips-linux runs on a big endian machine. This test passes if I change 'e'
to 'E' in the target data layout string.

llvm-svn: 178910

fac2db4a

<rdar://problem/13551789 > Fix a race in the LockFileManager. · 0cb68460

Douglas Gregor authored Apr 05, 2013

It's possible for the lock file to disappear and the owning process to
return before we're able to see the generated file. Spin for a little
while to see if it shows up before failing. 

llvm-svn: 178909

0cb68460

<rdar://problem/13551789 > Fix yet another race in unique_file. · 6bd4d8cf

Douglas Gregor authored Apr 05, 2013

If the directory that will contain the unique file doesn't exist when
we tried to create the file, but another process creates it before we
get a chance to try creating it, we would bail out rather than try to
create the unique file.

llvm-svn: 178908

6bd4d8cf

[Support][FileSystem] Fix identify_magic for big endian ELF. · b8055cbc
Michael J. Spencer authored Apr 05, 2013
```
llvm-svn: 178905
```
b8055cbc
Move yaml2obj to tools too. · 3add3e9c
Rafael Espindola authored Apr 05, 2013
```
llvm-svn: 178904
```
3add3e9c
Define versions of Section that are explicitly marked as little endian. · 4386fa99
Rafael Espindola authored Apr 05, 2013
```
These should really be templated like ELF, but this is a start.

llvm-svn: 178896
```
4386fa99
Added two debug logging messages to VisitInstructionsTopDown to match VisitInstructionsBottomUp. · bab49e97
Michael Gottesman authored Apr 05, 2013
```
llvm-svn: 178895
```
bab49e97
Don't use InMemoryStruct in getSection and getSection64. · 8622f2c1
Rafael Espindola authored Apr 05, 2013
```
llvm-svn: 178894
```
8622f2c1
Cleaned up whitespace and made debug logging less verbose. · 89279f83
Michael Gottesman authored Apr 05, 2013
```
llvm-svn: 178893
```
89279f83
Make the test/CodeGen/X86/win32_sret.ll reliable on any CPU by explicitly specifying the -mcpu · dcf44ca4
Timur Iskhodzhanov authored Apr 05, 2013
```
llvm-svn: 178885
```
dcf44ca4
Reverting 178851 as it broke buildbots · 91de828f
Renato Golin authored Apr 05, 2013
```
llvm-svn: 178883
```
91de828f

[ms-inline asm] Add support for numeric displacement expressions in bracketed · 4a7005e9

Chad Rosier authored Apr 05, 2013

memory operands.

Essentially, this layers an infix calculator on top of the parsing state
machine.  The scale on the index register is still expected to be an immediate

 __asm mov eax, [eax + ebx*4]

and will not work with more complex expressions.  For example,

 __asm mov eax, [eax + ebx*(2*2)]

The plus and minus binary operators assume the numeric value of a register is
zero so as to not change the displacement.  Register operands should never
be an operand for a multiply or divide operation; the scale*indexreg
expression is always replaced with a zero on the operand stack to prevent
such a case.
rdar://13521380

llvm-svn: 178881

4a7005e9