Commits · 7fdd268b68068ba9a9e498a2b2e9e1939eecd53a · Roger Ferrer / llvm-epi-0.8

May 16, 2013
- [SystemZ] Tweak register array comment · 7fdd268b
  Richard Sandiford authored May 16, 2013
```
llvm-svn: 182007
```
  7fdd268b
- [msan] Switch TLS globals to initial-exec model. · 1e764324
  Evgeniy Stepanov authored May 16, 2013
```
They are always defined in the main executable.

llvm-svn: 181994
```
  1e764324
- Removed unused variable, detected by gcc · b3391b58
  Patrik Hagglund authored May 16, 2013
```
-Wunused-but-set-variable. Leftover from r181979.

llvm-svn: 181993
```
  b3391b58
- Delete dead code. · 7242186b
  Rafael Espindola authored May 16, 2013
```
llvm-svn: 181982
```
  7242186b
- Don't call addFrameMove on XCore. · e3d5e535
  Rafael Espindola authored May 16, 2013
```
getExceptionHandlingType is not ExceptionHandling::DwarfCFI on xcore, so
etFrameInstructions is never called. There is no point creating cfi
instructions if they are never used.

llvm-svn: 181979
```
  e3d5e535
- Respect the 'nobuiltin' attribute when determining if a call is to a memory builtin. · e04f0d34
  Richard Smith authored May 16, 2013
```
llvm-svn: 181978
```
  e04f0d34
- Removed dead code. · 6e8c0d94
  Rafael Espindola authored May 16, 2013
```
llvm-svn: 181975
```
  6e8c0d94
- Patch number 2 for mips16/32 floating point interoperability stubs. · 515e9376
  Reed Kotler authored May 16, 2013
```
This creates stubs that help Mips32 functions call Mips16 
functions which have floating point parameters that are normally passed
in floating point registers.
 

llvm-svn: 181972
```
  515e9376
- Revert "Support unaligned load/store on more ARM targets" · 36f00d9f
  Derek Schuff authored May 15, 2013
```
This reverts r181898.

llvm-svn: 181944
```
  36f00d9f
- Remove dead code. · b8cd7a0d
  Eli Bendersky authored May 15, 2013
```
This method is not being used/tested anywhere.

llvm-svn: 181943
```
  b8cd7a0d
- LoopVectorize: Move call of canHoistAllLoads to canVectorizeWithIfConvert · 88e7fddc
  Arnold Schwaighofer authored May 15, 2013
```
We only want to check this once, not for every conditional block in the loop.

No functionality change (except that we don't perform a check redudantly
anymore).

llvm-svn: 181942
```
  88e7fddc
- Delete dead code. · 84ee6c40
  Rafael Espindola authored May 15, 2013
```
llvm-svn: 181941
```
  84ee6c40
- undef setjmp in PPCCTRLoops · 80267a0a
  Hal Finkel authored May 15, 2013
```
Trying to unbreak the VS build by copying some undef code from
Utils/LowerInvoke.cpp.

llvm-svn: 181938
```
  80267a0a
- X86: Remove redundant test instructions · 8f169742
  David Majnemer authored May 15, 2013
```
Increase the number of instructions LLVM recognizes as setting the ZF
flag. This allows us to remove test instructions that redundantly
recalculate the flag.

llvm-svn: 181937
```
  8f169742
May 15, 2013

Implement PPC counter loops as a late IR-level pass · 25c1992b

Hal Finkel authored May 15, 2013

The old PPCCTRLoops pass, like the Hexagon pass version from which it was
derived, could only handle some simple loops in canonical form. We cannot
directly adapt the new Hexagon hardware loops pass, however, because the
Hexagon pass contains a fundamental assumption that non-constant-trip-count
loops will contain a guard, and this is not always true (the result being that
incorrect negative counts can be generated). With this commit, we replace the
pass with a late IR-level pass which makes use of SE to calculate the
backedge-taken counts and safely generate the loop-count expressions (including
any necessary max() parts). This IR level pass inserts custom intrinsics that
are lowered into the desired decrement-and-branch instructions.

The most fragile part of this new implementation is that interfering uses of
the counter register must be detected on the IR level (and, on PPC, this also
includes any indirect branches in addition to function calls). Also, to make
all of this work, we need a variant of the mtctr instruction that is marked
as having side effects. Without this, machine-code level CSE, DCE, etc.
illegally transform the resulting code. Hopefully, this can be improved
in the future.

This new pass is smaller than the original (and much smaller than the new
Hexagon hardware loops pass), and can handle many additional cases correctly.
In addition, the preheader-creation code has been copied from LoopSimplify, and
after we decide on where it belongs, this code will be refactored so that it
can be explicitly shared (making this implementation even smaller).

The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for
the new Hexagon pass. There are a few classes of loops that this pass does not
transform (noted by FIXMEs in the files), but these deficiencies can be
addressed within the SE infrastructure (thus helping many other passes as well).

llvm-svn: 181927

25c1992b

Fix legalization of SETCC with promoted integer intrinsics · 1f6a7f53

Hal Finkel authored May 15, 2013

If the input operands to SETCC are promoted, we need to make sure that we
either use the promoted form of both operands (or neither); a mixture is not
allowed. This can happen, for example, if a target has a custom promoted
i1-returning intrinsic (where i1 is not a legal type). In this case, we need to
use the promoted form of both operands.

This change only augments the behavior of the existing logic in the case where
the input types (which may or may not have already been legalized) disagree,
and should not affect existing target code because this case would otherwise
cause an assert in the SETCC operand promotion code.

This will be covered by (essentially all of the) tests for the new PPCCTRLoops
infrastructure.

llvm-svn: 181926

1f6a7f53

Fix miscompile due to StackColoring incorrectly merging stack slots (PR15707) · d2c42d76

Derek Schuff authored May 15, 2013

IR optimisation passes can result in a basic block that contains:

  llvm.lifetime.start(%buf)
  ...
  llvm.lifetime.end(%buf)
  ...
  llvm.lifetime.start(%buf)

Before this change, calculateLiveIntervals() was ignoring the second
lifetime.start() and was regarding %buf as being dead from the
lifetime.end() through to the end of the basic block.  This can cause
StackColoring to incorrectly merge %buf with another stack slot.

Fix by removing the incorrect Starts[pos].isValid() and
Finishes[pos].isValid() checks.

Just doing:
      Starts[pos] = Indexes->getMBBStartIdx(MBB);
      Finishes[pos] = Indexes->getMBBEndIdx(MBB);
unconditionally would be enough to fix the bug, but it causes some
test failures due to stack slots not being merged when they were
before.  So, in order to keep the existing tests passing, treat LiveIn
and LiveOut separately rather than approximating the live ranges by
merging LiveIn and LiveOut.

This fixes PR15707.
Patch by Mark Seaborn.

llvm-svn: 181922

d2c42d76

Cleanup relocation sorting for ELF. · 0f2a6fe6

Rafael Espindola authored May 15, 2013

We want the order to be deterministic on all platforms. NAKAMURA Takumi
fixed that in r181864. This patch is just two small cleanups:

* Move the function to the cpp file. It is only passed to array_pod_sort.
* Remove the ppc implementation which is now redundant

llvm-svn: 181910

0f2a6fe6

PPCISelLowering.h: Escape \@ in comments. [-Wdocumentation] · dc9f013a
NAKAMURA Takumi authored May 15, 2013
```
llvm-svn: 181907
```
dc9f013a
Whitespace. · dcc66456
NAKAMURA Takumi authored May 15, 2013
```
llvm-svn: 181906
```
dcc66456

[objc-arc] Fixed a spelling error and made the statistic descriptions be... · b4e7f4d8

Michael Gottesman authored May 15, 2013

[objc-arc] Fixed a spelling error and made the statistic descriptions be consistent about their usage of periods.

llvm-svn: 181901

b4e7f4d8

Support unaligned load/store on more ARM targets · 72ddaba7

Derek Schuff authored May 15, 2013

This patch matches GCC behavior: the code used to only allow unaligned
load/store on ARM for v6+ Darwin, it will now allow unaligned load/store for
v6+ Darwin as well as for v7+ on other targets.

The distinction is made because v6 doesn't guarantee support (but LLVM assumes
that Apple controls hardware+kernel and therefore have conformant v6 CPUs),
whereas v7 does provide this guarantee (and Linux behaves sanely).

Overall this should slightly improve performance in most cases because of
reduced I$ pressure.

Patch by JF Bastien

llvm-svn: 181897

72ddaba7

· 06840768

Ulrich Weigand authored May 15, 2013

Remove MCELFObjectTargetWriter::adjustFixupOffset hack

Now that PowerPC no longer uses adjustFixupOffset, and no other
back-end (ever?) did, we can remove the infrastructure itself
(incidentally addressing a FIXME to that effect).

llvm-svn: 181895

06840768

· 2fb140ef

Ulrich Weigand authored May 15, 2013

[PowerPC] Remove need for adjustFixupOffst hack

Now that applyFixup understands differently-sized fixups, we can define
fixup_ppc_lo16/fixup_ppc_lo16_ds/fixup_ppc_ha16 to properly be 2-byte
fixups, applied at an offset of 2 relative to the start of the 
instruction text.

This has the benefit that if we actually need to generate a real
relocation record, its address will come out correctly automatically,
without having to fiddle with the offset in adjustFixupOffset.

Tested on both 64-bit and 32-bit PowerPC, using external and
integrated assembler.

llvm-svn: 181894

2fb140ef

[SystemZ] Make use of SUBTRACT HALFWORD · ffd14417
Richard Sandiford authored May 15, 2013
```
Thanks to Ulrich Weigand for noticing that this instruction was missing.

llvm-svn: 181893
```
ffd14417

· 56f5b28d

Ulrich Weigand authored May 15, 2013

[PowerPC] Correctly handle fixups of other than 4 byte size

The PPCAsmBackend::applyFixup routine handles the case where a
fixup can be resolved within the same object file.  However,
this routine is currently hard-coded to assume the size of
any fixup is always exactly 4 bytes.

This is sort-of correct for fixups on instruction text; even
though it only works because several of what really would be
2-byte fixups are presented as 4-byte fixups instead (requiring
another hack in PPCELFObjectWriter::adjustFixupOffset to clean
it up).

However, this assumption breaks down completely for fixups
on data, which legitimately can be of any size (1, 2, 4, or 8).

This patch makes applyFixup aware of fixups of varying sizes,
introducing a new helper routine getFixupKindNumBytes (along
the lines of what the ARM back end does).  Note that in order
to handle fixups of size 8, we also need to fix the return type
of adjustFixupValue to uint64_t to avoid truncation.

Tested on both 64-bit and 32-bit PowerPC, using external and
integrated assembler.

llvm-svn: 181891

56f5b28d

[SystemZ] Add more future work items to the README · 619859f4
Richard Sandiford authored May 15, 2013
```
Based on an analysis by Ulrich Weigand.

llvm-svn: 181882
```
619859f4
Fix build on Windows · 0588513e
Timur Iskhodzhanov authored May 15, 2013
```
llvm-svn: 181873
```
0588513e

Use only explicit bool conversion operators · 041f1aa3

David Blaikie authored May 15, 2013

BitVector/SmallBitVector::reference::operator bool remain implicit since
they model more exactly a bool, rather than something else that can be
boolean tested.

The most common (non-buggy) case are where such objects are used as
return expressions in bool-returning functions or as boolean function
arguments. In those cases I've used (& added if necessary) a named
function to provide the equivalent (or sometimes negative, depending on
convenient wording) test.

One behavior change (YAMLParser) was made, though no test case is
included as I'm not sure how to reach that code path. Essentially any
comparison of llvm::yaml::document_iterators would be invalid if neither
iterator was at the end.

This helped uncover a couple of bugs in Clang - test cases provided for
those in a separate commit along with similar changes to `operator bool`
instances in Clang.

llvm-svn: 181868

041f1aa3

LoopVectorize: Fix comments · 09cee972
Arnold Schwaighofer authored May 15, 2013
```
No functionality change.

llvm-svn: 181862
```
09cee972

LoopVectorize: Hoist conditional loads if possible · 2d920477

Arnold Schwaighofer authored May 15, 2013

InstCombine can be uncooperative to vectorization and sink loads into
conditional blocks. This prevents vectorization.

Undo this optimization if there are unconditional memory accesses to the same
addresses in the loop.

radar://13815763

llvm-svn: 181860

2d920477

Speed up Value::isUsedInBasicBlock() for long use lists. · 0925b24d

Jakob Stoklund Olesen authored May 14, 2013

This is expanding Ben's original heuristic for short basic blocks to
also work for longer basic blocks and huge use lists.

Scan the basic block and the use list in parallel, terminating the
search when the shorter list ends. In almost all cases, either the basic
block or the use list is short, and the function returns quickly.

In one crazy test case with very long use chains, CodeGenPrepare runs
400x faster. When compiling ARMDisassembler.cpp it is 5x faster.

<rdar://problem/13840497>

llvm-svn: 181851

0925b24d

Fix two typo · 149e281a
Sylvestre Ledru authored May 14, 2013
```
llvm-svn: 181848
```
149e281a

Object: Fix Mach-O relocation printing. · 9dab0cc6

Ahmed Bougacha authored May 14, 2013

There were two problems that made llvm-objdump -r crash:
- for non-scattered relocations, the symbol/section index is actually in the
  (aptly named) symbolnum field.
- sections are 1-indexed.

llvm-svn: 181843

9dab0cc6

ARM ISel: Don't create illegal types during LowerMUL · af85f608

Arnold Schwaighofer authored May 14, 2013

The transformation happening here is that we want to turn a
"mul(ext(X), ext(X))" into a "vmull(X, X)", stripping off the extension. We have
to make sure that X still has a valid vector type - possibly recreate an
extension to a smaller type. In case of a extload of a memory type smaller than
64 bit we used create a ext(load()). The problem with doing this - instead of
recreating an extload - is that an illegal type is exposed.

This patch fixes this by creating extloads instead of ext(load()) sequences.

Fixes PR15970.

radar://13871383

llvm-svn: 181842

af85f608

May 14, 2013

GlobalOpt: fix an issue where CXAAtExitFn points to a deleted function. · b3c52fb4

Manman Ren authored May 14, 2013

CXAAtExitFn was set outside a loop and before optimizations where functions
can be deleted. This patch will set CXAAtExitFn inside the loop and after
optimizations.

Seg fault when running LTO because of accesses to a deleted function.
rdar://problem/13838828

llvm-svn: 181838

b3c52fb4

Make getCompileUnit non-const and return the current DIE if it · 8fd7ab07

Eric Christopher authored May 14, 2013

happens to be a compile unit. Noticed on inspection and tested
via calling on a newly created compile unit. No functional change.

llvm-svn: 181835

8fd7ab07

Implement the PowerPC system call (sc) instruction. · a87a7e26
Bill Schmidt authored May 14, 2013
```
Instruction added at request of Roman Divacky.  Tested via asm-parser.

llvm-svn: 181821
```
a87a7e26

SectionMemoryManager shouldn't be a JITMemoryManager. Previously, the · 9bc53e84

Filip Pizlo authored May 14, 2013

EngineBuilder interface required a JITMemoryManager even if it was being used
to construct an MCJIT. But the MCJIT actually wants a RTDyldMemoryManager.
Consequently, the SectionMemoryManager, which is meant for MCJIT, derived
from the JITMemoryManager and then stubbed out a bunch of JITMemoryManager
methods that weren't relevant to the MCJIT.

This patch fixes the situation: it teaches the EngineBuilder that
RTDyldMemoryManager is a supertype of JITMemoryManager, and that it's
appropriate to pass a RTDyldMemoryManager instead of a JITMemoryManager if
we're using the MCJIT. This allows us to remove the stub methods from
SectionMemoryManager, and make SectionMemoryManager a direct subtype of
RTDyldMemoryManager.

llvm-svn: 181820

9bc53e84

Hexagon: Pass to replace tranfer/copy instructions into combine instruction · 803e506f
Jyotsna Verma authored May 14, 2013
```
where possible.

llvm-svn: 181817
```
803e506f