Commits · 90b1729af947850c3e9267125a20619085a2e509 · Roger Ferrer / llvm-epi-0.8

Sep 14, 2013

Make PrettyStackTraceEntry use ManagedStatic for its ThreadLocal. · 67d97093

Filip Pizlo authored Sep 13, 2013

This was somewhat tricky because ~PrettyStackTraceEntry() may run after
llvm_shutdown() has been called. This is rare and only happens for a common idiom
used in the main() functions of command-line tools. This works around the idiom by
skipping the stack clean-up if the PrettyStackTraceHead ManagedStatic is not
constructed (i.e. llvm_shutdown() has been called).

llvm-svn: 190730

67d97093

Sep 13, 2013

Add missing break statement in PPCISelLowering · c3cfbf86
Hal Finkel authored Sep 13, 2013
```
As it turns out, not a problem in practice, but it should be there.

llvm-svn: 190720
```
c3cfbf86

Adds support for Atom Silvermont (SLM) - -march=slm · 3fe264d6

Preston Gurd authored Sep 13, 2013

Implements Instruction scheduler latencies for Silvermont,
using latencies from the Intel Silvermont Optimization Guide.

Auto detects SLM.

Turns on post RA scheduler when generating code for SLM.

llvm-svn: 190717

3fe264d6

[Peephole] Rewrite copies to avoid cross register banks copies. · cf71c632

Quentin Colombet authored Sep 13, 2013

By definition copies across register banks are not coalescable. Still, it may be
possible to get rid of such a copy when the value is available in another
register of the same register file.
Consider the following example, where capital and lower letters denote different
register file:
b = copy A <-- cross-bank copy
...
C = copy b <-- cross-bank copy

This could have been optimized this way:
b = copy A  <-- cross-bank copy
...
C = copy A <-- same-bank copy

Note: b and C's definitions may be in different basic blocks.

This patch adds a peephole optimization that looks through a chain of copies
leading to a cross-bank copy and reuses a source that is on the same register
file if available.

This solution could also be used to get rid of some copies (e.g., A could have
been used instead of C). However, we do not do so because:
- It may over constrain the coloring of the source register for coalescing.
- The register allocator may not be able to find a nice split point for the
  longer live-range, leading to more spill.

<rdar://problem/14742333>

llvm-svn: 190713

cf71c632

[ARMv8] Change hasV8Fp to hasFPARMv8, and other command line options · ccd04894
Joey Gouly authored Sep 13, 2013
```
to be more consistent.

llvm-svn: 190692
```
ccd04894
[msan] Add source file:line to stack origin reports. · 0435ecd1
Evgeniy Stepanov authored Sep 13, 2013
```
Compiler part.

llvm-svn: 190689
```
0435ecd1
[ARMv8] Emit the proper .fpu directive. · 3c0e5567
Joey Gouly authored Sep 13, 2013
```
Patch by Bradley Smith!

llvm-svn: 190683
```
3c0e5567
Test commit to verify that commit access works. · def5d347
Zoran Jovanovic authored Sep 13, 2013
```
llvm-svn: 190676
```
def5d347
[SystemZ] Use getTarget{Insert,Extract}Subreg rather than getMachineNode · d8163208
Richard Sandiford authored Sep 13, 2013
```
Just a clean-up, no behavioral change intended.

llvm-svn: 190673
```
d8163208
[SystemZ] Try to fold shifts into TMxx · 030c1657
Richard Sandiford authored Sep 13, 2013
```
E.g. "SRL %r2, 2; TMLL %r2, 1" => "TMLL %r2, 4".

llvm-svn: 190672
```
030c1657
Avoid a compiler warning about Found not being used when assertions are · c9e95ad0
Duncan Sands authored Sep 13, 2013
```
disabled.

llvm-svn: 190668
```
c9e95ad0

AArch64: use RegisterOperand for NEON registers. · 635a9790

Tim Northover authored Sep 13, 2013

Previously we modelled VPR128 and VPR64 as essentially identical
register-classes containing V0-V31 (which had Q0-Q31 as "sub_alias"
sub-registers). This model is starting to cause significant problems
for code generation, particularly writing EXTRACT/INSERT_SUBREG
patterns for converting between the two.

The change here switches to classifying VPR64 & VPR128 as
RegisterOperands, which are essentially aliases for RegisterClasses
with different parsing and printing behaviour. This fits almost
exactly with their real status (VPR128 == FPR128 printed strangely,
VPR64 == FPR64 printed strangely).

llvm-svn: 190665

635a9790

Move operator to end of previous line to match coding standards. · 21a916b6
Craig Topper authored Sep 13, 2013
```
llvm-svn: 190659
```
21a916b6

Add initial support for handling gnu style pubnames accepted by some · dd1a0120

Eric Christopher authored Sep 13, 2013

versions of gold. This support is designed to allow gold to produce
gdb_index sections similar to the accelerator tables and consumable
by gdb.

llvm-svn: 190649

dd1a0120

Reformat and hoist section grabbing to top level. · 8b3737fb
Eric Christopher authored Sep 13, 2013
```
llvm-svn: 190648
```
8b3737fb
R600: Move clamp handling code to R600IselLowering.cpp · 0167a313
Vincent Lejeune authored Sep 12, 2013
```
llvm-svn: 190645
```
0167a313
R600: Move code handling literal folding into R600ISelLowering. · 9a248e5c
Vincent Lejeune authored Sep 12, 2013
```
llvm-svn: 190644
```
9a248e5c
R600: Move fabs/fneg/sel folding logic into PostProcessIsel · ab3baf80
Vincent Lejeune authored Sep 12, 2013
```
This move makes possible to correctly handle multiples instructions
from a single pattern.

llvm-svn: 190643
```
ab3baf80
Remove an unused variable, fixing -Werror build with latest Clang. · 51428e36
Chandler Carruth authored Sep 12, 2013
```
llvm-svn: 190640
```
51428e36

Fix PPC ABI for ByVal structs with vector members · 262a2247

Hal Finkel authored Sep 12, 2013

When a structure is passed by value, and that structure contains a vector
member, according to the PPC ABI, the structure will receive enhanced alignment
(so that the vector within the structure will always be aligned).

This should resolve PR16641.

llvm-svn: 190636

262a2247

Patch provide by Tom Roeder! · 1a6e7708

Joe Abbey authored Sep 12, 2013

Reviewed by Joe Abbey and Tobias Grosser

Here is a patch that fixes decoding of CE_SELECT in BitcodeReader,
along with a simple test case. The problem in the current code is that
it generates but doesn't accept bitcode that uses vectors for the
first element of a select in this context.

llvm-svn: 190634

1a6e7708

Sep 12, 2013

In AliasSetTracker, do not change the alias set to "mod/ref" when adding · de7485af
Krzysztof Parzyszek authored Sep 12, 2013
```
a volatile load, or a volatile store.

llvm-svn: 190631
```
de7485af

Make the PPC fast-math sqrt expansion safe at 0 · 1e2e3ea5

Hal Finkel authored Sep 12, 2013

In fast-math mode sqrt(x) is calculated using the fast expansion of the
reciprocal of the reciprocal sqrt expansion. The reciprocal and reciprocal
sqrt expansions use the associated estimate instructions along with some Newton
iterations. Unfortunately, as a result, sqrt(0) was being calculated as NaN,
which is not correct. Now we explicitly return a result of zero if the input is
zero.

llvm-svn: 190624

1e2e3ea5

Implement asm support for a few PowerPC bookIII that are needed for assembling · 62cb6354
Roman Divacky authored Sep 12, 2013
```
FreeBSD kernel.

llvm-svn: 190618
```
62cb6354

This switches CrashRecoveryContext to using ManagedStatic for its global Mutex and · f2189bf3

Filip Pizlo authored Sep 12, 2013

global ThreadLocals, thereby getting rid of the load-time initialization of those 
objects and also getting rid of their destruction unless the LLVM client calls 
llvm_shutdown.

llvm-svn: 190617

f2189bf3

Partial support for Intel SHA Extensions (sha1rnds4) · 1650175d

Ben Langmuir authored Sep 12, 2013

Add basic assembly/disassembly support for the first Intel SHA
instruction 'sha1rnds4'. Also includes feature flag, and test cases.

Support for the remaining instructions will follow in a separate patch.

llvm-svn: 190611

1650175d

Mark PPC MFTB and DST (and friends) as deprecated · 0096dbd5

Hal Finkel authored Sep 12, 2013

Use the new instruction deprecation feature to mark mftb (now replaced with
mfspr) and dst (along with the other Altivec cache control instructions) as
deprecated when targeting cores supporting at least ISA v2.03.

llvm-svn: 190605

0096dbd5

LLVM Interpreter: implementation of "insertvalue" and "extractvalue"; · 8e97f016

Elena Demikhovsky authored Sep 12, 2013

undef constatnt for structure and test for these functions.

done by Yuri Veselov (mailto:Yuri.Veselov@intel.com)

llvm-svn: 190599

8e97f016

Add an instruction deprecation feature to TableGen. · 0e76fa7d

Joey Gouly authored Sep 12, 2013

The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.

The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
  ComplexDeprecationPredicate<"MCR">

would mean you would have to define the following function:
  bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
                             std::string &Info)

Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.

The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.

llvm-svn: 190598

0e76fa7d

AVX-512: implemented extractelement with variable index. · 8952974e
Elena Demikhovsky authored Sep 12, 2013
```
Added parsing of mask register and "zeroing" semantic, like {%k1} {z}.

llvm-svn: 190595
```
8952974e

PPC: Enable aggressive anti-dependency breaking · 7fe6a539

Hal Finkel authored Sep 12, 2013

Aggressive anti-dependency breaking is enabled by default for all PPC cores.
This provides a general speedup on the P7 and other platforms (among other
factors, the instruction group formation for the non-embedded PPC cores is done
during post-RA scheduling). In order to do this safely, the incompatibility
between uses of the MFOCRF instruction and anti-dependency breaking are
resolved by marking MFOCRF with hasExtraSrcRegAllocReq. As noted in the removed
FIXME, the problem was that MFOCRF's output is sensitive to the identify of the
source register, and always paired with a shift to undo this effect. Because
anti-dependency breaking is unaware of this hidden dependency of the shift
amount on the source register of the MFOCRF instruction, changing that register
must be inhibited.

Two test cases were adjusted: The SjLj test was made more insensitive to
register choices and scheduling; the saveCR test disabled anti-dependency
breaking because part of what it is testing is proper register reuse.

llvm-svn: 190587

7fe6a539

Fix crash in AggressiveAntiDepBreaker with empty CriticalPathSet · 6f1ff8e1

Hal Finkel authored Sep 12, 2013

If no register classes are added to CriticalPathRCs, then the CriticalPathSet
bitmask will be empty. In that case, ExcludeRegs must remain NULL or else this
line will cause a segfault:

  } else if ((ExcludeRegs != NULL) && ExcludeRegs->test(AntiDepReg)) {

I have no in-tree test case.

llvm-svn: 190584

6f1ff8e1

R600/SI: expose TBUFFER_STORE_FORMAT_* for OpenGL transform feedback · afcf12f3

Tom Stellard authored Sep 12, 2013



For _XYZ, the type of VDATA is v4i32, because v3i32 doesn't exist.

The ADDR64 bit is not exposed. A simpler intrinsic that doesn't take
a resource descriptor might be nicer.

The maximum number of input SGPRs is bumped to 17.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 190575

afcf12f3

R600: Don't use trans slot for instructions that read LDS source registers · 7f6fa4c4

Tom Stellard authored Sep 12, 2013

This fixes some regressions in the piglit local memory store tests
introduced by recent commits which made the scheduler aware of the trans
slot.

It's not possible to test this using lit, because there is no way to
determine from the assembly dumps whether or not an instruction is in
the trans slot.

Even if this were possible, the test would be highly sensitive to
changes in the scheduler and might generate confusing false negatives.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 190574

7f6fa4c4

Move variable under condition where it is used · bed5bf2e
Matt Arsenault authored Sep 12, 2013
```
llvm-svn: 190567
```
bed5bf2e
Remove pointless assertion after r190376 · bc08ddba
Matt Arsenault authored Sep 12, 2013
```
llvm-svn: 190565
```
bc08ddba

Greatly simplify the PPC A2 scheduling itinerary · f574c277

Hal Finkel authored Sep 11, 2013

As Andy pointed out to me a long time ago, there are no structural hazards in
the later pipeline stages of the A2, and so modeling them is useless. Also,
modeling the top pre-dispatch stages is deceiving because, when multiple
hardware threads are active, those resources are shared among the threads. The
bypass definitions were mostly wrong, and so those have been removed. The
resulting itinerary is much simpler, and more accurate.

llvm-svn: 190562

f574c277

Enable MI scheduling (and CodeGen AA) by default for embedded PPC cores · 21442b24

Hal Finkel authored Sep 11, 2013

For embedded PPC cores (especially the A2 core), using the MI scheduler with AA
is far superior to the other scheduling options.

llvm-svn: 190558

21442b24

Sep 11, 2013
- Use the appropriate return type for the compact unwind encoding. · 7b650a75
  Bill Wendling authored Sep 11, 2013
```
llvm-svn: 190551
```
  7b650a75
- Implement TTI getUnrollingPreferences for PowerPC · 71780ec4
  Hal Finkel authored Sep 11, 2013
```
The PowerPC A2 core greatly benefits from aggressive concatenation unrolling;
use the new getUnrollingPreferences to enable this by default when targeting
the PPC A2 core.

llvm-svn: 190549
```
  71780ec4