Commits · 310f501ef07dd2ba454fb99471703c930195e9e3 · Roger Ferrer / llvm-epi-0.8

Jan 29, 2014

Use a raw_stream to implement the mangler. · 310f501e

Rafael Espindola authored Jan 29, 2014

This is a bit more convenient for some callers, but more importantly, it is
easier to implement correctly. Doing this removes the patching of already
printed data that was used for fastcall, fixing a crash with private fastcall
symbols.

llvm-svn: 200367

310f501e

[AArch64 NEON] Lower SELECT_CC with vector operand. · 92d64d2d

Kevin Qin authored Jan 29, 2014

When the scalar compare is between floating point and operands are
vector, we custom lower SELECT_CC to use NEON SIMD compare for
generating less instructions.

llvm-svn: 200365

92d64d2d

Remove unnecessary call to pthread_mutexattr_setpshared() · efe919ff

Mark Seaborn authored Jan 29, 2014

The default value of this attribute is PTHREAD_PROCESS_PRIVATE, so
there's no point in calling pthread_mutexattr_setpshared() to set
that.

See: http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutexattr_getpshared.html

This removes some ifdefs that tend to need to be extended for other
platforms (e.g. for NaCl).

Note that this call was in the first implementation of Mutex, added in
r22403, so it doesn't appear to have been added in response to a
performance problem.

Differential Revision: http://llvm-reviews.chandlerc.com/D2633

llvm-svn: 200360

efe919ff

MC: Clean up error paths in AsmParser::parseMacroArgument · 1625245b

David Majnemer authored Jan 29, 2014

Use an RAII object Instead of inserting a call to
AsmLexer::setSkipSpace(true) in all error paths.

No functional change.

llvm-svn: 200358

1625245b

Make createObjectFile's signature a bit less error prone. · c3ceeb6f

Rafael Espindola authored Jan 29, 2014

This will be better with c++11, but right now file_magic converts to bool,
which makes the api really easy to misuse.

llvm-svn: 200357

c3ceeb6f

[Sparc] Fix breakage in r200345 · a86694ca

David Woodhouse authored Jan 28, 2014

Oops. Don't do build tests on patches like that with --enable-targets=x86_64

llvm-svn: 200355

a86694ca

Delete MCSubtargetInfo data members from target MCCodeEmitter classes · d2cca113

David Woodhouse authored Jan 28, 2014

The subtarget info is explicitly passed to the EncodeInstruction
method and we should use that subtarget info to influence any
encoding decisions.

llvm-svn: 200350

d2cca113

Propagate MCSubtargetInfo through TableGen's getBinaryCodeForInstr() · 3fa98a65
David Woodhouse authored Jan 28, 2014
```
llvm-svn: 200349
```
3fa98a65
Explictly pass MCSubtargetInfo to MCCodeEmitter::EncodeInstruction() · 9784cef3
David Woodhouse authored Jan 28, 2014
```
llvm-svn: 200348
```
9784cef3

Keep the MCSubtargetInfo in the MCRelxableFragment class. · f5199f68

David Woodhouse authored Jan 28, 2014

Needed to fix PR18303 to correctly re-encode the instruction if it
is relaxed.

We keep a copy of the MCSubtargetInfo to make sure that we are not
effected by future changes to the subtarget info coming from the
assembler (e.g. when parsing .code 16 directived).

llvm-svn: 200347

f5199f68

Modify MCObjectStreamer EmitInstTo* interface · 6f3c73f7

David Woodhouse authored Jan 28, 2014

Add MCSubtargetInfo parameter
virtual void EmitInstToFragment(const MCInst &Inst, const MCSubtargetInfo &);
virtual void EmitInstToData(const MCInst &Inst, const MCSubtargetInfo &);

llvm-svn: 200346

6f3c73f7

Change MCStreamer EmitInstruction interface to take subtarget info · e6c13e4a
David Woodhouse authored Jan 28, 2014
```
llvm-svn: 200345
```
e6c13e4a

Jan 28, 2014

Add line table debug info to COFF files when using a win32 triple. · 2c659648
Timur Iskhodzhanov authored Jan 28, 2014
```
Reviewed at http://llvm-reviews.chandlerc.com/D2232

llvm-svn: 200340
```
2c659648

[mips] Fix ELF header flags. · 2e03f243

Matheus Almeida authored Jan 28, 2014

As opposed to GCC/GAS the default ABI for Mips64 is n64.
Compatibility bit should be set if o32 ABI is used when targeting Mips64.

llvm-svn: 200332

2e03f243

[NVPTX] Fix emitting aggregate parameters · 2c283400

Gautam Chakrabarti authored Jan 28, 2014

The code was missing the case for aggregate parameters and
hence was emitting them as .b0 type. Also fixed a couple
of comments.

llvm-svn: 200325

2c283400

[X86] Add extra rules for combining vselect dag nodes into movsd. · 2ea61f17

Andrea Di Biagio authored Jan 28, 2014

This improves the fix committed at revision 199683 adding the
following new target specific combine rules:

1) fold (v4i32: vselect <0,0,-1,-1>, A, B) ->
        (v4i32 (bitcast (movsd (v2i64 (bitcast A)), (v2i64 (bitcast B))) ))

2) fold (v4f32: vselect <0,0,-1,-1>, A, B) ->
        (v4f32 (bitcast (movsd (v2f64 (bitcast A)), (v2f64 (bitcast B))) ))

3) fold (v4i32: vselect <-1,-1,0,0>, A, B) ->
        (v4i32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) ))

4) fold (v4f32: vselect <-1,-1,0,0>, A, B) ->
        (v4f32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) ))

llvm-svn: 200324

2ea61f17

typo · c67655a7
Adrian Prantl authored Jan 28, 2014
```
llvm-svn: 200323
```
c67655a7

Fix pr14893. · ab73c493

Rafael Espindola authored Jan 28, 2014

When simplifycfg moves an instruction, it must drop metadata it doesn't know
is still valid with the preconditions changes. In particular, it must drop
the range and tbaa metadata.

The patch implements this with an utility function to drop all metadata not
in a white list.

llvm-svn: 200322

ab73c493

[DAGCombiner] Avoid introducing an illegal build_vector when folding a sign_extend. · b6d39afb

Andrea Di Biagio authored Jan 28, 2014

Make sure that we don't introduce illegal build_vector dag nodes
when trying to fold a sign_extend of a build_vector.

This fixes a regression introduced by r200234.
Added test CodeGen/X86/fold-vector-sext-crash.ll
to verify that llc no longer crashes with an assertion failure
due to an illegal build_vector of type MVT::v4i64.

Thanks to Ilia Filippov for spotting this regression and for
providing a reproducible test case.

llvm-svn: 200313

b6d39afb

Provide a stub Target Streamer implementation for PPC MachO · 625b65a9

Iain Sandoe authored Jan 28, 2014

At present, this handles .tc (error) and needs to be expanded to deal properly with .machine

llvm-svn: 200309

625b65a9

[vectorizer] Completely disable the block frequency guidance of the loop · b7836285

Chandler Carruth authored Jan 28, 2014

vectorizer, placing it behind an off-by-default flag.

It turns out that block frequency isn't what we want at all, here or
elsewhere. This has been I think a nagging feeling for several of us
working with it, but Arnold has given some really nice simple examples
where the results are so comprehensively wrong that they aren't useful.

I'm planning to email the dev list with a summary of why its not really
useful and a couple of ideas about how to better structure these types
of heuristics.

llvm-svn: 200294

b7836285

Handle spilling the PPC GPRC_NOR0 register class · 4e703bce

Hal Finkel authored Jan 28, 2014

GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo
register). As a result, we also need to check for it in the spilling code.

llvm-svn: 200288

4e703bce

MC: Add a .debug section that we'll soon use to emit debug info into COFF files · 31377c54
Timur Iskhodzhanov authored Jan 28, 2014
```
llvm-svn: 200285
```
31377c54

R600/SI: Add pattern for truncating i32 to i1 · bf1a6410

Michel Danzer authored Jan 28, 2014



Fixes half a dozen piglit tests with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 200283

bf1a6410

Fix the DWARF EH encodings for Sparc PIC code. · 83c67735
Jakob Stoklund Olesen authored Jan 28, 2014
```
Also emit the stubs that were generated for references to typeinfo
symbols.

llvm-svn: 200282
```
83c67735

Update optimization passes to handle inalloca arguments · 26af2cae

Reid Kleckner authored Jan 28, 2014

Summary:
I searched Transforms/ and Analysis/ for 'ByVal' and updated those call
sites to check for inalloca if appropriate.

I added tests for any change that would allow an optimization to fire on
inalloca.

Reviewers: nlewycky

Differential Revision: http://llvm-reviews.chandlerc.com/D2449

llvm-svn: 200281

26af2cae

x86: add implicit defs for cpuid · b2340d4c

Reid Kleckner authored Jan 28, 2014

This avoids miscompiling MS inline asm in LLVM where we have to infer
clobbers.  Test case forthcoming in Clang.

llvm-svn: 200279

b2340d4c

[LPM] Fix PR18616 where the shifts to the loop pass manager to extract · d84f776e

Chandler Carruth authored Jan 28, 2014

LCSSA from it caused a crasher with the LoopUnroll pass.

This crasher is really nasty. We destroy LCSSA form in a suprising way.
When unrolling a loop into an outer loop, we not only need to restore
LCSSA form for the outer loop, but for all children of the outer loop.
This is somewhat obvious in retrospect, but hey!

While this seems pretty heavy-handed, it's not that bad. Fundamentally,
we only do this when we unroll a loop, which is already a heavyweight
operation. We're unrolling all of these hypothetical inner loops as
well, so their size and complexity is already on the critical path. This
is just adding another pass over them to re-canonicalize.

I have a test case from PR18616 that is great for reproducing this, but
pretty useless to check in as it relies on many 10s of nested empty
loops that get unrolled and deleted in just the right order. =/ What's
worse is that investigating this has exposed another source of failure
that is likely to be even harder to test. I'll try to come up with test
cases for these fixes, but I want to get the fixes into the tree first
as they're causing crashes in the wild.

llvm-svn: 200273

d84f776e

[TLI] Add a new hook to TargetLowering to query the target if a load of a... · 659ce00d

Juergen Ributzka authored Jan 28, 2014

[TLI] Add a new hook to TargetLowering to query the target if a load of a constant should be converted to simply the constant itself.

Before this patch we used getIntImmCost from TargetTransformInfo to determine if
a load of a constant should be converted to just a constant, but the threshold
for this was set to an arbitrary value. This value works well for the two
targets (X86 and ARM) that implement this target-hook, but it isn't
target-independent at all.

Now targets have the possibility to decide directly if this optimization should
be performed. The default value is set to false to preserve the current
behavior. The target hook has been moved to TargetLowering, which removed the
last use and need of TargetTransformInfo in SelectionDAG.

llvm-svn: 200271

659ce00d

LoopVectorize: Support conditional stores by scalarizing · 18865db3

Arnold Schwaighofer authored Jan 28, 2014

The vectorizer takes a loop like this and widens all instructions except for the
store. The stores are scalarized/unrolled and hidden behind an "if" block.

  for (i = 0; i < 128; ++i) {
    if (a[i] < 10)
      a[i] += val;
  }

  for (i = 0; i < 128; i+=2) {
    v = a[i:i+1];
    v0 = (extract v, 0) + 10;
    v1 = (extract v, 1) + 10;
    if (v0 < 10)
      a[i] = v0;
    if (v1 < 10)
      a[i] = v1;
  }

The vectorizer relies on subsequent optimizations to sink instructions into the
conditional block where they are anticipated.

The flag "vectorize-num-stores-pred" controls whether and how many stores to
handle this way. Vectorization of conditional stores is disabled per default for
now.

This patch also adds a change to the heuristic when the flag
"enable-loadstore-runtime-unroll" is enabled (off by default). It unrolls small
loops until load/store ports are saturated. This heuristic uses TTI's
getMaxUnrollFactor as a measure for load/store ports.

I also added a second flag -enable-cond-stores-vec. It will enable vectorization
of conditional stores. But there is no cost model for vectorization of
conditional stores in place yet so this will not do good at the moment.

rdar://15892953

Results for x86-64 -O3 -mavx +/- -mllvm -enable-loadstore-runtime-unroll
-vectorize-num-stores-pred=1 (before the BFI change):

 Performance Regressions:
   Benchmarks/Ptrdist/yacr2/yacr2 7.35% (maze3() is identical but 10% slower)
   Applications/siod/siod         2.18%
 Performance improvements:
   mesa                          -4.42%
   libquantum                    -4.15%

 With a patch that slightly changes the register heuristics (by subtracting the
 induction variable on both sides of the register pressure equation, as the
 induction variable is probably not really unrolled):

 Performance Regressions:
   Benchmarks/Ptrdist/yacr2/yacr2  7.73%
   Applications/siod/siod          1.97%

 Performance Improvements:
   libquantum                    -13.05% (we now also unroll quantum_toffoli)
   mesa                           -4.27%

llvm-svn: 200270

18865db3

Revert r199871 and replace it with a simple check in the debug info · 2037caf8

Eric Christopher authored Jan 28, 2014

code to see if we're emitting a function into a non-default
text section. This is still a less-than-ideal solution, but more
contained than r199871 to determine whether or not we're emitting
code into an array of comdat sections.

llvm-svn: 200269

2037caf8

Reformat slightly. · f07ee3ae
Eric Christopher authored Jan 27, 2014
```
llvm-svn: 200264
```
f07ee3ae

PGO branch weight: keep halving the weights until they can fit into · f1cb16e4

Manman Ren authored Jan 27, 2014

uint32.

When folding branches to common destination, the updated branch weights
can exceed uint32 by more than factor of 2. We should keep halving the
weights until they can fit into uint32.

llvm-svn: 200262

f1cb16e4

Jan 27, 2014

Fix the "#ifndef HAVE_SYS_WAIT_H" code path in Program.inc to compile · 8d5b0e2f
Mark Seaborn authored Jan 27, 2014
```
Without this fix, WaitResult is not defined.

llvm-svn: 200259
```
8d5b0e2f

ARM MC: Fix the initial DWARF CFI unwind info at the start of a function · ba86cf51

Mark Seaborn authored Jan 27, 2014

This brings MC into line with GNU 'as' on ARM, and it brings the ARM
target into line with most other LLVM targets, which declare the
initial CFI state with addInitialFrameState().

Without this, functions generated with .cfi_startproc/endproc on ARM
will tend to cause GDB to abort with:
  gdb/dwarf2-frame.c:1132: internal-error: Unknown CFA rule.

I've also tested this by comparing the output of "readelf -w" on the
object files produced by llvm-mc and gas when given the .s file added
here.

This change is part of addressing PR18636.

Differential Revision: http://llvm-reviews.chandlerc.com/D2597

llvm-svn: 200255

ba86cf51

Fix sext(setcc) -> select_cc using wrong type for setcc. · 5f2a92a2

Matt Arsenault authored Jan 27, 2014

Also update the comment, since it actually produces a
select (setcc) instead of select_cc.

It was checking and using the setcc result type for the
type of the sext, instead of the type of the compared items.

In my problem case, the sext was to i32 and was used as the setcc type,
but the expected type was i64.

No test since I haven't been able to hit the problem with
this on any in-tree targets.

llvm-svn: 200249

5f2a92a2

Fix unsupported addressing mode assertion for pld · b76f55f7

David Peixotto authored Jan 27, 2014

Summary:
This commit gives an address mode to the PLD instruction. We
were getting an assertion failure in the frame lowering code
because we had code that was doing a pld of a stack allocated
address. The frame lowering was checking the address mode and
then asserting because pld had none defined.

This commit fixes pld for arm mode. There was a previous fix for
thumb mode in a separate commit. The commit for thumb mode
added a test in a separate file because it would otherwise fail
for arm. This commit moves the thumb test back into the prefetch.ll
file and adds the corresponding arm test.

Differential Revision: http://llvm-reviews.chandlerc.com/D2622

llvm-svn: 200248

b76f55f7

test commit: add minor comment · 35bd952a
Gautam Chakrabarti authored Jan 27, 2014
```
llvm-svn: 200244
```
35bd952a

[DAGCombiner] Teach how to fold sext/aext/zext of constant build vectors. · f09a3577

Andrea Di Biagio authored Jan 27, 2014

This patch teaches the DAGCombiner how to fold a sext/aext/zext dag node when
the operand in input is a build vector of constants (or UNDEFs).

The inability to fold a sext/zext of a constant build_vector was the root
cause of some pcg bugs affecting vselect expansion on x86-64 with AVX support.

Before this change, the DAGCombiner only knew how to fold a sext/zext/aext of a
ConstantSDNode.

llvm-svn: 200234

f09a3577

MC: Add support for .cfi_startproc simple · e035cf9c

David Majnemer authored Jan 27, 2014

This commit allows LLVM MC to process .cfi_startproc directives when
they are followed by an additional `simple' identifier. This signals to
elide the emission of target specific CFI instructions that would
normally occur initially.

This fixes PR16587.

Differential Revision: http://llvm-reviews.chandlerc.com/D2624

llvm-svn: 200227

e035cf9c