Commits · fc165f143454813dc062c98534763593261307b5 · Roger Ferrer / llvm-epi

Mar 05, 2015

Instructions: Use delegated constructors to reduce duplication · fc165f14
Benjamin Kramer authored Mar 05, 2015
```
NFC.

llvm-svn: 231411
```
fc165f14

[AVX] Lower / fast-isel scalar FP selects into VBLENDV instructions (PR22483) · 302404b2

Sanjay Patel authored Mar 05, 2015

This patch reduces code size for all AVX targets and increases speed for some chips.

SSE 4.1 introduced the useless (see code comments) 2-register form of BLENDV and
only in the packed float/double flavors.

AVX subsequently made the instruction useful by adding a 4-register operand form.

So we just need to paper over the lack of scalar forms of this instruction, complicate
the code to choose float or double forms, and use blendv on scalars since all FP is in
xmm registers anyway.

This gives us an approximately 50% speed up for a blendv microbenchmark sequence
on SandyBridge and Haswell:
blendv : 29.73 cycles/iter
logic : 43.15 cycles/iter

No new test cases with this patch because:

1. fast-isel-select-sse.ll tests the positive side for regular X86 lowering and fast-isel
2. sse-minmax.ll and fp-select-cmp-and.ll confirm that we're not firing for scalar selects without AVX
3. fp-select-cmp-and.ll and logical-load-fold.ll confirm that we're not firing for scalar selects with constants.

http://llvm.org/bugs/show_bug.cgi?id=22483

Differential Revision: http://reviews.llvm.org/D8063

llvm-svn: 231408

302404b2

SelectionDAGBuilder: Merge 3 copies of the limited precision exp2 emission code. · fb0abceb
Benjamin Kramer authored Mar 05, 2015
```
NFC intended.

llvm-svn: 231406
```
fb0abceb
Fix uninitialized memory references in WinEHPrepare · 05ee8bd4
Andrew Kaylor authored Mar 05, 2015
```
llvm-svn: 231405
```
05ee8bd4

SDAG: Merge the meat of two ExpandAtomic implementations. · c54c38e0

Benjamin Kramer authored Mar 05, 2015

The copies already diverged, don't let them become any worse. Reduce
redundancy in code with a little macro metaprogramming.

llvm-svn: 231401

c54c38e0

[AArch64] Teach AsmPrinter about GlobalAddress operands. · 1b67630c
Ahmed Bougacha authored Mar 05, 2015
```
Fixes PR22761, rdar://20024866.
Differential Revision: http://reviews.llvm.org/D8042

llvm-svn: 231400
```
1b67630c

[RewriteStatepointsForGC] Add additional tests around relocation · 03ea8642

Philip Reames authored Mar 05, 2015

These are focused around the actual relocation rewriting itself, not the rest of the infrastructure.

llvm-svn: 231399

03ea8642

Use the correct func begin symbol in all places in ppc. · 092b619e
Rafael Espindola authored Mar 05, 2015
```
I missed an occurrence of the old symbol in my previous patch.

llvm-svn: 231398
```
092b619e

TableGen: Initialize ErrorInfo to ~0ULL in the MatchInstructionImpl · 5698d633

Tom Stellard authored Mar 05, 2015

This is what all the targets check for and is consistent with the
initialized value of MissingFeatures, which is sometimes assinged
to ErrorInfo.

llvm-svn: 231397

5698d633

[ARM] Enable vector extload combine for legal types. · 4200cc95

Ahmed Bougacha authored Mar 05, 2015

This commit enables forming vector extloads for ARM.
It only does so for legal types, and when we can't fold the extension
in a wide/long form of the user instruction.

Enabling it for larger types isn't as good an idea on ARM as it is on
X86, because: 
- we pretend that extloads are legal, but end up generating vld+vmov
- we have instructions like vld {dN, dM}, which can't be generated
  when we "manually expand" extloads to vld+vmov.

For legal types, the combine doesn't fire that often: in the
integration tests only in a big endian testcase, where it removes a
pointless AND.

Related to rdar://19723053
Differential Revision: http://reviews.llvm.org/D7423

llvm-svn: 231396

4200cc95

Replace PrintStackTrace(FILE*) with PrintStackTrace(raw_ostream&) · cd132c9b

Zachary Turner authored Mar 05, 2015

This will be followed by a change on the clang side to update
the only user of this function with the new version.

Differential Revision: http://reviews.llvm.org/D8074
Reviewed By: Reid Kleckner

llvm-svn: 231392

cd132c9b

Remove accidental errs() call in Verifier · 286b1007
Reid Kleckner authored Mar 05, 2015
```
llvm-svn: 231391
```
286b1007
Use the generic Lfunc_begin label on ppc. · 86bd6a12
Rafael Espindola authored Mar 05, 2015
```
This removes yet another custom label to mark the start of a function.

llvm-svn: 231390
```
86bd6a12
X86: Optimize address mode matching for FRAME_ALLOC_RECOVER nodes · 71b9b6be
David Majnemer authored Mar 05, 2015
```
We know that the absolute symbol will be less than 2GB and thus will
always fit.

llvm-svn: 231389
```
71b9b6be
Revert busted CallSite change from r231386 · caf7444b
Reid Kleckner authored Mar 05, 2015
```
llvm-svn: 231388
```
caf7444b

Silence -Wmissing-braces warning from clang-cl · e658058c

Reid Kleckner authored Mar 05, 2015

The first element of STACKFRAME64 is a struct and Clang wants us to put
braces around it's initialization. Instead, drop the zero. The result
should be the same.

llvm-svn: 231387

e658058c

Replace llvm.frameallocate with llvm.frameescape · cfb9ce53

Reid Kleckner authored Mar 05, 2015

Turns out it's pretty straightforward and simplifies the implementation.

Reviewers: andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D8051

llvm-svn: 231386

cfb9ce53

Revert r231276 (including r231277): Add a lock() function in PassRegistry to... · 8c76e669

Erik Eckstein authored Mar 05, 2015

Revert r231276 (including r231277): Add a lock() function in PassRegistry to speed up multi-thread synchronization.

llvm-svn: 231385

8c76e669

[Windows] Implement PrintStackTrace(FILE*) · 62b7b617

Zachary Turner authored Mar 05, 2015

llvm::sys::PrintBacktrace(FILE*) is supposed to print a backtrace
of the current thread given the current PC.  This function was
unimplemented on Windows, and instead the only time we could
print a backtrace was as the result of an exception through
LLVMUnhandledExceptionFilter.

This patch implements backtracing of self by using
RtlCaptureContext to get a CONTEXT for the current thread, and
moving the printing and StackWalk64 code to a common method that
printing own stack trace and printing stack trace of an exception
can use.

Differential Revision: http://reviews.llvm.org/D8068
Reviewed by: Reid Kleckner

llvm-svn: 231382

62b7b617

[DagCombiner] Allow shuffles to merge through bitcasts · 7189084b

Simon Pilgrim authored Mar 05, 2015

Currently shuffles may only be combined if they are of the same type, despite the fact that bitcasts are often introduced in between shuffle nodes (e.g. x86 shuffle type widening).

This patch allows a single input shuffle to peek through bitcasts and if the input is another shuffle will merge them, shuffling using the smallest sized type, and re-applying the bitcasts at the inputs and output instead.

Dropped old ShuffleToZext test - this patch removes the use of the zext and vector-zext.ll covers these anyhow.

Differential Revision: http://reviews.llvm.org/D7939

llvm-svn: 231380

7189084b

FileCheck: Document CHECK-SAME, follow-up to r230612 · cffbbe92
Duncan P. N. Exon Smith authored Mar 05, 2015
```
llvm-svn: 231379
```
cffbbe92

While reviewing the changes to Clang to add builtin support for the vsld,... · e48b1e1c

Kit Barton authored Mar 05, 2015

While reviewing the changes to Clang to add builtin support for the vsld, vsrd, and vsrad instructions, it was pointed out that the builtins are generating the LLVM opcodes (shl, lshr, and ashr) not calls to the intrinsics. This patch changes the implementation of the vsld, vsrd, and vsrad instructions from from intrinsics to VXForm_1 instructions and makes them legal with P8 Altivec. It also removes the definition of the int_ppc_altivec_vsld, int_ppc_altivec_vsrd, and int_ppc_altivec_vsrad intrinsics.

llvm-svn: 231378

e48b1e1c

Revert change r231366 as it broke clang-native-arm-cortex-a9 Analysis/properties.m test. · 8d0851f5
Igor Laevsky authored Mar 05, 2015
```
llvm-svn: 231374
```
8d0851f5

AVX-512, SKX: Enabled masked_load/store operations for this target. · de05f10d

Elena Demikhovsky authored Mar 05, 2015

Added lowering for ISD::CONCAT_VECTORS and ISD::INSERT_SUBVECTOR for i1 vectors,
it is needed to pass all masked_memop.ll tests for SKX.

llvm-svn: 231371

de05f10d

Fix -Woverflow warning in unittest. · 0d94ef9b
Frederic Riss authored Mar 05, 2015
```
llvm-svn: 231368
```
0d94ef9b

Teach lowering to correctly handle invoke statepoint and gc results tied to... · 1725997f

Igor Laevsky authored Mar 05, 2015

Teach lowering to correctly handle invoke statepoint and gc results tied to them. Note that we still can not lower gc.relocates for invoke statepoints.
Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result.

llvm-svn: 231366

1725997f

[PBQP] Use a local bit-matrix to speedup searching an edge in the graph. · d8ed0d37

Arnaud A. de Grandmaison authored Mar 05, 2015

Build time (user time) for building llvm+clang+lldb in release mode:
 - default allocator: 9086 seconds
 - with PBQP: 9126 seconds
 - with PBQP + local bit matrix cache: 9097 seconds

llvm-svn: 231360

d8ed0d37

[InstCombine] Fix an assertion when fmul has a ConstantExpr operand · bcb26d68

Michael Kuperstein authored Mar 05, 2015

isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions.

Patch by Pawel Jurek <pawel.jurek@intel.com>
Differential Revision: http://reviews.llvm.org/D8053

llvm-svn: 231359

bcb26d68

Revert "[TableGen] Implement at least some support for multiple explicit... · 35b3dbc4

Craig Topper authored Mar 05, 2015

Revert "[TableGen] Implement at least some support for multiple explicit results in an instruction pattern. No functional change to existing patterns."

This is failing on several build bots.

llvm-svn: 231358

35b3dbc4

[TableGen] Implement at least some support for multiple explicit results in an... · 5edbf1cc

Craig Topper authored Mar 05, 2015

[TableGen] Implement at least some support for multiple explicit results in an instruction pattern. No functional change to existing patterns.

This should help with the AVX512 masked gather changes Elena is working on. This patch is derived from some of the changes Elena made to tablegen, but modified by me to support arbitrary number of results.

llvm-svn: 231357

5edbf1cc

[TableGen] Add support constraining a vector type in a pattern to have a... · 0be34580

Craig Topper authored Mar 05, 2015

[TableGen] Add support constraining a vector type in a pattern to have a specific element type and for constraining a vector type to have the same number of elements as another vector type. This is useful for AVX512 mask operations so we relate the mask type to the type of the other arguments.

llvm-svn: 231356

0be34580

[X86] Use vmovss to handle inserting an element into index 0 of a v8f32 vector of zeros. · 0ee8470a
Craig Topper authored Mar 05, 2015
```
llvm-svn: 231354
```
0ee8470a
Remove useless break after return. · 6e56345d
Frederic Riss authored Mar 05, 2015
```
Pointed out by Paul Robinson.

llvm-svn: 231353
```
6e56345d

Add a few more performance tips · 34843ae5

Philip Reames authored Mar 05, 2015

These came from my own experience and may not apply equally to all use cases. Any alternate perspective anyone has should be used to refine these.

As always, grammar and spelling adjustments are more than welcome. Please just directly commit a fix if you see something problematic.

llvm-svn: 231352

34843ae5

Revert "[dsymutil] MSVC does generate move constructors, but it should accept to default them" · 2838f9ed

Frederic Riss authored Mar 05, 2015

This reverts commit r231350.

It turns out MSVC doesn't generate implicit move constructors and also doesn't accept to default them...
See for example http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc/builds/2786

llvm-svn: 231351

2838f9ed

[dsymutil] MSVC does generate move constructors, but it should accept to default them · 1e9cd291
Frederic Riss authored Mar 05, 2015
```
llvm-svn: 231350
```
1e9cd291
Add a link to the new PerformanceTips docs from the 3.7 release notes · aedd404a
Philip Reames authored Mar 05, 2015
```
llvm-svn: 231349
```
aedd404a
Revert r231324 "Remove the conditional addition of the execution dependency fixing" · 6d8e6d5e
Hans Wennborg authored Mar 05, 2015
```
See PR22799.

llvm-svn: 231348
```
6d8e6d5e

[MBP] Use range based for-loops throughout this code. Several had · 7a715dae

Chandler Carruth authored Mar 05, 2015

already been added and the inconsistency made choosing names and
changing code more annoying. Plus, wow are they better for this code!

llvm-svn: 231347

7a715dae

[MBP] NFC, run clang-format over this code and tweak things to make the · 2fc3fe12

Chandler Carruth authored Mar 05, 2015

result reasonable.

This code predated clang-format and so there was a reasonable amount of
crufty formatting that had accumulated. This should ensure that neither
myself nor others end up with formatting-only changes sneaking into
other fixes.

llvm-svn: 231341

2fc3fe12