Commits · 994fed689f252ff096f838af1039a8c3c8afc48f · Roger Ferrer / llvm-epi-0.8

Jan 12, 2012

Make SplitAnalysis::UseSlots private. · 994fed68
Jakob Stoklund Olesen authored Jan 12, 2012
```
llvm-svn: 148031
```
994fed68

After Jakob's r147938 exception handling on i386 was completely broken. · 9ece950d

Benjamin Kramer authored Jan 12, 2012

Restore the (obviously wrong) behavior from before r147938 without relying on
undefined behavior. Add a fat FIXME note.

This should fix nightly tester failures.

llvm-svn: 148030

9ece950d

Fix a bug in the AVX 256-bit shuffle code in cases where the splat element is... · 0a0a829b

Nadav Rotem authored Jan 12, 2012

Fix a bug in the AVX 256-bit shuffle code in cases where the splat element is on the boundary of two 128-bit vectors.
The attached testcase was stuck in an endless loop.

llvm-svn: 148027

0a0a829b

X86: Generalize the x << (y & const) optimization to also catch masks with... · 5b3aa60b
Benjamin Kramer authored Jan 12, 2012
```
X86: Generalize the x << (y & const) optimization to also catch masks with more set bits set than 31 or 63.

llvm-svn: 148024
```
5b3aa60b

Add predicate method check match memory operand size, if available. · fc6be102

Devang Patel authored Jan 12, 2012

In att style asm syntax memory operand size is derived from suffix attached with mnemonic.  In intel style asm syntax it is part of memory operand hence predicate method check is required to select appropriate instruction.

llvm-svn: 148006

fc6be102

A DenseMap of a std::map isn't a very good idea because the "grow()" method will · 58c75698

Bill Wendling authored Jan 12, 2012

need to make a deep copy of each of the std::maps. Use a std::map of the
std::map instead. This improves the compile time of sqlite3 by ~2%.

llvm-svn: 148003

58c75698

Add intel style operand parser skeleton. · 46831de2
Devang Patel authored Jan 12, 2012
```
This is a work in progress.

llvm-svn: 148002
```
46831de2

Switch all of the uses of my InsertDAGNode helper to follow the exact · eb21da06

Chandler Carruth authored Jan 12, 2012

same pattern. We already had this pattern is a few places, but others
tried to make a rough approximation of an actual DAG structure. As not
everywhere went to this trouble, nothing could rely on this being done.
In fact, I've checked all references to these node Ids, and the ones
that are using the topo-sort properties are actually satisfied with
a strict-weak-ordering. The requirement appears to be that Use >= Def.

I've added a big blurb of comments to this bit of the transform to
clarify why the order is so important for the next reader of the code.

I'm starting with this change as it is very small, and trivially
reverted if something breaks or the >= above really does need to be >.
If that proves the case, we can hide the problem by reverting this
patch, but the problem exists elsewhere as well, and so a more
comprehensive solution will be needed.

llvm-svn: 148001

eb21da06

Revert r147978. A DenseMap's iterators may become invalidated here. · 4ec081a4
Bill Wendling authored Jan 11, 2012
```
llvm-svn: 147980
```
4ec081a4
Make data structures private. · 20f19eb9
Jakob Stoklund Olesen authored Jan 11, 2012
```
llvm-svn: 147979
```
20f19eb9

Jan 11, 2012

Use a DenseMap. · f0275df9

Bill Wendling authored Jan 11, 2012

This appears to improve sqlite3's compile time by ~2%.

llvm-svn: 147978

f0275df9

Sink spillInterferences into RABasic. · 73edbf16
Jakob Stoklund Olesen authored Jan 11, 2012
```
This helper method is too simplistic for RAGreedy.

llvm-svn: 147976
```
73edbf16
Cleanup. · 06ec4203
Jakob Stoklund Olesen authored Jan 11, 2012
```
llvm-svn: 147975
```
06ec4203
Move RegAllocBase into its own cpp file separate from RABasic. · a818d804
Jakob Stoklund Olesen authored Jan 11, 2012
```
No functional change.

llvm-svn: 147972
```
a818d804

Re-fix the issue Bill fixed in r147899 in a slightly different way, which... · b31c627b

Eli Friedman authored Jan 11, 2012

Re-fix the issue Bill fixed in r147899 in a slightly different way, which doesn't abuse the semantics of linker_private.  We don't really want to merge any string constant with a weak_odr global.

llvm-svn: 147971

b31c627b

Fix assert. · d284c1d8
Eric Christopher authored Jan 11, 2012
```
llvm-svn: 147966
```
d284c1d8
Disable the crash reporter when running lit tests. · cd8fe08e
Argyrios Kyrtzidis authored Jan 11, 2012
```
llvm-svn: 147965
```
cd8fe08e

On AVX, we can load v8i32 at a time. The bug happens when two uneven loads are used. · b5ce6ee8

Nadav Rotem authored Jan 11, 2012

When we load the v12i32 type, the GenWidenVectorLoads method generates two loads: v8i32 and v4i32
and attempts to use CONCAT_VECTORS to join them. In this fix I concat undef values to widen
the smaller value. The test "widen_load-2.ll" also exposes this bug on AVX.

llvm-svn: 147964

b5ce6ee8

Support segmented stacks on mac. · d90466bc

Rafael Espindola authored Jan 11, 2012

This uses TLS slot 90, which actually belongs to JavaScriptCore. We only support
frames with static size
Patch by Brian Anderson.

llvm-svn: 147960

d90466bc

Generate the segmented stack prologue for fastcc too. · 4eecacb9
Rafael Espindola authored Jan 11, 2012
```
Patch by Brian Anderson.

llvm-svn: 147958
```
4eecacb9

Revert r147945 which disabled an addressing mode transformation. I had · 3212a342

Chandler Carruth authored Jan 11, 2012

hoped this would revive one of the llvm-gcc selfhost build bots, but it
didn't so it doesn't appear that my transform is the culprit.

If anyone else is seeing failures, please let me know!

llvm-svn: 147957

3212a342

Use unsigned comparison in segmented stack prologue. · 2b89448d

Rafael Espindola authored Jan 11, 2012

This is a comparison of two addresses, and GCC does the comparison unsigned.

Patch by Brian Anderson.

llvm-svn: 147954

2b89448d

[asan] extend the workaround for http://llvm.org/bugs/show_bug.cgi?id=11395 :... · 687d0781

Kostya Serebryany authored Jan 11, 2012

[asan] extend the workaround for http://llvm.org/bugs/show_bug.cgi?id=11395: don't instrument the function at all on x86_32 if it has a large asm blob

llvm-svn: 147953

687d0781

Explicitly set the scale to 1 on some segstack prologue instrs. · 6635ae1c
Rafael Espindola authored Jan 11, 2012
```
Patch by Brian Anderson.

llvm-svn: 147952
```
6635ae1c

The error check for using -g with a .s file already containing dwarf .file · 6223cf72

Kevin Enderby authored Jan 11, 2012

directives was in the wrong place and getting triggered incorectly with a
cpp .file directive.  This change fixes that and adds a test case.

llvm-svn: 147951

6223cf72

Add XOP Intrinsics and tests · 21f83d9f
Jan Sjödin authored Jan 11, 2012
```
llvm-svn: 147949
```
21f83d9f

Fix a bug in the lowering of BUILD_VECTOR for AVX. SCALAR_TO_VECTOR does not... · baae7e45

Nadav Rotem authored Jan 11, 2012

Fix a bug in the lowering of BUILD_VECTOR for AVX. SCALAR_TO_VECTOR does not zero untouched elements. Use INSERT_VECTOR_ELT instead.

llvm-svn: 147948

baae7e45

Don't try to create a GEP when the pointee type is unsized (such GEPs · 0bf46b53
Duncan Sands authored Jan 11, 2012
```
are invalid).  Fixes a crash on array1.C from the GCC testsuite when
compiled with dragonegg.

llvm-svn: 147946
```
0bf46b53

Disable the transformation I added in r147936 to see if it fixes some · 9bc48e52

Chandler Carruth authored Jan 11, 2012

strange build bot failures that look like a miscompile into an infloop.
I'll investigate this tomorrow, but I'd both like to know whether my
patch is the culprit, and get the bots back to green.

llvm-svn: 147945

9bc48e52

Hoist a really redundant code pattern into a helper function, and delete · 3eacfb83
Chandler Carruth authored Jan 11, 2012
```
lots of lines of code. No functionality changed.

llvm-svn: 147942
```
3eacfb83
Simplify the AND-rooted mask+shift checking code to match that of the · b0049f4a
Chandler Carruth authored Jan 11, 2012
```
SRL-rooted code.

llvm-svn: 147941
```
b0049f4a

Unify the interface of the three mask+shift transform helpers, and · 3dbcda84

Chandler Carruth authored Jan 11, 2012

factor the differences that were hiding in one of them into its other
caller, the SRL handling code. No change in behavior.

llvm-svn: 147940

3dbcda84

Clarify and make explicit some of the requirements for transforming · aa01e666

Chandler Carruth authored Jan 11, 2012

mask+shift pairs at the beginning of the ISD::AND case block, and then
hoist the final pattern into a helper function, simplifying and
reflowing it appropriately. This should have no observable behavior
change, but several simplifications fell out of this such as directly
computing the new mask constant, etc.

llvm-svn: 147939

aa01e666

Fix undefined code and reenable test case. · 60399837

Jakob Stoklund Olesen authored Jan 11, 2012

I don't think the compact encoding code is right, but at least is has
defined behavior now.

llvm-svn: 147938

60399837

Hoist the logic to transform shift+mask combinations into sub-register · 51d3076b

Chandler Carruth authored Jan 11, 2012

extracts and scaled addressing modes into its own helper function. No
functionality changed here, just hoisting and layout fixes falling out
of that hoisting.

llvm-svn: 147937

51d3076b

Teach the X86 instruction selection to do some heroic transforms to · 55b2cdee

Chandler Carruth authored Jan 11, 2012

detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

llvm-svn: 147936

55b2cdee

Improved compile time: · 82165698

Stepan Dyatkovskiy authored Jan 11, 2012

1. Size heuristics changed. Now we calculate number of unswitching
branches only once per loop.
2. Some checks was moved from UnswitchIfProfitable to
processCurrentLoop, since it is not changed during processCurrentLoop
iteration. It allows decide to skip some loops at an early stage.
Extended statistics:
- Added total number of instructions analyzed.

llvm-svn: 147935

82165698

Clarified the SCEV getSmallConstantTripCount interface with in-your-face comments. · e81211f4
Andrew Trick authored Jan 11, 2012
```
This interface is misleading and dangerous, but it is actually what we need for unrolling.

llvm-svn: 147926
```
e81211f4
Add big endian mips support. Based on a patch by Jack Carter. · 647841b1
Rafael Espindola authored Jan 11, 2012
```
llvm-svn: 147924
```
647841b1
Add the skeleton of an asm parser for mips. · 870c4e92
Rafael Espindola authored Jan 11, 2012
```
llvm-svn: 147923
```
870c4e92