Commits · 4a87aa0c31370f4890a4bf52acddffa380259cc4 · Roger Ferrer / llvm-epi-0.8

Nov 23, 2011

Fix a crash in block placement due to an inner loop that happened to be · 4a87aa0c

Chandler Carruth authored Nov 23, 2011

reversed in the function's original ordering, and we happened to
encounter it while handling an outer unnatural CFG structure.

Thanks to the test case reduced from GCC's source by Benjamin Kramer.
This may also fix a crasher in gzip that Duncan reduced for me, but
I haven't yet gotten to testing that one.

llvm-svn: 145094

4a87aa0c

[asan] do not instrument threadlocal globals, this is buggy · 8b5c7a56
Kostya Serebryany authored Nov 23, 2011
```
llvm-svn: 145092
```
8b5c7a56

Nov 22, 2011

Undo test commit · bcf6a37a
Anshuman Dasgupta authored Nov 22, 2011
```
llvm-svn: 145079
```
bcf6a37a
Test commit · 9ff08947
Anshuman Dasgupta authored Nov 22, 2011
```
llvm-svn: 145078
```
9ff08947
add basic PPC register-pressure feedback; adjust the vaarg test to match the... · 6f0ae783
Hal Finkel authored Nov 22, 2011
```
add basic PPC register-pressure feedback; adjust the vaarg test to match the new register-allocation pattern

llvm-svn: 145065
```
6f0ae783

More fixes to the X86InstComments for shuffle instructions. In particular add... · 83c45926

Craig Topper authored Nov 22, 2011

More fixes to the X86InstComments for shuffle instructions. In particular add AVX flavors of many instructions and fix the destination operand for some of the existing AVX entries.

llvm-svn: 145063

83c45926

Fix a devilish miscompile exposed by block placement. The · ee54feb6

Chandler Carruth authored Nov 22, 2011

updateTerminator code didn't correctly handle EH terminators in one very
specific case. AnalyzeBranch would find no terminator instruction, and
so the fallback in updateTerminator is to assume fallthrough. This is
correct, but the destination of the fallthrough was assumed to be the
first successor.

This is *almost always* true, but in certain cases the loop
transformations will cause the landing pad to be the first successor!
Instead of this brittle logic, actually look through the successors for
a non-landing-pad accessor, and to assert if more than one is found.

This will hopefully fix some (if not all) of the self host miscompiles
with block placement. Thanks to Benjamin Kramer for reporting, Nick
Lewycky for an initial stab at a reduction, and Duncan for endless
advice on EH (which I know nothing about) as well as reviewing the
actual fix.

llvm-svn: 145062

ee54feb6

Add configure checking for pread(2) and use it to save a syscall when reading files. · e1effb0d
Benjamin Kramer authored Nov 22, 2011
```
llvm-svn: 145061
```
e1effb0d

Fix an obvious omission in the SelectionDAGBuilder where we were · e2530dc8

Chandler Carruth authored Nov 22, 2011

dropping weights on the floor for invokes. This was impeding my writing
further test cases for invoke when interacting with probabilities and
block placement.

No test case as there doesn't appear to be a way to test this stuff. =/
Suggestions for a test case of course welcome. I hope to be able to add
test cases that indirectly cover this eventually by adding probabilities
to the exceptional edge and reordering blocks as a result.

llvm-svn: 145060

e2530dc8

Turn error recovery into an assert. · f22623b7

Benjamin Kramer authored Nov 22, 2011

This was put in because in a certain version of DragonFlyBSD stat(2) lied about the
size of some files. This was fixed a long time ago so we can remove the workaround.

llvm-svn: 145059

f22623b7

Add triple to the test. · c55e1af1
Rafael Espindola authored Nov 22, 2011
```
llvm-svn: 145057
```
c55e1af1
If a register is both an early clobber and part of a tied use, handle the use · 2021f382
Rafael Espindola authored Nov 22, 2011
```
before the clobber so that we copy the value if needed.

Fixes pr11415.

llvm-svn: 145056
```
2021f382

Fix shuffle decoding logic to handle UNPCKLPS/UNPCKLPD on 256-bit vectors... · ccb70975

Craig Topper authored Nov 22, 2011

Fix shuffle decoding logic to handle UNPCKLPS/UNPCKLPD on 256-bit vectors correctly. Add support for decoding UNPCKHPS/UNPCKHPD for AVX 128-bit and 256-bit forms.

llvm-svn: 145055

ccb70975

Add methods for querying minimum SSE version along with AVX. Simplifies all... · f5639777

Craig Topper authored Nov 22, 2011

Add methods for querying minimum SSE version along with AVX. Simplifies all the places that had to check a version of SSE and AVX.

llvm-svn: 145053

f5639777

Nov 21, 2011
- fix typo in comment · 74e1bc79
  Sebastian Pop authored Nov 21, 2011
```
llvm-svn: 145048
```
  74e1bc79
- Fix crasher in GVN due to my recent capture tracking changes. · 063ae589
  Nick Lewycky authored Nov 21, 2011
```
llvm-svn: 145047
```
  063ae589
- Add virtual destructor. Whoops! · aa2a00db
  Nick Lewycky authored Nov 21, 2011
```
llvm-svn: 145044
```
  aa2a00db
- Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled. · 6270d072
  Craig Topper authored Nov 21, 2011
```
llvm-svn: 145028
```
  6270d072
- Test case for r145026 · d12d6f4b
  Craig Topper authored Nov 21, 2011
```
llvm-svn: 145027
```
  d12d6f4b
- Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled. · 669199ca
  Craig Topper authored Nov 21, 2011
```
llvm-svn: 145026
```
  669199ca
- Fixing a comment · 96e89f64
  Joe Abbey authored Nov 21, 2011
```
llvm-svn: 145025
```
  96e89f64
- Make LowerSIGN_EXTEND_INREG split 256-bit vectors when AVX1 is enabled and use... · a065238c
  Craig Topper authored Nov 21, 2011
```
Make LowerSIGN_EXTEND_INREG split 256-bit vectors when AVX1 is enabled and use AVX2 shifts when AVX2 is enabled.

llvm-svn: 145022
```
  a065238c
Nov 20, 2011

Less template, more virtual! Refactoring suggested by Chris in code review. · 6ae03c33
Nick Lewycky authored Nov 20, 2011
```
llvm-svn: 145014
```
6ae03c33
Refactor code to use new attribute getters on CallSite for NoCapture and ByVal. · 612d70b1
Nick Lewycky authored Nov 20, 2011
```
Suggested in code review by Eli.

That code in InstCombine looks kinda suspicious.

llvm-svn: 145013
```
612d70b1
test/CodeGen/X86/block-placement.ll: Relax expressions for Win32. · 76dfa038
NAKAMURA Takumi authored Nov 20, 2011
```
llvm-svn: 145011
```
76dfa038

The logic for breaking the CFG in the presence of hot successors didn't · 18dfac38

Chandler Carruth authored Nov 20, 2011

properly account for the *global* probability of the edge being taken.
This manifested as a very large number of unconditional branches to
blocks being merged against the CFG even though they weren't
particularly hot within the CFG.

The fix is to check whether the edge being merged is both locally hot
relative to other successors for the source block, and globally hot
compared to other (unmerged) predecessors of the destination block.

This introduces a new crasher on GCC single-source, but it's currently
behind a flag, and Ben has offered to work on the reduction. =]

llvm-svn: 145010

18dfac38

Make an obviously const interface actually be marked as const. · bcb5f395
Chandler Carruth authored Nov 20, 2011
```
llvm-svn: 145009
```
bcb5f395
XFAIL this test until I figure out what indvars is doing here (or find someone who does) · 650c09aa
Benjamin Kramer authored Nov 20, 2011
```
llvm-svn: 145008
```
650c09aa
SCEV: Actually set overflow flags on add expressions. · b5ba2eef
Benjamin Kramer authored Nov 20, 2011
```
setFlags doesn't modify its arguments.

llvm-svn: 145007
```
b5ba2eef

Add some comments to the latest test case I added here to document what · 20df3953

Chandler Carruth authored Nov 20, 2011

is actually being tested. Also add some FileCheck goodness to much more
carefully ensure that the result is the desired result. Before this test
would only have failed through an assert failure if the underlying fix
were reverted.

Also, add some weight metadata and a comment explaining exactly what is
going on to a trick section of the test case. Originally, we were
getting very unlucky and trying to form a block chain that isn't
actually profitable. I'm working on a fix to avoid forming these
unprofitable chains, and that would also have masked any failure from
this test case. The easy solution is to add some metadata that makes it
*really* profitable to form the bad chain here.

llvm-svn: 145006

20df3953

Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift... · e79761df

Craig Topper authored Nov 20, 2011

Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine.

llvm-svn: 145005

e79761df

Nov 19, 2011

Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled. · a3a65836
Craig Topper authored Nov 19, 2011
```
llvm-svn: 145004
```
a3a65836

Remove some of the special classes that worked around an old tablegen... · bac86038

Craig Topper authored Nov 19, 2011

Remove some of the special classes that worked around an old tablegen limitation of not being able to remove redundant bitconverts from patterns.

llvm-svn: 145003

bac86038

Custom lower AVX2 variable shift intrinsics to shl/srl/sra nodes and remove the intrinsic patterns. · 3af6ae08
Craig Topper authored Nov 19, 2011
```
llvm-svn: 144999
```
3af6ae08

Move the handling of unanalyzable branches out of the loop-driven chain · f3dc9eff

Chandler Carruth authored Nov 19, 2011

formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.

Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.

Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).

llvm-svn: 144994

f3dc9eff

Test cases for SSSE3/AVX integer horizontal add/sub. · 6d77f4ae
Craig Topper authored Nov 19, 2011
```
llvm-svn: 144990
```
6d77f4ae

Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from... · f984efbf

Craig Topper authored Nov 19, 2011

Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors.

llvm-svn: 144989

f984efbf

Collapse X86 PSIGNB/PSIGNW/PSIGND node types. · 81390be0
Craig Topper authored Nov 19, 2011
```
llvm-svn: 144988
```
81390be0
Extend VPBLENDVB and VPSIGN lowering to work for AVX2. · de6b73bb
Craig Topper authored Nov 19, 2011
```
llvm-svn: 144987
```
de6b73bb
Remove some unnecessary filtering checks from X86 disassembler table build. · 75ffc5fb
Craig Topper authored Nov 19, 2011
```
llvm-svn: 144986
```
75ffc5fb