Commits · 00b1a3cd7e9fb9b578a39414c7048d4c23b7105a · Roger Ferrer / llvm-epi-0.8

Jan 07, 2012
- Added a late machine instruction copy propagation pass. This catches · 00b1a3cd
  Evan Cheng authored Jan 07, 2012
```
opportunities that only present themselves after late optimizations
such as tail duplication .e.g.
## BB#1:
        movl    %eax, %ecx
        movl    %ecx, %eax
        ret

The register allocator also leaves some of them around (due to false
dep between copies from phi-elimination, etc.)

This required some changes in codegen passes. Post-ra scheduler and the
pseudo-instruction expansion passes have been moved after branch folding
and tail merging. They were before branch folding before because it did
not always update block livein's. That's fixed now. The pass change makes
independently since we want to properly schedule instructions after
branch folding / tail duplication.

rdar://10428165
rdar://10640363

llvm-svn: 147716
```
  00b1a3cd
- Copy implicit defs (e.g. r0) when changing tBX_RET to tPOP_RET. This bug is · 501e3095
  Evan Cheng authored Jan 07, 2012
```
exposed with an upcoming change will would delete the copy to return register
because there is no use! It's amazing anything works.

llvm-svn: 147715
```
  501e3095
- Use movw+movt in ARMFastISel::ARMMaterializeGV. · 68f034ee
  Jakob Stoklund Olesen authored Jan 07, 2012
```
This eliminates a lot of constant pool entries for -O0 builds of code
with many global variable accesses.

This speeds up -O0 codegen of consumer-typeset by 2x because the
constant island pass no longer has to look at thousands of constant pool
entries.

<rdar://problem/10629774>

llvm-svn: 147712
```
  68f034ee
- LSR: run DeleteDeadPhis before replaceCongruentPhis. · 2ec61a89
  Andrew Trick authored Jan 07, 2012
```
llvm-svn: 147711
```
  2ec61a89
- Cleanup comments and argument types related to my previous replaceCongruentPhis checkin. · f730f39f
  Andrew Trick authored Jan 07, 2012
```
llvm-svn: 147709
```
  f730f39f
- Extended replaceCongruentPhis to handle mixed phi types. · 5adedf5d
  Andrew Trick authored Jan 07, 2012
```
llvm-svn: 147707
```
  5adedf5d
- Make the 'x' constraint work for AVX registers as well. · c206d467
  Eric Christopher authored Jan 07, 2012
```
Fixes rdar://10614894

llvm-svn: 147704
```
  c206d467
- Missing raw_ostream.h breaks MSVC build. · ff4e2b7d
  Andrew Trick authored Jan 07, 2012
```
llvm-svn: 147703
```
  ff4e2b7d
- Expose isNonConstantNegative to users of ScalarEvolution. · 881a7768
  Andrew Trick authored Jan 07, 2012
```
llvm-svn: 147700
```
  881a7768
- Add comment. · 73a3fab4
  Chad Rosier authored Jan 06, 2012
```
llvm-svn: 147696
```
  73a3fab4
- Add a comment and ensure that anyone else looking at this code doesn't start · 8ea8e4fc
  Eric Christopher authored Jan 06, 2012
```
to bleed from the eyes.

llvm-svn: 147695
```
  8ea8e4fc
- Use const vector references instead of a vector copy. Spotted by Devang. · 090fcc1a
  Eric Christopher authored Jan 06, 2012
```
llvm-svn: 147694
```
  090fcc1a
- Use -> instead of (*iter). · 5a28a6ee
  Eric Christopher authored Jan 06, 2012
```
llvm-svn: 147693
```
  5a28a6ee
Jan 06, 2012
- Enable aligned NEON spilling by default. · 68a922c0
  Jakob Stoklund Olesen authored Jan 06, 2012
```
Experiments show this to be a small speedup for modern ARM cores.

llvm-svn: 147689
```
  68a922c0
- Put all IVUsers in the processed set. Allow querying IVUsers with isIVUserOrOperand. · 9a5b242d
  Andrew Trick authored Jan 06, 2012
```
llvm-svn: 147686
```
  9a5b242d
- Abort AdjustBBOffsetsAfter early when possible. · 69051113
  Jakob Stoklund Olesen authored Jan 06, 2012
```
llvm-svn: 147685
```
  69051113
- SCEVExpander: hoistStep should check strict dominance. · b8045cbc
  Andrew Trick authored Jan 06, 2012
```
llvm-svn: 147683
```
  b8045cbc
- Tracing to help investigate issues with SjLj spill code. · 85460d0d
  Andrew Trick authored Jan 06, 2012
```
llvm-svn: 147682
```
  85460d0d
- Initializing to false makes better sense. Thanks, David. · 64dc8aa4
  Chad Rosier authored Jan 06, 2012
```
llvm-svn: 147679
```
  64dc8aa4
- Fix uninitialized variable warning. · a3d90a94
  Chad Rosier authored Jan 06, 2012
```
llvm-svn: 147676
```
  a3d90a94
- Fix uninitialized variable warning. · 6b64c3c6
  Chad Rosier authored Jan 06, 2012
```
llvm-svn: 147675
```
  6b64c3c6
- Fix a leak I noticed while reviewing the accelerator table changes. Passes · 667a074b
  Eric Christopher authored Jan 06, 2012
```
lldb testsuite.

rdar://10652330

llvm-svn: 147673
```
  667a074b
- [asan] cleanup: remove the SIGILL-related code (compiler part) · 3411f2ea
  Kostya Serebryany authored Jan 06, 2012
```
llvm-svn: 147667
```
  3411f2ea
- Fix typo in string · d8e25729
  Eli Bendersky authored Jan 06, 2012
```
llvm-svn: 147654
```
  d8e25729
- As part of the ongoing work in finalizing the accelerator tables, extend · 21bde87b
  Eric Christopher authored Jan 06, 2012
```
the debug type accelerator tables to contain the tag and a flag
stating whether or not a compound type is a complete type.

rdar://10652330

llvm-svn: 147651
```
  21bde87b
- Fix SpeculativelyExecuteBB to either speculate all or none of the phis · 5ab9c0a9
  Dan Gohman authored Jan 05, 2012
```
present in the bottom of the CFG triangle, as the transformation isn't
ever valuable if the branch can't be eliminated.

Also, unify some heuristics between SimplifyCFG's multiple
if-converters, for consistency.

This fixes rdar://10627242.

llvm-svn: 147630
```
  5ab9c0a9
- PR11705, part 2: globalopt shouldn't put inttoptr/ptrtoint operations into... · 55fa49f3
  Eli Friedman authored Jan 05, 2012
```
PR11705, part 2: globalopt shouldn't put inttoptr/ptrtoint operations into global initializers if there's an implied extension or truncation.

llvm-svn: 147625
```
  55fa49f3
- Link symbols with different visibilities according to the rules in the · 23f8d64b
  Rafael Espindola authored Jan 05, 2012
```
System V Application Binary Interface. This lets us use
-fvisibility-inlines-hidden with LTO.
Fixes PR11697.

llvm-svn: 147624
```
  23f8d64b
Jan 05, 2012

Revert r56315. When the instruction to speculate is a load, this · 52672118

Dan Gohman authored Jan 05, 2012

code can incorrectly move the load across a store. This never
happens in practice today, but only because the current
heuristics accidentally preclude it.

llvm-svn: 147623

52672118

Kill ObjectCodeEmitter and BinaryObject, they were unused and superseded by MC. · 69eab4e0
Benjamin Kramer authored Jan 05, 2012
```
llvm-svn: 147618
```
69eab4e0
SCCCaptured is trivially false on entry to this loop and not modified inside it. · f740db31
Nick Lewycky authored Jan 05, 2012
```
Eliminate the dead test for it on each loop iteration. No functionality change.

llvm-svn: 147616
```
f740db31
Remove the old ELF writer. · afcf571e
Rafael Espindola authored Jan 05, 2012
```
llvm-svn: 147615
```
afcf571e
A small re-factored JIT/MCJIT::getPointerToNamedFunction(), so it could be... · 7e325789
Danil Malyshev authored Jan 05, 2012
```
A small re-factored JIT/MCJIT::getPointerToNamedFunction(), so it could be called with the base class.

llvm-svn: 147610
```
7e325789
revert r147542 after comments from Joerg Sonnenberger · 99ab273a
Sebastian Pop authored Jan 05, 2012
```
llvm-svn: 147608
```
99ab273a
Remove an unused variable. · eab50299
Chandler Carruth authored Jan 05, 2012
```
llvm-svn: 147605
```
eab50299

Prevent a DAGCombine from firing where there are two uses of · e041a30b

Chandler Carruth authored Jan 05, 2012

a combined-away node and the result of the combine isn't substantially
smaller than the input, it's just canonicalized. This is the first part
of a significant (7%) performance gain for Snappy's hot decompression
loop.

llvm-svn: 147604

e041a30b

Mark scalar FMA4 instructions as ignoring the VEX.L bit. · 29b07374
Craig Topper authored Jan 05, 2012
```
llvm-svn: 147602
```
29b07374

Peephole optimization of ptest-conditioned branch in X86 arch. Performs... · 9255b6d9

Victor Umansky authored Jan 05, 2012

Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX.

Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX)

Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov
llvm-svn: 147601

9255b6d9

Minor postra scheduler cleanup. It could result in more precise antidependence... · 100af0ad

Andrew Trick authored Jan 05, 2012

Minor postra scheduler cleanup. It could result in more precise antidependence latency on ARM in exceedingly rare cases.

llvm-svn: 147594

100af0ad

Replace the uint64_t -> double convertion algorithm with one that's more efficient. · ac27f0c8

Bill Wendling authored Jan 05, 2012

This small bit of ASM code is sufficient to do what the old algorithm did:

     movq       %rax,  %xmm0
     punpckldq  (c0),  %xmm0  // c0: (uint4){ 0x43300000U, 0x45300000U, 0U, 0U }
     subpd      (c1),  %xmm0  // c1: (double2){ 0x1.0p52, 0x1.0p52 * 0x1.0p32 }
   #ifdef __SSE3__
     haddpd   %xmm0, %xmm0          
   #else
     pshufd   $0x4e, %xmm0, %xmm1 
     addpd    %xmm1, %xmm0
   #endif

It's arguably faster. One caveat, the 'haddpd' instruction isn't very fast on
all processors.
<rdar://problem/7719814>

llvm-svn: 147593

ac27f0c8