Commits · 985b1dc2d8497a17c4b32209aa458ec1fdb757da · Roger Ferrer / llvm-epi-0.8

Sep 20, 2012

Re-work X86 code generation of atomic ops with spin-loop · 3237662b

Michael Liao authored Sep 20, 2012

- Rewrite/merge pseudo-atomic instruction emitters to address the
  following issue:
  * Reduce one unnecessary load in spin-loop

    previously the spin-loop looks like

        thisMBB:
        newMBB:
          ld  t1 = [bitinstr.addr]
          op  t2 = t1, [bitinstr.val]
          not t3 = t2  (if Invert)
          mov EAX = t1
          lcs dest = [bitinstr.addr], t3  [EAX is implicit]
          bz  newMBB
          fallthrough -->nextMBB

    the 'ld' at the beginning of newMBB should be lift out of the loop
    as lcs (or CMPXCHG on x86) will load the current memory value into
    EAX. This loop is refined as:

        thisMBB:
          EAX = LOAD [MI.addr]
        mainMBB:
          t1 = OP [MI.val], EAX
          LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined]
          JNE mainMBB
        sinkMBB:

  * Remove immopc as, so far, all pseudo-atomic instructions has
    all-register form only, there is no immedidate operand.

  * Remove unnecessary attributes/modifiers in pseudo-atomic instruction
    td

  * Fix issues in PR13458

- Add comprehensive tests on atomic ops on various data types.
  NOTE: Some of them are turned off due to missing functionality.

- Revise tests due to the new spin-loop generated.

llvm-svn: 164281

3237662b

Aug 02, 2012

X86 Peephole: fold loads to the source register operand if possible. · ba8122cc

Manman Ren authored Aug 02, 2012

Add more comments and use early returns to reduce nesting in isLoadFoldable.
Also disable folding for V_SET0 to avoid introducing a const pool entry and
a const pool load.

rdar://10554090 and rdar://11873276

llvm-svn: 161207

ba8122cc

X86 Peephole: fold loads to the source register operand if possible. · 5759d012

Manman Ren authored Aug 02, 2012

Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

This patch is a rework of r160919 and was tested on clang self-host on my local
machine.

rdar://10554090 and rdar://11873276

llvm-svn: 161152

5759d012

Jul 29, 2012
- Revert r160920 and r160919 due to dragonegg and clang selfhost failure · f87dd7c0
  Manman Ren authored Jul 29, 2012
```
llvm-svn: 160927
```
  f87dd7c0
Jul 28, 2012

X86 Peephole: fold loads to the source register operand if possible. · 0fa3ab88

Manman Ren authored Jul 28, 2012

Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

rdar://10554090 and rdar://11873276

llvm-svn: 160919

0fa3ab88

Jul 06, 2012

X86: peephole optimization to remove cmp instruction · c9656737

Manman Ren authored Jul 06, 2012

For each Cmp, we check whether there is an earlier Sub which make Cmp
redundant. We handle the case where SUB operates on the same source operands as
Cmp, including the case where the two source operands are swapped.

llvm-svn: 159838

c9656737

Jul 04, 2012

Add early if-conversion support to X86. · 49e4d4b3

Jakob Stoklund Olesen authored Jul 04, 2012

Implement the TII hooks needed by EarlyIfConversion to create cmov
instructions and estimate their latency.

Early if-conversion is still not enabled by default.

llvm-svn: 159695

49e4d4b3

Jun 23, 2012
- Make helper method static since it doesn't use anything in the class. · d9c7d0dd
  Craig Topper authored Jun 23, 2012
```
llvm-svn: 159071
```
  d9c7d0dd
Jun 07, 2012

Revert r157755. · 9c964181

Manman Ren authored Jun 06, 2012

The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.

llvm-svn: 158122

9c964181

Jun 03, 2012
- Revert r157831 · 5097e4f3
  Manman Ren authored Jun 03, 2012
```
llvm-svn: 157896
```
  5097e4f3
Jun 01, 2012

X86: peephole optimization to remove cmp instruction · 879ca9d4

Manman Ren authored Jun 01, 2012

This patch will optimize the following:
  sub r1, r3
  cmp r3, r1 or cmp r1, r3
  bge L1
TO
  sub r1, r3
  bge L1 or ble L1

If the branch instruction can use flag from "sub", then we can eliminate
the "cmp" instruction.

llvm-svn: 157831

879ca9d4

Remove a trailing space and fix a comment. · 9eadcfdf
Craig Topper authored Jun 01, 2012
```
llvm-svn: 157801
```
9eadcfdf

May 31, 2012

X86: replace SUB with CMP if possible · 9bccb64e

Manman Ren authored May 31, 2012

This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 157755

9bccb64e

Added FMA3 Intel instructions. · 602f3a26

Elena Demikhovsky authored May 31, 2012

I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.

llvm-svn: 157737

602f3a26

Mar 17, 2012
- Reorder includes in Target backends to following coding standards. Remove some... · b25fda95
  Craig Topper authored Mar 17, 2012
```
Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations.

llvm-svn: 152997
```
  b25fda95
Feb 18, 2012
- Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430,... · b22310fd
  Jia Liu authored Feb 18, 2012
```
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.

llvm-svn: 150878
```
  b22310fd
Nov 15, 2011

Break false dependencies before partial register updates. · f8ad336b

Jakob Stoklund Olesen authored Nov 15, 2011

Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.

The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.

The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.

llvm-svn: 144602

f8ad336b

Sep 29, 2011

Expand the x86 V_SET0* pseudos right after register allocation. · dd1904e7

Jakob Stoklund Olesen authored Sep 29, 2011

This also makes it possible to reduce the number of pseudo instructions
and get rid of the encoding information.

llvm-svn: 140776

dd1904e7

Sep 28, 2011

Promote the X86 Get/SetSSEDomain functions to TargetInstrInfo. · b48c994c

Jakob Stoklund Olesen authored Sep 27, 2011

I am going to unify the SSEDomainFix and NEONMoveFix passes into a
single target independent pass.  They are essentially doing the same
thing.

llvm-svn: 140652

b48c994c

Sep 08, 2011

* Combines Alignment, AuxInfo, and TB_NOT_REVERSABLE flag into a · 23eb5265

Bruno Cardoso Lopes authored Sep 08, 2011

single field (Flags), which is a bitwise OR of items from the TB_*
enum. This makes it easier to add new information in the future.

* Gives every static array an equivalent layout: { RegOp, MemOp, Flags }

* Adds a helper function, AddTableEntry, to avoid duplication of the
insertion code.

* Renames TB_NOT_REVERSABLE to TB_NO_REVERSE.

* Adds TB_NO_FORWARD, which is analogous to TB_NO_REVERSE, except that
it prevents addition of the Reg->Mem entry. (This is going to be used
by Native Client, in the next CL).

Patch by David Meyer

llvm-svn: 139311

23eb5265

Aug 08, 2011

Hoist hasLoadFromStackSlot and hasStoreToStackSlot. · daa2cad7

Jakob Stoklund Olesen authored Aug 08, 2011

These the methods are target-independent since they simply scan the
memory operands.  They can live in TargetInstrInfoImpl.

llvm-svn: 137063

daa2cad7

Jul 25, 2011
- Refactor X86 target to separate MC code from Target code. · 7e763d86
  Evan Cheng authored Jul 25, 2011
```
llvm-svn: 135930
```
  7e763d86
Jul 01, 2011
- Hide the call to InitMCInstrInfo into tblgen generated ctor. · 703a0fbf
  Evan Cheng authored Jul 01, 2011
```
llvm-svn: 134244
```
  703a0fbf
May 25, 2011
- Remove unused OpcodeMask enumerator. · 85ec5212
  Francois Pichet authored May 25, 2011
```
llvm-svn: 132062
```
  85ec5212
- Fix MSVC warning: "is out of range for enum constant" · 58b09c93
  Francois Pichet authored May 25, 2011
```
MSVC doesn't support 64 bit enum. 
OpcodeMask is not used anywhere in the code base.

llvm-svn: 132057
```
  58b09c93
Apr 15, 2011
- Fix a ton of comment typos found by codespell. Patch by · 0ab5e2cd
  Chris Lattner authored Apr 15, 2011
```
Luis Felipe Strano Moraes!

llvm-svn: 129558
```
  0ab5e2cd
Apr 04, 2011
- Make OpcodeMask an unsigned long long literal to deal with overflow. · 418f186a
  Joerg Sonnenberger authored Apr 04, 2011
```
llvm-svn: 128847
```
  418f186a
- Add support for the VIA PadLock instructions. · fc4789da
  Joerg Sonnenberger authored Apr 04, 2011
```
llvm-svn: 128826
```
  fc4789da
- Expand Op0Mask by one bit in preparation for the PadLock prefixes. · cc53d991
  Joerg Sonnenberger authored Apr 04, 2011
```
Define most shift masks incrementally to reduce the redundant
hard-coding. Introduce new shift for the VEX flags to replace the
magic constant 32 in various places.

llvm-svn: 128822
```
  cc53d991
Mar 05, 2011

Increased the register pressure limit on x86_64 from 8 to 12 · 641e2d4f

Andrew Trick authored Mar 05, 2011

regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.

Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.

Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.

llvm-svn: 127067

641e2d4f

whitespace · 27c079e1
Andrew Trick authored Mar 05, 2011
```
llvm-svn: 127065
```
27c079e1

Feb 22, 2011
- Implement xgetbv and xsetbv. · e3906219
  Rafael Espindola authored Feb 22, 2011
```
Patch by Jai Menon.

llvm-svn: 126165
```
  e3906219
Nov 28, 2010
- Move callee-saved regs spills / reloads to TFI · d08fbd19
  Anton Korobeynikov authored Nov 27, 2010
```
llvm-svn: 120228
```
  d08fbd19
Nov 15, 2010
- tidy up, no functionality change. · ea857d35
  Chris Lattner authored Nov 14, 2010
```
llvm-svn: 119092
```
  ea857d35
Oct 19, 2010

Re-enable register pressure aware machine licm with fixes. Hoist() may have · 63c7608c
Evan Cheng authored Oct 19, 2010
```
erased the instruction during LICM so UpdateRegPressureAfter() should not
reference it afterwards.

llvm-svn: 116845
```
63c7608c
Revert r116781 "- Add a hook for target to determine whether an instruction def · 418204e5
Daniel Dunbar authored Oct 19, 2010
```
is", which breaks some nightly tests.

llvm-svn: 116816
```
418204e5

- Add a hook for target to determine whether an instruction def is · 8249dfe6

Evan Cheng authored Oct 19, 2010

  "long latency" enough to hoist even if it may increase spilling. Reloading
  a value from spill slot is often cheaper than performing an expensive
  computation in the loop. For X86, that means machine LICM will hoist
  SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
  instructions.
- Enable register pressure aware machine LICM by default.

llvm-svn: 116781

8249dfe6

Oct 08, 2010
- Reduce casting in various tables by defining the table · 1c090c00
  Chris Lattner authored Oct 07, 2010
```
with the right types.

llvm-svn: 116001
```
  1c090c00
Oct 03, 2010

Implement support for the bizarre 3DNow! encoding (which is unlike anything · 45270db9

Chris Lattner authored Oct 03, 2010

else in X86), and add support for pavgusb.  This is apparently the
only instruction (other than movsx) that is preventing ffmpeg from building
with clang.

If someone else is interested in banging out the rest of the 3DNow! 
instructions, it should be quite easy now.

llvm-svn: 115466

45270db9

Sep 17, 2010
- fix rdar://8444631 - encoder crash on 'enter' · cea0a8d7
  Chris Lattner authored Sep 17, 2010
```
What a weird instruction.

llvm-svn: 114190
```
  cea0a8d7