Commits · 985b1dc2d8497a17c4b32209aa458ec1fdb757da · Roger Ferrer / llvm-epi-0.8

Aug 08, 2012

X86: enable CSE between CMP and SUB · 1be131ba

Manman Ren authored Aug 08, 2012

We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.

Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.

rdar://11873276

llvm-svn: 161462

1be131ba

Jul 18, 2012

X86: remove redundant cmp against zero. · d0a4ee84

Manman Ren authored Jul 18, 2012

Updated OptimizeCompare in peephole to remove redundant cmp against zero.
We only remove Compare if CF and OF are not used.

rdar://11855129

llvm-svn: 160454

d0a4ee84

Jul 06, 2012

X86: peephole optimization to remove cmp instruction · c9656737

Manman Ren authored Jul 06, 2012

For each Cmp, we check whether there is an earlier Sub which make Cmp
redundant. We handle the case where SUB operates on the same source operands as
Cmp, including the case where the two source operands are swapped.

llvm-svn: 159838

c9656737

Jun 03, 2012
- Revert r157831 · 5097e4f3
  Manman Ren authored Jun 03, 2012
```
llvm-svn: 157896
```
  5097e4f3
Jun 01, 2012

X86: peephole optimization to remove cmp instruction · 879ca9d4

Manman Ren authored Jun 01, 2012

This patch will optimize the following:
  sub r1, r3
  cmp r3, r1 or cmp r1, r3
  bge L1
TO
  sub r1, r3
  bge L1 or ble L1

If the branch instruction can use flag from "sub", then we can eliminate
the "cmp" instruction.

llvm-svn: 157831

879ca9d4

Apr 09, 2012
- This patch adds X86 instruction itineraries, which were missed by the · 2eec3672
  Preston Gurd authored Apr 09, 2012
```
original patch to add itineraries, to X86InstrArithmetc.td.  

llvm-svn: 154320
```
  2eec3672
Feb 18, 2012
- Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430,... · b22310fd
  Jia Liu authored Feb 18, 2012
```
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.

llvm-svn: 150878
```
  b22310fd
Feb 02, 2012

Instruction scheduling itinerary for Intel Atom. · 8523b16f

Andrew Trick authored Feb 01, 2012

Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.

Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.

Adds a test to verify that the scheduler is working.

Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.

Patch by Preston Gurd!

llvm-svn: 149558

8523b16f

Oct 23, 2011
- Add X86 MULX instruction for disassembler. · e94d277d
  Craig Topper authored Oct 23, 2011
```
llvm-svn: 142738
```
  e94d277d
- Remove some duplicate specifying of neverHasSideEffects and mayLoad from X86 multiply instructions. · 7412aa98
  Craig Topper authored Oct 22, 2011
```
llvm-svn: 142737
```
  7412aa98
Oct 14, 2011
- Add X86 ANDN instruction. Including instruction selection. · 965de2c1
  Craig Topper authored Oct 14, 2011
```
llvm-svn: 141947
```
  965de2c1
Oct 08, 2011

Add TEST8ri_NOREX pseudo to constrain sub_8bit_hi copies. · 729abd36

Jakob Stoklund Olesen authored Oct 08, 2011

In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX
instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot
target all GR8 registers, only those in GR8_NOREX.

TO enforce this, we ensure that all instructions using the
EXTRACT_SUBREG are GR8_NOREX constrained.

This fixes PR11088.

llvm-svn: 141499

729abd36

Oct 02, 2011

Fix some Intel syntax disassembly issues with instructions that implicitly use... · 7aea69d9

Craig Topper authored Oct 02, 2011

Fix some Intel syntax disassembly issues with instructions that implicitly use AL/AX/EAX/RAX such as ADD/SUB/ADC/SUBB/XOR/OR/AND/CMP/MOV/TEST.

llvm-svn: 140974

7aea69d9

Sep 11, 2011
- Fix disassembling of reverse register/register forms of ADD/SUB/XOR/OR/AND/SBB/ADC/CMP/MOV. · a88e3560
  Craig Topper authored Sep 11, 2011
```
llvm-svn: 139485
```
  a88e3560
Apr 15, 2011
- Fix a ton of comment typos found by codespell. Patch by · 0ab5e2cd
  Chris Lattner authored Apr 15, 2011
```
Luis Felipe Strano Moraes!

llvm-svn: 129558
```
  0ab5e2cd
Dec 20, 2010

Change the X86 backend to stop using the evil ADDC/ADDE/SUBC/SUBE nodes (which · 846c20d4

Chris Lattner authored Dec 20, 2010

their carry depenedencies with MVT::Flag operands) and use clean and beautiful
EFLAGS dependences instead.

We do this by changing the modelling of SBB/ADC to have EFLAGS input and outputs
(which is what requires the previous scheduler change) and change X86 ISelLowering
to custom lower ADDC and friends down to X86ISD::ADD/ADC/SUB/SBB nodes.

With the previous series of changes, this causes no changes in the testsuite, woo.

llvm-svn: 122213

846c20d4

Dec 05, 2010

it turns out that when ".with.overflow" intrinsics were added to the X86 · 364bb0a0

Chris Lattner authored Dec 05, 2010

backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.

llvm-svn: 120935

364bb0a0

Oct 08, 2010

fix a subtle bug I introduced in my refactoring, where we stopped preferring · 35e6ce47

Chris Lattner authored Oct 08, 2010

the i8 versions of instructions in some cases.  In test6, we started 
generating:

	cmpq	$0, -8(%rsp)            ## encoding: [0x48,0x81,0x7c,0x24,0xf8,0x00,0x00,0x00,0x00]
                                        ## <MCInst #478 CMP64mi32
                                        ##  <MCOperand Reg:114>
                                        ##  <MCOperand Imm:1>
                                        ##  <MCOperand Reg:0>
                                        ##  <MCOperand Imm:-8>
                                        ##  <MCOperand Reg:0>
                                        ##  <MCOperand Imm:0>>

instead of:

	cmpq	$0, -8(%rsp)            ## encoding: [0x48,0x83,0x7c,0x24,0xf8,0x00]
                                        ## <MCInst #479 CMP64mi8
                                        ##  <MCOperand Reg:114>
                                        ##  <MCOperand Imm:1>
                                        ##  <MCOperand Reg:0>
                                        ##  <MCOperand Imm:-8>
                                        ##  <MCOperand Reg:0>
                                        ##  <MCOperand Imm:0>>

Fix this and add some comments.

llvm-svn: 116053

35e6ce47

Oct 07, 2010

convert test to use the existing classes that the multipatterns · f5c60d81

Chris Lattner authored Oct 07, 2010

use.  Since TEST is completely different than all other binops,
don't define a multipattern for it.

This completes factorization of binops.

llvm-svn: 115982

f5c60d81

convert cmp to use a multipattern · ae8d67d3
Chris Lattner authored Oct 07, 2010
```
llvm-svn: 115978
```
ae8d67d3
reduce redundancy between pattern copies. · a8c0bbb8
Chris Lattner authored Oct 07, 2010
```
llvm-svn: 115968
```
a8c0bbb8
the opcode for BinOpMI/BinOpMI8 is always the same, remove the argument. · 9fece2be
Chris Lattner authored Oct 07, 2010
```
llvm-svn: 115967
```
9fece2be

convert adc/sbb to a multipattern. Because the adde/sube nodes · 752b60bc

Chris Lattner authored Oct 07, 2010

are not defined as returning EFLAGS (like add_flag and friends),
the entire multipattern and several of the subclasses need to be
cloned.

This could be handled through better instantiation support in tblgen,
but it isn't meta enough.

llvm-svn: 115964

752b60bc

add support for isConvertibleToThreeAddress to ArithBinOpEFLAGS, · 67677515
Chris Lattner authored Oct 07, 2010
```
allowing us to convert ADD over.  deletes 160 lines of .td file.

llvm-svn: 115897
```
67677515

Fix a few issues in ArithBinOpEFLAGS that made it specific to and. · 4fc52f6f

Chris Lattner authored Oct 07, 2010

Start using ArithBinOpEFLAGS for OR, XOR, and SUB.

This removes 500 lines from the .td file.  Now AND/OR/XOR/SUB are all
defined exactly the same way instead of being close relatives.

llvm-svn: 115896

4fc52f6f

Convert 'and' to single instance of a multipattern · 26d6a044

Chris Lattner authored Oct 07, 2010

which instantiates the 34 versions of and all in one
swoop.  The BaseOpc/BaseOpc2/BaseOpc4 stuff should not
be required, but tblgen's feeble brain explodes when I
use Or4<BaseOpc>.V in the multipattern.

No change in the generated .inc files.

llvm-svn: 115893

26d6a044

add a new BinOpAI class to represent the immediate form that directly acts on EAX. · b71a77d7

Chris Lattner authored Oct 07, 2010

This does change the generated .inc files to include the implicit use/def of eax.
Since these instructions are only generated by the assembler and disassembler it
doesn't actually matter though.

llvm-svn: 115885

b71a77d7

add a bunch of classes for other common patterns. · 894d2e61
Chris Lattner authored Oct 07, 2010
```
As usual, no change in generated .inc files.

llvm-svn: 115882
```
894d2e61
Define a new BinOpRI8 class and use it to define the imm8 versions of and. · e17d7212
Chris Lattner authored Oct 07, 2010
```
llvm-svn: 115880
```
e17d7212
add the pattern operator to match to X86TypeInfo, use this to · 356f16c1
Chris Lattner authored Oct 07, 2010
```
convert AND64ri32 to use BinOpRI.

llvm-svn: 115878
```
356f16c1

Oct 06, 2010
- enhance X86TypeInfo to include information about the encoding and · 6e85be2e
  Chris Lattner authored Oct 06, 2010
```
operand kind for immediates.  Use these to define a new BinOpRI
class and switch AND8/16/32ri over to it.  AND64ri32 needs some
more refactoring before it can make the switcheroo.

llvm-svn: 115752
```
  6e85be2e
- add a class for _REV nodes. · 94eff91d
  Chris Lattner authored Oct 06, 2010
```
llvm-svn: 115748
```
  94eff91d
- sink more intelligence into the ITy base class. Now it knows · a46073b5
  Chris Lattner authored Oct 06, 2010
```
that i8 operations are even and i16,i32,i64 operations have a
low opcode bit set (they are odd).

llvm-svn: 115747
```
  a46073b5
- refactor things a bit, now the REX_W and OpSize prefix bytes are inferred from the type info. · b6da2be7
  Chris Lattner authored Oct 06, 2010
```
llvm-svn: 115745
```
  b6da2be7
- with tblgen suitably extended, we can now get the load node from typeinfo. · 7bbd809b
  Chris Lattner authored Oct 06, 2010
```
llvm-svn: 115744
```
  7bbd809b
- lets go all meta and define new X86 type wrappers that declare the associated · 1fc81e90
  Chris Lattner authored Oct 06, 2010
```
gunk that goes along with an MVT (e.g. reg class, preferred load operation,
memory operand)

llvm-svn: 115727
```
  1fc81e90
- introduce a new BinOpRM class and use it to factor AND*rm. This points out · eadaeaab
  Chris Lattner authored Oct 06, 2010
```
that I need a heavier handed approach to get ultimate factorization.

llvm-svn: 115726
```
  eadaeaab
- allow !strconcat to take more than two operands to eliminate · 61ea00b4
  Chris Lattner authored Oct 05, 2010
```
!strconcat(!strconcat(!strconcat(!strconcat

Simplify some x86 td files to use it.

llvm-svn: 115719
```
  61ea00b4
- associate the instruction suffix letter with the integer gpr · 97b1368a
  Chris Lattner authored Oct 05, 2010
```
register class, and use this to simplify use of BinOpRR.

llvm-svn: 115716
```
  97b1368a
- introduce a new BinOpRR class, and convert 4 and instructions to use it. · 7359194b
  Chris Lattner authored Oct 05, 2010
```
llvm-svn: 115715
```
  7359194b