Commits · bfacef45eb217477c76049ff48549217822e478e · Roger Ferrer / llvm-epi-0.8

Sep 13, 2012

Don't fold indexed loads into TCRETURNmi64. · bfacef45

Jakob Stoklund Olesen authored Sep 13, 2012

We don't have enough GR64_TC registers when calling a varargs function
with 6 arguments. Since %al holds the number of vector registers used,
only %r11 is available as a scratch register.

This means that addressing modes using both base and index registers
can't be folded into TCRETURNmi64.

<rdar://problem/12282281>

llvm-svn: 163761

bfacef45

Jun 01, 2012

Implement the local-dynamic TLS model for x86 (PR3985) · 789acfb6

Hans Wennborg authored Jun 01, 2012

This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

llvm-svn: 157818

789acfb6

May 09, 2012

Use ptr_rc_tailcall instead of GR32_TC. · 7e21d617

Jakob Stoklund Olesen authored May 09, 2012

The getPointerRegClass() hook will return GR32_TC, or whatever is
appropriate for the current function.

Patch by Yiannis Tsiouris!

llvm-svn: 156459

7e21d617

May 07, 2012

X86: optimization for -(x != 0) · ef4e0479

Manman Ren authored May 07, 2012

This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax

In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;

rdar: 10961709
llvm-svn: 156312

ef4e0479

Apr 04, 2012

Always compute all the bits in ComputeMaskedBits. · ba0a6cab

Rafael Espindola authored Apr 04, 2012

This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011

ba0a6cab

Mar 29, 2012
- Make x86 REP_MOV* and REP_STO instructions use the correct operand sizes in 64-bit mode. · 5569ce7d
  Lang Hames authored Mar 29, 2012
```
llvm-svn: 153680
```
  5569ce7d
Mar 19, 2012

This patch adds X86 instruction itineraries for non-pseudo opcodes in · 48ccc4df

Preston Gurd authored Mar 19, 2012

X86InstrCompiler.td.
 
It also adds –mcpu-generic to the legalize-shift-64.ll test so the test
will pass if run on an Intel Atom CPU, which would otherwise
produce an instruction schedule which differs from that which the test expects.

llvm-svn: 153033

48ccc4df

Feb 24, 2012
- Add WIN_FTOL_* psudo-instructions to model the unique calling convention · 248d65e7
  Michael J. Spencer authored Feb 24, 2012
```
used by the Win32 _ftol2 runtime function. Patch by Joe Groff!

llvm-svn: 151382
```
  248d65e7
Feb 16, 2012

Use the same CALL instructions for Windows as for everything else. · 97e3115d

Jakob Stoklund Olesen authored Feb 16, 2012

The different calling conventions and call-preserved registers are
represented with regmask operands that are added dynamically.

llvm-svn: 150708

97e3115d

Jan 16, 2012
- Make sure the non-SSE lowering for fences correctly clobbers EFLAGS. PR11768. · 206ca569
  Eli Friedman authored Jan 16, 2012
```
llvm-svn: 148240
```
  206ca569
- Get rid of unused codegen-only instruction. · 75e3db4c
  Eli Friedman authored Jan 16, 2012
```
llvm-svn: 148239
```
  75e3db4c
Jan 12, 2012
- X86: Generalize the x << (y & const) optimization to also catch masks with... · 5b3aa60b
  Benjamin Kramer authored Jan 12, 2012
```
X86: Generalize the x << (y & const) optimization to also catch masks with more set bits set than 31 or 63.

llvm-svn: 148024
```
  5b3aa60b
Dec 24, 2011

Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the · 7e9453e9

Chandler Carruth authored Dec 24, 2011

X86ISelLowering C++ code. Because this is lowered via an xor wrapped
around a bsr, we want the dagcombine which runs after isel lowering to
have a chance to clean things up. In particular, it is very common to
see code which looks like:

  (sizeof(x)*8 - 1) ^ __builtin_clz(x)

Which is trying to compute the most significant bit of 'x'. That's
actually the value computed directly by the 'bsr' instruction, but if we
match it too late, we'll get completely redundant xor instructions.

The more naive code for the above (subtracting rather than using an xor)
still isn't handled correctly due to the dagcombine getting confused.

Also, while here fix an issue spotted by inspection: we should have been
expanding the zero-undef variants to the normal variants when there is
an 'lzcnt' instruction. Do so, and test for this. We don't want to
generate unnecessary 'bsr' instructions.

These two changes fix some regressions in encoding and decoding
benchmarks. However, there is still a *lot* to be improve on in this
type of code.

llvm-svn: 147244

7e9453e9

Dec 20, 2011

Begin teaching the X86 target how to efficiently codegen patterns that · 24680c24

Chandler Carruth authored Dec 20, 2011

use the zero-undefined variants of CTTZ and CTLZ. These are just simple
patterns for now, there is more to be done to make real world code using
these constructs be optimized and codegen'ed properly on X86.

The existing tests are spiffed up to check that we no longer generate
unnecessary cmov instructions, and that we generate the very important
'xor' to transform bsr which counts the index of the most significant
one bit to the number of leading (most significant) zero bits. Also they
now check that when the variant with defined zero result is used, the
cmov is still produced.

llvm-svn: 146974

24680c24

Oct 26, 2011

Fixes an issue reported by -verify-machineinstrs. · b3285224
Rafael Espindola authored Oct 26, 2011
```
Patch by Sanjoy Das.

llvm-svn: 143064
```
b3285224

This commit introduces two fake instructions MORESTACK_RET and · 66393c12

Rafael Espindola authored Oct 26, 2011

MORESTACK_RET_RESTORE_R10; which are lowered to a RET and a RET
followed by a MOV respectively.  Having a fake instruction prevents
the verifier from seeing a MachineBasicBlock end with a
non-terminator (MOV).  It also prevents the rather eccentric case of a
MachineBasicBlock ending with RET but having successors nevertheless.

Patch by Sanjoy Das.

llvm-svn: 143062

66393c12

Sep 13, 2011

Fix the assembler strings for a couple of atomic instructions. Doesn't really... · d68a727b

Eli Friedman authored Sep 13, 2011

Fix the assembler strings for a couple of atomic instructions.  Doesn't really matter much in practice, but it's a bit cleaner.

llvm-svn: 139563

d68a727b

Sep 07, 2011

Fix atomic load and store on x86 to pass -verify-machineinstrs (and possibly... · 02f2f89a

Eli Friedman authored Sep 07, 2011

Fix atomic load and store on x86 to pass -verify-machineinstrs (and possibly fix some subtle bugs involving passes which check mayStore()).

This isn't exactly ideal, but it is good enough for the moment.

llvm-svn: 139245

02f2f89a

Sep 03, 2011

Pseudo CMOV instructions don't clobber EFLAGS. · 1f72dd40

Jakob Stoklund Olesen authored Sep 02, 2011

The explanation about a 0 argument being materialized as xor is no
longer valid.  Rematerialization will check if EFLAGS is live before
clobbering it.

The code produced by X86TargetLowering::EmitLoweredSelect does not
clobber EFLAGS.

This causes one less testb instruction to be generated in the cmov.ll
test case.

llvm-svn: 139057

1f72dd40

Aug 30, 2011

Adds a SelectionDAG node X86SegAlloca which will be custom lowered · 33530176

Rafael Espindola authored Aug 30, 2011

from DYNAMIC_STACKALLOC.

Two new pseudo instructions (SEG_ALLOCA_32 and SEG_ALLOCA_64) which
will match X86SegAlloca (based on word size) are also added.  They
will be custom emitted to inject the actual stack handling code.

Patch by Sanjoy Das.

llvm-svn: 138814

33530176

Aug 26, 2011
- Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction. · 5e570427
  Eli Friedman authored Aug 26, 2011
```
llvm-svn: 138660
```
  5e570427
Aug 24, 2011
- Basic x86 code generation for atomic load and store instructions. · 342e8df0
  Eli Friedman authored Aug 24, 2011
```
llvm-svn: 138478
```
  342e8df0
Aug 10, 2011
- Add 256-bit support for v8i32, v4i64 and v4f64 ISD::SELECT. Fix PR10556 · 72323966
  Bruno Cardoso Lopes authored Aug 09, 2011
```
llvm-svn: 137179
```
  72323966
- Fix a couple ridiculous copy-paste errors. rdar://9914773 . · 4ef2426b
  Eli Friedman authored Aug 09, 2011
```
llvm-svn: 137160
```
  4ef2426b
Jul 27, 2011

X86ISD::MEMBARRIER does not require SSE2; it doesn't actually generate any... · e6d1853e

Eli Friedman authored Jul 27, 2011

X86ISD::MEMBARRIER does not require SSE2; it doesn't actually generate any code, and all x86 processors will honor the required semantics.

llvm-svn: 136249

e6d1853e

Jun 16, 2011
- Add a comment describing why transforming (shl x, 1) to (add x, x) is to be · 8eb36ef4
  Dan Gohman authored Jun 16, 2011
```
considered safe enough in this context.

llvm-svn: 133159
```
  8eb36ef4
May 21, 2011
- X86: smulo -> add is now done target-independently in DAGCombiner, remove the patterns. · e30b7007
  Benjamin Kramer authored May 21, 2011
```
llvm-svn: 131801
```
  e30b7007
May 20, 2011
- Re-commit 131641 with fixes; de-pseudoize MOVSX16rr8 and friends. · 91f1d247
  Stuart Hastings authored May 20, 2011
```
rdar://problem/8614450

llvm-svn: 131746
```
  91f1d247
May 19, 2011
- Reverting 131641 to investigate 'bot complaint. · c72240bb
  Stuart Hastings authored May 19, 2011
```
llvm-svn: 131654
```
  c72240bb
- Revise MOVSX16rr8/MOVZX16rr8 (and rm variants) to no longer be · b476b0cc
  Stuart Hastings authored May 19, 2011
```
pseudos.  rdar://problem/8614450

llvm-svn: 131641
```
  b476b0cc
May 17, 2011
- Support XOR and AND optimization with no return value. · a1d9e295
  Eric Christopher authored May 17, 2011
```
Finishes off rdar://8470697

llvm-svn: 131458
```
  a1d9e295
May 11, 2011
- Optimize atomic lock or that doesn't use the result value. · 4a34e61e
  Eric Christopher authored May 10, 2011
```
Next up: xor and and.

Part of rdar://8470697

llvm-svn: 131171
```
  4a34e61e
May 10, 2011
- Refactor lock versions of binary operators to be a little less · e3346466
  Eric Christopher authored May 10, 2011
```
cut and paste.

llvm-svn: 131139
```
  e3346466
May 08, 2011

X86: Add a bunch of peeps for add and sub of SETB. · d724a590

Benjamin Kramer authored May 08, 2011

"b + ((a < b) ? 1 : 0)" compiles into
	cmpl	%esi, %edi
	adcl	$0, %esi
instead of
	cmpl	%esi, %edi
	sbbl	%eax, %eax
	andl	$1, %eax
	addl	%esi, %eax

This saves a register, a false dependency on %eax
(Intel's CPUs still don't ignore it) and it's shorter.

llvm-svn: 131070

d724a590

Feb 17, 2011
- The labyrinthine X86 backend no longer appears to require · f0f8e143
  Dan Gohman authored Feb 17, 2011
```
these patterns.

llvm-svn: 125759
```
  f0f8e143
Jan 26, 2011
- Target/X86: Tweak win64's tailcall. · 0cfdac07
  NAKAMURA Takumi authored Jan 26, 2011
```
llvm-svn: 124272
```
  0cfdac07
- Fix whitespace. · 9d29eff1
  NAKAMURA Takumi authored Jan 26, 2011
```
llvm-svn: 124270
```
  9d29eff1
Jan 18, 2011
- The stub routine that we're calling uses test and so clobbers · 542f8a52
  Eric Christopher authored Jan 18, 2011
```
the flags.

llvm-svn: 123712
```
  542f8a52
Dec 20, 2010

We lower setb to sbb with the hope that the and will go away, when it · 46b9efca

Chris Lattner authored Dec 20, 2010

doesn't, match it back to setb.

On a 64-bit version of the testcase before we'd get:

	movq	%rdi, %rax
	addq	%rsi, %rax
	sbbb	%dl, %dl
	andb	$1, %dl
	ret

now we get:

	movq	%rdi, %rax
	addq	%rsi, %rax
	setb	%dl
	ret

llvm-svn: 122217

46b9efca

Dec 19, 2010

improve the setcc -> setcc_carry optimization to happen more · 9edf3f50

Chris Lattner authored Dec 19, 2010

consistently by moving it out of lowering into dag combine.

Add some missing patterns for matching away extended versions of setcc_c.

llvm-svn: 122201

9edf3f50