Commits · de0874a4bcecae977b98fbbfacf9ea3dd7a890d2 · Roger Ferrer / llvm-epi-0.8

Jul 29, 2010
- Remove dead prototype · de0874a4
  Jim Grosbach authored Jul 28, 2010
```
llvm-svn: 109691
```
  de0874a4
Jul 28, 2010

Create a fixed stack object for varargs that is as large as any register. · f2234fbe

Jakob Stoklund Olesen authored Jul 28, 2010

The size of this object isn't used for anything - technically it is of variable
size.

This avoids a false positive from the assert in
X86InstrInfo::loadRegFromStackSlot, and fixes PR7735.

llvm-svn: 109652

f2234fbe

Fix this code to avoid decrementing an iterator past the beginning · 1da02dfb
Dan Gohman authored Jul 28, 2010
```
of a std::vector.

llvm-svn: 109597
```
1da02dfb
Do GEP offset calculations with unsigned math rather than signed math · 32f889e5
Dan Gohman authored Jul 28, 2010
```
to avoid undefined behavior on overflow, noticed by John Regehr.

llvm-svn: 109594
```
32f889e5
Implement a vectorized algorithm for <16 x i8> << <16 x i8> · 53afc8f0
Nate Begeman authored Jul 28, 2010
```
This is about 4x faster and smaller than the existing scalarization.

llvm-svn: 109566
```
53afc8f0

~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller... · 269a6da0

Nate Begeman authored Jul 27, 2010

~40% faster vector shl <4 x i32> on SSE 4.1  Larger improvements for smaller types coming in future patches.

For:

define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
  %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
  %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
  ret <2 x i64> %tmp2
}

We get:

_shl:                                   ## @shl
	pslld	$23, %xmm1
	paddd	LCPI0_0, %xmm1
	cvttps2dq	%xmm1, %xmm1
	pmulld	%xmm1, %xmm0
	ret

Instead of:

_shl:                                   ## @shl
	pshufd	$3, %xmm0, %xmm2
	movd	%xmm2, %eax
	pshufd	$3, %xmm1, %xmm2
	movd	%xmm2, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	pshufd	$1, %xmm0, %xmm3
	movd	%xmm3, %eax
	pshufd	$1, %xmm1, %xmm3
	movd	%xmm3, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm3
	punpckldq	%xmm2, %xmm3
	movd	%xmm0, %eax
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	movhlps	%xmm0, %xmm0
	movd	%xmm0, %eax
	movhlps	%xmm1, %xmm1
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm0
	punpckldq	%xmm0, %xmm2
	movdqa	%xmm2, %xmm0
	punpckldq	%xmm3, %xmm0
	ret

llvm-svn: 109549

269a6da0

Jul 27, 2010
- Make MC use Windows COFF on Windows and add tests. · f8270bdb
  Michael J. Spencer authored Jul 27, 2010
```
llvm-svn: 109494
```
  f8270bdb
- The isLoadFromStackSlot and isStoreToStackSlot have no way of reporting · 96a890a7
  Jakob Stoklund Olesen authored Jul 27, 2010
```
subregister operands like this:

%reg1040:sub_32bit<def> = MOV32rm <fi#-2>, 1, %reg0, 0, %reg0, %reg1040<imp-def>; mem:LD4[FixedStack-2](align=8)

Make them return false when subreg operands are present. VirtRegRewriter is
making bad assumptions otherwise.

This fixes PR7713.

llvm-svn: 109489
```
  96a890a7
- Add assertions that expose the PR7713 miscompilation: Accessing a stack slot · c3c05ed0
  Jakob Stoklund Olesen authored Jul 27, 2010
```
with a too-big register class.

llvm-svn: 109488
```
  c3c05ed0
- And a bit more non-ASCII stuff. · f902befe
  Eli Friedman authored Jul 26, 2010
```
llvm-svn: 109458
```
  f902befe
- Drop some non-ascii stuff · 1e0d76bf
  Anton Korobeynikov authored Jul 26, 2010
```
llvm-svn: 109456
```
  1e0d76bf
Jul 26, 2010

On x86, f32 / f64 nodes share the same registers as 128-bit vector values. · d4218b87
Evan Cheng authored Jul 26, 2010
```
llvm-svn: 109450
```
d4218b87
Add a note · b61a6f27
Anton Korobeynikov authored Jul 26, 2010
```
llvm-svn: 109448
```
b61a6f27

Temporary hack to let codegen assert or generate poor code in case · 36c2ea6c

Bruno Cardoso Lopes authored Jul 26, 2010

we are using AVX and no AVX version of the desired intruction is present,
this is better for incremental dev (without fallbacks it's easier to spot
what's missing). Not sure this is the best hack thought (we can also disable
all HasSSE* predicates by dinamically marking them 'false' if AVX is present)

llvm-svn: 109434

36c2ea6c

Currently EH lowering code expects typeinfo to be global only. · 6bcea068

Anton Korobeynikov authored Jul 26, 2010

This assumption is not satisfied due to global mergeing.
Workaround the issue by temporary disablinge mergeing of const globals.
Also, ignore LLVM "special" globals. This fixes PR7716

llvm-svn: 109423

6bcea068

ARM fastisel isn't ready. · 23b05d1c
Evan Cheng authored Jul 26, 2010
```
llvm-svn: 109421
```
23b05d1c

Jul 25, 2010
- Remove extraneous semicolon · 8f452bc2
  Douglas Gregor authored Jul 25, 2010
```
llvm-svn: 109373
```
  8f452bc2
- Unbreak CMake build · 8fcfe7aa
  Douglas Gregor authored Jul 25, 2010
```
llvm-svn: 109372
```
  8fcfe7aa
Jul 24, 2010

Hook in GlobalMerge pass · 19edda03
Anton Korobeynikov authored Jul 24, 2010
```
llvm-svn: 109359
```
19edda03

Add an ILP scheduler. This is a register pressure aware scheduler that's · 37b740c4

Evan Cheng authored Jul 24, 2010

appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.

On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.

llvm-svn: 109300

37b740c4

Support x86 "eiz" and "riz" pseudo index registers in the assembler. · 306a1f97
Bruno Cardoso Lopes authored Jul 24, 2010
```
llvm-svn: 109295
```
306a1f97

Use the appropriate register class for an i32 when adding ARM::LR to the · 0acbcb1a

Jim Grosbach authored Jul 23, 2010

function live in set. This will give us tGPR for Thumb1 and GPR otherwise,
so the copy will be spillable. rdar://8224931

llvm-svn: 109293

0acbcb1a

Revert 109076. It is wrong and was causing regressions. Add some · c17dd579

Dale Johannesen authored Jul 23, 2010

comments explaining why it was wrong.  8225024.

Fix the real problem in 8213383: the code that splits very large
blocks when no other place to put constants can be found was not
considering the case that the block contained a Thumb tablejump.

llvm-svn: 109282

c17dd579

- Allow target to specify when is register pressure "too high". In most cases, · df907f45

Evan Cheng authored Jul 23, 2010

  it's too late to start backing off aggressive latency scheduling when most
  of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
  For ARM, this is almost always a win on # of instructions. It's runtime
  neutral for most of the tests. But for some kernels with high register
  pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
  54 and sped up by 20%.

llvm-svn: 109279

df907f45

Remove trailing whitespace · d65cd1d5
Bruno Cardoso Lopes authored Jul 23, 2010
```
llvm-svn: 109276
```
d65cd1d5

Jul 23, 2010
- Add AVX version of CLMUL instructions · ea0e05a3
  Bruno Cardoso Lopes authored Jul 23, 2010
```
llvm-svn: 109248
```
  ea0e05a3
- fix constness warnings · 4ad72717
  Gabor Greif authored Jul 23, 2010
```
llvm-svn: 109224
```
  4ad72717
- do not (implicitly) dereference iterator many times, cache it instead · a1e9c983
  Gabor Greif authored Jul 23, 2010
```
llvm-svn: 109222
```
  a1e9c983
- Declare CLMUL as a subtarget feature · d618c8ac
  Bruno Cardoso Lopes authored Jul 23, 2010
```
llvm-svn: 109207
```
  d618c8ac
- Add x86 CLMUL (Carry-less multiplication) cpu feature · 09dc24be
  Bruno Cardoso Lopes authored Jul 23, 2010
```
llvm-svn: 109206
```
  09dc24be
- Add complete assembler support for FMA3 instructions, with descriptions and... · acd9230b
  Bruno Cardoso Lopes authored Jul 23, 2010
```
Add complete assembler support for FMA3 instructions, with descriptions and encodings taken from the AVX manual

llvm-svn: 109204
```
  acd9230b
- The only supported calling convention for X86-64 uses · f2d75670
  Dale Johannesen authored Jul 23, 2010
```
SSE, so we can't return floating point values if this
is disabled.  Detect this error for clang.

With SSE1 only, f64 is a problem; it can be done, but
neither llvm-gcc nor clang has ever generated correct
code for it.  Since nobody noticed this I think it's
OK to treat it as an error for now.

This also handles SSE-sized vectors of floating point.
8207686, 8204109.

llvm-svn: 109201
```
  f2d75670
- Fix some AVX instructions which didnt had HasAVX prefix. And also a problem... · e29e3896
  Bruno Cardoso Lopes authored Jul 23, 2010
```
Fix some AVX instructions which didnt had HasAVX prefix. And also a problem with PINSRW, which was totally wrong because of a typo I introduced previously

llvm-svn: 109198
```
  e29e3896
Jul 22, 2010

eliminate the TargetInstrInfo::GetInstSizeInBytes hook. · 749ca32d

Chris Lattner authored Jul 22, 2010

ARM/PPC/MSP430-specific code (which are the only targets that
implement the hook) can directly reference their target-specific
instrinfo classes.

llvm-svn: 109171

749ca32d

Add remaining AVX instructions (most of them dealing with GR64 destinations.... · 0710c74f

Bruno Cardoso Lopes authored Jul 22, 2010

Add remaining AVX instructions (most of them dealing with GR64 destinations. This complete the assembler support for the general AVX ISA. But we still miss instructions from FMA3 and CLMUL specific feature flags, which are now the next step

llvm-svn: 109168

0710c74f

remove the JIT "NeedsExactSize" feature and supporting logic. · 8f3adc90
Chris Lattner authored Jul 22, 2010
```
llvm-svn: 109167
```
8f3adc90

switch a private implementation of GetFunctionSizeInBytes. · dab6888b

Chris Lattner authored Jul 22, 2010

This is probably not the best way to implement "Force LR to 
be spilled if the Thumb function size is > 2048." do this, 
it should use the branch shortening infrastructure, but I'm
just preserving functionality here.

llvm-svn: 109165

dab6888b

X86MCInstLower now depends on AsmPrinter being around. · b3f608bb
Chris Lattner authored Jul 22, 2010
```
llvm-svn: 109154
```
b3f608bb

instead of migrating it to the MC instruction encoder, just · 083be4d3

Chris Lattner authored Jul 22, 2010

rip out the implementation of X86InstrInfo::GetInstSizeInBytes.
The code being ripped out just implemented a copy and hacked up
version of the (old) instruction encoder, and is buggy and 
terrible in other ways.  Since "GetInstSizeInBytes" is really 
only there to support the JIT's "NeedsExactSize" hook (which
noone is using), just rip out the code.  I will rip out the
NeedsExactSize hook next.

This resolves rdar://7617809 - switch X86InstrInfo::GetInstSizeInBytes to use X86MCCodeEmitter

llvm-svn: 109149

083be4d3

ARMv4 JIT forgets to set the lr register when making a indirect function call. Fixes PR7608 · ff66cd43
Xerxes Ranby authored Jul 22, 2010
```
llvm-svn: 109125
```
ff66cd43