Commits · f7f7ddfd7e553cd3bf9e4cf235d503941521832b · Roger Ferrer / llvm-epi-0.8

Nov 05, 2007

[ARM] Fix code generation for: · 1a30c18e
Lauro Ramos Venancio authored Nov 05, 2007
```
static __thread struct {
    int a;
    int b;
} teste = {0, 0};

llvm-svn: 43722
```
1a30c18e
Use movups to spill / restore SSE registers on targets where stacks alignment is · 9337929a
Evan Cheng authored Nov 05, 2007
```
less than 16. This is a temporary solution until dynamic stack alignment is
implemented.

llvm-svn: 43703
```
9337929a
Added support for PIC code with "explicit relocations" *only*. · 3e0d030d
Bruno Cardoso Lopes authored Nov 05, 2007
```
Removed all macro code for PIC (goodbye "la").
Support tested with shootout bench.

llvm-svn: 43697
```
3e0d030d

Eliminate the remaining uses of getTypeSize. This · 283207a7

Duncan Sands authored Nov 05, 2007

should only effect x86 when using long double.  Now
12/16 bytes are output for long double globals (the
exact amount depends on the alignment).  This brings
globals in line with the rest of LLVM: the space
reserved for an object is now always the ABI size.
One tricky point is that only 10 bytes should be
output for long double if it is a field in a packed
struct, which is the reason for the additional
argument to EmitGlobalConstant.

llvm-svn: 43688

283207a7

Nov 04, 2007
- Fix PR1761 by not printing (rip) suffix when in -static mode. · 9329e780
  Chris Lattner authored Nov 04, 2007
```
Evan, please review this.

llvm-svn: 43680
```
  9329e780
- Fix crash before main on ppc/linux with static constructors. PR1771 · d954dcd1
  Nick Lewycky authored Nov 04, 2007
```
llvm-svn: 43676
```
  d954dcd1
- Fix PR1763 by allowing the 'q' constraint to work with 64-bit · 296160d4
  Chris Lattner authored Nov 04, 2007
```
regs on x86-64.

llvm-svn: 43669
```
  296160d4
Nov 02, 2007
- Unbreak tailcall opt. · 2b93a20b
  Evan Cheng authored Nov 02, 2007
```
llvm-svn: 43646
```
  2b93a20b
- add a note · 389d430c
  Chris Lattner authored Nov 02, 2007
```
llvm-svn: 43642
```
  389d430c
- Missing a getNumOperands check. · e453ff49
  Evan Cheng authored Nov 02, 2007
```
llvm-svn: 43630
```
  e453ff49
Nov 01, 2007

Executive summary: getTypeSize -> getTypeStoreSize / getABITypeSize. · 44b8721d

Duncan Sands authored Nov 01, 2007

The meaning of getTypeSize was not clear - clarifying it is important
now that we have x86 long double and arbitrary precision integers.
The issue with long double is that it requires 80 bits, and this is
not a multiple of its alignment.  This gives a primitive type for
which getTypeSize differed from getABITypeSize.  For arbitrary precision
integers it is even worse: there is the minimum number of bits needed to
hold the type (eg: 36 for an i36), the maximum number of bits that will
be overwriten when storing the type (40 bits for i36) and the ABI size
(i.e. the storage size rounded up to a multiple of the alignment; 64 bits
for i36).

This patch removes getTypeSize (not really - it is still there but
deprecated to allow for a gradual transition).  Instead there is:

(1) getTypeSizeInBits - a number of bits that suffices to hold all
values of the type.  For a primitive type, this is the minimum number
of bits.  For an i36 this is 36 bits.  For x86 long double it is 80.
This corresponds to gcc's TYPE_PRECISION.

(2) getTypeStoreSizeInBits - the maximum number of bits that is
written when storing the type (or read when reading it).  For an
i36 this is 40 bits, for an x86 long double it is 80 bits.  This
is the size alias analysis is interested in (getTypeStoreSize
returns the number of bytes).  There doesn't seem to be anything
corresponding to this in gcc.

(3) getABITypeSizeInBits - this is getTypeStoreSizeInBits rounded
up to a multiple of the alignment.  For an i36 this is 64, for an
x86 long double this is 96 or 128 depending on the OS.  This is the
spacing between consecutive elements when you form an array out of
this type (getABITypeSize returns the number of bytes).  This is
TYPE_SIZE in gcc.

Since successive elements in a SequentialType (arrays, pointers
and vectors) need to be aligned, the spacing between them will be
given by getABITypeSize.  This means that the size of an array
is the length times the getABITypeSize.  It also means that GEP
computations need to use getABITypeSize when computing offsets.
Furthermore, if an alloca allocates several elements at once then
these too need to be aligned, so the size of the alloca has to be
the number of elements multiplied by getABITypeSize.  Logically
speaking this doesn't have to be the case when allocating just
one element, but it is simpler to also use getABITypeSize in this
case.  So alloca's and mallocs should use getABITypeSize.  Finally,
since gcc's only notion of size is that given by getABITypeSize, if
you want to output assembler etc the same as gcc then getABITypeSize
is the size you want.

Since a store will overwrite no more than getTypeStoreSize bytes,
and a read will read no more than that many bytes, this is the
notion of size appropriate for alias analysis calculations.

In this patch I have corrected all type size uses except some of
those in ScalarReplAggregates, lib/Codegen, lib/Target (the hard
cases).  I will get around to auditing these too at some point,
but I could do with some help.

Finally, I made one change which I think wise but others might
consider pointless and suboptimal: in an unpacked struct the
amount of space allocated for a field is now given by the ABI
size rather than getTypeStoreSize.  I did this because every
other place that reserves memory for a type (eg: alloca) now
uses getABITypeSize, and I didn't want to make an exception
for unpacked structs, i.e. I did it to make things more uniform.
This only effects structs containing long doubles and arbitrary
precision integers.  If someone wants to pack these types more
tightly they can always use a packed struct.

llvm-svn: 43620

44b8721d

Silence, accersed warning · b7cabbe2
Bill Wendling authored Nov 01, 2007
```
llvm-svn: 43609
```
b7cabbe2

Oct 31, 2007
- Make ARM and X86 LowerMEMCPY identical by moving the isThumb check into getMaxInlineSizeThreshold · 419b6d7c
  Rafael Espindola authored Oct 31, 2007
```
and by restructuring the X86 version.

New I just have to move this to a common place :-)

llvm-svn: 43554
```
  419b6d7c
- Make ARM an X86 memcpy expansion more similar to each other. · 063f1773
  Rafael Espindola authored Oct 31, 2007
```
Now both subtarget define getMaxInlineSizeThreshold and the expansion uses it.

This should not change generated code.

llvm-svn: 43552
```
  063f1773
- Make i64=expand_vector_elt(v2i64) work in 32-bit mode. · b066c1f2
  Dale Johannesen authored Oct 31, 2007
```
llvm-svn: 43535
```
  b066c1f2
Oct 30, 2007
- Add missing SSE builtins: CVTPD2PI, CVTPS2PI, · d50c8bce
  Dale Johannesen authored Oct 30, 2007
```
CVTTPD2PI, CVTTPS2PI, CVTPI2PD, CVTPI2PS.

llvm-svn: 43523
```
  d50c8bce
- Fix for visibility warnings generated by gcc-4.2. · b508c53c
  Duncan Sands authored Oct 30, 2007
```
llvm-svn: 43500
```
  b508c53c
- Add missing MMX PSUBQ. · 6aa304e5
  Dale Johannesen authored Oct 30, 2007
```
llvm-svn: 43488
```
  6aa304e5
Oct 29, 2007
- Enable more fold (sext (load x)) -> (sext (truncate (sextload x))) · e106e2f1
  Evan Cheng authored Oct 29, 2007
```
transformation. Previously, it's restricted by ensuring the number of load uses
is one. Now the restriction is loosened up by allowing setcc uses to be
"extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq).

llvm-svn: 43465
```
  e106e2f1
- Avoid doing something dumb like rewriting using a 64-bit iv in 32-bit mode. · 7b3f7fea
  Evan Cheng authored Oct 29, 2007
```
llvm-svn: 43446
```
  7b3f7fea
- add a note. · 909a54cc
  Chris Lattner authored Oct 29, 2007
```
llvm-svn: 43444
```
  909a54cc
- Add support for the x86-64 'q' regigster modifier, and add support for the · 5e99fd8c
  Chris Lattner authored Oct 29, 2007
```
b/h/w/k/q inline asm memory modifiers, which are just ignored.  This fixes
PR1748 and CodeGen/X86/2007-10-28-inlineasm-q-modifier.ll

llvm-svn: 43430
```
  5e99fd8c
- Fix PR1749 and InstCombine/2007-10-28-EmptyField.ll by handling · 9a641510
  Chris Lattner authored Oct 29, 2007
```
zero-length fields better.

llvm-svn: 43427
```
  9a641510
Oct 28, 2007
- New entry. · c826ac53
  Evan Cheng authored Oct 28, 2007
```
llvm-svn: 43420
```
  c826ac53
Oct 26, 2007

Fix off-by-one stack offset computations (dwarf information) for callee-saved · d07d6a41

Anton Korobeynikov authored Oct 26, 2007

registers in case, when FP pointer was eliminated. This should fixes misc. random
EH-related crahses, when stuff is compiled with -fomit-frame-pointer.
Thanks Duncan for nailing this bug!

llvm-svn: 43381

d07d6a41

clo/clz aren't supported on mips I. Keep them around for when we'll · 18063916
Eric Christopher authored Oct 26, 2007
```
want them later (mips32/64).

llvm-svn: 43380
```
18063916

Loosen up iv reuse to allow reuse of the same stride but a larger type when... · 7f3d0247

Evan Cheng authored Oct 26, 2007

Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free.
e.g.
Turns this loop:
LBB1_1: # entry.bb_crit_edge
        xorl    %ecx, %ecx
        xorw    %dx, %dx
        movw    %dx, %si
LBB1_2: # bb
        movl    L_X$non_lazy_ptr, %edi
        movw    %si, (%edi)
        movl    L_Y$non_lazy_ptr, %edi
        movw    %dx, (%edi)
		addw    $4, %dx
		incw    %si
		incl    %ecx
		cmpl    %eax, %ecx
		jne     LBB1_2  # bb
	
into

LBB1_1: # entry.bb_crit_edge
        xorl    %ecx, %ecx
        xorw    %dx, %dx
LBB1_2: # bb
        movl    L_X$non_lazy_ptr, %esi
        movw    %cx, (%esi)
        movl    L_Y$non_lazy_ptr, %esi
        movw    %dx, (%esi)
        addw    $4, %dx
		incl    %ecx
        cmpl    %eax, %ecx
        jne     LBB1_2  # bb

llvm-svn: 43375

7f3d0247

Oct 25, 2007
- Support non-POSIX hosts by removing use of strncasecmp. · 65e804a9
  Dale Johannesen authored Oct 25, 2007
```
llvm-svn: 43364
```
  65e804a9
Oct 24, 2007
- Disable a couple more things for ppcf128. · 10f41524
  Dale Johannesen authored Oct 23, 2007
```
llvm-svn: 43267
```
  10f41524
Oct 23, 2007
- Temporary solution: added a different set of BCTRL_Macho / BCTRL_ELF with... · ec271b10
  Evan Cheng authored Oct 23, 2007
```
Temporary solution: added a different set of BCTRL_Macho / BCTRL_ELF with right callee-saved defs set for ppc64.

llvm-svn: 43248
```
  ec271b10
- Fix memcpy lowering when addresses are 4-byte aligned but size is not multiple of 4. · 1f2dd358
  Evan Cheng authored Oct 22, 2007
```
llvm-svn: 43234
```
  1f2dd358
Oct 22, 2007
- Fix the folding of multiplication into addresses on x86, which was broken · bf474959
  Dan Gohman authored Oct 22, 2007
```
by the recent {U,S}MUL_LOHI changes.

llvm-svn: 43230
```
  bf474959
- Use ptr type in the immediate field of a BxA instruction so we don't end up... · bdbed663
  Evan Cheng authored Oct 22, 2007
```
Use ptr type in the immediate field of a BxA instruction so we don't end up selecting 32-bit call instruction for ppc64.

llvm-svn: 43228
```
  bdbed663
- Fix an unfolding bug. · c92446af
  Evan Cheng authored Oct 22, 2007
```
llvm-svn: 43212
```
  c92446af
Oct 21, 2007
- Allow for copysign having f80 second argument. · 8ee70112
  Dale Johannesen authored Oct 21, 2007
```
Fixes 5550319.

llvm-svn: 43205
```
  8ee70112
Oct 20, 2007
- Resolve unfold tables ambiguity. · 45e096c7
  Evan Cheng authored Oct 19, 2007
```
llvm-svn: 43194
```
  45e096c7
Oct 19, 2007

Local spiller optimization: · 35ff7937

Evan Cheng authored Oct 19, 2007

Turn a store folding instruction into a load folding instruction. e.g.
     xorl  %edi, %eax
     movl  %eax, -32(%ebp)
     movl  -36(%ebp), %eax
     orl   %eax, -32(%ebp)
=>
     xorl  %edi, %eax
     orl   -36(%ebp), %eax
     mov   %eax, -32(%ebp)
This enables the unfolding optimization for a subsequent instruction which will
also eliminate the newly introduced store instruction.

llvm-svn: 43192

35ff7937

split LowerMEMCPY into LowerMEMCPYCall and LowerMEMCPYInline in the ARM backend. · 18a831d7
Rafael Espindola authored Oct 19, 2007
```
llvm-svn: 43176
```
18a831d7

Add support for byval function whose argument is not 32 bit aligned. · 846c19dd

Rafael Espindola authored Oct 19, 2007

To do this it is necessary to add a "always inline" argument to the
memcpy node. For completeness I have also added this node to memmove
and memset.  I have also added getMem* functions, because the extra
argument makes it cumbersome to use getNode and because I get confused
by it :-)

llvm-svn: 43172

846c19dd

comment fixes · b193576b
Chris Lattner authored Oct 19, 2007
```
llvm-svn: 43168
```
b193576b