Commits · 0b271eb7af4877f914528b6bcbbb8fffcb563720 · Roger Ferrer / llvm-epi-0.8

Nov 02, 2007
- Fix a thinko. · 04059dd3
  Duncan Sands authored Nov 02, 2007
```
llvm-svn: 43639
```
  04059dd3
Nov 01, 2007

Executive summary: getTypeSize -> getTypeStoreSize / getABITypeSize. · 44b8721d

Duncan Sands authored Nov 01, 2007

The meaning of getTypeSize was not clear - clarifying it is important
now that we have x86 long double and arbitrary precision integers.
The issue with long double is that it requires 80 bits, and this is
not a multiple of its alignment.  This gives a primitive type for
which getTypeSize differed from getABITypeSize.  For arbitrary precision
integers it is even worse: there is the minimum number of bits needed to
hold the type (eg: 36 for an i36), the maximum number of bits that will
be overwriten when storing the type (40 bits for i36) and the ABI size
(i.e. the storage size rounded up to a multiple of the alignment; 64 bits
for i36).

This patch removes getTypeSize (not really - it is still there but
deprecated to allow for a gradual transition).  Instead there is:

(1) getTypeSizeInBits - a number of bits that suffices to hold all
values of the type.  For a primitive type, this is the minimum number
of bits.  For an i36 this is 36 bits.  For x86 long double it is 80.
This corresponds to gcc's TYPE_PRECISION.

(2) getTypeStoreSizeInBits - the maximum number of bits that is
written when storing the type (or read when reading it).  For an
i36 this is 40 bits, for an x86 long double it is 80 bits.  This
is the size alias analysis is interested in (getTypeStoreSize
returns the number of bytes).  There doesn't seem to be anything
corresponding to this in gcc.

(3) getABITypeSizeInBits - this is getTypeStoreSizeInBits rounded
up to a multiple of the alignment.  For an i36 this is 64, for an
x86 long double this is 96 or 128 depending on the OS.  This is the
spacing between consecutive elements when you form an array out of
this type (getABITypeSize returns the number of bytes).  This is
TYPE_SIZE in gcc.

Since successive elements in a SequentialType (arrays, pointers
and vectors) need to be aligned, the spacing between them will be
given by getABITypeSize.  This means that the size of an array
is the length times the getABITypeSize.  It also means that GEP
computations need to use getABITypeSize when computing offsets.
Furthermore, if an alloca allocates several elements at once then
these too need to be aligned, so the size of the alloca has to be
the number of elements multiplied by getABITypeSize.  Logically
speaking this doesn't have to be the case when allocating just
one element, but it is simpler to also use getABITypeSize in this
case.  So alloca's and mallocs should use getABITypeSize.  Finally,
since gcc's only notion of size is that given by getABITypeSize, if
you want to output assembler etc the same as gcc then getABITypeSize
is the size you want.

Since a store will overwrite no more than getTypeStoreSize bytes,
and a read will read no more than that many bytes, this is the
notion of size appropriate for alias analysis calculations.

In this patch I have corrected all type size uses except some of
those in ScalarReplAggregates, lib/Codegen, lib/Target (the hard
cases).  I will get around to auditing these too at some point,
but I could do with some help.

Finally, I made one change which I think wise but others might
consider pointless and suboptimal: in an unpacked struct the
amount of space allocated for a field is now given by the ABI
size rather than getTypeStoreSize.  I did this because every
other place that reserves memory for a type (eg: alloca) now
uses getABITypeSize, and I didn't want to make an exception
for unpacked structs, i.e. I did it to make things more uniform.
This only effects structs containing long doubles and arbitrary
precision integers.  If someone wants to pack these types more
tightly they can always use a packed struct.

llvm-svn: 43620

44b8721d

- Coalesce extract_subreg when both intervals are relatively small. · fe1ac528
Evan Cheng authored Nov 01, 2007
```
- Some code clean up.

llvm-svn: 43606
```
fe1ac528

Oct 31, 2007
- Promotion of sdiv/srem/udiv/urem. · 3b4668a5
  Duncan Sands authored Oct 31, 2007
```
llvm-svn: 43551
```
  3b4668a5
- Add a newline at the end of the file. · 21ca9396
  Duncan Sands authored Oct 31, 2007
```
llvm-svn: 43550
```
  21ca9396
- Add the skeleton of a better PHI elimination pass. · 0b59fa06
  Owen Anderson authored Oct 31, 2007
```
llvm-svn: 43542
```
  0b59fa06
- Some fixes to get MachineDomTree working better. · 9b8f34f2
  Owen Anderson authored Oct 31, 2007
```
llvm-svn: 43541
```
  9b8f34f2
- Make i64=expand_vector_elt(v2i64) work in 32-bit mode. · b066c1f2
  Dale Johannesen authored Oct 31, 2007
```
llvm-svn: 43535
```
  b066c1f2
Oct 30, 2007

Typo. · 0747bc1d
Evan Cheng authored Oct 30, 2007
```
llvm-svn: 43511
```
0747bc1d

Add support for expanding trunc stores. Consider · 9ad54650

Duncan Sands authored Oct 30, 2007

storing an i170 on a 32 bit machine.  This is first
promoted to a trunc-i170 store of an i256.  On a
little-endian machine this expands to a store of
an i128 and a trunc-i42 store of an i128.  The
trunc-i42 store is further expanded to a trunc-i42
store of an i64, then to a store of an i32 and a
trunc-i10 store of an i32.  At this point the operand
type is legal (i32) and expansion stops (legalization
of the trunc-i10 needs to be handled in LegalizeDAG.cpp).
On big-endian machines the high bits are stored first,
and some bit-fiddling is needed in order to generate
aligned stores.

llvm-svn: 43499

9ad54650

If a call to getTruncStore is for a normal store, · 341f093b

Duncan Sands authored Oct 30, 2007

offload to getStore rather than trying to handle
both cases at once (the assertions for example
assume the store really is truncating).

llvm-svn: 43498

341f093b

Oct 29, 2007
- Fix a DAGCombiner abort on a bitcast from a scalar to a vector. · ae95d72a
  Dan Gohman authored Oct 29, 2007
```
llvm-svn: 43470
```
  ae95d72a
- Enable more fold (sext (load x)) -> (sext (truncate (sextload x))) · e106e2f1
  Evan Cheng authored Oct 29, 2007
```
transformation. Previously, it's restricted by ensuring the number of load uses
is one. Now the restriction is loosened up by allowing setcc uses to be
"extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq).

llvm-svn: 43465
```
  e106e2f1
- Add explicit keywords. · 1961c28d
  Dan Gohman authored Oct 29, 2007
```
llvm-svn: 43464
```
  1961c28d
Oct 28, 2007

The guaranteed alignment of ptr+offset is only the minimum of · 1826deda

Duncan Sands authored Oct 28, 2007

of offset and the alignment of ptr if these are both powers of
2.  While the ptr alignment is guaranteed to be a power of 2,
there is no reason to think that offset is.  For example, if
offset is 12 (the size of a long double on x86-32 linux) and
the alignment of ptr is 8, then the alignment of ptr+offset
will in general be 4, not 8.  Introduce a function MinAlign,
lifted from gcc, for computing the minimum guaranteed alignment.
I've tried to fix up everywhere under lib/CodeGen/SelectionDAG/.
I also changed some places that weren't wrong (because both values
were a power of 2), as a defensive change against people copying
and pasting the code.
Hopefully someone who cares about alignment will review the rest
of LLVM and fix up the remaining places.  Since I'm on x86 I'm
not very motivated to do this myself...

llvm-svn: 43421

1826deda

Oct 26, 2007

- Remove the hacky code that forces a memcpy. Alignment is taken care of in the · 6d15b32c

Bill Wendling authored Oct 26, 2007

  FE.
- Explicitly pass in the alignment of the load & store.
- XFAIL 2007-10-23-UnalignedMemcpy.ll because llc has a bug that crashes on
  unaligned pointers.

llvm-svn: 43398

6d15b32c

Oct 25, 2007
- Changed XXX to FIXME, and added comment to the README file · f73340ef
  Bill Wendling authored Oct 25, 2007
```
llvm-svn: 43359
```
  f73340ef
- Added comment explaining why we are doing this check. · 5f7ed00d
  Bill Wendling authored Oct 25, 2007
```
llvm-svn: 43353
```
  5f7ed00d
- Small formatting changes. Add a sanity check. · d385f075
  Duncan Sands authored Oct 25, 2007
```
Use NVT rather than looking it up, since we have
it to hand.

llvm-svn: 43341
```
  d385f075
- Promote SETCC operands. · a8f4ba6e
  Duncan Sands authored Oct 25, 2007
```
llvm-svn: 43340
```
  a8f4ba6e
- Correctly extract the ValueType from a VTSDNode. · cf0da033
  Duncan Sands authored Oct 25, 2007
```
llvm-svn: 43339
```
  cf0da033
- Another expansion for i64 multiply, suitable for PPC. · a4a972e3
  Dale Johannesen authored Oct 24, 2007
```
llvm-svn: 43314
```
  a4a972e3
Oct 24, 2007
- Fix comment and use the "Size" variable that's already provided. · 38ccabca
  Bill Wendling authored Oct 23, 2007
```
llvm-svn: 43271
```
  38ccabca
- If there's an unaligned memcpy to/from the stack, don't lower it. Just call the · e3b85929
  Bill Wendling authored Oct 23, 2007
```
memcpy library function instead.

llvm-svn: 43270
```
  e3b85929
- This broke lots. Reverting. · 6f149c05
  Bill Wendling authored Oct 23, 2007
```
llvm-svn: 43264
```
  6f149c05
Oct 23, 2007
- Lowering a memcpy to the stack is killing PPC. The ARM and X86 backends already · 8971440e
  Bill Wendling authored Oct 23, 2007
```
have their own custom memcpy lowering code. This code needs to be factored out
into a target-independent lowering method with hooks to the backend. In the
meantime, just call memcpy if we're trying to copy onto a stack.

llvm-svn: 43262
```
  8971440e
- It's possible to commute instrctions with more than 3 operands. · 5d7032bb
  Evan Cheng authored Oct 23, 2007
```
llvm-svn: 43256
```
  5d7032bb
- isSubRegOf() is a dup of isSubRegister. · 847d42a8
  Evan Cheng authored Oct 23, 2007
```
llvm-svn: 43249
```
  847d42a8
Oct 22, 2007

Add missing paratheses. · 5163a8f5
Evan Cheng authored Oct 22, 2007
```
llvm-svn: 43227
```
5163a8f5
Support for expanding extending loads of integers with · 941db4da
Duncan Sands authored Oct 22, 2007
```
funky bit-widths.

llvm-svn: 43225
```
941db4da

Fix up the logic for result expanding the various extension · 8fc99506

Duncan Sands authored Oct 22, 2007

operations so they work right for integers with funky
bit-widths.  For example, consider extending i48 to i64
on a 32 bit machine.  The i64 result is expanded to 2 x i32.
We know that the i48 operand will be promoted to i64, then
also expanded to 2 x i32.  If we had the expanded promoted
operand to hand, then expanding the result would be trivial.
Unfortunately at this stage we can only get hold of the
promoted operand.  So instead we kind of hand-expand, doing
explicit shifting and truncating to get the top and bottom
halves of the i64 operand into 2 x i32, which are then used
to expand the result.  This is harmless, because when the
promoted operand is finally expanded all this bit fiddling
turns into trivial operations which are eliminated either
by the expansion code itself or the DAG combiner.

llvm-svn: 43223

8fc99506

- Only perform the unfolding optimization when the folding in question is modref. · 85576037
Evan Cheng authored Oct 22, 2007
```
- Remove a bogus assertion.

llvm-svn: 43211
```
85576037

Oct 21, 2007
- Add promote operand support for [su]int_to_fp. · 36f06c80
  Chris Lattner authored Oct 20, 2007
```
llvm-svn: 43204
```
  36f06c80
Oct 20, 2007
- Add result promotion of FP_TO_*INT, fixing CodeGen/X86/trunc-to-bool.ll · 2ba4b148
  Chris Lattner authored Oct 20, 2007
```
with the new legalizer.

llvm-svn: 43199
```
  2ba4b148
- simplify some code. · 1c87f0c6
  Chris Lattner authored Oct 20, 2007
```
llvm-svn: 43198
```
  1c87f0c6
- Implement promote and expand for operands of memcpy and friends. · 2bcac640
  Chris Lattner authored Oct 20, 2007
```
This fixes CodeGen/X86/mem*.ll.

llvm-svn: 43197
```
  2bcac640
- Added missing curly braces which renders the if clause useless in debug build. · f1296712
  Evan Cheng authored Oct 20, 2007
```
llvm-svn: 43196
```
  f1296712
- Fix a few places vector operations were not getting · 771188cf
  Dale Johannesen authored Oct 20, 2007
```
the operand's type from the right place.

llvm-svn: 43195
```
  771188cf
Oct 19, 2007

Local spiller optimization: · 35ff7937

Evan Cheng authored Oct 19, 2007

Turn a store folding instruction into a load folding instruction. e.g.
     xorl  %edi, %eax
     movl  %eax, -32(%ebp)
     movl  -36(%ebp), %eax
     orl   %eax, -32(%ebp)
=>
     xorl  %edi, %eax
     orl   -36(%ebp), %eax
     mov   %eax, -32(%ebp)
This enables the unfolding optimization for a subsequent instruction which will
also eliminate the newly introduced store instruction.

llvm-svn: 43192

35ff7937

Don't branch fold inline asm statements. · ac5c9304
Bill Wendling authored Oct 19, 2007
```
llvm-svn: 43191
```
ac5c9304