Commits · ef206f19a4be428092b173339df138a3c17e1687 · Roger Ferrer / llvm-epi-0.8

Oct 15, 2012

Update the memcpy rewriting to fully support widened int rewriting. This · 49c8eea3

Chandler Carruth authored Oct 15, 2012

includes extracting ints for copying elsewhere and inserting ints when
copying into the alloca. This should fix the CanSROA assertion coming
out of Clang's regression test suite.

llvm-svn: 165931

49c8eea3

Follow-up fix to r165928: handle memset rewriting for widened integers, · 9d966a20

Chandler Carruth authored Oct 15, 2012

and generally clean up the memset handling. It had rotted a bit as the
other rewriting logic got polished more.

llvm-svn: 165930

9d966a20

First major step toward addressing PR14059. This teaches SROA to handle · 435c4e07

Chandler Carruth authored Oct 15, 2012

cases where we have partial integer loads and stores to an otherwise
promotable alloca to widen[1] those loads and stores to cover the entire
alloca and bitcast them into the appropriate type such that promotion
can proceed.

These partial loads and stores stem from an annoying confluence of ARM's
calling convention and ABI lowering and the FCA pre-splitting which
takes place in SROA. Clang lowers a { double, double } in-register
function argument as a [4 x i32] function argument to ensure it is
placed into integer 32-bit registers (a really unnerving implicit
contract between Clang and the ARM backend I would add). This results in
a FCA load of [4 x i32]* from the { double, double } alloca, and SROA
decomposes this into a sequence of i32 loads and stores. Inlining
proceeds, code gets folded, but at the end of the day, we still have i32
stores to the low and high halves of a double alloca. Widening these to
be i64 operations, and bitcasting them to double prior to loading or
storing allows promotion to proceed for these allocas.

I looked quite a bit changing the IR which Clang produces for this case
to be more friendly, but small changes seem unlikely to help. I think
the best representation we could use currently would be to pass 4 i32
arguments thereby avoiding any FCAs, but that would still require this
fix. It seems like it might eventually be nice to somehow encode the ABI
register selection choices outside of the parameter type system so that
the parameter can be a { double, double }, but the CC register
annotations indicate that this should be passed via 4 integer registers.

This patch does not address the second problem in PR14059, which is the
reverse: when a struct alloca is loaded as a *larger* single integer.

This patch also does not address some of the code quality issues with
the FCA-splitting. Those don't actually impede any optimizations really,
but they're on my list to clean up.

[1]: Pedantic footnote: for those concerned about memory model issues
here, this is safe. For the alloca to be promotable, it cannot escape or
have any use of its address that could allow these loads or stores to be
racing. Thus, widening is always safe.

llvm-svn: 165928

435c4e07

Hoist the canConvertValue predicate and the convertValue transform out · aa6afbb8
Chandler Carruth authored Oct 15, 2012
```
into static helper functions. They're really quite generic and are going
to be needed elsewhere shortly.

llvm-svn: 165927
```
aa6afbb8

Add an enum for the return and function indexes into the AttrListPtr object.... · fbd38fe2

Bill Wendling authored Oct 15, 2012

Add an enum for the return and function indexes into the AttrListPtr object. This gets rid of some magic numbers.

llvm-svn: 165924

fbd38fe2

Attributes Rewrite · d079a446

Bill Wendling authored Oct 15, 2012

Convert the internal representation of the Attributes class into a pointer to an
opaque object that's uniqued by and stored in the LLVMContext object. The
Attributes class then becomes a thin wrapper around this opaque
object. Eventually, the internal representation will be expanded to include
attributes that represent code generation options, etc.

llvm-svn: 165917

d079a446

instcombine: Migrate strcmp and strncmp optimizations · 40b6fac3

Meador Inge authored Oct 15, 2012

This patch migrates the strcmp and strncmp optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.

llvm-svn: 165915

40b6fac3

Oct 14, 2012

Unquadratize SetVector removal loops in DSE. · 650b1dbd

Benjamin Kramer authored Oct 14, 2012

Erasing from the beginning or middle of the vector is expensive, remove_if can
do it in linear time even though it's a bit ugly without lambdas.

No functionality change.

llvm-svn: 165903

650b1dbd

Remove the bitwise assignment OR operator from the Attributes class. Replace... · 722b26c0

Bill Wendling authored Oct 14, 2012

Remove the bitwise assignment OR operator from the Attributes class. Replace it with the equivalent from the builder class.

llvm-svn: 165895

722b26c0

Remove the bitwise XOR operator from the Attributes class. Replace it with the... · a05b043c
Bill Wendling authored Oct 14, 2012
```
Remove the bitwise XOR operator from the Attributes class. Replace it with the equivalent from the builder class.

llvm-svn: 165893
```
a05b043c

Oct 13, 2012

instcombine: Migrate strchr and strrchr optimizations · 17418508

Meador Inge authored Oct 13, 2012

This patch migrates the strchr and strrchr optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.

llvm-svn: 165875

17418508

instcombine: Migrate strcat and strncat optimizations · 7fb2f737

Meador Inge authored Oct 13, 2012

This patch migrates the strcat and strncat optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.

llvm-svn: 165874

7fb2f737

Teach SROA to cope with wrapper aggregates. These show up a lot in ABI · ba931992

Chandler Carruth authored Oct 13, 2012

type coercion code, especially when targetting ARM. Things like [1
x i32] instead of i32 are very common there.

The goal of this logic is to ensure that when we are picking an alloca
type, we look through such wrapper aggregates and across any zero-length
aggregate elements to find the simplest type possible to form a type
partition.

This logic should (generally speaking) rarely fire. It only ends up
kicking in when an alloca is accessed using two different types (for
instance, i32 and float), and the underlying alloca type has wrapper
aggregates around it. I noticed a significant amount of this occurring
looking at stepanov_abstraction generated code for arm, and suspect it
happens elsewhere as well.

Note that this doesn't yet address truly heinous IR productions such as
PR14059 is concerning. Those result in mismatched *sizes* of types in
addition to mismatched access and alloca types.

llvm-svn: 165870

ba931992

Speculatively harden the conversion logic. I have no idea if this will · 482c6178

Chandler Carruth authored Oct 13, 2012

help the dragonegg builders, and no test case at this point, but this
was one dimly plausible case I spotted by inspection. Hopefully will get
a testcase from those bots soon-ish, and will tidy this up with proper
testing.

llvm-svn: 165869

482c6178

Silence a warning in -assert builds. · 0fb8a778
Chandler Carruth authored Oct 13, 2012
```
llvm-svn: 165867
```
0fb8a778

Clean up how we rewrite loads and stores to the whole alloca. When these · 891fec0b

Chandler Carruth authored Oct 13, 2012

are single value types, the load and store should be directly based upon
the alloca and then bitcasting can fix the type as needed afterward.
This might in theory improve some of the IR coming out of SROA, but
I don't expect big changes yet and don't have any test cases on hand.
This is really just a cleanup/refactoring patch. The next patch will
cause this code path to be hit a lot more, actually get SROA to promote
more allocas and include several more test cases.

llvm-svn: 165864

891fec0b

Oct 11, 2012

Revert 165732 for further review. · 0c61134d
Micah Villmow authored Oct 11, 2012
```
llvm-svn: 165747
```
0c61134d

Add in the first iteration of support for llvm/clang/lldb to allow variable... · 08318973

Micah Villmow authored Oct 11, 2012

Add in the first iteration of support for llvm/clang/lldb to allow variable per address space pointer sizes to be optimized correctly.

llvm-svn: 165726

08318973

· e1032873

Nadav Rotem authored Oct 10, 2012

Add a new interface to allow IR-level passes to access codegen-specific information.

llvm-svn: 165665

e1032873

Oct 10, 2012
- Remove the final bits of Attributes being declared in the Attribute · bbcdf4e2
  Bill Wendling authored Oct 10, 2012
```
namespace. Use the attribute's enum value instead. No functionality change
intended.

llvm-svn: 165610
```
  bbcdf4e2
Oct 09, 2012

Update EarlyCSE's SimpleValues to use Hashing.h for their hashes. Expanded the... · 336cb79f

Michael Ilseman authored Oct 09, 2012

Update EarlyCSE's SimpleValues to use Hashing.h for their hashes. Expanded the hashing and equality to allow for equality modulo commutativity for binary ops, and comparisons with swapping of predicates.

llvm-svn: 165509

336cb79f

Use the enum value of the attributes when adding them to the attributes builder. · 93f70b78
Bill Wendling authored Oct 09, 2012
```
llvm-svn: 165494
```
93f70b78

Create enums for the different attributes. · c9b22d73

Bill Wendling authored Oct 09, 2012

We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.

llvm-svn: 165488

c9b22d73

Fix PR14034, an infloop / heap corruption / crash bug in the new SROA. · 503eb2bb
Chandler Carruth authored Oct 09, 2012
```
Thanks to Benjamin for the raw test case. This one took about 50 times
longer to reduce than to fix. =/

llvm-svn: 165476
```
503eb2bb
Fix. Apply the no capture attribute to the correct parameter. · f1c60d6d
Bill Wendling authored Oct 09, 2012
```
llvm-svn: 165469
```
f1c60d6d
Convert to using the Attributes::Builder class to create attributes. · c1e8e74c
Bill Wendling authored Oct 09, 2012
```
llvm-svn: 165468
```
c1e8e74c

· 35315fea

Nadav Rotem authored Oct 08, 2012

Refactor the AddrMode class out of TLI to its own header file.
This class is used by LSR and a number of places in the codegen.
This is the first step in de-coupling LSR from TLI, and creating
a new interface in between them.

llvm-svn: 165455

35315fea

Oct 08, 2012
- Move TargetData to DataLayout. · cdfe20b9
  Micah Villmow authored Oct 08, 2012
```
llvm-svn: 165402
```
  cdfe20b9
Oct 05, 2012

SROA.cpp: Fix a warning, [-Wunused-variable] · 605fe78a
NAKAMURA Takumi authored Oct 05, 2012
```
llvm-svn: 165309
```
605fe78a

Move this test a bit later, after the point at which we know that we either · 933db779

Duncan Sands authored Oct 05, 2012

have an alloca or a parameter, since then the alloca test should make sense
to readers, while before it probably appears too specific.  No functionality
change.

llvm-svn: 165306

933db779

Teach the new SROA a new trick. Now we zap any memcpy or memmoves which · e5b7a2cc

Chandler Carruth authored Oct 05, 2012

are in fact identity operations. We detect these and kill their
partitions so that even splitting is unaffected by them. This is
particularly important because Clang relies on emitting identity memcpy
operations for struct copies, and these fold away to constants very
often after inlining.

Fixes the last big performance FIXME I have on my plate.

llvm-svn: 165285

e5b7a2cc

Lift the speculation visitor above all the helpers that are targeted at · 90c4a3ae

Chandler Carruth authored Oct 05, 2012

the rewrite visitor to make the fact that the speculation is completely
independent a bit more clear.

I promise that this is just a cut/paste of the one visitor and adding
the annonymous namespace wrappings. The diff may look completely
preposterous, it does in git for some reason.

llvm-svn: 165284

90c4a3ae

Oct 04, 2012

This patch corrects commit 165126 by using an integer bit width instead of · 0d67f510
Preston Gurd authored Oct 04, 2012
```
a pointer to a type, in order to remove the uses of getGlobalContext().

Patch by Tyler Nowicki.

llvm-svn: 165255
```
0d67f510
Add a comment to the commit r165187. · e076cac0
Jakub Staszak authored Oct 04, 2012
```
llvm-svn: 165238
```
e076cac0

In my recent change to avoid use of underaligned memory I didn't notice that · a6d20010

Duncan Sands authored Oct 04, 2012

cpyDest can be mutated in some cases, which would then cause a crash later if
indeed the memory was underaligned.  This brought down several buildbots, so
I guess the underaligned case is much more common than I thought!

llvm-svn: 165228

a6d20010

Fix PR13969, a mini-phase-ordering issue with the new SROA pass. · ac8317fd

Chandler Carruth authored Oct 04, 2012

Currently, we re-visit allocas when something changes about the way they
might be *split* to allow better scalarization to take place. However,
we weren't handling the case when the *promotion* is what would change
the behavior of SROA. When an address derived from an alloca is stored
into another alloca, we consider the first to have escaped. If the
second is ever promoted to an SSA value, we will suddenly be able to run
the SROA pass on the first alloca.

This patch adds explicit support for this form if iteration. When we
detect a store of a pointer derived from an alloca, we flag the
underlying alloca for reprocessing after promotion. The logic works hard
to only do this when there is definitely going to be promotion and it
might remove impediments to the analysis of the alloca.

Thanks to Nick for the great test case and Benjamin for some sanity
check review.

llvm-svn: 165223

ac8317fd

The memcpy optimizer was happily doing call slot forwarding when the new memory · c6ada69a

Duncan Sands authored Oct 04, 2012

was less aligned than the old.  In the testcase this results in an overaligned
memset: the memset alignment was correct for the original memory but is too much
for the new memory.  Fix this by either increasing the alignment of the new
memory or bailing out if that isn't possible.  Should fix the gcc-4.7 self-host
buildbot failure.

llvm-svn: 165220

c6ada69a

Teach the integer-promotion rewrite strategy to be endianness aware. · 43c8b46d

Chandler Carruth authored Oct 04, 2012

Sorry for this being broken so long. =/

As part of this, switch all of the existing tests to be Little Endian,
which is the behavior I was asserting in them anyways! Add in a new
big-endian test that checks the interesting behavior there.

Another part of this is to tighten the rules abotu when we perform the
full-integer promotion. This logic now rejects cases where there fully
promoted integer is a non-multiple-of-8 bitwidth or cases where the
loads or stores touch bits which are in the allocated space of the
alloca but are not loaded or stored when accessing the integer. Sadly,
these aren't really observable today as the rest of the pass will
already ensure the invariants hold. However, the latter situation is
likely to become a potential concern in the future.

Thanks to Benjamin and Duncan for early review of this patch. I'm still
looking into whether there are further endianness issues, please let me
know if anyone sees BE failures persisting past this.

llvm-svn: 165219

43c8b46d

Use method to query for attributes. · e8619aa1
Bill Wendling authored Oct 04, 2012
```
llvm-svn: 165209
```
e8619aa1
Fix PR13967. · f8a81295
Jakub Staszak authored Oct 03, 2012
```
llvm-svn: 165187
```
f8a81295