Commits · cba4c33949d991b50bd83988d1e0bfe1beb12420 · Roger Ferrer / llvm-epi-0.8

Jan 16, 2011

Only put unnamed_addr constants in mergeable sections. Fixes PR8297. · cba4c339
Rafael Espindola authored Jan 16, 2011
```
llvm-svn: 123585
```
cba4c339

Don't merge two constants if we care about the address of both. · 751677a0

Rafael Espindola authored Jan 16, 2011

This fixes the original testcase in PR8927. It also causes a clang
binary built with a patched clang to increase in size by 0.21%.

We can probably get some of the size back by writing a pass that
detects that a global never has its pointer compared and adds
unnamed_addr to it (maybe extend global opt). It is also possible that
there are some other cases clang could add unnamed_addr to.

I will investigate extending globalopt next.

llvm-svn: 123584

751677a0

Make a mandatory call to DestroyThread() in ~LinuxThread(). · 811975d5
Stephen Wilson authored Jan 16, 2011
```
llvm-svn: 123583
```
811975d5
Emit an extension diagnostic for C99 designated initializers that appear in C++ code · c124e59c
Douglas Gregor authored Jan 16, 2011
```
llvm-svn: 123582
```
c124e59c

Tweak the partial ordering rules for function templates to prefer a · cef1a03e

Douglas Gregor authored Jan 16, 2011

non-variadic function template over a variadic one. This matches GCC
and the intent of the C++0x wording, in a way that I think is likely
to be acceptable to the committee.

llvm-svn: 123581

cef1a03e

Simplify the construction and destruction of Uses. Simplify · bbb91f2b
Jay Foad authored Jan 16, 2011
```
User::dropHungOffUses().

llvm-svn: 123580
```
bbb91f2b
Reduce and merge testcases. · ec3b10fc
Owen Anderson authored Jan 16, 2011
```
llvm-svn: 123579
```
ec3b10fc

fix PR8514, a bug where the "heroic" transformation of shift/and · 35a2e65b

Chris Lattner authored Jan 16, 2011

into and/shift would cause nodes to move around and a dangling pointer
to happen.  The code tried to avoid this with a HandleSDNode, but 
got the details wrong.

llvm-svn: 123578

35a2e65b

Remove unnecessary specialization OperandTraits<User>. · 5ded9df8
Jay Foad authored Jan 16, 2011
```
llvm-svn: 123577
```
5ded9df8

improve compatibility with GCC: when generating the ".d" filename to use · 906bb904

Chris Lattner authored Jan 16, 2011

and the filename has multiple .'s in it, use the last.  For example, "foo.bar.cpp"
should produce "foo.bar.d" not "foo.d".  Patch by Johan Boule in PR8391

llvm-svn: 123576

906bb904

Move the implementation of the User class into a new source file, · 59809c7a
Jay Foad authored Jan 16, 2011
```
User.cpp.

llvm-svn: 123575
```
59809c7a
fix PR8932, a case where arg promotion could infinitely promote. · e5f8de86
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123574
```
e5f8de86
simplify a little · ed1fb92c
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123573
```
ed1fb92c
add some commentary · c326ebd1
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123572
```
c326ebd1

if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, · 6fab2e94

Chris Lattner authored Jan 16, 2011

then don't try to decimate it into its individual pieces.  This will just make a mess of the
IR and is pointless if none of the elements are individually accessed.  This was generating
really terrible code for std::bitset (PR8980) because it happens to be lowered by clang
as an {[8 x i8]} structure instead of {i64}.

The testcase now is optimized to:

define i64 @test2(i64 %X) {
  br label %L2

L2:                                               ; preds = %0
  ret i64 %X
}

before we generated:

define i64 @test2(i64 %X) {
  %sroa.store.elt = lshr i64 %X, 56
  %1 = trunc i64 %sroa.store.elt to i8
  %sroa.store.elt8 = lshr i64 %X, 48
  %2 = trunc i64 %sroa.store.elt8 to i8
  %sroa.store.elt9 = lshr i64 %X, 40
  %3 = trunc i64 %sroa.store.elt9 to i8
  %sroa.store.elt10 = lshr i64 %X, 32
  %4 = trunc i64 %sroa.store.elt10 to i8
  %sroa.store.elt11 = lshr i64 %X, 24
  %5 = trunc i64 %sroa.store.elt11 to i8
  %sroa.store.elt12 = lshr i64 %X, 16
  %6 = trunc i64 %sroa.store.elt12 to i8
  %sroa.store.elt13 = lshr i64 %X, 8
  %7 = trunc i64 %sroa.store.elt13 to i8
  %8 = trunc i64 %X to i8
  br label %L2

L2:                                               ; preds = %0
  %9 = zext i8 %1 to i64
  %10 = shl i64 %9, 56
  %11 = zext i8 %2 to i64
  %12 = shl i64 %11, 48
  %13 = or i64 %12, %10
  %14 = zext i8 %3 to i64
  %15 = shl i64 %14, 40
  %16 = or i64 %15, %13
  %17 = zext i8 %4 to i64
  %18 = shl i64 %17, 32
  %19 = or i64 %18, %16
  %20 = zext i8 %5 to i64
  %21 = shl i64 %20, 24
  %22 = or i64 %21, %19
  %23 = zext i8 %6 to i64
  %24 = shl i64 %23, 16
  %25 = or i64 %24, %22
  %26 = zext i8 %7 to i64
  %27 = shl i64 %26, 8
  %28 = or i64 %27, %25
  %29 = zext i8 %8 to i64
  %30 = or i64 %29, %28
  ret i64 %30
}

In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough
PHIs are in play that instcombine backs off.  It's better to not generate this stuff
in the first place.

llvm-svn: 123571

6fab2e94

Use an irbuilder to get some trivial constant folding when doing a store · 7cd8cf7d
Chris Lattner authored Jan 16, 2011
```
of a constant.

llvm-svn: 123570
```
7cd8cf7d
remove a dead check, this was needed before we had an explicit veto on uses of phis. · adb1a233
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123569
```
adb1a233

enhance FoldOpIntoPhi in instcombine to try harder when a phi has · d55581de

Chris Lattner authored Jan 16, 2011

multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.

llvm-svn: 123568

d55581de

Spill R4 if it's going to be used to restore SP from FP. · 572756ac
Evan Cheng authored Jan 16, 2011
```
llvm-svn: 123567
```
572756ac
remove the AllowAggressive argument to FoldOpIntoPhi. It is forced to false in the · ea7131a0
Chris Lattner authored Jan 16, 2011
```
first line of the function because it isn't a good idea, even for compares.

llvm-svn: 123566
```
ea7131a0
more cleanups: use the IR builder. · ff2e7377
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123565
```
ff2e7377
tidy up code. · 25ce2805
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123564
```
25ce2805
Improve the safety of my globalopt enhancement by ensuring that the bitcast · 4e54efd6
Owen Anderson authored Jan 16, 2011
```
of the stored value to the new store type is always.  Also, add a testcase.

llvm-svn: 123563
```
4e54efd6
fix PR8983, a broken assertion. · 08f43456
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123562
```
08f43456
Implement AnalyzeBranch in Sparc Backend. · 1b0e2cbf
Venkatraman Govindaraju authored Jan 16, 2011
```
llvm-svn: 123561
```
1b0e2cbf
fix PR8981, a crash trying to form a conditional inc with a floating point compare. · 218092e6
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123560
```
218092e6
reapply my fix for PR8961 with a tweak to properly handle · 2d186574
Chris Lattner authored Jan 16, 2011
```
multi-instruction sequences like calls.  Many thanks to Jakob for
finding a testcase.

llvm-svn: 123559
```
2d186574
simplify this code, it is still broken but will follow up on llvm-commits. · 8b4952fc
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123558
```
8b4952fc
Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external... · 2ff30b84
Michael J. Spencer authored Jan 16, 2011
```
Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1."

llvm-svn: 123557
```
2ff30b84
Simplify a README.txt entry significantly to expose the core issue. · ef28abef
Chandler Carruth authored Jan 16, 2011
```
llvm-svn: 123556
```
ef28abef
one of michael's recent patches broke this, temporarily disable · c703334f
Chris Lattner authored Jan 16, 2011
```
it so the bots go green

llvm-svn: 123555
```
c703334f
remove the partial specialization pass. It is unmaintained and has bugs. · 1e209b87
Chris Lattner authored Jan 16, 2011
```
llvm-svn: 123554
```
1e209b87

Jan 15, 2011
- AST/InheritViz: Remove all internal uses of PathV1. · a0acb46b
  Michael J. Spencer authored Jan 15, 2011
```
llvm-svn: 123553
```
  a0acb46b
- Archive: Fix spelling. · 53dcdc74
  Michael J. Spencer authored Jan 15, 2011
```
llvm-svn: 123552
```
  53dcdc74
- Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1. · a0ce7632
  Michael J. Spencer authored Jan 15, 2011
```
llvm-svn: 123551
```
  a0ce7632
- Support/GraphWriter: Replace all internal uses of PathV1 with PathV2. The... · 8685f387
  Michael J. Spencer authored Jan 15, 2011
```
Support/GraphWriter: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1.

llvm-svn: 123550
```
  8685f387
- Add an assert so we don't silently miscompile ctpop for bit widths > 128. · bec03ea7
  Benjamin Kramer authored Jan 15, 2011
```
llvm-svn: 123549
```
  bec03ea7
- Support/PathV2: Add identify_magic. · 94b2ab35
  Michael J. Spencer authored Jan 15, 2011
```
llvm-svn: 123548
```
  94b2ab35
- Reimplement CTPOP legalization with the "best" algorithm from · fff2517e
  Benjamin Kramer authored Jan 15, 2011
```
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old
code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter,
especially when counting 64 bit population on a 32 bit target.

I hope this is fast enough to replace Kernighan-style counting loops even when
the input is rather sparse.

llvm-svn: 123547
```
  fff2517e
- Unittests/Support/Path: Tweak test. · b587180f
  Michael J. Spencer authored Jan 15, 2011
```
llvm-svn: 123546
```
  b587180f