- Jan 16, 2011
-
-
Chris Lattner authored
then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. llvm-svn: 123571
-
Chris Lattner authored
of a constant. llvm-svn: 123570
-
Chris Lattner authored
llvm-svn: 123569
-
Chris Lattner authored
multiple uses. In some cases, all the uses are the same operation, so instcombine can go ahead and promote the phi. In the testcase this pushes an add out of the loop. llvm-svn: 123568
-
Evan Cheng authored
llvm-svn: 123567
-
Chris Lattner authored
first line of the function because it isn't a good idea, even for compares. llvm-svn: 123566
-
Chris Lattner authored
llvm-svn: 123565
-
Chris Lattner authored
llvm-svn: 123564
-
Owen Anderson authored
of the stored value to the new store type is always. Also, add a testcase. llvm-svn: 123563
-
Chris Lattner authored
llvm-svn: 123562
-
Venkatraman Govindaraju authored
llvm-svn: 123561
-
Chris Lattner authored
llvm-svn: 123560
-
Chris Lattner authored
multi-instruction sequences like calls. Many thanks to Jakob for finding a testcase. llvm-svn: 123559
-
Chris Lattner authored
llvm-svn: 123558
-
Michael J. Spencer authored
Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1." llvm-svn: 123557
-
Chandler Carruth authored
llvm-svn: 123556
-
Chris Lattner authored
it so the bots go green llvm-svn: 123555
-
Chris Lattner authored
llvm-svn: 123554
-
- Jan 15, 2011
-
-
Michael J. Spencer authored
llvm-svn: 123553
-
Michael J. Spencer authored
llvm-svn: 123552
-
Michael J. Spencer authored
llvm-svn: 123551
-
Michael J. Spencer authored
Support/GraphWriter: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1. llvm-svn: 123550
-
Benjamin Kramer authored
llvm-svn: 123549
-
Michael J. Spencer authored
llvm-svn: 123548
-
Benjamin Kramer authored
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter, especially when counting 64 bit population on a 32 bit target. I hope this is fast enough to replace Kernighan-style counting loops even when the input is rather sparse. llvm-svn: 123547
-
Michael J. Spencer authored
llvm-svn: 123546
-
Michael J. Spencer authored
llvm-svn: 123545
-
Michael J. Spencer authored
llvm-svn: 123544
-
Nick Lewycky authored
llvm-svn: 123543
-
Ken Dyck authored
Add toCharUnitsInBits() to simplify the many calls to CharUnits::fromQuantity() of the form CharUnits::fromQuantity(bitSize, Context.getCharWidth()). llvm-svn: 123542
-
Nick Lewycky authored
opporuntities. Fixes PR8978. llvm-svn: 123541
-
Oscar Fuentes authored
Patch by arrowdodger! llvm-svn: 123539
-
Francois Pichet authored
llvm-svn: 123538
-
Benjamin Kramer authored
llvm-svn: 123537
-
Nick Lewycky authored
Also, replace tabs with spaces. Yes, it's 2011. llvm-svn: 123535
-
Nick Lewycky authored
half a million non-local queries, each of which would otherwise have triggered a linear scan over a basic block. Also fix a fixme for memory intrinsics which dereference pointers. With this, we prove that a pointer is non-null because it was dereferenced by an intrinsic 112 times in llvm-test. llvm-svn: 123533
-
Rafael Espindola authored
llvm-svn: 123531
-
Rafael Espindola authored
llvm-svn: 123530
-
Rafael Espindola authored
llvm-svn: 123529
-
Chris Lattner authored
realize that ConstantFoldTerminator doesn't preserve dominfo. llvm-svn: 123527
-