- Dec 13, 2010
-
-
Chris Lattner authored
when the wider type is legal. This allows us to compile: define zeroext i16 @test1(i16 zeroext %x) nounwind { entry: %div = udiv i16 %x, 33 ret i16 %div } into: test1: # @test1 movzwl 4(%esp), %eax imull $63551, %eax, %eax # imm = 0xF83F shrl $21, %eax ret instead of: test1: # @test1 movw $-1985, %ax # imm = 0xFFFFFFFFFFFFF83F mulw 4(%esp) andl $65504, %edx # imm = 0xFFE0 movl %edx, %eax shrl $5, %eax ret Implementing rdar://8760399 and example #4 from: http://blog.regehr.org/archives/320 We should implement the same thing for [su]mul_hilo, but I don't have immediate plans to do this. llvm-svn: 121696
-
Chris Lattner authored
'and' case. llvm-svn: 121695
-
Chandler Carruth authored
llvm-svn: 121694
-
Chris Lattner authored
I can track down a miscompile. This should bring the buildbots back to life llvm-svn: 121693
-
Chandler Carruth authored
would return true if the initializer pointer union had *any* non-null pointer in it, even if the pointer wasn't one that would actually be returned via getInit(). This makes it more accurately model the logic of 'getInit() != NULL'. This still isn't completely satisfying. From a principled stance, I suspect we should make hasInit() and getInit() *always* return false and NULL (resp.) for ParmVarDecl. We shouldn't at the API level treat initializers and default arguments as the same thing. llvm-svn: 121692
-
Chris Lattner authored
for each constant pool entry. Using WriteTypeSymbolic here takes time proportional to the size of the module, for each constant pool entry. This speeds up -verbose-asm llc on 252.eon (a random testcase at my disposal) from 4.4s to 2.137s. llc takes 2.11s with asm-verbose off, so this is now a pretty reasonable cost for verbose comments. llvm-svn: 121691
-
Chris Lattner authored
when simplifying, allowing them to be eagerly turned into switches. This is the last step required to get "Example 7" from this blog post: http://blog.regehr.org/archives/320 On X86, we now generate this machine code, which (to my eye) seems better than the ICC generated code: _crud: ## @crud ## BB#0: ## %entry cmpb $33, %dil jb LBB0_4 ## BB#1: ## %switch.early.test addb $-34, %dil cmpb $58, %dil ja LBB0_3 ## BB#2: ## %switch.early.test movzbl %dil, %eax movabsq $288230376537592865, %rcx ## imm = 0x400000017001421 btq %rax, %rcx jb LBB0_4 LBB0_3: ## %lor.rhs xorl %eax, %eax ret LBB0_4: ## %lor.end movl $1, %eax ret llvm-svn: 121690
-
Chris Lattner authored
llvm-svn: 121689
-
Chris Lattner authored
per terminator kind. llvm-svn: 121688
-
Chris Lattner authored
llvm-svn: 121687
-
Chris Lattner authored
doing a cfg search for every block simplified. llvm-svn: 121686
-
Chris Lattner authored
llvm-svn: 121685
-
Chris Lattner authored
llvm-svn: 121684
-
Chris Lattner authored
getSinglePredecessor to simplify code. llvm-svn: 121683
-
Chris Lattner authored
llvm-svn: 121682
-
Chris Lattner authored
llvm-svn: 121681
-
Chris Lattner authored
'or sequence' that it doesn't understand. This allows us to optimize something insane like this: int crud (unsigned char c, unsigned x) { if(((((((((( (int) c <= 32 || (int) c == 46) || (int) c == 44) || (int) c == 58) || (int) c == 59) || (int) c == 60) || (int) c == 62) || (int) c == 34) || (int) c == 92) || (int) c == 39) != 0) foo(); } into: define i32 @crud(i8 zeroext %c, i32 %x) nounwind ssp noredzone { entry: %cmp = icmp ult i8 %c, 33 br i1 %cmp, label %if.then, label %switch.early.test switch.early.test: ; preds = %entry switch i8 %c, label %if.end [ i8 39, label %if.then i8 44, label %if.then i8 58, label %if.then i8 59, label %if.then i8 60, label %if.then i8 62, label %if.then i8 46, label %if.then i8 92, label %if.then i8 34, label %if.then ] by pulling the < comparison out ahead of the newly formed switch. llvm-svn: 121680
-
Chris Lattner authored
llvm-svn: 121679
-
Chris Lattner authored
llvm-svn: 121678
-
Evan Cheng authored
llvm-svn: 121677
-
Chris Lattner authored
llvm-svn: 121676
-
Chris Lattner authored
llvm-svn: 121675
-
Chris Lattner authored
bootstrap buildbot tripped over. llvm-svn: 121674
-
Chris Lattner authored
llvm-svn: 121673
-
Chris Lattner authored
llvm-svn: 121672
-
Chris Lattner authored
or'd conditions. Previously we'd compile something like this: int crud (unsigned char c) { return c == 62 || c == 34 || c == 92; } into: switch i8 %c, label %lor.rhs [ i8 62, label %lor.end i8 34, label %lor.end ] lor.rhs: ; preds = %entry %cmp8 = icmp eq i8 %c, 92 br label %lor.end lor.end: ; preds = %entry, %entry, %lor.rhs %0 = phi i1 [ true, %entry ], [ %cmp8, %lor.rhs ], [ true, %entry ] %lor.ext = zext i1 %0 to i32 ret i32 %lor.ext which failed to merge the compare-with-92 into the switch. With this patch we simplify this all the way to: switch i8 %c, label %lor.rhs [ i8 62, label %lor.end i8 34, label %lor.end i8 92, label %lor.end ] lor.rhs: ; preds = %entry br label %lor.end lor.end: ; preds = %entry, %entry, %entry, %lor.rhs %0 = phi i1 [ true, %entry ], [ false, %lor.rhs ], [ true, %entry ], [ true, %entry ] %lor.ext = zext i1 %0 to i32 ret i32 %lor.ext which is much better for codegen's switch lowering stuff. This kicks in 33 times on 176.gcc (for example) cutting 103 instructions off the generated code. llvm-svn: 121671
-
Chris Lattner authored
llvm-svn: 121670
-
Chris Lattner authored
llvm-svn: 121669
-
Chris Lattner authored
location in simplifycfg. In the old days, SimplifyCFG was never run on the entry block, so we had to scan over all preds of the BB passed into simplifycfg to do this xform, now we can just check blocks ending with a condbranch. This avoids a scan over all preds of every simplified block, which should be a significant compile-time perf win on functions with lots of edges. No functionality change. llvm-svn: 121668
-
Chris Lattner authored
llvm-svn: 121667
-
Bill Wendling authored
class A<bit a, bits<3> x, bits<3> y> { bits<3> z; let z = !if(a, x, y); } The variable z will get the value of x when 'a' is 1 and 'y' when a is '0'. llvm-svn: 121666
-
Chandler Carruth authored
cases. First, omit all builtin overloads when no non-record type is in the set of candidate types. Second, avoid arithmetic type overloads for non-arithmetic or enumeral types (counting vector types as arithmetic due to Clang extensions). When heavily using constructs such as STL's '<<' based stream logging, this can have a significant impact. One logging-heavy test case's compile time dropped by 10% with this. Self-host shows 1-2% improvement in compile time, but that's likely in the noise. llvm-svn: 121665
-
Chris Lattner authored
llvm-svn: 121664
-
Sean Callanan authored
very minor changes, changing how we get the target type from a TypedefType, adding a parameter to EnumDecl::Create(), and other minor tweaks. llvm-svn: 121663
-
Chris Lattner authored
llvm-svn: 121662
-
Bill Wendling authored
llvm-svn: 121661
-
Bill Wendling authored
llvm-svn: 121660
-
Chris Lattner authored
llvm-svn: 121659
-
Chris Lattner authored
llvm-svn: 121658
-
Chris Lattner authored
llvm-svn: 121657
-