Skip to content
  1. Jan 04, 2010
  2. Jan 03, 2010
    • Chris Lattner's avatar
      fix PR5930, allowing the asmprinter to emit difference between · 1dae8766
      Chris Lattner authored
      two labels as a truncate.
      
      llvm-svn: 92455
      1dae8766
    • Chris Lattner's avatar
      add PR# · f6a585fc
      Chris Lattner authored
      llvm-svn: 92451
      f6a585fc
    • Chris Lattner's avatar
      differences between two blockaddress's don't cause a · a7cfc43a
      Chris Lattner authored
      global variable initializer to require relocations.
      
      llvm-svn: 92450
      a7cfc43a
    • Chris Lattner's avatar
      generalize the previous transformation to handle indexing into · fca0c8f9
      Chris Lattner authored
      arrays of structs and other arrays, so long as all the subsequent
      indexes are constants.  This triggers frequently for stuff like:
      
      @divisions = internal constant [29 x [2 x i32]] [[2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 2], [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2]], align 32 ; <[29 x [2 x i32]]*> [#uses=50]
      
      	  %623 = getelementptr inbounds [29 x [2 x i32]]* @divisions, i64 0, i64 %619, i64 0 ; <i32*> [#uses=1]
      	   %684 = icmp eq i32 %683, 999 
      
      also for the "my_defs" table in 'gs', etc.
      
      llvm-svn: 92444
      fca0c8f9
  3. Jan 02, 2010
    • Chris Lattner's avatar
      teach instcombine to optimize idioms like A[i]&42 == 0. This · 98ad2b56
      Chris Lattner authored
      occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which
      is copied in multiple apps) in _sch_istable, etc.
      
      llvm-svn: 92427
      98ad2b56
    • Chris Lattner's avatar
      Teach the table lookup optimization to generate range compares · b56bef45
      Chris Lattner authored
      when a consequtive sequence of elements all satisfies the 
      predicate.  Like the double compare case, this generates better
      code than the magic constant case and generalizes to more than
      32/64 element array lookups.
      
      Here are some examples where it triggers.  From 403.gcc, most
      accesses to the rtx_class array are handled, e.g.:
      
      @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=547]
         %142 = icmp eq i8 %141, 105
      @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=543]
      	   %165 = icmp eq i8 %164, 60      
      
      Also, most of the 59-element arrays (mode_class/rid_to_yy, etc) 
      optimized before are actually range compares.  This lets 32-bit
      machines optimize them.
      
      400.perlbmk has stuff like this:
      
      400.perlbmk: PL_regkind, even for 32-bit:
      @PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]*> [#uses=4]
      	   %811 = icmp ne i8 %810, 33 
      
      @PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]*> [#uses=94]
      	   %12 = icmp ult i8 %10, 2
                 
      etc.
      
      llvm-svn: 92426
      b56bef45
    • Nick Lewycky's avatar
      a67519be
    • Nick Lewycky's avatar
      Optimize pointer comparison into the typesafe form, now that the backends will · 357d41b3
      Nick Lewycky authored
      handle them efficiently. This is the opposite direction of the transformation
      we used to have here.
      
      llvm-svn: 92418
      357d41b3
    • Chris Lattner's avatar
      Generalize the previous xform to handle cases where exactly · cfda435c
      Chris Lattner authored
      two elements match or don't match with two comparisons.  For
      example, the testcase compiles into:
      
      define i1 @test5(i32 %X) {
        %1 = icmp eq i32 %X, 2                          ; <i1> [#uses=1]
        %2 = icmp eq i32 %X, 7                          ; <i1> [#uses=1]
        %R = or i1 %1, %2                               ; <i1> [#uses=1]
        ret i1 %R
      }
      
      This generalizes the previous xforms when the array is larger than
      64 elements (and this case matches) and generates better code for
      cases where it overlaps with the magic bitshift case.
      
      This generalizes more cases than you might expect.  For example,
      400.perlbmk has:
      
      @PL_utf8skip = constant [256 x i8] c"\01\01\01\...
      %15 = icmp ult i8 %7, 7
      
      403.gcc has:
      @rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ...
      %18 = icmp eq i16 %16, 295 
      
      and xalancbmk has a bunch of examples, such as 
      _ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE.
      
      llvm-svn: 92417
      cfda435c
    • Chris Lattner's avatar
      enhance the compare/load/index optimization to work on *any* load · 935a4a60
      Chris Lattner authored
      from a global with 32/64 elements or less (depending on whether
      i64 is native on the target), generating a bitshift idiom to 
      determine the result.  For example, on test4 we produce:
      
      define i1 @test4(i32 %X) {
        %1 = lshr i32 933, %X                           ; <i32> [#uses=1]
        %2 = and i32 %1, 1                              ; <i32> [#uses=1]
        %R = icmp ne i32 %2, 0                          ; <i1> [#uses=1]
        ret i1 %R
      }
      
      This triggers in a number of interesting cases, for example, here's an
      fp case:
      @A.3255 = internal constant [4 x double] [double 4.100000e+00, double -3.900000e+00, double -1.000000e+00, double 1.000000e+00], align 32 ; <[4 x double]*> [#uses=7]
      ...
      	   %7 = fcmp olt double %3, 0.000000e+00
      
      In this case we make the slen2_tab global dead, which is nice:
      @slen2_tab = internal constant [16 x i32] [i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 2, i32 3], align 32 ; <[16 x i32]*> [#uses=1]
      ...
      	   %204 = icmp eq i32 %46, 0     
      
      Perl has a bunch of these, also on the 'Perl_regkind' array:
      @Perl_yygindex = internal constant [51 x i16] [i16 0, i16 0, i16 0, i16 0, i16 374, i16 351, i16 0, i16 -12, i16 0, i16 946, i16 413, i16 -83, i16 0, i16 0, i16 0, i16 -311, i16 -13, i16 4007, i16 2893, i16 0, i16 0, i16 0, i16 0, i16 0, i16 372, i16 -8, i16 0, i16 0, i16 246, i16 -131, i16 43, i16 86, i16 208, i16 -45, i16 -169, i16 987, i16 0, i16 0, i16 0, i16 0, i16 308, i16 0, i16 -271, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0], align 32 ; <[51 x i16]*> [#uses=1]
      ...
        %1364 = icmp eq i16 %1361, 0
      
      186.crafty really likes this on 64-bit machines, because it triggers on a bunch of globals like this:
      @white_outpost = internal constant [64 x i8] c"\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\02\02\00\00\00\00\00\04\05\05\04\00\00\00\00\03\06\06\03\00\00\00\00\00\01\01\00\00\00\00\00\00\00\00\00\00\00", align 32 ; <[64 x i8]*> [#uses=2]
      
      However the big winner is 403.gcc, which triggers hundreds of times, eliminating all the accesses to the 57-element arrays 'mode_class', mode_unit_size, mode_bitsize, regclass_map, etc.
      
      go 64-bit machines :)
      
      llvm-svn: 92415
      935a4a60
    • Chris Lattner's avatar
      enhance the previous optimization to work with fcmp in addition · b1567bd5
      Chris Lattner authored
      to icmp.
      
      llvm-svn: 92412
      b1567bd5
    • Chris Lattner's avatar
      Teach instcombine to fold compares of loads from constant · a061859c
      Chris Lattner authored
      arrays with variable indices into a comparison of the index
      with a constant.  The most common occurrence of this that
      I see by far is stuff like:
      
      if ("foobar"[i] == '\0') ...
      
      which we compile into: if (i == 6), saving a load and 
      materialization of the global address.  This also exposes 
      loop trip count information to later passes in many cases.
      
      This triggers hundreds of times in xalancbmk, which is where I first
      noticed it, but it also triggers in many other apps.  Here are a few 
      interesting ones from various apps:
      
      @must_be_connected_without = internal constant [8 x i8*] [i8* getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8*]*> [#uses=2]
        %scevgep.i = getelementptr [8 x i8*]* @must_be_connected_without, i64 0, i64 %indvar.i ; <i8**> [#uses=1]
        %17 = load ...
        %18 = icmp eq i8* %17, null                     ; <i1> [#uses=1]
      -> icmp eq i64 %indvar.i, 7 
      
      
      @yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11*\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]*> [#uses=2]
        %57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8*> [#uses=1]
         %mode.0.in = getelementptr inbounds [9 x i32]* @mb_mode_table, i64 0, i64 %.pn ; <i32*> [#uses=1]
      load ...
         %64 = icmp eq i8 %58, 4                         ; <i1> [#uses=1]
      -> icmp eq i64 %.pn, 35             ; <i1> [#uses=0]
      
      
      @gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767]
      %scevgep.i = getelementptr [4 x i16]* @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1]
      %425 = load %scevgep.i
      %426 = icmp eq i16 %425, -32768                 ; <i1> [#uses=0]
      -> false
      
      llvm-svn: 92411
      a061859c
    • Chris Lattner's avatar
      remove the instcombine transformations that are inserting nasty · 2e4be2c3
      Chris Lattner authored
      pointer to int casts that confuse later optimizations.  See PR3351
      for details.
      
      This improves but doesn't complete fix 483.xalancbmk because llvm-gcc
      does this xform in GCC's "fold" routine as well.  Clang++ will do
      better I guess.
      
      llvm-svn: 92408
      2e4be2c3
    • Chris Lattner's avatar
      allow this to work on linux hosts. · 909c71c9
      Chris Lattner authored
      llvm-svn: 92407
      909c71c9
    • Chris Lattner's avatar
      Teach codegen to handle: · 1eea3b0a
      Chris Lattner authored
       (X != null) | (Y != null) --> (X|Y) != 0
       (X == null) & (Y == null) --> (X|Y) == 0
      
      so that instcombine can stop doing this for pointers.  This is part of PR3351,
      which is a case where instcombine doing this for pointers (inserting ptrtoint)
      is pessimizing code.
      
      llvm-svn: 92406
      1eea3b0a
    • Chris Lattner's avatar
      rename file. · 6eef072e
      Chris Lattner authored
      llvm-svn: 92405
      6eef072e
    • Chris Lattner's avatar
      add a simple instcombine xform, simplify another one to use hasAllZeroIndices() · faf1337a
      Chris Lattner authored
      instead of hand rolling a loop.
      
      llvm-svn: 92403
      faf1337a
  4. Jan 01, 2010
  5. Dec 31, 2009
  6. Dec 30, 2009
  7. Dec 29, 2009
Loading