Skip to content
  1. Jan 05, 2010
  2. Jan 04, 2010
  3. Jan 03, 2010
    • Chris Lattner's avatar
      pull my debug hooks out, I'm done with this xform for now. · 48218e42
      Chris Lattner authored
      llvm-svn: 92446
      48218e42
    • Nick Lewycky's avatar
      Small cleanups, refactor some duplicated code into a single method. No · 475d3d12
      Nick Lewycky authored
      functionality change.
      
      llvm-svn: 92445
      475d3d12
    • Chris Lattner's avatar
      generalize the previous transformation to handle indexing into · fca0c8f9
      Chris Lattner authored
      arrays of structs and other arrays, so long as all the subsequent
      indexes are constants.  This triggers frequently for stuff like:
      
      @divisions = internal constant [29 x [2 x i32]] [[2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 2], [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2]], align 32 ; <[29 x [2 x i32]]*> [#uses=50]
      
      	  %623 = getelementptr inbounds [29 x [2 x i32]]* @divisions, i64 0, i64 %619, i64 0 ; <i32*> [#uses=1]
      	   %684 = icmp eq i32 %683, 999 
      
      also for the "my_defs" table in 'gs', etc.
      
      llvm-svn: 92444
      fca0c8f9
    • Nick Lewycky's avatar
      Cleanup. · ff9cd7ac
      Nick Lewycky authored
      llvm-svn: 92436
      ff9cd7ac
  4. Jan 02, 2010
    • Chris Lattner's avatar
      teach instcombine to optimize idioms like A[i]&42 == 0. This · 98ad2b56
      Chris Lattner authored
      occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which
      is copied in multiple apps) in _sch_istable, etc.
      
      llvm-svn: 92427
      98ad2b56
    • Chris Lattner's avatar
      Teach the table lookup optimization to generate range compares · b56bef45
      Chris Lattner authored
      when a consequtive sequence of elements all satisfies the 
      predicate.  Like the double compare case, this generates better
      code than the magic constant case and generalizes to more than
      32/64 element array lookups.
      
      Here are some examples where it triggers.  From 403.gcc, most
      accesses to the rtx_class array are handled, e.g.:
      
      @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=547]
         %142 = icmp eq i8 %141, 105
      @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=543]
      	   %165 = icmp eq i8 %164, 60      
      
      Also, most of the 59-element arrays (mode_class/rid_to_yy, etc) 
      optimized before are actually range compares.  This lets 32-bit
      machines optimize them.
      
      400.perlbmk has stuff like this:
      
      400.perlbmk: PL_regkind, even for 32-bit:
      @PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]*> [#uses=4]
      	   %811 = icmp ne i8 %810, 33 
      
      @PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]*> [#uses=94]
      	   %12 = icmp ult i8 %10, 2
                 
      etc.
      
      llvm-svn: 92426
      b56bef45
    • Chris Lattner's avatar
      theoretically the negate we find could be in a different function, check · e199d2df
      Chris Lattner authored
      for this case.
      
      llvm-svn: 92425
      e199d2df
    • Chris Lattner's avatar
      use enums for the over/underdefined markers for clarity. Switch · 2fa4ec70
      Chris Lattner authored
      to using -2/-3 instead of -1/-2 for a future xform.
      
      llvm-svn: 92423
      2fa4ec70
    • Chris Lattner's avatar
      remove the random sampling framework, which is not maintained anymore. · 351e22aa
      Chris Lattner authored
      If there is interest, it can be resurrected from SVN.  PR4912.
      
      llvm-svn: 92422
      351e22aa
    • Nick Lewycky's avatar
      a67519be
    • Nick Lewycky's avatar
      Optimize pointer comparison into the typesafe form, now that the backends will · 357d41b3
      Nick Lewycky authored
      handle them efficiently. This is the opposite direction of the transformation
      we used to have here.
      
      llvm-svn: 92418
      357d41b3
    • Chris Lattner's avatar
      Generalize the previous xform to handle cases where exactly · cfda435c
      Chris Lattner authored
      two elements match or don't match with two comparisons.  For
      example, the testcase compiles into:
      
      define i1 @test5(i32 %X) {
        %1 = icmp eq i32 %X, 2                          ; <i1> [#uses=1]
        %2 = icmp eq i32 %X, 7                          ; <i1> [#uses=1]
        %R = or i1 %1, %2                               ; <i1> [#uses=1]
        ret i1 %R
      }
      
      This generalizes the previous xforms when the array is larger than
      64 elements (and this case matches) and generates better code for
      cases where it overlaps with the magic bitshift case.
      
      This generalizes more cases than you might expect.  For example,
      400.perlbmk has:
      
      @PL_utf8skip = constant [256 x i8] c"\01\01\01\...
      %15 = icmp ult i8 %7, 7
      
      403.gcc has:
      @rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ...
      %18 = icmp eq i16 %16, 295 
      
      and xalancbmk has a bunch of examples, such as 
      _ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE.
      
      llvm-svn: 92417
      cfda435c
    • Chris Lattner's avatar
      fix a miscompilation I introduced of cdecl with a late change. · c6ac0784
      Chris Lattner authored
      llvm-svn: 92416
      c6ac0784
    • Chris Lattner's avatar
      enhance the compare/load/index optimization to work on *any* load · 935a4a60
      Chris Lattner authored
      from a global with 32/64 elements or less (depending on whether
      i64 is native on the target), generating a bitshift idiom to 
      determine the result.  For example, on test4 we produce:
      
      define i1 @test4(i32 %X) {
        %1 = lshr i32 933, %X                           ; <i32> [#uses=1]
        %2 = and i32 %1, 1                              ; <i32> [#uses=1]
        %R = icmp ne i32 %2, 0                          ; <i1> [#uses=1]
        ret i1 %R
      }
      
      This triggers in a number of interesting cases, for example, here's an
      fp case:
      @A.3255 = internal constant [4 x double] [double 4.100000e+00, double -3.900000e+00, double -1.000000e+00, double 1.000000e+00], align 32 ; <[4 x double]*> [#uses=7]
      ...
      	   %7 = fcmp olt double %3, 0.000000e+00
      
      In this case we make the slen2_tab global dead, which is nice:
      @slen2_tab = internal constant [16 x i32] [i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 2, i32 3], align 32 ; <[16 x i32]*> [#uses=1]
      ...
      	   %204 = icmp eq i32 %46, 0     
      
      Perl has a bunch of these, also on the 'Perl_regkind' array:
      @Perl_yygindex = internal constant [51 x i16] [i16 0, i16 0, i16 0, i16 0, i16 374, i16 351, i16 0, i16 -12, i16 0, i16 946, i16 413, i16 -83, i16 0, i16 0, i16 0, i16 -311, i16 -13, i16 4007, i16 2893, i16 0, i16 0, i16 0, i16 0, i16 0, i16 372, i16 -8, i16 0, i16 0, i16 246, i16 -131, i16 43, i16 86, i16 208, i16 -45, i16 -169, i16 987, i16 0, i16 0, i16 0, i16 0, i16 308, i16 0, i16 -271, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0], align 32 ; <[51 x i16]*> [#uses=1]
      ...
        %1364 = icmp eq i16 %1361, 0
      
      186.crafty really likes this on 64-bit machines, because it triggers on a bunch of globals like this:
      @white_outpost = internal constant [64 x i8] c"\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\02\02\00\00\00\00\00\04\05\05\04\00\00\00\00\03\06\06\03\00\00\00\00\00\01\01\00\00\00\00\00\00\00\00\00\00\00", align 32 ; <[64 x i8]*> [#uses=2]
      
      However the big winner is 403.gcc, which triggers hundreds of times, eliminating all the accesses to the 57-element arrays 'mode_class', mode_unit_size, mode_bitsize, regclass_map, etc.
      
      go 64-bit machines :)
      
      llvm-svn: 92415
      935a4a60
    • Chris Lattner's avatar
      enhance the previous optimization to work with fcmp in addition · b1567bd5
      Chris Lattner authored
      to icmp.
      
      llvm-svn: 92412
      b1567bd5
    • Chris Lattner's avatar
      Teach instcombine to fold compares of loads from constant · a061859c
      Chris Lattner authored
      arrays with variable indices into a comparison of the index
      with a constant.  The most common occurrence of this that
      I see by far is stuff like:
      
      if ("foobar"[i] == '\0') ...
      
      which we compile into: if (i == 6), saving a load and 
      materialization of the global address.  This also exposes 
      loop trip count information to later passes in many cases.
      
      This triggers hundreds of times in xalancbmk, which is where I first
      noticed it, but it also triggers in many other apps.  Here are a few 
      interesting ones from various apps:
      
      @must_be_connected_without = internal constant [8 x i8*] [i8* getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8*]*> [#uses=2]
        %scevgep.i = getelementptr [8 x i8*]* @must_be_connected_without, i64 0, i64 %indvar.i ; <i8**> [#uses=1]
        %17 = load ...
        %18 = icmp eq i8* %17, null                     ; <i1> [#uses=1]
      -> icmp eq i64 %indvar.i, 7 
      
      
      @yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11*\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]*> [#uses=2]
        %57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8*> [#uses=1]
         %mode.0.in = getelementptr inbounds [9 x i32]* @mb_mode_table, i64 0, i64 %.pn ; <i32*> [#uses=1]
      load ...
         %64 = icmp eq i8 %58, 4                         ; <i1> [#uses=1]
      -> icmp eq i64 %.pn, 35             ; <i1> [#uses=0]
      
      
      @gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767]
      %scevgep.i = getelementptr [4 x i16]* @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1]
      %425 = load %scevgep.i
      %426 = icmp eq i16 %425, -32768                 ; <i1> [#uses=0]
      -> false
      
      llvm-svn: 92411
      a061859c
    • Chris Lattner's avatar
      remove the instcombine transformations that are inserting nasty · 2e4be2c3
      Chris Lattner authored
      pointer to int casts that confuse later optimizations.  See PR3351
      for details.
      
      This improves but doesn't complete fix 483.xalancbmk because llvm-gcc
      does this xform in GCC's "fold" routine as well.  Clang++ will do
      better I guess.
      
      llvm-svn: 92408
      2e4be2c3
    • Chris Lattner's avatar
      add a simple instcombine xform, simplify another one to use hasAllZeroIndices() · faf1337a
      Chris Lattner authored
      instead of hand rolling a loop.
      
      llvm-svn: 92403
      faf1337a
  5. Jan 01, 2010
Loading