Skip to content
  1. Jan 11, 2012
    • Chandler Carruth's avatar
      Unify the interface of the three mask+shift transform helpers, and · 3dbcda84
      Chandler Carruth authored
      factor the differences that were hiding in one of them into its other
      caller, the SRL handling code. No change in behavior.
      
      llvm-svn: 147940
      3dbcda84
    • Chandler Carruth's avatar
      Clarify and make explicit some of the requirements for transforming · aa01e666
      Chandler Carruth authored
      mask+shift pairs at the beginning of the ISD::AND case block, and then
      hoist the final pattern into a helper function, simplifying and
      reflowing it appropriately. This should have no observable behavior
      change, but several simplifications fell out of this such as directly
      computing the new mask constant, etc.
      
      llvm-svn: 147939
      aa01e666
    • Jakob Stoklund Olesen's avatar
      Fix undefined code and reenable test case. · 60399837
      Jakob Stoklund Olesen authored
      I don't think the compact encoding code is right, but at least is has
      defined behavior now.
      
      llvm-svn: 147938
      60399837
    • Chandler Carruth's avatar
      Hoist the logic to transform shift+mask combinations into sub-register · 51d3076b
      Chandler Carruth authored
      extracts and scaled addressing modes into its own helper function. No
      functionality changed here, just hoisting and layout fixes falling out
      of that hoisting.
      
      llvm-svn: 147937
      51d3076b
    • Chandler Carruth's avatar
      Teach the X86 instruction selection to do some heroic transforms to · 55b2cdee
      Chandler Carruth authored
      detect a pattern which can be implemented with a small 'shl' embedded in
      the addressing mode scale. This happens in real code as follows:
      
        unsigned x = my_accelerator_table[input >> 11];
      
      Here we have some lookup table that we look into using the high bits of
      'input'. Each entity in the table is 4-bytes, which means this
      implicitly gets turned into (once lowered out of a GEP):
      
        *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));
      
      The shift right followed by a shift left is canonicalized to a smaller
      shift right and masking off the low bits. That hides the shift right
      which x86 has an addressing mode designed to support. We now detect
      masks of this form, and produce the longer shift right followed by the
      proper addressing mode. In addition to saving a (rather large)
      instruction, this also reduces stalls in Intel chips on benchmarks I've
      measured.
      
      In order for all of this to work, one part of the DAG needs to be
      canonicalized *still further* than it currently is. This involves
      removing pointless 'trunc' nodes between a zextload and a zext. Without
      that, we end up generating spurious masks and hiding the pattern.
      
      llvm-svn: 147936
      55b2cdee
  2. Jan 10, 2012
  3. Jan 09, 2012
  4. Jan 08, 2012
  5. Jan 07, 2012
  6. Jan 05, 2012
  7. Jan 04, 2012
  8. Jan 03, 2012
Loading