Skip to content
  1. Jan 13, 2014
    • Chandler Carruth's avatar
      Re-sort #include lines again, prior to moving headers around. · 07baed53
      Chandler Carruth authored
      llvm-svn: 199080
      07baed53
    • Chandler Carruth's avatar
      [PM] Wire up support for writing bitcode with new PM. · b7bdfd65
      Chandler Carruth authored
      This moves the old pass creation functionality to its own header and
      updates the callers of that routine. Then it adds a new PM supporting
      bitcode writer to the header file, and wires that up in the opt tool.
      A test is added that round-trips code into bitcode and back out using
      the new pass manager.
      
      llvm-svn: 199078
      b7bdfd65
    • Kevin Qin's avatar
      [AArch64 NEON] Add missing patterns for bitcast from or to v1f64 · cfef55d6
      Kevin Qin authored
      llvm-svn: 199070
      cfef55d6
    • Kevin Qin's avatar
      [AArch64 NEON] Add more scenarios to use perm instructions when lowering shuffle_vector · 21e8f1c4
      Kevin Qin authored
      This patch covered 2 more scenarios:
      
      1.  Two operands of shuffle_vector are the same, like
      %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> %a, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14>
      
      2. One of operands is undef, like
      %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> undef, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14>
      
      After this patch, perm instructions will have chance to be emitted instead of lots of INS.
      
      llvm-svn: 199069
      21e8f1c4
    • Saleem Abdulrasool's avatar
      correct target directive handling error handling · a6505ca4
      Saleem Abdulrasool authored
      The target specific parser should return `false' if the target AsmParser handles
      the directive, and `true' if the generic parser should handle the directive.
      Many of the target specific directive handlers would `return Error' which does
      not follow these semantics.  This change simply changes the target specific
      routines to conform to the semantis of the ParseDirective correctly.
      
      Conformance to the semantics improves diagnostics emitted for the invalid
      directives.  X86 is taken as a sample to ensure that multiple diagnostics are
      not presented for a single error.
      
      llvm-svn: 199068
      a6505ca4
  2. Jan 12, 2014
  3. Jan 11, 2014
    • Arnold Schwaighofer's avatar
      LoopVectorizer: Enable strided memory accesses versioning per default · 66c742ae
      Arnold Schwaighofer authored
      I saw no compile or execution time regressions on x86_64 -mavx -O3.
      
      radar://13075509
      
      llvm-svn: 199015
      66c742ae
    • Venkatraman Govindaraju's avatar
      [Sparc] Bundle instruction with delay slow and its filler. Now, we can use... · 0653218b
      Venkatraman Govindaraju authored
      [Sparc] Bundle instruction with delay slow and its filler. Now, we can use -verify-machineinstrs with SPARC backend.
      
      llvm-svn: 199014
      0653218b
    • Alp Toker's avatar
      Fix 'ned' typo in doc comment · 798060e0
      Alp Toker authored
      Patch by Jasper Neumann!
      
      llvm-svn: 199007
      798060e0
    • Chandler Carruth's avatar
      [PM] Add names to passes under the new pass manager, and a debug output · a13f27cc
      Chandler Carruth authored
      mode that can be used to debug the execution of everything.
      
      No support for analyses here, that will come later. This already helps
      show parts of the opt commandline integration that isn't working. Tests
      of that will start using it as the bugs are fixed.
      
      llvm-svn: 199004
      a13f27cc
    • NAKAMURA Takumi's avatar
      LoopVectorize.cpp: Appease MSC16. · 41c409ce
      NAKAMURA Takumi authored
      Excuse me, I hope msc16 builders would be fine till its end day.
      Introduce nullptr then. ;)
      
      llvm-svn: 199001
      41c409ce
    • Juergen Ributzka's avatar
      [anyregcc] Fix callee-save mask for anyregcc · 976d94b8
      Juergen Ributzka authored
      Use separate callee-save masks for XMM and YMM registers for anyregcc on X86 and
      select the proper mask depending on the target cpu we compile for.
      
      llvm-svn: 198985
      976d94b8
    • Eric Christopher's avatar
      Revert r198979 - accidental commit. · 942f22c4
      Eric Christopher authored
      llvm-svn: 198981
      942f22c4
    • Eric Christopher's avatar
      Reformat. · ceec7b02
      Eric Christopher authored
      llvm-svn: 198980
      ceec7b02
    • Eric Christopher's avatar
      Update function name and add some helpful comments. · 67cde9ac
      Eric Christopher authored
      llvm-svn: 198979
      67cde9ac
    • Diego Novillo's avatar
      Extend and simplify the sample profile input file. · 9518b63b
      Diego Novillo authored
      1- Use the line_iterator class to read profile files.
      
      2- Allow comments in profile file. Lines starting with '#'
         are completely ignored while reading the profile.
      
      3- Add parsing support for discriminators and indirect call samples.
      
         Our external profiler can emit more profile information that we are
         currently not handling. This patch does not add new functionality to
         support this information, but it allows profile files to provide it.
      
         I will add actual support later on (for at least one of these
         features, I need support for DWARF discriminators in Clang).
      
         A sample line may contain the following additional information:
      
         Discriminator. This is used if the sampled program was compiled with
         DWARF discriminator support
         (http://wiki.dwarfstd.org/index.php?title=Path_Discriminators). This
         is currently only emitted by GCC and we just ignore it.
      
         Potential call targets and samples. If present, this line contains a
         call instruction. This models both direct and indirect calls. Each
         called target is listed together with the number of samples. For
         example,
      
                          130: 7  foo:3  bar:2  baz:7
      
         The above means that at relative line offset 130 there is a call
         instruction that calls one of foo(), bar() and baz(). With baz()
         being the relatively more frequent call target.
      
         Differential Revision: http://llvm-reviews.chandlerc.com/D2355
      
      4- Simplify format of profile input file.
      
         This implements earlier suggestions to simplify the format of the
         sample profile file. The symbol table is not necessary and function
         profiles do not need to know the number of samples in advance.
      
         Differential Revision: http://llvm-reviews.chandlerc.com/D2419
      
      llvm-svn: 198973
      9518b63b
    • Diego Novillo's avatar
      Propagation of profile samples through the CFG. · 0accb3d2
      Diego Novillo authored
      This adds a propagation heuristic to convert instruction samples
      into branch weights. It implements a similar heuristic to the one
      implemented by Dehao Chen on GCC.
      
      The propagation proceeds in 3 phases:
      
      1- Assignment of block weights. All the basic blocks in the function
         are initial assigned the same weight as their most frequently
         executed instruction.
      
      2- Creation of equivalence classes. Since samples may be missing from
         blocks, we can fill in the gaps by setting the weights of all the
         blocks in the same equivalence class to the same weight. To compute
         the concept of equivalence, we use dominance and loop information.
         Two blocks B1 and B2 are in the same equivalence class if B1
         dominates B2, B2 post-dominates B1 and both are in the same loop.
      
      3- Propagation of block weights into edges. This uses a simple
         propagation heuristic. The following rules are applied to every
         block B in the CFG:
      
         - If B has a single predecessor/successor, then the weight
           of that edge is the weight of the block.
      
         - If all the edges are known except one, and the weight of the
           block is already known, the weight of the unknown edge will
           be the weight of the block minus the sum of all the known
           edges. If the sum of all the known edges is larger than B's weight,
           we set the unknown edge weight to zero.
      
         - If there is a self-referential edge, and the weight of the block is
           known, the weight for that edge is set to the weight of the block
           minus the weight of the other incoming edges to that block (if
           known).
      
      Since this propagation is not guaranteed to finalize for every CFG, we
      only allow it to proceed for a limited number of iterations (controlled
      by -sample-profile-max-propagate-iterations). It currently uses the same
      GCC default of 100.
      
      Before propagation starts, the pass builds (for each block) a list of
      unique predecessors and successors. This is necessary to handle
      identical edges in multiway branches. Since we visit all blocks and all
      edges of the CFG, it is cleaner to build these lists once at the start
      of the pass.
      
      Finally, the patch fixes the computation of relative line locations.
      The profiler emits lines relative to the function header. To discover
      it, we traverse the compilation unit looking for the subprogram
      corresponding to the function. The line number of that subprogram is the
      line where the function begins. That becomes line zero for all the
      relative locations.
      
      llvm-svn: 198972
      0accb3d2
  4. Jan 10, 2014
Loading