Skip to content
  1. Jan 12, 2014
  2. Jan 11, 2014
    • Benjamin Kramer's avatar
      Fix broken CHECK lines. · c10563d1
      Benjamin Kramer authored
      llvm-svn: 199016
      c10563d1
    • Arnold Schwaighofer's avatar
      LoopVectorizer: Enable strided memory accesses versioning per default · 66c742ae
      Arnold Schwaighofer authored
      I saw no compile or execution time regressions on x86_64 -mavx -O3.
      
      radar://13075509
      
      llvm-svn: 199015
      66c742ae
    • Venkatraman Govindaraju's avatar
      [Sparc] Bundle instruction with delay slow and its filler. Now, we can use... · 0653218b
      Venkatraman Govindaraju authored
      [Sparc] Bundle instruction with delay slow and its filler. Now, we can use -verify-machineinstrs with SPARC backend.
      
      llvm-svn: 199014
      0653218b
    • Alp Toker's avatar
      lit: Provide source locations in cfg files with older Python versions · 74997190
      Alp Toker authored
      This commit prospectively brings the benefits of r198766 to older supported
      Python versions (2.5+).
      
      Tested with Python 2.6, 2.7, 3.1 and 3.3 (!)
      
      llvm-svn: 199009
      74997190
    • Alp Toker's avatar
      Fix 'ned' typo in doc comment · 798060e0
      Alp Toker authored
      Patch by Jasper Neumann!
      
      llvm-svn: 199007
      798060e0
    • Alp Toker's avatar
      lit: execfile() isn't present in Python 3.3 · f0a24594
      Alp Toker authored
      On the other hand, exec(compile()) doesn't work in older Python versions in the
      2.x series.
      
      This commit introduces exec(compile()) with a fallback to plain exec(). That'll
      hopefully hit the sweet spot in terms of version support.
      
      Followup to r198766 which added enhanced source locations for lit cfg parsing.
      
      llvm-svn: 199006
      f0a24594
    • Chandler Carruth's avatar
      [PM] Actually nest pass managers correctly when parsing the pass · 258dbb3b
      Chandler Carruth authored
      pipeline string. Add tests that cover this now that we have execution
      dumping in the pass managers.
      
      llvm-svn: 199005
      258dbb3b
    • Chandler Carruth's avatar
      [PM] Add names to passes under the new pass manager, and a debug output · a13f27cc
      Chandler Carruth authored
      mode that can be used to debug the execution of everything.
      
      No support for analyses here, that will come later. This already helps
      show parts of the opt commandline integration that isn't working. Tests
      of that will start using it as the bugs are fixed.
      
      llvm-svn: 199004
      a13f27cc
    • Chandler Carruth's avatar
      [PM] Somehow I missed the header guards on this file. Yikes! · d7693d84
      Chandler Carruth authored
      llvm-svn: 199003
      d7693d84
    • NAKAMURA Takumi's avatar
      LoopVectorize.cpp: Appease MSC16. · 41c409ce
      NAKAMURA Takumi authored
      Excuse me, I hope msc16 builders would be fine till its end day.
      Introduce nullptr then. ;)
      
      llvm-svn: 199001
      41c409ce
    • NAKAMURA Takumi's avatar
    • NAKAMURA Takumi's avatar
      llvm/test/CodeGen/X86/anyregcc.ll: Add explicit -mtriple=x86_64-unknown-unknown. · 80a474c1
      NAKAMURA Takumi authored
      XMM(s) are really spilling for targeting Win64.
      
      llvm-svn: 198999
      80a474c1
    • Chandler Carruth's avatar
      [PM] Add (very skeletal) support to opt for running the new pass · 66445382
      Chandler Carruth authored
      manager. I cannot emphasize enough that this is a WIP. =] I expect it
      to change a great deal as things stabilize, but I think its really
      important to get *some* functionality here so that the infrastructure
      can be tested more traditionally from the commandline.
      
      The current design is looking something like this:
      
        ./bin/opt -passes='module(pass_a,pass_b,function(pass_c,pass_d))'
      
      So rather than custom-parsed flags, there is a single flag with a string
      argument that is parsed into the pass pipeline structure. This makes it
      really easy to have nice structural properties that are very explicit.
      There is one obvious and important shortcut. You can start off the
      pipeline with a pass, and the minimal context of pass managers will be
      built around the entire specified pipeline. This makes the common case
      for tests super easy:
      
        ./bin/opt -passes=instcombine,sroa,gvn
      
      But this won't introduce any of the complexity of the fully inferred old
      system -- we only ever do this for the *entire* argument, and we only
      look at the first pass. If the other passes don't fit in the pass
      manager selected it is a hard error.
      
      The other interesting aspect here is that I'm not relying on any
      registration facilities. Such facilities may be unavoidable for
      supporting plugins, but I have alternative ideas for plugins that I'd
      like to try first. My plan is essentially to build everything without
      registration until we hit an absolute requirement.
      
      Instead of registration of pass names, there will be a library dedicated
      to parsing pass names and the pass pipeline strings described above.
      Currently, this is directly embedded into opt for simplicity as it is
      very early, but I plan to eventually pull this into a library that opt,
      bugpoint, and even Clang can depend on. It should end up as a good home
      for things like the existing PassManagerBuilder as well.
      
      There are a bunch of FIXMEs in the code for the parts of this that are
      just stubbed out to make the patch more incremental. A quick list of
      what's coming up directly after this:
      - Support for function passes and building the structured nesting.
      - Support for printing the pass structure, and FileCheck tests of all of
        this code.
      - The .def-file based pass name parsing.
      - IR priting passes and the corresponding tests.
      
      Some obvious things that I'm not going to do right now, but am
      definitely planning on as the pass manager work gets a bit further:
      - Pull the parsing into library, including the builders.
      - Thread the rest of the target stuff into the new pass manager.
      - Wire support for the new pass manager up to llc.
      - Plugin support.
      
      Some things that I'd like to have, but are significantly lower on my
      priority list. I'll get to these eventually, but they may also be places
      where others want to contribute:
      - Adding nice error reporting for broken pass pipeline descriptions.
      - Typo-correction for pass names.
      
      llvm-svn: 198998
      66445382
    • Juergen Ributzka's avatar
      [anyregcc] Fix callee-save mask for anyregcc · 976d94b8
      Juergen Ributzka authored
      Use separate callee-save masks for XMM and YMM registers for anyregcc on X86 and
      select the proper mask depending on the target cpu we compile for.
      
      llvm-svn: 198985
      976d94b8
    • Eric Christopher's avatar
      Revert r198979 - accidental commit. · 942f22c4
      Eric Christopher authored
      llvm-svn: 198981
      942f22c4
    • Eric Christopher's avatar
      Reformat. · ceec7b02
      Eric Christopher authored
      llvm-svn: 198980
      ceec7b02
    • Eric Christopher's avatar
      Update function name and add some helpful comments. · 67cde9ac
      Eric Christopher authored
      llvm-svn: 198979
      67cde9ac
    • Eric Christopher's avatar
      Fix odd whitespace. · a052e12c
      Eric Christopher authored
      llvm-svn: 198978
      a052e12c
    • Diego Novillo's avatar
      Extend and simplify the sample profile input file. · 9518b63b
      Diego Novillo authored
      1- Use the line_iterator class to read profile files.
      
      2- Allow comments in profile file. Lines starting with '#'
         are completely ignored while reading the profile.
      
      3- Add parsing support for discriminators and indirect call samples.
      
         Our external profiler can emit more profile information that we are
         currently not handling. This patch does not add new functionality to
         support this information, but it allows profile files to provide it.
      
         I will add actual support later on (for at least one of these
         features, I need support for DWARF discriminators in Clang).
      
         A sample line may contain the following additional information:
      
         Discriminator. This is used if the sampled program was compiled with
         DWARF discriminator support
         (http://wiki.dwarfstd.org/index.php?title=Path_Discriminators). This
         is currently only emitted by GCC and we just ignore it.
      
         Potential call targets and samples. If present, this line contains a
         call instruction. This models both direct and indirect calls. Each
         called target is listed together with the number of samples. For
         example,
      
                          130: 7  foo:3  bar:2  baz:7
      
         The above means that at relative line offset 130 there is a call
         instruction that calls one of foo(), bar() and baz(). With baz()
         being the relatively more frequent call target.
      
         Differential Revision: http://llvm-reviews.chandlerc.com/D2355
      
      4- Simplify format of profile input file.
      
         This implements earlier suggestions to simplify the format of the
         sample profile file. The symbol table is not necessary and function
         profiles do not need to know the number of samples in advance.
      
         Differential Revision: http://llvm-reviews.chandlerc.com/D2419
      
      llvm-svn: 198973
      9518b63b
    • Diego Novillo's avatar
      Propagation of profile samples through the CFG. · 0accb3d2
      Diego Novillo authored
      This adds a propagation heuristic to convert instruction samples
      into branch weights. It implements a similar heuristic to the one
      implemented by Dehao Chen on GCC.
      
      The propagation proceeds in 3 phases:
      
      1- Assignment of block weights. All the basic blocks in the function
         are initial assigned the same weight as their most frequently
         executed instruction.
      
      2- Creation of equivalence classes. Since samples may be missing from
         blocks, we can fill in the gaps by setting the weights of all the
         blocks in the same equivalence class to the same weight. To compute
         the concept of equivalence, we use dominance and loop information.
         Two blocks B1 and B2 are in the same equivalence class if B1
         dominates B2, B2 post-dominates B1 and both are in the same loop.
      
      3- Propagation of block weights into edges. This uses a simple
         propagation heuristic. The following rules are applied to every
         block B in the CFG:
      
         - If B has a single predecessor/successor, then the weight
           of that edge is the weight of the block.
      
         - If all the edges are known except one, and the weight of the
           block is already known, the weight of the unknown edge will
           be the weight of the block minus the sum of all the known
           edges. If the sum of all the known edges is larger than B's weight,
           we set the unknown edge weight to zero.
      
         - If there is a self-referential edge, and the weight of the block is
           known, the weight for that edge is set to the weight of the block
           minus the weight of the other incoming edges to that block (if
           known).
      
      Since this propagation is not guaranteed to finalize for every CFG, we
      only allow it to proceed for a limited number of iterations (controlled
      by -sample-profile-max-propagate-iterations). It currently uses the same
      GCC default of 100.
      
      Before propagation starts, the pass builds (for each block) a list of
      unique predecessors and successors. This is necessary to handle
      identical edges in multiway branches. Since we visit all blocks and all
      edges of the CFG, it is cleaner to build these lists once at the start
      of the pass.
      
      Finally, the patch fixes the computation of relative line locations.
      The profiler emits lines relative to the function header. To discover
      it, we traverse the compilation unit looking for the subprogram
      corresponding to the function. The line number of that subprogram is the
      line where the function begins. That becomes line zero for all the
      relative locations.
      
      llvm-svn: 198972
      0accb3d2
    • Tom Roeder's avatar
      Space formatting fix for r198966. · 583a77e0
      Tom Roeder authored
      llvm-svn: 198971
      583a77e0
  3. Jan 10, 2014
Loading