- Jan 12, 2014
-
-
Venkatraman Govindaraju authored
llvm-svn: 199033
-
Saleem Abdulrasool authored
The implicit immediate 0 forms are assembly aliases, not distinct instruction encodings. Fix the initial implementation introduced in r198914 to an alias to avoid two separate instruction definitions for the same encoding. An InstAlias is insufficient in this case as the necessary due to the need to add a new additional operand for the implicit zero. By using the AsmPsuedoInst, fall back to the C++ code to transform the instruction to the equivalent _POST_IMM form, inserting the additional implicit immediate 0. llvm-svn: 199032
-
Venkatraman Govindaraju authored
llvm-svn: 199031
-
Jakob Stoklund Olesen authored
This is different from the argument passing convention which puts the first float argument in %f1. With this patch, all returned floats are treated as if the 'inreg' flag were set. This means multiple float return values get packed in %f0, %f1, %f2, ... Note that when returning a struct in registers, clang will set the 'inreg' flag on the return value, so that behavior is unchanged. This also happens when returning a float _Complex. llvm-svn: 199028
-
Joerg Sonnenberger authored
llvm-svn: 199027
-
Joerg Sonnenberger authored
assemble the various mul instructions. llvm-svn: 199026
-
Hans Wennborg authored
case when the lookup table doesn't have any holes. This means we can build a lookup table for switches like this: switch (x) { case 0: return 1; case 1: return 2; case 2: return 3; case 3: return 4; default: exit(1); } The default case doesn't yield a constant result here, but that doesn't matter, since a default result is only necessary for filling holes in the lookup table, and this table doesn't have any holes. This makes us transform 505 more switches in a clang bootstrap, and shaves 164 KB off the resulting clang binary. llvm-svn: 199025
-
Venkatraman Govindaraju authored
llvm-svn: 199024
-
Saleem Abdulrasool authored
A 32-bit immediate value can be formed from a constant expression and loaded into a register. Add support to emit this into an object file. Because this value is a constant, a relocation must *not* be produced for it. llvm-svn: 199023
-
- Jan 11, 2014
-
-
Benjamin Kramer authored
llvm-svn: 199016
-
Arnold Schwaighofer authored
I saw no compile or execution time regressions on x86_64 -mavx -O3. radar://13075509 llvm-svn: 199015
-
Venkatraman Govindaraju authored
[Sparc] Bundle instruction with delay slow and its filler. Now, we can use -verify-machineinstrs with SPARC backend. llvm-svn: 199014
-
Alp Toker authored
This commit prospectively brings the benefits of r198766 to older supported Python versions (2.5+). Tested with Python 2.6, 2.7, 3.1 and 3.3 (!) llvm-svn: 199009
-
Alp Toker authored
Patch by Jasper Neumann! llvm-svn: 199007
-
Alp Toker authored
On the other hand, exec(compile()) doesn't work in older Python versions in the 2.x series. This commit introduces exec(compile()) with a fallback to plain exec(). That'll hopefully hit the sweet spot in terms of version support. Followup to r198766 which added enhanced source locations for lit cfg parsing. llvm-svn: 199006
-
Chandler Carruth authored
pipeline string. Add tests that cover this now that we have execution dumping in the pass managers. llvm-svn: 199005
-
Chandler Carruth authored
mode that can be used to debug the execution of everything. No support for analyses here, that will come later. This already helps show parts of the opt commandline integration that isn't working. Tests of that will start using it as the bugs are fixed. llvm-svn: 199004
-
Chandler Carruth authored
llvm-svn: 199003
-
NAKAMURA Takumi authored
Excuse me, I hope msc16 builders would be fine till its end day. Introduce nullptr then. ;) llvm-svn: 199001
-
NAKAMURA Takumi authored
llvm-svn: 199000
-
NAKAMURA Takumi authored
XMM(s) are really spilling for targeting Win64. llvm-svn: 198999
-
Chandler Carruth authored
manager. I cannot emphasize enough that this is a WIP. =] I expect it to change a great deal as things stabilize, but I think its really important to get *some* functionality here so that the infrastructure can be tested more traditionally from the commandline. The current design is looking something like this: ./bin/opt -passes='module(pass_a,pass_b,function(pass_c,pass_d))' So rather than custom-parsed flags, there is a single flag with a string argument that is parsed into the pass pipeline structure. This makes it really easy to have nice structural properties that are very explicit. There is one obvious and important shortcut. You can start off the pipeline with a pass, and the minimal context of pass managers will be built around the entire specified pipeline. This makes the common case for tests super easy: ./bin/opt -passes=instcombine,sroa,gvn But this won't introduce any of the complexity of the fully inferred old system -- we only ever do this for the *entire* argument, and we only look at the first pass. If the other passes don't fit in the pass manager selected it is a hard error. The other interesting aspect here is that I'm not relying on any registration facilities. Such facilities may be unavoidable for supporting plugins, but I have alternative ideas for plugins that I'd like to try first. My plan is essentially to build everything without registration until we hit an absolute requirement. Instead of registration of pass names, there will be a library dedicated to parsing pass names and the pass pipeline strings described above. Currently, this is directly embedded into opt for simplicity as it is very early, but I plan to eventually pull this into a library that opt, bugpoint, and even Clang can depend on. It should end up as a good home for things like the existing PassManagerBuilder as well. There are a bunch of FIXMEs in the code for the parts of this that are just stubbed out to make the patch more incremental. A quick list of what's coming up directly after this: - Support for function passes and building the structured nesting. - Support for printing the pass structure, and FileCheck tests of all of this code. - The .def-file based pass name parsing. - IR priting passes and the corresponding tests. Some obvious things that I'm not going to do right now, but am definitely planning on as the pass manager work gets a bit further: - Pull the parsing into library, including the builders. - Thread the rest of the target stuff into the new pass manager. - Wire support for the new pass manager up to llc. - Plugin support. Some things that I'd like to have, but are significantly lower on my priority list. I'll get to these eventually, but they may also be places where others want to contribute: - Adding nice error reporting for broken pass pipeline descriptions. - Typo-correction for pass names. llvm-svn: 198998
-
Juergen Ributzka authored
Use separate callee-save masks for XMM and YMM registers for anyregcc on X86 and select the proper mask depending on the target cpu we compile for. llvm-svn: 198985
-
Eric Christopher authored
llvm-svn: 198981
-
Eric Christopher authored
llvm-svn: 198980
-
Eric Christopher authored
llvm-svn: 198979
-
Eric Christopher authored
llvm-svn: 198978
-
Diego Novillo authored
1- Use the line_iterator class to read profile files. 2- Allow comments in profile file. Lines starting with '#' are completely ignored while reading the profile. 3- Add parsing support for discriminators and indirect call samples. Our external profiler can emit more profile information that we are currently not handling. This patch does not add new functionality to support this information, but it allows profile files to provide it. I will add actual support later on (for at least one of these features, I need support for DWARF discriminators in Clang). A sample line may contain the following additional information: Discriminator. This is used if the sampled program was compiled with DWARF discriminator support (http://wiki.dwarfstd.org/index.php?title=Path_Discriminators). This is currently only emitted by GCC and we just ignore it. Potential call targets and samples. If present, this line contains a call instruction. This models both direct and indirect calls. Each called target is listed together with the number of samples. For example, 130: 7 foo:3 bar:2 baz:7 The above means that at relative line offset 130 there is a call instruction that calls one of foo(), bar() and baz(). With baz() being the relatively more frequent call target. Differential Revision: http://llvm-reviews.chandlerc.com/D2355 4- Simplify format of profile input file. This implements earlier suggestions to simplify the format of the sample profile file. The symbol table is not necessary and function profiles do not need to know the number of samples in advance. Differential Revision: http://llvm-reviews.chandlerc.com/D2419 llvm-svn: 198973
-
Diego Novillo authored
This adds a propagation heuristic to convert instruction samples into branch weights. It implements a similar heuristic to the one implemented by Dehao Chen on GCC. The propagation proceeds in 3 phases: 1- Assignment of block weights. All the basic blocks in the function are initial assigned the same weight as their most frequently executed instruction. 2- Creation of equivalence classes. Since samples may be missing from blocks, we can fill in the gaps by setting the weights of all the blocks in the same equivalence class to the same weight. To compute the concept of equivalence, we use dominance and loop information. Two blocks B1 and B2 are in the same equivalence class if B1 dominates B2, B2 post-dominates B1 and both are in the same loop. 3- Propagation of block weights into edges. This uses a simple propagation heuristic. The following rules are applied to every block B in the CFG: - If B has a single predecessor/successor, then the weight of that edge is the weight of the block. - If all the edges are known except one, and the weight of the block is already known, the weight of the unknown edge will be the weight of the block minus the sum of all the known edges. If the sum of all the known edges is larger than B's weight, we set the unknown edge weight to zero. - If there is a self-referential edge, and the weight of the block is known, the weight for that edge is set to the weight of the block minus the weight of the other incoming edges to that block (if known). Since this propagation is not guaranteed to finalize for every CFG, we only allow it to proceed for a limited number of iterations (controlled by -sample-profile-max-propagate-iterations). It currently uses the same GCC default of 100. Before propagation starts, the pass builds (for each block) a list of unique predecessors and successors. This is necessary to handle identical edges in multiway branches. Since we visit all blocks and all edges of the CFG, it is cleaner to build these lists once at the start of the pass. Finally, the patch fixes the computation of relative line locations. The profiler emits lines relative to the function header. To discover it, we traverse the compilation unit looking for the subprogram corresponding to the function. The line number of that subprogram is the line where the function begins. That becomes line zero for all the relative locations. llvm-svn: 198972
-
Tom Roeder authored
llvm-svn: 198971
-
- Jan 10, 2014
-
-
Roman Divacky authored
llvm-svn: 198969
-
Tom Roeder authored
llvm-svn: 198966
-
Tom Roeder authored
is needed for LLVMgold to load in ld. llvm-svn: 198965
-
Rafael Espindola authored
llvm-svn: 198960
-
Rafael Espindola authored
llvm-svn: 198959
-
Rafael Espindola authored
llvm-svn: 198958
-
Rafael Espindola authored
llvm-svn: 198955
-
Duncan P. N. Exon Smith authored
llvm-svn: 198954
-
Arnold Schwaighofer authored
for (i = 0; i < N; ++i) A[i * Stride1] += B[i * Stride2]; We take loops like this and check that the symbolic strides 'Strided1/2' are one and drop to the scalar loop if they are not. This is currently disabled by default and hidden behind the flag 'enable-mem-access-versioning'. radar://13075509 llvm-svn: 198950
-
Arnold Schwaighofer authored
An upcoming loop vectorizer commit will want to replace a SCEVUnknown(Value*) by a SCEVConstant. This commit modifies the SCEVParameterRewriter to support this. The SCEVParameterRewriter constructor can optionally specify to follow this behavior. llvm-svn: 198949
-