- Jan 13, 2014
-
-
Chandler Carruth authored
llvm-svn: 199080
-
Chandler Carruth authored
This moves the old pass creation functionality to its own header and updates the callers of that routine. Then it adds a new PM supporting bitcode writer to the header file, and wires that up in the opt tool. A test is added that round-trips code into bitcode and back out using the new pass manager. llvm-svn: 199078
-
Kevin Qin authored
llvm-svn: 199070
-
Kevin Qin authored
This patch covered 2 more scenarios: 1. Two operands of shuffle_vector are the same, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> %a, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> 2. One of operands is undef, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> undef, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> After this patch, perm instructions will have chance to be emitted instead of lots of INS. llvm-svn: 199069
-
Saleem Abdulrasool authored
The target specific parser should return `false' if the target AsmParser handles the directive, and `true' if the generic parser should handle the directive. Many of the target specific directive handlers would `return Error' which does not follow these semantics. This change simply changes the target specific routines to conform to the semantis of the ParseDirective correctly. Conformance to the semantics improves diagnostics emitted for the invalid directives. X86 is taken as a sample to ensure that multiple diagnostics are not presented for a single error. llvm-svn: 199068
-
- Jan 12, 2014
-
-
Jakob Stoklund Olesen authored
Targets like SPARC and MIPS have delay slots and normally bundle the delay slot instruction with the corresponding terminator. Teach isBlockOnlyReachableByFallthrough to find any MBB operands on bundled terminators so SPARC doesn't need to specialize this function. llvm-svn: 199061
-
NAKAMURA Takumi authored
raw_fd_ostream: Don't change STDERR to O_BINARY, or w*printf() (in assert()) would barf wide chars after llvm::errs(). llvm-svn: 199057
-
NAKAMURA Takumi authored
FIXME: It should be generic to C++11. For now, it is dedicated to mingw-w64. llvm-svn: 199052
-
Nico Rieck authored
Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050
-
Chandler Carruth authored
This implements the legacy passes in terms of the new ones. It adds basic testing using explicit runs of the passes. Next up will be wiring the basic output mechanism of opt up when the new pass manager is engaged unless bitcode writing is requested. llvm-svn: 199049
-
Chandler Carruth authored
API is exposed. This removes the support for deleting the ostream, switches the member and constructor order arround to be consistent with the creation routines, and switches to using references. llvm-svn: 199047
-
Chandler Carruth authored
Nothing was using the ability of the pass to delete the raw_ostream it printed to, and nothing was trying to pass it a pointer to the raw_ostream. Also, the function variant had a different order of arguments from all of the others which was just really confusing. Now the interface accepts a reference, doesn't offer to delete it, and uses a consistent order. The implementation of the printing passes haven't been updated with this simplification, this is just the API switch. llvm-svn: 199044
-
Chandler Carruth authored
the heavy factoring needed to share logic between the new pass manager and the old. llvm-svn: 199043
-
Chandler Carruth authored
name to match the source file which I got earlier. Update the include sites. Also modernize the comments in the header to use the more recommended doxygen style. llvm-svn: 199041
-
Saleem Abdulrasool authored
An improper qualifier would result in a superfluous error due to the parser not consuming the remainder of the statement. Simply consume the remainder of the statement to avoid the error. llvm-svn: 199035
-
Venkatraman Govindaraju authored
llvm-svn: 199033
-
Saleem Abdulrasool authored
The implicit immediate 0 forms are assembly aliases, not distinct instruction encodings. Fix the initial implementation introduced in r198914 to an alias to avoid two separate instruction definitions for the same encoding. An InstAlias is insufficient in this case as the necessary due to the need to add a new additional operand for the implicit zero. By using the AsmPsuedoInst, fall back to the C++ code to transform the instruction to the equivalent _POST_IMM form, inserting the additional implicit immediate 0. llvm-svn: 199032
-
Venkatraman Govindaraju authored
llvm-svn: 199031
-
Jakob Stoklund Olesen authored
This is different from the argument passing convention which puts the first float argument in %f1. With this patch, all returned floats are treated as if the 'inreg' flag were set. This means multiple float return values get packed in %f0, %f1, %f2, ... Note that when returning a struct in registers, clang will set the 'inreg' flag on the return value, so that behavior is unchanged. This also happens when returning a float _Complex. llvm-svn: 199028
-
Joerg Sonnenberger authored
assemble the various mul instructions. llvm-svn: 199026
-
Hans Wennborg authored
case when the lookup table doesn't have any holes. This means we can build a lookup table for switches like this: switch (x) { case 0: return 1; case 1: return 2; case 2: return 3; case 3: return 4; default: exit(1); } The default case doesn't yield a constant result here, but that doesn't matter, since a default result is only necessary for filling holes in the lookup table, and this table doesn't have any holes. This makes us transform 505 more switches in a clang bootstrap, and shaves 164 KB off the resulting clang binary. llvm-svn: 199025
-
Venkatraman Govindaraju authored
llvm-svn: 199024
-
Saleem Abdulrasool authored
A 32-bit immediate value can be formed from a constant expression and loaded into a register. Add support to emit this into an object file. Because this value is a constant, a relocation must *not* be produced for it. llvm-svn: 199023
-
- Jan 11, 2014
-
-
Arnold Schwaighofer authored
I saw no compile or execution time regressions on x86_64 -mavx -O3. radar://13075509 llvm-svn: 199015
-
Venkatraman Govindaraju authored
[Sparc] Bundle instruction with delay slow and its filler. Now, we can use -verify-machineinstrs with SPARC backend. llvm-svn: 199014
-
Alp Toker authored
Patch by Jasper Neumann! llvm-svn: 199007
-
Chandler Carruth authored
mode that can be used to debug the execution of everything. No support for analyses here, that will come later. This already helps show parts of the opt commandline integration that isn't working. Tests of that will start using it as the bugs are fixed. llvm-svn: 199004
-
NAKAMURA Takumi authored
Excuse me, I hope msc16 builders would be fine till its end day. Introduce nullptr then. ;) llvm-svn: 199001
-
Juergen Ributzka authored
Use separate callee-save masks for XMM and YMM registers for anyregcc on X86 and select the proper mask depending on the target cpu we compile for. llvm-svn: 198985
-
Eric Christopher authored
llvm-svn: 198981
-
Eric Christopher authored
llvm-svn: 198980
-
Eric Christopher authored
llvm-svn: 198979
-
Diego Novillo authored
1- Use the line_iterator class to read profile files. 2- Allow comments in profile file. Lines starting with '#' are completely ignored while reading the profile. 3- Add parsing support for discriminators and indirect call samples. Our external profiler can emit more profile information that we are currently not handling. This patch does not add new functionality to support this information, but it allows profile files to provide it. I will add actual support later on (for at least one of these features, I need support for DWARF discriminators in Clang). A sample line may contain the following additional information: Discriminator. This is used if the sampled program was compiled with DWARF discriminator support (http://wiki.dwarfstd.org/index.php?title=Path_Discriminators). This is currently only emitted by GCC and we just ignore it. Potential call targets and samples. If present, this line contains a call instruction. This models both direct and indirect calls. Each called target is listed together with the number of samples. For example, 130: 7 foo:3 bar:2 baz:7 The above means that at relative line offset 130 there is a call instruction that calls one of foo(), bar() and baz(). With baz() being the relatively more frequent call target. Differential Revision: http://llvm-reviews.chandlerc.com/D2355 4- Simplify format of profile input file. This implements earlier suggestions to simplify the format of the sample profile file. The symbol table is not necessary and function profiles do not need to know the number of samples in advance. Differential Revision: http://llvm-reviews.chandlerc.com/D2419 llvm-svn: 198973
-
Diego Novillo authored
This adds a propagation heuristic to convert instruction samples into branch weights. It implements a similar heuristic to the one implemented by Dehao Chen on GCC. The propagation proceeds in 3 phases: 1- Assignment of block weights. All the basic blocks in the function are initial assigned the same weight as their most frequently executed instruction. 2- Creation of equivalence classes. Since samples may be missing from blocks, we can fill in the gaps by setting the weights of all the blocks in the same equivalence class to the same weight. To compute the concept of equivalence, we use dominance and loop information. Two blocks B1 and B2 are in the same equivalence class if B1 dominates B2, B2 post-dominates B1 and both are in the same loop. 3- Propagation of block weights into edges. This uses a simple propagation heuristic. The following rules are applied to every block B in the CFG: - If B has a single predecessor/successor, then the weight of that edge is the weight of the block. - If all the edges are known except one, and the weight of the block is already known, the weight of the unknown edge will be the weight of the block minus the sum of all the known edges. If the sum of all the known edges is larger than B's weight, we set the unknown edge weight to zero. - If there is a self-referential edge, and the weight of the block is known, the weight for that edge is set to the weight of the block minus the weight of the other incoming edges to that block (if known). Since this propagation is not guaranteed to finalize for every CFG, we only allow it to proceed for a limited number of iterations (controlled by -sample-profile-max-propagate-iterations). It currently uses the same GCC default of 100. Before propagation starts, the pass builds (for each block) a list of unique predecessors and successors. This is necessary to handle identical edges in multiway branches. Since we visit all blocks and all edges of the CFG, it is cleaner to build these lists once at the start of the pass. Finally, the patch fixes the computation of relative line locations. The profiler emits lines relative to the function header. To discover it, we traverse the compilation unit looking for the subprogram corresponding to the function. The line number of that subprogram is the line where the function begins. That becomes line zero for all the relative locations. llvm-svn: 198972
-
- Jan 10, 2014
-
-
Rafael Espindola authored
llvm-svn: 198958
-
Rafael Espindola authored
llvm-svn: 198955
-
Duncan P. N. Exon Smith authored
llvm-svn: 198954
-
Arnold Schwaighofer authored
for (i = 0; i < N; ++i) A[i * Stride1] += B[i * Stride2]; We take loops like this and check that the symbolic strides 'Strided1/2' are one and drop to the scalar loop if they are not. This is currently disabled by default and hidden behind the flag 'enable-mem-access-versioning'. radar://13075509 llvm-svn: 198950
-
Artyom Skrobov authored
llvm-svn: 198945
-
Saleem Abdulrasool authored
The disassembler would no longer be able to disambiguage between the two variants (explicit immediate #0 vs implicit, omitted #0) for the ldrt, strt, ldrbt, strbt mnemonics as both versions indicated the disassembler routine. llvm-svn: 198944
-