- Aug 07, 2012
-
-
Jakob Stoklund Olesen authored
Compare the critical paths of the two traces through an if-conversion candidate. If the difference is larger than the branch brediction penalty, reject the if-conversion. If would never pay. llvm-svn: 161433
-
Jim Grosbach authored
llvm-svn: 161430
-
Rafael Espindola authored
a use or a BB, but it is inline in the handling of the invoke instruction. This patch refactors it so that it can be used in other cases. For example, in define i32 @f(i32 %x) { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb2, label %bb1 bb1: br label %bb2 bb2: %cond = phi i32 [ %x, %bb0 ], [ 0, %bb1 ] %foo = add i32 %cond, %x ret i32 %foo } GVN should be able to replace %x with 0 in any use that is dominated by the true edge out of bb0. In the above example the only such use is the one in the phi. llvm-svn: 161429
-
Hal Finkel authored
Thanks to Alex Rosenberg for the suggestion. llvm-svn: 161428
-
Alexey Samsonov authored
and "instruction address -> file/line" lookup. Instead of plain collection of rows, debug line table for compilation unit is now treated as the number of row ranges, describing sequences (series of contiguous machine instructions). The sequences are not always listed in the order of increasing address, so previously used std::lower_bound() sometimes produced wrong results. Now the instruction address lookup consists of two stages: finding the correct sequence, and searching for address in range of rows for this sequence. llvm-svn: 161414
-
Benjamin Kramer authored
We give a bonus for every argument because the argument setup is not needed anymore when the function is inlined. With this patch we interpret byval arguments as a compact representation of many arguments. The byval argument setup is implemented in the backend as an inline memcpy, so to model the cost as accurately as possible we take the number of pointer-sized elements in the byval argument and give a bonus of 2 instructions for every one of those. The bonus is capped at 8 elements, which is the number of stores at which the x86 backend switches from an expanded inline memcpy to a real memcpy. It would be better to use the real memcpy threshold from the backend, but it's not available via TargetData. This change brings the performance of c-ray in line with gcc 4.7. The included test case tries to reproduce the c-ray problem to catch regressions for this benchmark early, its performance is dominated by the inline decision of a specific call. This only has a small impact on most code, more on x86 and arm than on x86_64 due to the way the ABI works. When building LLVM for x86 it gives a small inline cost boost to virtually any function using StringRef or STL allocators, but only a 0.01% increase in overall binary size. The size of gcc compiled by clang actually shrunk by a couple bytes with this patch applied, but not significantly. llvm-svn: 161413
-
Chandler Carruth authored
instsimplify+inline strategy. The crux of the problem is that instsimplify was reasonably relying on an invariant that is true within any single function, but is no longer true mid-inline the way we use it. This invariant is that an argument pointer != a local (alloca) pointer. The fix is really light weight though, and allows instsimplify to be resiliant to these situations: when checking the relation ships to function arguments, ensure that the argumets come from the same function. If they come from different functions, then none of these assumptions hold. All credit to Benjamin Kramer for coming up with this clever solution to the problem. llvm-svn: 161410
-
Chandler Carruth authored
Previously, MBP essentially aligned every branch target it could. This bloats code quite a bit, especially non-looping code which has no real reason to prefer aligned branch targets so heavily. As Andy said in review, it's still a bit odd to do this without a real cost model, but this at least has much more plausible heuristics. Fixes PR13265. llvm-svn: 161409
-
Manman Ren authored
If the result of a common subexpression is used at all uses of the candidate expression, CSE should not increase the live range of the common subexpression. rdar://11393714 and rdar://11819721 llvm-svn: 161396
-
Bill Wendling authored
--- Reverse-merging r161371 into '.': U include/llvm/Target/TargetData.h U lib/Target/TargetData.cpp llvm-svn: 161394
-
Jack Carter authored
initialize fields of the class that it used. The result was nonsense code. Before: 0000000000000000 <foo>: 0: 00441100 0x441100 4: 03e00008 jr ra 8: 00000000 nop After: 0000000000000000 <foo>: 0: 00041000 sll v0,a0,0x0 4: 03e00008 jr ra 8: 00000000 nop llvm-svn: 161377
-
Bill Wendling authored
llvm-svn: 161371
-
Andrew Trick authored
This allows codegen passes to query properties like InstrItins->SchedModel->IssueWidth. It also ensure's that computeOperandLatency returns the X86 defaults for loads and "high latency ops". This should have no significant impact on existing schedulers because X86 defaults happen to be the same as global defaults. llvm-svn: 161370
-
Jack Carter authored
I hit this in a very large program (spirit.cpp), but have not figured out how to make a small make check test for it. llvm-svn: 161366
-
Jack Carter authored
were using a class defined for 32 bit instructions and thus the instruction was for addiu instead of daddiu. This was corrected by adding the instruction opcode as a field in the base class to be filled in by the defs. llvm-svn: 161359
-
Bill Wendling authored
llvm-svn: 161356
-
Jakob Stoklund Olesen authored
llvm-svn: 161354
-
- Aug 06, 2012
-
-
Bill Wendling authored
When the command line target options were removed from the LLVM libraries, LTO lost its ability to specify things like `-disable-fp-elim'. Add this back by adding the command line variables to the `lto' project. <rdar://problem/12038729> llvm-svn: 161353
-
Jack Carter authored
These 2 relocations gain access to the highest and the second highest 16 bits of a 64 bit object. R_MIPS_HIGHER %higher(A+S) The %higher(x) function is [ (((long long) x + 0x80008000LL) >> 32) & 0xffff ]. R_MIPS_HIGHEST %highest(A+S) The %highest(x) function is [ (((long long) x + 0x800080008000LL) >> 48) & 0xffff ]. llvm-svn: 161348
-
Hal Finkel authored
The MFTB instruction itself is being phased out, and its functionality is provided by MFSPR. According to the ISA docs, using MFSPR works on all known chips except for the 601 (which did not have a timebase register anyway) and the POWER3. Thanks to Adhemerval Zanella for pointing this out! llvm-svn: 161346
-
Eric Christopher authored
Patch by David Hill. llvm-svn: 161344
-
Simon Atanasyan authored
The patch reviewed by Akira Hatanaka. llvm-svn: 161332
-
Jakob Stoklund Olesen authored
llvm-svn: 161329
-
Roman Divacky authored
llvm-svn: 161328
-
Craig Topper authored
Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318
-
- Aug 05, 2012
-
-
Craig Topper authored
llvm-svn: 161307
-
Craig Topper authored
llvm-svn: 161306
-
Craig Topper authored
Use a COPY node instead of an explicit MOVA opcode in the custom insterter for pcmpestrm/pcmpistrm. Allows the register allocator to handle it better and prevent wasted identity moves. llvm-svn: 161305
-
- Aug 04, 2012
-
-
Hal Finkel authored
On PPC64, this can be done with a simple TableGen pattern. To enable this, I've added the (otherwise missing) readcyclecounter SDNode definition to TargetSelectionDAG.td. llvm-svn: 161302
-
Anton Korobeynikov authored
llvm-svn: 161301
-
Anton Korobeynikov authored
(this corresponds by spilling/reloading regs in DTriple / DQuad reg classes). No testcase, found by inspection. llvm-svn: 161300
-
Anton Korobeynikov authored
were missed for no reason. This fixes PR13377 llvm-svn: 161299
-
Bill Wendling authored
llvm-svn: 161298
-
Benjamin Kramer authored
llvm-svn: 161297
-
Benjamin Kramer authored
Postpone the deletion of the old name in StructType::setName to allow using a slice of the old name. Fixes PR13522. Add a rudimentary unit test to exercise the behavior. llvm-svn: 161296
-
NAKAMURA Takumi authored
[CMake] add_lit_target: Remove comments about add_dependencies. It is not a bug in cmake that add_custom_target(DEPENDS) would not accept targets but file-level dependencies. llvm-svn: 161295
-
NAKAMURA Takumi authored
FIXME: Fix several tests on i686-win32 due to lacking of many libraries. llvm-svn: 161292
-
Jakob Stoklund Olesen authored
TwoAddressInstructionPass doesn't remat any more. llvm-svn: 161285
-
Jakob Stoklund Olesen authored
llvm-svn: 161284
-
Bob Wilson authored
This patch is mostly just refactoring a bunch of copy-and-pasted code, but it also adds a check that the call instructions are readnone or readonly. That check was already present for sin, cos, sqrt, log2, and exp2 calls, but it was missing for the rest of the builtins being handled in this code. llvm-svn: 161282
-