- Jan 29, 2014
-
-
David Woodhouse authored
llvm-svn: 200349
-
David Woodhouse authored
llvm-svn: 200348
-
David Woodhouse authored
Needed to fix PR18303 to correctly re-encode the instruction if it is relaxed. We keep a copy of the MCSubtargetInfo to make sure that we are not effected by future changes to the subtarget info coming from the assembler (e.g. when parsing .code 16 directived). llvm-svn: 200347
-
David Woodhouse authored
Add MCSubtargetInfo parameter virtual void EmitInstToFragment(const MCInst &Inst, const MCSubtargetInfo &); virtual void EmitInstToData(const MCInst &Inst, const MCSubtargetInfo &); llvm-svn: 200346
-
David Woodhouse authored
llvm-svn: 200345
-
- Jan 28, 2014
-
-
Timur Iskhodzhanov authored
llvm-svn: 200341
-
Timur Iskhodzhanov authored
Reviewed at http://llvm-reviews.chandlerc.com/D2232 llvm-svn: 200340
-
Owen Anderson authored
llvm-svn: 200335
-
Nick Kledzik authored
llvm-svn: 200333
-
Matheus Almeida authored
As opposed to GCC/GAS the default ABI for Mips64 is n64. Compatibility bit should be set if o32 ABI is used when targeting Mips64. llvm-svn: 200332
-
Nick Kledzik authored
Makes it easy to use BumpPtrAllocator to make a copy of StringRef strings. llvm-svn: 200331
-
Gautam Chakrabarti authored
The code was missing the case for aggregate parameters and hence was emitting them as .b0 type. Also fixed a couple of comments. llvm-svn: 200325
-
Andrea Di Biagio authored
This improves the fix committed at revision 199683 adding the following new target specific combine rules: 1) fold (v4i32: vselect <0,0,-1,-1>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast A)), (v2i64 (bitcast B))) )) 2) fold (v4f32: vselect <0,0,-1,-1>, A, B) -> (v4f32 (bitcast (movsd (v2f64 (bitcast A)), (v2f64 (bitcast B))) )) 3) fold (v4i32: vselect <-1,-1,0,0>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) 4) fold (v4f32: vselect <-1,-1,0,0>, A, B) -> (v4f32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) llvm-svn: 200324
-
Adrian Prantl authored
llvm-svn: 200323
-
Rafael Espindola authored
When simplifycfg moves an instruction, it must drop metadata it doesn't know is still valid with the preconditions changes. In particular, it must drop the range and tbaa metadata. The patch implements this with an utility function to drop all metadata not in a white list. llvm-svn: 200322
-
Aaron Ballman authored
The llvm_headers_do_not_build project needs to be excluded from the default build, otherwise it gets built (at least in Visual Studio 2013). Thanks to chapuni200000 for help with this in IRC! llvm-svn: 200321
-
Andrea Di Biagio authored
Make sure that we don't introduce illegal build_vector dag nodes when trying to fold a sign_extend of a build_vector. This fixes a regression introduced by r200234. Added test CodeGen/X86/fold-vector-sext-crash.ll to verify that llc no longer crashes with an assertion failure due to an illegal build_vector of type MVT::v4i64. Thanks to Ilia Filippov for spotting this regression and for providing a reproducible test case. llvm-svn: 200313
-
NAKAMURA Takumi authored
[CMake] llvm_update_compile_flags(name) doesn't require source files. TARGET PROPERTY SOURCES has them. llvm-svn: 200311
-
Iain Sandoe authored
At present, this handles .tc (error) and needs to be expanded to deal properly with .machine llvm-svn: 200309
-
NAKAMURA Takumi authored
It is the final step to deprecate contextual CMAKE_CXX_FLAGS. llvm-svn: 200303
-
NAKAMURA Takumi authored
[CMake] Enhance llvm_update_compile_flags(name sources) to handle LLVM_REQUIRES_EH and LLVM_REQUIRES_RTTI. LLVM_REQUIRES_EH implies LLVM_REQUIRES_RTTI. It is as same behavior as Makefile.rule's. llvm/examples/ExceptionDemo is affected. (It was built with -fno-rtti.) For MSVC, Remove flags like "/EHsc /GR" in HandleLLVMOptions, or CL.EXE complains with flags like "/GR /GR-". llvm_update_compile_flags() updates source file property if the target contains *.c. COMPILE_FLAGS in target properties affects both C++ and C! LLVM_NO_RTTI is deprecated. It was introduced by me and was my mistake. llvm-svn: 200301
-
NAKAMURA Takumi authored
llvm-svn: 200300
-
NAKAMURA Takumi authored
With this tweaks, also unittests are compiled with -ffunction-sections. It's hard to control contextual CMAKE_CXX_FLAGS. We should get rid of twiddling it as possible. llvm-svn: 200299
-
NAKAMURA Takumi authored
llvm-svn: 200298
-
NAKAMURA Takumi authored
llvm-svn: 200297
-
Chandler Carruth authored
vectorizer, placing it behind an off-by-default flag. It turns out that block frequency isn't what we want at all, here or elsewhere. This has been I think a nagging feeling for several of us working with it, but Arnold has given some really nice simple examples where the results are so comprehensively wrong that they aren't useful. I'm planning to email the dev list with a summary of why its not really useful and a couple of ideas about how to better structure these types of heuristics. llvm-svn: 200294
-
Hal Finkel authored
GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo register). As a result, we also need to check for it in the spilling code. llvm-svn: 200288
-
Craig Topper authored
Improve handling of EnforceSmallerThan. Remove all types that are smaller from the larger set not just the smallest type from the smaller set. Ensure 'smaller' vectors have the same or fewer total bits. Similar for 'larger' vectors. llvm-svn: 200287
-
Timur Iskhodzhanov authored
llvm-svn: 200285
-
Michel Danzer authored
Fixes half a dozen piglit tests with radeonsi. Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200283
-
Jakob Stoklund Olesen authored
Also emit the stubs that were generated for references to typeinfo symbols. llvm-svn: 200282
-
Reid Kleckner authored
Summary: I searched Transforms/ and Analysis/ for 'ByVal' and updated those call sites to check for inalloca if appropriate. I added tests for any change that would allow an optimization to fire on inalloca. Reviewers: nlewycky Differential Revision: http://llvm-reviews.chandlerc.com/D2449 llvm-svn: 200281
-
Reid Kleckner authored
This avoids miscompiling MS inline asm in LLVM where we have to infer clobbers. Test case forthcoming in Clang. llvm-svn: 200279
-
Chandler Carruth authored
LCSSA from it caused a crasher with the LoopUnroll pass. This crasher is really nasty. We destroy LCSSA form in a suprising way. When unrolling a loop into an outer loop, we not only need to restore LCSSA form for the outer loop, but for all children of the outer loop. This is somewhat obvious in retrospect, but hey! While this seems pretty heavy-handed, it's not that bad. Fundamentally, we only do this when we unroll a loop, which is already a heavyweight operation. We're unrolling all of these hypothetical inner loops as well, so their size and complexity is already on the critical path. This is just adding another pass over them to re-canonicalize. I have a test case from PR18616 that is great for reproducing this, but pretty useless to check in as it relies on many 10s of nested empty loops that get unrolled and deleted in just the right order. =/ What's worse is that investigating this has exposed another source of failure that is likely to be even harder to test. I'll try to come up with test cases for these fixes, but I want to get the fixes into the tree first as they're causing crashes in the wild. llvm-svn: 200273
-
Juergen Ributzka authored
[TLI] Add a new hook to TargetLowering to query the target if a load of a constant should be converted to simply the constant itself. Before this patch we used getIntImmCost from TargetTransformInfo to determine if a load of a constant should be converted to just a constant, but the threshold for this was set to an arbitrary value. This value works well for the two targets (X86 and ARM) that implement this target-hook, but it isn't target-independent at all. Now targets have the possibility to decide directly if this optimization should be performed. The default value is set to false to preserve the current behavior. The target hook has been moved to TargetLowering, which removed the last use and need of TargetTransformInfo in SelectionDAG. llvm-svn: 200271
-
Arnold Schwaighofer authored
The vectorizer takes a loop like this and widens all instructions except for the store. The stores are scalarized/unrolled and hidden behind an "if" block. for (i = 0; i < 128; ++i) { if (a[i] < 10) a[i] += val; } for (i = 0; i < 128; i+=2) { v = a[i:i+1]; v0 = (extract v, 0) + 10; v1 = (extract v, 1) + 10; if (v0 < 10) a[i] = v0; if (v1 < 10) a[i] = v1; } The vectorizer relies on subsequent optimizations to sink instructions into the conditional block where they are anticipated. The flag "vectorize-num-stores-pred" controls whether and how many stores to handle this way. Vectorization of conditional stores is disabled per default for now. This patch also adds a change to the heuristic when the flag "enable-loadstore-runtime-unroll" is enabled (off by default). It unrolls small loops until load/store ports are saturated. This heuristic uses TTI's getMaxUnrollFactor as a measure for load/store ports. I also added a second flag -enable-cond-stores-vec. It will enable vectorization of conditional stores. But there is no cost model for vectorization of conditional stores in place yet so this will not do good at the moment. rdar://15892953 Results for x86-64 -O3 -mavx +/- -mllvm -enable-loadstore-runtime-unroll -vectorize-num-stores-pred=1 (before the BFI change): Performance Regressions: Benchmarks/Ptrdist/yacr2/yacr2 7.35% (maze3() is identical but 10% slower) Applications/siod/siod 2.18% Performance improvements: mesa -4.42% libquantum -4.15% With a patch that slightly changes the register heuristics (by subtracting the induction variable on both sides of the register pressure equation, as the induction variable is probably not really unrolled): Performance Regressions: Benchmarks/Ptrdist/yacr2/yacr2 7.73% Applications/siod/siod 1.97% Performance Improvements: libquantum -13.05% (we now also unroll quantum_toffoli) mesa -4.27% llvm-svn: 200270
-
Eric Christopher authored
code to see if we're emitting a function into a non-default text section. This is still a less-than-ideal solution, but more contained than r199871 to determine whether or not we're emitting code into an array of comdat sections. llvm-svn: 200269
-
Eric Christopher authored
llvm-svn: 200264
-
Manman Ren authored
uint32. When folding branches to common destination, the updated branch weights can exceed uint32 by more than factor of 2. We should keep halving the weights until they can fit into uint32. llvm-svn: 200262
-
- Jan 27, 2014
-
-
Mark Seaborn authored
Without this fix, WaitResult is not defined. llvm-svn: 200259
-