- Jan 29, 2014
-
-
Renato Golin authored
After all hard work to implement the EHABI and with the test-suite passing, it's time to turn it on by default and allow users to disable it as a work-around while we fix the eventual bugs that show up. This commit also remove the -arm-enable-ehabi-descriptors, since we want the tables to be printed every time the EHABI is turned on for non-Darwin ARM targets. Although MCJIT EHABI is not working yet (needs linking with the right libraries), this commit also fixes some relocations on MCJIT regarding the EH tables/lib calls, and update some tests to avoid using EH tables when none are needed. The EH tests in the test-suite that were previously disabled on ARM now pass with these changes, so a follow-up commit on the test-suite will re-enable them. llvm-svn: 200388
-
Venkatraman Govindaraju authored
This makes MCAsmInfo::getExprForFDESymbol() a virtual function and overrides it in SparcMCAsmInfo. llvm-svn: 200376
-
NAKAMURA Takumi authored
It was incompatible with --target=i686-win32. llvm-svn: 200375
-
Venkatraman Govindaraju authored
Otherwise, assembler (gas) fails to assemble them with error message "operation combines symbols in different segments". This is because MC computes pc_rel entries with subtract expression between labels from different sections. llvm-svn: 200373
-
Chandler Carruth authored
because of the inside-out run of LoopSimplify in the LoopPassManager and the fact that LoopSimplify couldn't be "preserved" across two independent LoopPassManagers. Anyways, in that case, IndVars wasn't correctly preserving an LCSSA PHI node because it thought it was rewriting (via SCEV) the incoming value to a loop invariant value. While it may well be invariant for the current loop, it may be rewritten in terms of an enclosing loop's values. This in and of itself is fine, as the LCSSA PHI node in the enclosing loop for the inner loop value we're rewriting will have its own LCSSA PHI node if used outside of the enclosing loop. With me so far? Well, the current loop and the enclosing loop may share an exiting block and exit block, and when they do they also share LCSSA PHI nodes. In this case, its not valid to RAUW through the LCSSA PHI node. Expected crazy test included. llvm-svn: 200372
-
Arnold Schwaighofer authored
When estimating register pressure, don't count the induction variable mulitple times. It is unlikely to be unrolled. This is currently disabled and hidden behind a flag ("enable-ind-var-reg-heur"). llvm-svn: 200371
-
Venkatraman Govindaraju authored
[SparcV9] Use correct register class (I64RegClass) to hold the address of _GLOBAL_OFFSET_TABLE_ in sparcv9. llvm-svn: 200368
-
Rafael Espindola authored
This is a bit more convenient for some callers, but more importantly, it is easier to implement correctly. Doing this removes the patching of already printed data that was used for fastcall, fixing a crash with private fastcall symbols. llvm-svn: 200367
-
Kevin Qin authored
When the scalar compare is between floating point and operands are vector, we custom lower SELECT_CC to use NEON SIMD compare for generating less instructions. llvm-svn: 200365
-
Mark Seaborn authored
The default value of this attribute is PTHREAD_PROCESS_PRIVATE, so there's no point in calling pthread_mutexattr_setpshared() to set that. See: http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutexattr_getpshared.html This removes some ifdefs that tend to need to be extended for other platforms (e.g. for NaCl). Note that this call was in the first implementation of Mutex, added in r22403, so it doesn't appear to have been added in response to a performance problem. Differential Revision: http://llvm-reviews.chandlerc.com/D2633 llvm-svn: 200360
-
David Majnemer authored
Use an RAII object Instead of inserting a call to AsmLexer::setSkipSpace(true) in all error paths. No functional change. llvm-svn: 200358
-
Rafael Espindola authored
This will be better with c++11, but right now file_magic converts to bool, which makes the api really easy to misuse. llvm-svn: 200357
-
David Woodhouse authored
Oops. Don't do build tests on patches like that with --enable-targets=x86_64 llvm-svn: 200355
-
David Woodhouse authored
The subtarget info is explicitly passed to the EncodeInstruction method and we should use that subtarget info to influence any encoding decisions. llvm-svn: 200350
-
David Woodhouse authored
llvm-svn: 200349
-
David Woodhouse authored
llvm-svn: 200348
-
David Woodhouse authored
Needed to fix PR18303 to correctly re-encode the instruction if it is relaxed. We keep a copy of the MCSubtargetInfo to make sure that we are not effected by future changes to the subtarget info coming from the assembler (e.g. when parsing .code 16 directived). llvm-svn: 200347
-
David Woodhouse authored
Add MCSubtargetInfo parameter virtual void EmitInstToFragment(const MCInst &Inst, const MCSubtargetInfo &); virtual void EmitInstToData(const MCInst &Inst, const MCSubtargetInfo &); llvm-svn: 200346
-
David Woodhouse authored
llvm-svn: 200345
-
- Jan 28, 2014
-
-
Timur Iskhodzhanov authored
Reviewed at http://llvm-reviews.chandlerc.com/D2232 llvm-svn: 200340
-
Matheus Almeida authored
As opposed to GCC/GAS the default ABI for Mips64 is n64. Compatibility bit should be set if o32 ABI is used when targeting Mips64. llvm-svn: 200332
-
Gautam Chakrabarti authored
The code was missing the case for aggregate parameters and hence was emitting them as .b0 type. Also fixed a couple of comments. llvm-svn: 200325
-
Andrea Di Biagio authored
This improves the fix committed at revision 199683 adding the following new target specific combine rules: 1) fold (v4i32: vselect <0,0,-1,-1>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast A)), (v2i64 (bitcast B))) )) 2) fold (v4f32: vselect <0,0,-1,-1>, A, B) -> (v4f32 (bitcast (movsd (v2f64 (bitcast A)), (v2f64 (bitcast B))) )) 3) fold (v4i32: vselect <-1,-1,0,0>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) 4) fold (v4f32: vselect <-1,-1,0,0>, A, B) -> (v4f32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) llvm-svn: 200324
-
Adrian Prantl authored
llvm-svn: 200323
-
Rafael Espindola authored
When simplifycfg moves an instruction, it must drop metadata it doesn't know is still valid with the preconditions changes. In particular, it must drop the range and tbaa metadata. The patch implements this with an utility function to drop all metadata not in a white list. llvm-svn: 200322
-
Andrea Di Biagio authored
Make sure that we don't introduce illegal build_vector dag nodes when trying to fold a sign_extend of a build_vector. This fixes a regression introduced by r200234. Added test CodeGen/X86/fold-vector-sext-crash.ll to verify that llc no longer crashes with an assertion failure due to an illegal build_vector of type MVT::v4i64. Thanks to Ilia Filippov for spotting this regression and for providing a reproducible test case. llvm-svn: 200313
-
Iain Sandoe authored
At present, this handles .tc (error) and needs to be expanded to deal properly with .machine llvm-svn: 200309
-
Chandler Carruth authored
vectorizer, placing it behind an off-by-default flag. It turns out that block frequency isn't what we want at all, here or elsewhere. This has been I think a nagging feeling for several of us working with it, but Arnold has given some really nice simple examples where the results are so comprehensively wrong that they aren't useful. I'm planning to email the dev list with a summary of why its not really useful and a couple of ideas about how to better structure these types of heuristics. llvm-svn: 200294
-
Hal Finkel authored
GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo register). As a result, we also need to check for it in the spilling code. llvm-svn: 200288
-
Timur Iskhodzhanov authored
llvm-svn: 200285
-
Michel Danzer authored
Fixes half a dozen piglit tests with radeonsi. Reviewed-by:
Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200283
-
Jakob Stoklund Olesen authored
Also emit the stubs that were generated for references to typeinfo symbols. llvm-svn: 200282
-
Reid Kleckner authored
Summary: I searched Transforms/ and Analysis/ for 'ByVal' and updated those call sites to check for inalloca if appropriate. I added tests for any change that would allow an optimization to fire on inalloca. Reviewers: nlewycky Differential Revision: http://llvm-reviews.chandlerc.com/D2449 llvm-svn: 200281
-
Reid Kleckner authored
This avoids miscompiling MS inline asm in LLVM where we have to infer clobbers. Test case forthcoming in Clang. llvm-svn: 200279
-
Chandler Carruth authored
LCSSA from it caused a crasher with the LoopUnroll pass. This crasher is really nasty. We destroy LCSSA form in a suprising way. When unrolling a loop into an outer loop, we not only need to restore LCSSA form for the outer loop, but for all children of the outer loop. This is somewhat obvious in retrospect, but hey! While this seems pretty heavy-handed, it's not that bad. Fundamentally, we only do this when we unroll a loop, which is already a heavyweight operation. We're unrolling all of these hypothetical inner loops as well, so their size and complexity is already on the critical path. This is just adding another pass over them to re-canonicalize. I have a test case from PR18616 that is great for reproducing this, but pretty useless to check in as it relies on many 10s of nested empty loops that get unrolled and deleted in just the right order. =/ What's worse is that investigating this has exposed another source of failure that is likely to be even harder to test. I'll try to come up with test cases for these fixes, but I want to get the fixes into the tree first as they're causing crashes in the wild. llvm-svn: 200273
-
Juergen Ributzka authored
[TLI] Add a new hook to TargetLowering to query the target if a load of a constant should be converted to simply the constant itself. Before this patch we used getIntImmCost from TargetTransformInfo to determine if a load of a constant should be converted to just a constant, but the threshold for this was set to an arbitrary value. This value works well for the two targets (X86 and ARM) that implement this target-hook, but it isn't target-independent at all. Now targets have the possibility to decide directly if this optimization should be performed. The default value is set to false to preserve the current behavior. The target hook has been moved to TargetLowering, which removed the last use and need of TargetTransformInfo in SelectionDAG. llvm-svn: 200271
-
Arnold Schwaighofer authored
The vectorizer takes a loop like this and widens all instructions except for the store. The stores are scalarized/unrolled and hidden behind an "if" block. for (i = 0; i < 128; ++i) { if (a[i] < 10) a[i] += val; } for (i = 0; i < 128; i+=2) { v = a[i:i+1]; v0 = (extract v, 0) + 10; v1 = (extract v, 1) + 10; if (v0 < 10) a[i] = v0; if (v1 < 10) a[i] = v1; } The vectorizer relies on subsequent optimizations to sink instructions into the conditional block where they are anticipated. The flag "vectorize-num-stores-pred" controls whether and how many stores to handle this way. Vectorization of conditional stores is disabled per default for now. This patch also adds a change to the heuristic when the flag "enable-loadstore-runtime-unroll" is enabled (off by default). It unrolls small loops until load/store ports are saturated. This heuristic uses TTI's getMaxUnrollFactor as a measure for load/store ports. I also added a second flag -enable-cond-stores-vec. It will enable vectorization of conditional stores. But there is no cost model for vectorization of conditional stores in place yet so this will not do good at the moment. rdar://15892953 Results for x86-64 -O3 -mavx +/- -mllvm -enable-loadstore-runtime-unroll -vectorize-num-stores-pred=1 (before the BFI change): Performance Regressions: Benchmarks/Ptrdist/yacr2/yacr2 7.35% (maze3() is identical but 10% slower) Applications/siod/siod 2.18% Performance improvements: mesa -4.42% libquantum -4.15% With a patch that slightly changes the register heuristics (by subtracting the induction variable on both sides of the register pressure equation, as the induction variable is probably not really unrolled): Performance Regressions: Benchmarks/Ptrdist/yacr2/yacr2 7.73% Applications/siod/siod 1.97% Performance Improvements: libquantum -13.05% (we now also unroll quantum_toffoli) mesa -4.27% llvm-svn: 200270
-
Eric Christopher authored
code to see if we're emitting a function into a non-default text section. This is still a less-than-ideal solution, but more contained than r199871 to determine whether or not we're emitting code into an array of comdat sections. llvm-svn: 200269
-
Eric Christopher authored
llvm-svn: 200264
-
Manman Ren authored
uint32. When folding branches to common destination, the updated branch weights can exceed uint32 by more than factor of 2. We should keep halving the weights until they can fit into uint32. llvm-svn: 200262
-