- Jan 21, 2011
-
-
Venkatraman Govindaraju authored
Rename FLUSH to FLUSHW. Output "ta 3" instead of a "flushw" instruction if v8 instruction set is used. llvm-svn: 123997
-
Owen Anderson authored
A == B, and A > B, does not mean we can fold it to true. We still need to check for A ? B (A unordered B). llvm-svn: 123993
-
Evan Cheng authored
1. Fixed ARM pc adjustment. 2. Fixed dynamic-no-pic codegen 3. CSE of pc-relative load of global addresses. It's now enabled by default for Darwin. llvm-svn: 123991
-
Renato Golin authored
Clang was not parsing target triples involving EABI and was generating wrong IR (wrong PCS) and passing the wrong information down llc via the target-triple printed in IR. I've fixed this by adding the parsing of EABI into LLVM's Triple class and using it to choose the correct PCS in Clang's Tools. A Clang patch is on its way to use this infrastructure. llvm-svn: 123990
-
Oscar Fuentes authored
Patch by arrowdodger! llvm-svn: 123976
-
Bruno Cardoso Lopes authored
qadd and qdadd uses "rd, rm, rn", the same applies to the 'sub' variants. This is described in ARM manuals and matches the encoding used by the gnu assembler. llvm-svn: 123975
-
Venkatraman Govindaraju authored
llvm-svn: 123974
-
Nick Lewycky authored
instructions. llvm-svn: 123973
-
Andrew Trick authored
DAG. Disable using "-disable-sched-cycles". For ARM, this enables a framework for modeling the cpu pipeline and counting stalls. It also activates several heuristics to drive scheduling based on the model. Scheduling is inherently imprecise at this stage, and until spilling is improved it may defeat attempts to schedule. However, this framework provides greater control over tuning codegen. Although the flag is not target-specific, it should have very little affect on the default scheduler used by x86. The only two changes that affect x86 are: - scheduling a high-latency operation bumps the current cycle so independent operations can have their latency covered. i.e. two independent 4 cycle operations can produce results in 4 cycles, not 8 cycles. - Two operations with equal register pressure impact and no latency-based stalls on their uses will be prioritized by depth before height (height is irrelevant if no stalls occur in the schedule below this point). llvm-svn: 123971
-
Andrew Trick authored
flags. They are still not enable in this revision. Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with the scheduler's model of operand latency in the selection DAG. Generalized unit tests to work with sched-cycles. llvm-svn: 123969
-
Chris Lattner authored
llvm-svn: 123968
-
Chris Lattner authored
llvm-svn: 123965
-
Nick Lewycky authored
a select. A vector select is pairwise on each element so we'd need a new condition with the right number of elements to select on. Fixes PR8994. llvm-svn: 123963
-
Michael J. Spencer authored
llvm-svn: 123962
-
Nick Lewycky authored
While here, I'd like to complain about how vector is not an aggregate type according to llvm::Type::isAggregateType(), but they're listed under aggregate types in the LangRef and zero vectors are stored as ConstantAggregateZero. llvm-svn: 123956
-
Evan Cheng authored
value, the "add pc" must be CSE'ed at the same time. We could follow the same approach as T2 by adding pseudo instructions that combine the ldr + "add pc". But the better approach is to use movw + movt (which I will enable soon), so I'll leave this as a TODO. llvm-svn: 123949
-
- Jan 20, 2011
-
-
Tobias Grosser authored
The PassManager did not implement the transitivity of requiredTransitive. This was unnoticed since 2006. llvm-svn: 123942
-
Bruno Cardoso Lopes authored
llvm-svn: 123936
-
Bruno Cardoso Lopes authored
llvm-svn: 123930
-
Bruno Cardoso Lopes authored
llvm-svn: 123929
-
Bruno Cardoso Lopes authored
- Use a more appropriate name for Owen's ARM Parser isMCR hack since the same operands can be present in cdp/cdp2 instructions. Also increase the hack with cdp/cdp2 instructions. - Fix the encoding of cdp/cdp2 instructions for ARM (no thumb and thumb2 yet) and add testcases for t hem. llvm-svn: 123927
-
Jakob Stoklund Olesen authored
The value mapping gets confused about which original values have multiple new definitions so they may need phi insertions. This could probably be simplified by letting enterIntvBefore() take a live range to be added following the instruction. As long as the range stays inside the same basic block, value mapping shouldn't be a problem. llvm-svn: 123926
-
Jakob Stoklund Olesen authored
llvm-svn: 123925
-
Bruno Cardoso Lopes authored
llvm-svn: 123919
-
Bruno Cardoso Lopes authored
llvm-svn: 123917
-
Kalle Raiskila authored
llvm-svn: 123912
-
Duncan Sands authored
auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0. This patch adds this transform and some related logic to InstructionSimplify and removes some of the logic from instcombine (unfortunately not all because there are several situations in which instcombine can improve things by making new instructions, whereas instsimplify is not allowed to do this). At -O2 this often results in more than 15% more simplifications by early-cse, and results in hundreds of lines of bitcode being eliminated from the testsuite. I did see some small negative effects in the testsuite, for example a few additional instructions in three programs. One program, 483.xalancbmk, got an additional 35 instructions, which seems to be due to a function getting an additional instruction and then being inlined all over the place. llvm-svn: 123911
-
Bruno Cardoso Lopes authored
llvm-svn: 123910
-
Eric Christopher authored
llvm-svn: 123909
-
Eric Christopher authored
to add/sub by doing the normal operation and then checking for overflow afterwards. This generally relies on the DAG handling the later invalid operations as well. Fixes the 64-bit part of rdar://8622122 and rdar://8774702. llvm-svn: 123908
-
Evan Cheng authored
llvm-svn: 123907
-
Evan Cheng authored
TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. llvm-svn: 123905
-
Michael J. Spencer authored
llvm-svn: 123896
-
Michael J. Spencer authored
llvm-svn: 123895
-
Andrew Trick authored
Added a check for already live regs before claiming HighRegPressure. Fixed a few cases of checking the wrong number of successors. Added some tracing until these heuristics are better understood. llvm-svn: 123892
-
Jakob Stoklund Olesen authored
The live range may have been deleted earlier because of rematerialization. llvm-svn: 123891
-
Jakob Stoklund Olesen authored
register coalescing. llvm-svn: 123890
-
Venkatraman Govindaraju authored
with useful instructions. llvm-svn: 123884
-
Cameron Zwarich authored
llvm-svn: 123879
-
Jakob Stoklund Olesen authored
llvm-svn: 123872
-