- Sep 20, 2012
-
-
Michael Liao authored
- Rewrite/merge pseudo-atomic instruction emitters to address the following issue: * Reduce one unnecessary load in spin-loop previously the spin-loop looks like thisMBB: newMBB: ld t1 = [bitinstr.addr] op t2 = t1, [bitinstr.val] not t3 = t2 (if Invert) mov EAX = t1 lcs dest = [bitinstr.addr], t3 [EAX is implicit] bz newMBB fallthrough -->nextMBB the 'ld' at the beginning of newMBB should be lift out of the loop as lcs (or CMPXCHG on x86) will load the current memory value into EAX. This loop is refined as: thisMBB: EAX = LOAD [MI.addr] mainMBB: t1 = OP [MI.val], EAX LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined] JNE mainMBB sinkMBB: * Remove immopc as, so far, all pseudo-atomic instructions has all-register form only, there is no immedidate operand. * Remove unnecessary attributes/modifiers in pseudo-atomic instruction td * Fix issues in PR13458 - Add comprehensive tests on atomic ops on various data types. NOTE: Some of them are turned off due to missing functionality. - Revise tests due to the new spin-loop generated. llvm-svn: 164281
-
- Aug 02, 2012
-
-
Manman Ren authored
Add more comments and use early returns to reduce nesting in isLoadFoldable. Also disable folding for V_SET0 to avoid introducing a const pool entry and a const pool load. rdar://10554090 and rdar://11873276 llvm-svn: 161207
-
Manman Ren authored
Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. This patch is a rework of r160919 and was tested on clang self-host on my local machine. rdar://10554090 and rdar://11873276 llvm-svn: 161152
-
- Jul 29, 2012
-
-
Manman Ren authored
llvm-svn: 160927
-
- Jul 28, 2012
-
-
Manman Ren authored
Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919
-
- Jul 06, 2012
-
-
Manman Ren authored
For each Cmp, we check whether there is an earlier Sub which make Cmp redundant. We handle the case where SUB operates on the same source operands as Cmp, including the case where the two source operands are swapped. llvm-svn: 159838
-
- Jul 04, 2012
-
-
Jakob Stoklund Olesen authored
Implement the TII hooks needed by EarlyIfConversion to create cmov instructions and estimate their latency. Early if-conversion is still not enabled by default. llvm-svn: 159695
-
- Jun 23, 2012
-
-
Craig Topper authored
llvm-svn: 159071
-
- Jun 07, 2012
-
-
Manman Ren authored
The commit is intended to fix rdar://11540023. It is implemented as part of peephole optimization. We can actually implement this in the SelectionDAG lowering phase. llvm-svn: 158122
-
- Jun 03, 2012
-
-
Manman Ren authored
llvm-svn: 157896
-
- Jun 01, 2012
-
-
Manman Ren authored
This patch will optimize the following: sub r1, r3 cmp r3, r1 or cmp r1, r3 bge L1 TO sub r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can eliminate the "cmp" instruction. llvm-svn: 157831
-
Craig Topper authored
llvm-svn: 157801
-
- May 31, 2012
-
-
Manman Ren authored
This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 llvm-svn: 157755
-
Elena Demikhovsky authored
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks. I added tests for GodeGen and intrinsics. I did not change llvm.fma.f32/64 - it may be done later. llvm-svn: 157737
-
- Mar 17, 2012
-
-
Craig Topper authored
Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations. llvm-svn: 152997
-
- Feb 18, 2012
-
-
Jia Liu authored
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878
-
- Nov 15, 2011
-
-
Jakob Stoklund Olesen authored
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix about instructions with partial register updates causing false unwanted dependencies. The ExecutionDepsFix pass will break the false dependencies if the updated register was written in the previoius N instructions. The small loop added to sse-domains.ll runs twice as fast with dependency-breaking instructions inserted. llvm-svn: 144602
-
- Sep 29, 2011
-
-
Jakob Stoklund Olesen authored
This also makes it possible to reduce the number of pseudo instructions and get rid of the encoding information. llvm-svn: 140776
-
- Sep 28, 2011
-
-
Jakob Stoklund Olesen authored
I am going to unify the SSEDomainFix and NEONMoveFix passes into a single target independent pass. They are essentially doing the same thing. llvm-svn: 140652
-
- Sep 08, 2011
-
-
Bruno Cardoso Lopes authored
single field (Flags), which is a bitwise OR of items from the TB_* enum. This makes it easier to add new information in the future. * Gives every static array an equivalent layout: { RegOp, MemOp, Flags } * Adds a helper function, AddTableEntry, to avoid duplication of the insertion code. * Renames TB_NOT_REVERSABLE to TB_NO_REVERSE. * Adds TB_NO_FORWARD, which is analogous to TB_NO_REVERSE, except that it prevents addition of the Reg->Mem entry. (This is going to be used by Native Client, in the next CL). Patch by David Meyer llvm-svn: 139311
-
- Aug 08, 2011
-
-
Jakob Stoklund Olesen authored
These the methods are target-independent since they simply scan the memory operands. They can live in TargetInstrInfoImpl. llvm-svn: 137063
-
- Jul 25, 2011
-
-
Evan Cheng authored
llvm-svn: 135930
-
- Jul 01, 2011
-
-
Evan Cheng authored
llvm-svn: 134244
-
- May 25, 2011
-
-
Francois Pichet authored
llvm-svn: 132062
-
Francois Pichet authored
MSVC doesn't support 64 bit enum. OpcodeMask is not used anywhere in the code base. llvm-svn: 132057
-
- Apr 15, 2011
-
-
Chris Lattner authored
Luis Felipe Strano Moraes! llvm-svn: 129558
-
- Apr 04, 2011
-
-
Joerg Sonnenberger authored
llvm-svn: 128847
-
Joerg Sonnenberger authored
llvm-svn: 128826
-
Joerg Sonnenberger authored
Define most shift masks incrementally to reduce the redundant hard-coding. Introduce new shift for the VEX flags to replace the magic constant 32 in various places. llvm-svn: 128822
-
- Mar 05, 2011
-
-
Andrew Trick authored
regs. This is the only change in this checkin that may affects the default scheduler. With better register tracking and heuristics, it doesn't make sense to artificially lower the register limit so much. Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to give the scheduler a way to account for div and sqrt on targets that don't have an itinerary. It is currently defaults to 10 (the actual number doesn't matter much), but only takes effect on non-default schedulers: list-hybrid and list-ilp. Added several heuristics that can be individually disabled for the non-default sched=list-ilp mode. This helps us determine how much better we can do on a given benchmark than the default scheduler. Certain compute intensive loops run much faster in this mode with the right set of heuristics, and it doesn't seem to have much negative impact elsewhere. Not all of the heuristics are needed, but we still need to experiment to decide which should be disabled by default for sched=list-ilp. llvm-svn: 127067
-
Andrew Trick authored
llvm-svn: 127065
-
- Feb 22, 2011
-
-
Rafael Espindola authored
Patch by Jai Menon. llvm-svn: 126165
-
- Nov 28, 2010
-
-
Anton Korobeynikov authored
llvm-svn: 120228
-
- Nov 15, 2010
-
-
Chris Lattner authored
llvm-svn: 119092
-
- Oct 19, 2010
-
-
Evan Cheng authored
erased the instruction during LICM so UpdateRegPressureAfter() should not reference it afterwards. llvm-svn: 116845
-
Daniel Dunbar authored
is", which breaks some nightly tests. llvm-svn: 116816
-
Evan Cheng authored
"long latency" enough to hoist even if it may increase spilling. Reloading a value from spill slot is often cheaper than performing an expensive computation in the loop. For X86, that means machine LICM will hoist SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON instructions. - Enable register pressure aware machine LICM by default. llvm-svn: 116781
-
- Oct 08, 2010
-
-
Chris Lattner authored
with the right types. llvm-svn: 116001
-
- Oct 03, 2010
-
-
Chris Lattner authored
else in X86), and add support for pavgusb. This is apparently the only instruction (other than movsx) that is preventing ffmpeg from building with clang. If someone else is interested in banging out the rest of the 3DNow! instructions, it should be quite easy now. llvm-svn: 115466
-
- Sep 17, 2010
-
-