- Oct 13, 2012
-
-
Benjamin Kramer authored
X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this level is often not a good idea because it's too late for many optimizations to kick in. This solution doesn't add any extensions (truncs are free) and tries to avoid introducing partial register stalls by filtering direct copyfromregs. I'm seeing a ~10% speedup on reading a random .png file with libpng15 via graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture. llvm-svn: 165868
-
- Oct 11, 2012
-
-
Micah Villmow authored
llvm-svn: 165747
-
Micah Villmow authored
Add in the first iteration of support for llvm/clang/lldb to allow variable per address space pointer sizes to be optimized correctly. llvm-svn: 165726
-
NAKAMURA Takumi authored
It broke stage2 clang and test-suite/MultiSource/Benchmarks/mediabench/g721/g721encode. llvm-svn: 165692
-
Evan Cheng authored
llvm-svn: 165677
-
- Oct 10, 2012
-
-
Nadav Rotem authored
Original message: The attached is the fix to radar://11663049. The optimization can be outlined by following rules: (select (x != c), e, c) -> select (x != c), e, x), (select (x == c), c, e) -> select (x == c), x, e) where the <c> is an integer constant. The reason for this change is that : on x86, conditional-move-from-constant needs two instructions; however, conditional-move-from-register need only one instruction. While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase. The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource". llvm-svn: 165661
-
Michael Liao authored
- Due to the current matching vector elements constraints in ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from v2f32) is scalarized. Add a customized v2f32 widening to convert it into a target-specific X86ISD::VFPROUND to work around this constraints. llvm-svn: 165631
-
Michael Liao authored
- Due to the current matching vector elements constraints in ISD::FP_EXTEND, rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening to convert it into a target-specific X86ISD::VFPEXT to work around this constraints. This patch also reverts a previous attempt to fix this issue by recovering the scalarized ISD::FP_EXTEND pattern and thus significantly reduces the overhead of supporting non-power-2 vector FP extend. llvm-svn: 165625
-
-
- Oct 09, 2012
-
-
Bill Wendling authored
We use the enums to query whether an Attributes object has that attribute. The opaque layer is responsible for knowing where that specific attribute is stored. llvm-svn: 165488
-
- Oct 08, 2012
-
-
Micah Villmow authored
llvm-svn: 165402
-
- Oct 04, 2012
-
-
Preston Gurd authored
a pointer to a type, in order to remove the uses of getGlobalContext(). Patch by Tyler Nowicki. llvm-svn: 165255
-
Michael Liao authored
- Add 'HwEncoding' for X86 registers and call getEncodingValue() to retrieve their encoding values. - This's the first step to adopt new scheme. Furthur revising is onging. llvm-svn: 165241
-
Bill Wendling authored
llvm-svn: 165205
-
Michael Liao authored
llvm-svn: 165182
-
- Sep 30, 2012
-
-
Craig Topper authored
Change getX86SubSuperRegister to take an MVT::SimpleValueType rather than an EVT and add llvm_unreachable to the switches. Helps it compile to dramatically better code. llvm-svn: 164919
-
- Sep 26, 2012
-
-
Bill Wendling authored
The hasFnAttr method has been replaced by querying the Attributes explicitly. No intended functionality change. llvm-svn: 164725
-
- Sep 25, 2012
-
-
Michael Liao authored
- Turn on atomic6432.ll and add specific test case as well llvm-svn: 164616
-
Evan Cheng authored
Fix an illegal tailcall opt where the callee returns a double via xmm while caller returns x86_fp80 via st0. rdar://12229511 llvm-svn: 164588
-
- Sep 21, 2012
-
-
Michael Liao authored
- Fix PR5145 and turn on test 8-bit atomic ops llvm-svn: 164358
-
Michael Liao authored
- Rewirte most atomic instructions in templates for both better maintenance and future extensions, such as HLE in TSX. llvm-svn: 164357
-
- Sep 20, 2012
-
-
Michael Liao authored
- Rewrite/merge pseudo-atomic instruction emitters to address the following issue: * Reduce one unnecessary load in spin-loop previously the spin-loop looks like thisMBB: newMBB: ld t1 = [bitinstr.addr] op t2 = t1, [bitinstr.val] not t3 = t2 (if Invert) mov EAX = t1 lcs dest = [bitinstr.addr], t3 [EAX is implicit] bz newMBB fallthrough -->nextMBB the 'ld' at the beginning of newMBB should be lift out of the loop as lcs (or CMPXCHG on x86) will load the current memory value into EAX. This loop is refined as: thisMBB: EAX = LOAD [MI.addr] mainMBB: t1 = OP [MI.val], EAX LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined] JNE mainMBB sinkMBB: * Remove immopc as, so far, all pseudo-atomic instructions has all-register form only, there is no immedidate operand. * Remove unnecessary attributes/modifiers in pseudo-atomic instruction td * Fix issues in PR13458 - Add comprehensive tests on atomic ops on various data types. NOTE: Some of them are turned off due to missing functionality. - Revise tests due to the new spin-loop generated. llvm-svn: 164281
-
- Sep 15, 2012
-
-
Benjamin Kramer authored
This was only an issue if sse is disabled. llvm-svn: 163967
-
- Sep 13, 2012
-
-
Michael Liao authored
llvm-svn: 163835
-
Michael Liao authored
- Enhance the fix to PR12312 to support wider integer, such as 256-bit integer. If more than 1 fully evaluated vectors are found, POR them first followed by the final PTEST. llvm-svn: 163832
-
- Sep 12, 2012
-
-
Michael Liao authored
- BlockAddress has no support of BA + offset form and there is no way to propagate that offset into machine operand; - Add BA + offset support and a new interface 'getTargetBlockAddress' to simplify target block address forming; - All targets are modified to use new interface and X86 backend is enhanced to support BA + offset addressing. llvm-svn: 163743
-
Craig Topper authored
llvm-svn: 163682
-
- Sep 11, 2012
-
-
Craig Topper authored
llvm-svn: 163596
-
- Sep 10, 2012
-
-
Dmitri Gribenko authored
llvm-svn: 163547
-
Michael Liao authored
- Fix an remaining issue of PR11674 as well llvm-svn: 163528
-
Michael Liao authored
- If a boolean value is generated from CMOV and tested as boolean value, simplify the use of test result by referencing the original condition. RDRAND intrinisc is one of such cases. llvm-svn: 163516
-
Elena Demikhovsky authored
The VPSHUFB 256-bit instruction may be generated when one of input vector is undefined or zeroinitializer. I've added the "zeroinitializer" case in this patch. llvm-svn: 163506
-
- Sep 08, 2012
-
-
Craig Topper authored
llvm-svn: 163473
-
Craig Topper authored
llvm-svn: 163463
-
Craig Topper authored
llvm-svn: 163461
-
Craig Topper authored
Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct. llvm-svn: 163458
-
- Sep 06, 2012
-
-
Elena Demikhovsky authored
Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible. llvm-svn: 163312
-
Michael Liao authored
llvm-svn: 163295
-
Craig Topper authored
Use iPTR instead of i32 for extract_subvector/insert_subvector index in lowering and patterns. This makes it consistent with the incoming DAG nodes from the DAG builder. llvm-svn: 163293
-
Roman Divacky authored
llvm-svn: 163258
-