Skip to content
  1. Nov 14, 2016
  2. Nov 08, 2016
  3. Oct 31, 2016
  4. Oct 27, 2016
  5. Oct 23, 2016
  6. Oct 20, 2016
  7. Oct 18, 2016
    • Simon Pilgrim's avatar
      [X86][SSE] Add lowering to cvttpd2dq/cvttps2dq for sitofp v2f64/2f32 to 2i32 · 4ddc92b6
      Simon Pilgrim authored
      As discussed on PR28461 we currently miss the chance to lower "fptosi <2 x double> %arg to <2 x i32>" to cvttpd2dq due to its use of illegal types.
      
      This patch adds support for fptosi to 2i32 from both 2f64 and 2f32.
      
      It also recognises that cvttpd2dq zeroes the upper 64-bits of the xmm result (similar to D23797) - we still don't do this for the cvttpd2dq/cvttps2dq intrinsics - this can be done in a future patch.
      
      Differential Revision: https://reviews.llvm.org/D23808
      
      llvm-svn: 284459
      4ddc92b6
  8. Oct 12, 2016
    • Alexey Bataev's avatar
      NFC: The Cost Model specialization, by Andrey Tischenko · b271a58e
      Alexey Bataev authored
      The current Cost Model implementation is very inaccurate and has to be
      updated, improved, re-implemented to be able to take into account the
      concrete CPU models and the concrete targets where this Cost Model is
      being used. For example, the Latency Cost Model should be differ from
      Code Size Cost Model, etc.
      This patch is the first step to launch the developing and implementation
      of a new Cost Model generation.
      
      Differential Revision: https://reviews.llvm.org/D25186
      
      llvm-svn: 284012
      b271a58e
  9. Aug 17, 2016
  10. Aug 08, 2016
    • Charles Davis's avatar
      Revert "[X86] Support the "ms-hotpatch" attribute." · e9c32c7e
      Charles Davis authored
      This reverts commit r278048. Something changed between the last time I
      built this--it takes awhile on my ridiculously slow and ancient
      computer--and now that broke this.
      
      llvm-svn: 278053
      e9c32c7e
    • Charles Davis's avatar
      [X86] Support the "ms-hotpatch" attribute. · 0822aa11
      Charles Davis authored
      Summary:
      Based on two patches by Michael Mueller.
      
      This is a target attribute that causes a function marked with it to be
      emitted as "hotpatchable". This particular mechanism was originally
      devised by Microsoft for patching their binaries (which they are
      constantly updating to stay ahead of crackers, script kiddies, and other
      ne'er-do-wells on the Internet), but is now commonly abused by Windows
      programs to hook API functions.
      
      This mechanism is target-specific. For x86, a two-byte no-op instruction
      is emitted at the function's entry point; the entry point must be
      immediately preceded by 64 (32-bit) or 128 (64-bit) bytes of padding.
      This padding is where the patch code is written. The two byte no-op is
      then overwritten with a short jump into this code. The no-op is usually
      a `movl %edi, %edi` instruction; this is used as a magic value
      indicating that this is a hotpatchable function.
      
      Reviewers: majnemer, sanjoy, rnk
      
      Subscribers: dberris, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D19908
      
      llvm-svn: 278048
      0822aa11
  11. Aug 05, 2016
    • Michael Kuperstein's avatar
      [LV, X86] Be more optimistic about vectorizing shifts. · 3ceac2bb
      Michael Kuperstein authored
      Shifts with a uniform but non-constant count were considered very expensive to
      vectorize, because the splat of the uniform count and the shift would tend to
      appear in different blocks. That made the splat invisible to ISel, and we'd
      scalarize the shift at codegen time.
      
      Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we
      are able to select the appropriate vector shifts. This updates the cost model to
      to take this into account by making shifts by a uniform cheap again.
      
      Differential Revision: https://reviews.llvm.org/D23049
      
      llvm-svn: 277782
      3ceac2bb
  12. Aug 04, 2016
  13. Aug 02, 2016
  14. Jul 20, 2016
  15. Jul 17, 2016
  16. Jul 11, 2016
  17. Jul 06, 2016
  18. Jun 21, 2016
  19. Jun 11, 2016
  20. Jun 10, 2016
  21. May 25, 2016
  22. May 24, 2016
  23. May 09, 2016
    • Simon Pilgrim's avatar
      [X86][SSE] Improve cost model for i64 vector comparisons on pre-SSE42 targets · eec3a95f
      Simon Pilgrim authored
      As discussed on PR24888, until SSE42 we don't have access to PCMPGTQ for v2i64 comparisons, but the cost models don't reflect this, resulting in over-optimistic vectorizaton.
      
      This patch adds SSE2 'base level' costs that match what a typical target is capable of and only reduces the v2i64 costs at SSE42.
      
      Technically SSE41 provides a PCMPEQQ v2i64 equality test, but as getCmpSelInstrCost doesn't give us a way to discriminate between comparison test types we can't easily make use of this, otherwise we could split the cost of integer equality and greater-than tests to give better costings of each.
      
      Differential Revision: http://reviews.llvm.org/D20057
      
      llvm-svn: 268972
      eec3a95f
  24. Apr 22, 2016
    • Ashutosh Nema's avatar
      [X86]: Changing cost for “TRUNCATE v16i32 to v16i8” in SSE4.1 mode. · 468558a0
      Ashutosh Nema authored
      Summary:
      rL256194 transforms truncations between vectors of integers into PACKUS/PACKSS
      operations during DAG combine. This generates better code for truncate, so cost
      of truncate needs to be changed but looks like it got changed only in SSE2 table
      Whereas this change is also applicable for SSE4.1, so the cost of truncate needs
      to be changed for that as well. Cost of “TRUNCATE v16i32 to v16i8” & “TRUNCATE 
      v16i16 to v16i8” should be same in SSE4.1 & SSE2 table. Removing their cost from
      SSE4.1, so it will fall back to SSE2.
      
      Reviewers: Simon Pilgrim
      llvm-svn: 267123
      468558a0
  25. Apr 14, 2016
    • Mehdi Amini's avatar
      Do not use getGlobalContext()... ever. · 867e9146
      Mehdi Amini authored
      This code was creating a new type in the global context, regardless
      of which context the user is sitting in, what can possibly go wrong?
      
      From: Mehdi Amini <mehdi.amini@apple.com>
      llvm-svn: 266275
      867e9146
  26. Apr 05, 2016
  27. Mar 09, 2016
  28. Mar 06, 2016
  29. Jan 25, 2016
  30. Dec 28, 2015
  31. Dec 21, 2015
    • Cong Hou's avatar
      [X86][SSE] Transform truncations between vectors of integers into... · 8df93ce4
      Cong Hou authored
      [X86][SSE] Transform truncations between vectors of integers into X86ISD::PACKUS/PACKSS operations during DAG combine.
      
      This patch transforms truncation between vectors of integers into
      X86ISD::PACKUS/PACKSS operations during DAG combine. We don't do it in
      lowering phase because after type legalization, the original truncation
      will be turned into a BUILD_VECTOR with each element that is extracted
      from a vector and then truncated, and from them it is difficult to do
      this optimization. This greatly improves the performance of truncations
      on some specific types.
      
      Cost table is updated accordingly.
      
      
      Differential revision: http://reviews.llvm.org/D14588
      
      llvm-svn: 256194
      8df93ce4
  32. Dec 20, 2015
    • Craig Topper's avatar
      [X86] Prevent constant hoisting for a couple compare immediates that the... · 074e8452
      Craig Topper authored
      [X86] Prevent constant hoisting for a couple compare immediates that the selection DAG knows how to optimize into a shift.
      
      This allows "icmp ugt %a, 4294967295" and "icmp uge %a, 4294967296" to be optimized into right shifts by 32 which can fold the immediate into the shift instruction. These patterns show up with some regularity in real code.
      
      Unfortunately, since getImmCost can't see the icmp predicate we can't be tell if we're only catching these specific cases.
      
      llvm-svn: 256126
      074e8452
  33. Dec 11, 2015
  34. Dec 02, 2015
  35. Nov 19, 2015
Loading