Skip to content
  1. Jun 11, 2012
  2. Jun 10, 2012
    • Benjamin Kramer's avatar
      InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare. · 8b8a7697
      Benjamin Kramer authored
      This saves a cast, and zext is more expensive on platforms with subreg support
      than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750.
      On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the
      same performance now when not inlining either function.
      
      stupid_memchr: 323.0us
      bsd_memchr: 321.0us
      memchr: 479.0us
      
      where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When
      inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time,
      I haven't fully understood the issue yet, something is grossly mangling the
      loop after inlining.
      
      llvm-svn: 158297
      8b8a7697
    • Hal Finkel's avatar
      Enable ILP scheduling for all nodes by default on PPC. · 4e9f1a85
      Hal Finkel authored
      Over the entire test-suite, this has an insignificantly negative average
      performance impact, but reduces some of the worst slowdowns from the
      anti-dep. change (r158294).
      
      Largest speedups:
      SingleSource/Benchmarks/Stanford/Quicksort - 28%
      SingleSource/Benchmarks/Stanford/Towers - 24%
      SingleSource/Benchmarks/Shootout-C++/matrix - 23%
      MultiSource/Benchmarks/SciMark2-C/scimark2 - 19%
      MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15%
      (matrix and automotive-bitcount were both in the top-5 slowdown list from the
      anti-dep. change)
      
      Largest slowdowns:
      MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
      MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26%
      MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21%
      SingleSource/Benchmarks/CoyoteBench/lpbench - 20%
      MultiSource/Applications/d/make_dparser - 16%
      
      llvm-svn: 158296
      4e9f1a85
    • Nadav Rotem's avatar
      Add AutoUpgrade support for the SSE4 ptest intrinsics. · 17ee58a7
      Nadav Rotem authored
      Patch by Michael Kuperstein.
      
      llvm-svn: 158295
      17ee58a7
    • Hal Finkel's avatar
      Use critical anti-dep. breaking on all PPC targets, but also add other register classes. · a8100281
      Hal Finkel authored
      Using 'all' instead of 'critical' would be better because it would make it easier to
      satisfy the bundling constraints, but, as noted in the FIXME, that is currently not
      possible with the crs.
      
      This yields an average 1% speedup over the entire test suite (on Power 7). Largest speedups:
      SingleSource/Benchmarks/Shootout-C++/moments - 40%
      MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
      SingleSource/Benchmarks/BenchmarkGame/nsieve-bits - 26%
      SingleSource/Benchmarks/McGill/misr - 23%
      MultiSource/Applications/JM/ldecod/ldecod - 22%
      
      Largest slowdowns:
      SingleSource/Benchmarks/Shootout-C++/matrix - -29%
      SingleSource/Benchmarks/Shootout-C++/ary3 - -22%
      MultiSource/Benchmarks/BitBench/uuencode/uuencode - -18%
      SingleSource/Benchmarks/Shootout-C++/ary - -17%
      MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - -15%
      
      llvm-svn: 158294
      a8100281
    • Richard Smith's avatar
      PR12964: __int128 and unsigned __int128 are promoted integral types, be sure to · 521ecc1f
      Richard Smith authored
      consider them when enumerating builtin operator candidates.
      
      llvm-svn: 158293
      521ecc1f
    • Craig Topper's avatar
      Add XOP vprot* instruction intrinsics · a3c5fbf5
      Craig Topper authored
      llvm-svn: 158292
      a3c5fbf5
    • Craig Topper's avatar
      Add intrinsics for immediate form of XOP vprot instructions. Use i128mem... · 7afe343b
      Craig Topper authored
      Add intrinsics for immediate form of XOP vprot instructions. Use i128mem instead of f128mem for integer XOP instructions.
      
      llvm-svn: 158291
      7afe343b
    • Richard Smith's avatar
      Remove CXXRecordDecl flags which are unused after r158289. · 4086a13d
      Richard Smith authored
      We need an efficient mechanism to determine whether a defaulted default
      constructor is constexpr, in order to determine whether a class is a literal
      type, so keep the incrementally-built form on CXXRecordDecl. Remove the
      on-demand computation of same, so that we only have one method for determining
      whether a default constructor is constexpr. This doesn't affect correctness,
      since default constructor lookup is much simpler than selecting a constructor
      for copying or moving.
      
      We don't need a corresponding mechanism for defaulted copy or move constructors,
      since they can't affect whether a type is a literal type. Conversely, checking
      whether such functions are constexpr can require non-trivial effort, so we defer
      such checks until the copy or move constructor is required.
      
      Thus we now only compute whether a copy or move constructor is constexpr on
      demand, and only compute whether a default constructor is constexpr in advance.
      This is unfortunate, but seems like the best solution.
      
      llvm-svn: 158290
      4086a13d
    • Richard Smith's avatar
      Fix PR13052 properly, by performing special member lookup to determine whether · b5800095
      Richard Smith authored
      an explicitly-defaulted default constructor would be constexpr. This is
      necessary in weird (but well-formed) cases where a class has more than one copy
      or move constructor.
      
      Cleanup of now-unused parts of CXXRecordDecl to follow.
      
      llvm-svn: 158289
      b5800095
    • Richard Smith's avatar
      PR13064: Store whether an in-class initializer uses direct or copy · 2b013185
      Richard Smith authored
      initialization, and use that information to produce the right kind of
      initialization during template instantiation.
      
      llvm-svn: 158288
      2b013185
    • Craig Topper's avatar
      More XOP intrinsics · 02b3d81a
      Craig Topper authored
      llvm-svn: 158287
      02b3d81a
    • Craig Topper's avatar
      Begin adding XOP intrinsics · 33b6d5e2
      Craig Topper authored
      llvm-svn: 158286
      33b6d5e2
    • James Dennett's avatar
      Fix the top-of-file comment in Attr.h to say that it's about attributes, not · f3d90890
      James Dennett authored
      expressions.
      
      llvm-svn: 158285
      f3d90890
    • Craig Topper's avatar
      Add XOP feature flag. · f561a956
      Craig Topper authored
      llvm-svn: 158284
      f561a956
    • Hal Finkel's avatar
      Improve ext/trunc patterns on PPC64. · 2edfbddc
      Hal Finkel authored
      The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that
      would leave self-moves in the final assembly. Replacing those patterns with ones
      based on the SUBREG builtins yields better-looking code.
      
      Thanks to Jakob and Owen for their suggestions in this matter.
      
      llvm-svn: 158283
      2edfbddc
  3. Jun 09, 2012
Loading