Skip to content
  1. Dec 05, 2013
    • Yi Jiang's avatar
      01cfa942
    • Yuchen Wu's avatar
      llvm-cov: Further improved error messages. · 4c9f19d6
      Yuchen Wu authored
      llvm-svn: 196542
      4c9f19d6
    • Yuchen Wu's avatar
      llvm-cov: Conformed headers. · c3e64247
      Yuchen Wu authored
      llvm-svn: 196541
      c3e64247
    • Renato Golin's avatar
      Move test to X86 dir · e593fea5
      Renato Golin authored
      Test is platform independent, but I don't want to force vector-width, or
      that could spoil the pragma test.
      
      llvm-svn: 196539
      e593fea5
    • Renato Golin's avatar
      Add #pragma vectorize enable/disable to LLVM · 729a3ae9
      Renato Golin authored
      The intended behaviour is to force vectorization on the presence
      of the flag (either turn on or off), and to continue the behaviour
      as expected in its absence. Tests were added to make sure the all
      cases are covered in opt. No tests were added in other tools with
      the assumption that they should use the PassManagerBuilder in the
      same way.
      
      This patch also removes the outdated -late-vectorize flag, which was
      on by default and not helping much.
      
      The pragma metadata is being attached to the same place as other loop
      metadata, but nothing forbids one from attaching it to a function
      (to enable #pragma optimize) or basic blocks (to hint the basic-block
      vectorizers), etc. The logic should be the same all around.
      
      Patches to Clang to produce the metadata will be produced after the
      initial implementation is agreed upon and committed. Patches to other
      vectorizers (such as SLP and BB) will be added once we're happy with
      the pass manager changes.
      
      llvm-svn: 196537
      729a3ae9
    • Aditya Nandakumar's avatar
      Check hint registers for interference only once before evictions · 73f3d33d
      Aditya Nandakumar authored
      llvm-svn: 196536
      73f3d33d
    • Ana Pazos's avatar
      Implemented vget/vset_lane_f16 intrinsics · 6b0a8c50
      Ana Pazos authored
      llvm-svn: 196533
      6b0a8c50
    • Yuchen Wu's avatar
      llvm-cov: Changed extension from .llcov to .gcov. · 9af3938b
      Yuchen Wu authored
      llvm-svn: 196530
      9af3938b
    • Matt Arsenault's avatar
      Revert part of GCC warning fix to fix debug build. · 79d55f5c
      Matt Arsenault authored
      The typedef is used inside the DEBUG(), and apparently can't be moved
      inside of it.
      
      llvm-svn: 196528
      79d55f5c
    • Matt Arsenault's avatar
      Fix minor GCC warnings. · c44a3ff6
      Matt Arsenault authored
      Unused typedefs and unused variables.
      
      llvm-svn: 196526
      c44a3ff6
    • Michael Gottesman's avatar
      Change std::deque => std::vector. No functionality change. · 2bf0173b
      Michael Gottesman authored
      There is no reason to use std::deque here over std::vector. Thus given the
      performance differences inbetween the two it makes sense to change deque to
      vector.
      
      llvm-svn: 196524
      2bf0173b
    • Yunzhong Gao's avatar
      f5b769e4
    • Rafael Espindola's avatar
      Fix non-deterministic behavior. · cdbde3aa
      Rafael Espindola authored
      We use CSEBlocks to initialize a worklist:
      
      SmallVector<BasicBlock *, 8> CSEWorkList(CSEBlocks.begin(), CSEBlocks.end());
      
      so it must have a deterministic order.
      
      llvm-svn: 196520
      cdbde3aa
    • Eric Christopher's avatar
      f8194853
    • Andrew Trick's avatar
      MI-Sched: Model "reserved" processor resources. · 5a22df49
      Andrew Trick authored
      This allows a target to use MI-Sched as an in-order scheduler that
      will model strict resource conflicts without defining a processor
      itinerary. Instead, the target can now use the new per-operand machine
      model and define in-order resources with BufferSize=0. For example,
      this would allow restricting the type of operations that can be formed
      into a dispatch group. (Normally NumMicroOps is sufficient to enforce
      dispatch groups).
      
      If the intent is to model latency in in-order pipeline, as opposed to
      resource conflicts, then a resource with BufferSize=1 should be
      defined instead.
      
      This feature is only casually tested as there are no in-tree targets
      using it yet. However, Hal will be experimenting with POWER7.
      
      llvm-svn: 196517
      5a22df49
    • Andrew Trick's avatar
      MI-Sched: handle latency of in-order operations with the new machine model. · 880e573d
      Andrew Trick authored
      The per-operand machine model allows the target to define "unbuffered"
      processor resources. This change is a quick, cheap way to model stalls
      caused by the latency of operations that use such resources. This only
      applies when the processor's micro-op buffer size is non-zero
      (Out-of-Order). We can't precisely model in-order stalls during
      out-of-order execution, but this is an easy and effective
      heuristic. It benefits cortex-a9 scheduling when using the new
      machine model, which is not yet on by default.
      
      MI-Sched for armv7 was evaluated on Swift (and only not enabled because
      of a performance bug related to predication). However, we never
      evaluated Cortex-A9 performance on MI-Sched in its current form. This
      change adds MI-Sched functionality to reach performance goals on
      A9. The only remaining change is to allow MI-Sched to run as a PostRA
      pass.
      
      I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7:
      -mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false
      
      For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results:
      (min run time over 2 runs, filtering tiny changes)
      
      Speedups:
      | Benchmarks/BenchmarkGame/recursive         |  52.39% |
      | Benchmarks/VersaBench/beamformer           |  20.80% |
      | Benchmarks/Misc/pi                         |  19.97% |
      | Benchmarks/Misc/mandel-2                   |  19.95% |
      | SPEC/CFP2000/188.ammp                      |  18.72% |
      | Benchmarks/McCat/08-main/main              |  18.58% |
      | Benchmarks/Misc-C++/Large/sphereflake      |  18.46% |
      | Benchmarks/Olden/power                     |  17.11% |
      | Benchmarks/Misc-C++/mandel-text            |  16.47% |
      | Benchmarks/Misc/oourafft                   |  15.94% |
      | Benchmarks/Misc/flops-7                    |  14.99% |
      | Benchmarks/FreeBench/distray               |  14.26% |
      | SPEC/CFP2006/470.lbm                       |  14.00% |
      | mediabench/mpeg2/mpeg2dec/mpeg2decode      |  12.28% |
      | Benchmarks/SmallPT/smallpt                 |  10.36% |
      | Benchmarks/Misc-C++/Large/ray              |   8.97% |
      | Benchmarks/Misc/fp-convert                 |   8.75% |
      | Benchmarks/Olden/perimeter                 |   7.10% |
      | Benchmarks/Bullet/bullet                   |   7.03% |
      | Benchmarks/Misc/mandel                     |   6.75% |
      | Benchmarks/Olden/voronoi                   |   6.26% |
      | Benchmarks/Misc/flops-8                    |   5.77% |
      | Benchmarks/Misc/matmul_f64_4x4             |   5.19% |
      | Benchmarks/MiBench/security-rijndael       |   5.15% |
      | Benchmarks/Misc/flops-6                    |   5.10% |
      | Benchmarks/Olden/tsp                       |   4.46% |
      | Benchmarks/MiBench/consumer-lame           |   4.28% |
      | Benchmarks/Misc/flops-5                    |   4.27% |
      | Benchmarks/mafft/pairlocalalign            |   4.19% |
      | Benchmarks/Misc/himenobmtxpa               |   4.07% |
      | Benchmarks/Misc/lowercase                  |   4.06% |
      | SPEC/CFP2006/433.milc                      |   3.99% |
      | Benchmarks/tramp3d-v4                      |   3.79% |
      | Benchmarks/FreeBench/pifft                 |   3.66% |
      | Benchmarks/Ptrdist/ks                      |   3.21% |
      | Benchmarks/Adobe-C++/loop_unroll           |   3.12% |
      | SPEC/CINT2000/175.vpr                      |   3.12% |
      | Benchmarks/nbench                          |   2.98% |
      | SPEC/CFP2000/183.equake                    |   2.91% |
      | Benchmarks/Misc/perlin                     |   2.85% |
      | Benchmarks/Misc/flops-1                    |   2.82% |
      | Benchmarks/Misc-C++-EH/spirit              |   2.80% |
      | Benchmarks/Misc/flops-2                    |   2.77% |
      | Benchmarks/NPB-serial/is                   |   2.42% |
      | Benchmarks/ASC_Sequoia/CrystalMk           |   2.33% |
      | Benchmarks/BenchmarkGame/n-body            |   2.28% |
      | Benchmarks/SciMark2-C/scimark2             |   2.27% |
      | Benchmarks/Olden/bh                        |   2.03% |
      | skidmarks10/skidmarks                      |   1.81% |
      | Benchmarks/Misc/flops                      |   1.72% |
      
      Slowdowns:
      | Benchmarks/llubenchmark/llu                | -14.14% |
      | Benchmarks/Polybench/stencils/seidel-2d    |  -5.67% |
      | Benchmarks/Adobe-C++/functionobjects       |  -5.25% |
      | Benchmarks/Misc-C++/oopack_v1p8            |  -5.00% |
      | Benchmarks/Shootout/hash                   |  -2.35% |
      | Benchmarks/Prolangs-C++/ocean              |  -2.01% |
      | Benchmarks/Polybench/medley/floyd-warshall |  -1.98% |
      | Polybench/linear-algebra/kernels/3mm       |  -1.95% |
      | Benchmarks/McCat/09-vor/vor                |  -1.68% |
      
      llvm-svn: 196516
      880e573d
    • Andrew Trick's avatar
      Machine model comments. Explain a ProcessorUnit's BufferSize. · 093bdd17
      Andrew Trick authored
      llvm-svn: 196515
      093bdd17
    • Andrew Trick's avatar
      Fix the A9 machine model. VTRN writes two registers. · ff199a4b
      Andrew Trick authored
      llvm-svn: 196514
      ff199a4b
    • Andrew Trick's avatar
      comment typo and reformat · bb1247b9
      Andrew Trick authored
      llvm-svn: 196513
      bb1247b9
    • Rafael Espindola's avatar
      Add a default constructor to get deterministic behavior. · 4cc2b873
      Rafael Espindola authored
      Should fix the msan and valgrind bots.
      
      llvm-svn: 196509
      4cc2b873
    • Arnold Schwaighofer's avatar
      SLPVectorizer: An in-tree vectorized entry cannot also be a scalar external use · 7ee53cac
      Arnold Schwaighofer authored
      We were creating external uses for scalar values in MustGather entries that also
      had a ScalarToTreeEntry (they also are present in a vectorized tuple). This
      meant we would keep a value 'alive' as a scalar and vectorized causing havoc.
      This is not necessary because when we create a MustGather vector we explicitly
      create external uses entries for the insertelement instructions of the
      MustGather vector elements.
      
      Fixes PR18129.
      
      radar://15582184
      
      llvm-svn: 196508
      7ee53cac
    • Kostya Serebryany's avatar
      [tsan] fix PR18146: sometimes a variable written into vptr could have an... · 2460c3fc
      Kostya Serebryany authored
      [tsan] fix PR18146: sometimes a variable written into vptr could have an integer type (after other optimizations)
      
      llvm-svn: 196507
      2460c3fc
    • Justin Holewinski's avatar
      4459717b
    • Alexey Samsonov's avatar
      Add forgotten header guards · 996e099d
      Alexey Samsonov authored
      llvm-svn: 196500
      996e099d
    • Matheus Almeida's avatar
      [mips] Small code generation improvement for conditional operator (select) · a6beac1a
      Matheus Almeida authored
      in case the operands are constants and its difference is |1|.
      It should be possible in those cases to rematerialize the result using
      MIPS's slt and similar instructions.
      
      The small update to some of the tests in cmov.ll, sel1c.ll and sel2c.ll was needed
      otherwise the optimization implemented in this patch would have been triggered
      (difference between the operands was 1) and that would have changed the semantic
      of the tests.
      
      llvm-svn: 196498
      a6beac1a
    • Matheus Almeida's avatar
      [mips] Add some comments related to the optimization performed in performSELECTCombine. · a611c0f4
      Matheus Almeida authored
      The structure of the code was slightly modified so that the next patch is easier to read/review.
      
      No functional changes.
      
      llvm-svn: 196496
      a611c0f4
    • Matheus Almeida's avatar
      [mips][msa] Fix issue with immediate fields of LD/ST instructions · 6b59c449
      Matheus Almeida authored
      not being correctly encoded/decoded.
      In more detail, immediate fields of LD/ST instructions should be
      divided/multiplied by the size of the data format before encoding and
      after decoding, respectively.
      
      llvm-svn: 196494
      6b59c449
    • Tim Northover's avatar
      ARM: fix yet another stack-folding bug · e4def5e2
      Tim Northover authored
      We were trying to fold the stack adjustment into the wrong instruction in the
      situation where the entire basic-block was epilogue code. Really, it can only
      ever be valid to do the folding precisely where the "add sp, ..." would be
      placed so there's no need for a separate iterator to track that.
      
      Should fix PR18136.
      
      llvm-svn: 196493
      e4def5e2
    • David Blaikie's avatar
    • Matt Arsenault's avatar
      Use isIntrinsic() instead of checking for "llvm." · a68c9adc
      Matt Arsenault authored
      llvm-svn: 196473
      a68c9adc
    • Rafael Espindola's avatar
      Remove the isImplicitlyPrivate argument of getNameWithPrefix. · 117b20c4
      Rafael Espindola authored
      getSymbolWithGlobalValueBase use is to create a name of a new symbol based
      on the name of an existing GV. Assert that and then remove the last call
      to pass true to isImplicitlyPrivate.
      
      This gives the mangler API a 1:1 mapping from GV to names, which is what we
      need to drop the mangler dependency on the target (and use an extended
      datalayout instead).
      
      llvm-svn: 196472
      117b20c4
    • Alp Toker's avatar
      Correct word hyphenations · f907b891
      Alp Toker authored
      This patch tries to avoid unrelated changes other than fixing a few
      hyphen-related ambiguities and contractions in nearby lines.
      
      llvm-svn: 196471
      f907b891
    • Rafael Espindola's avatar
      Hide the stub created for MO_ExternalSymbol too. · 01d19d02
      Rafael Espindola authored
      given
      
      declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1)
      declare void @foo()
      define void @bar() {
        call void @foo()
        call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false)
        ret void
      }
      
      We used to produce
      
      L_foo$stub:
              .indirect_symbol        _foo
              .ascii  "\364\364\364\364\364"
      
      _memset$stub:
              .indirect_symbol        _memset
              .ascii  "\364\364\364\364\364"
      
      We not produce a private stub for memset too.
      
      Stubs are not needed with recent linkers, but we still produce them for darwin8.
      
      Thanks to David Fang for confirming that gcc used to do this too.
      
      llvm-svn: 196468
      01d19d02
    • Matt Arsenault's avatar
      R600/SI: Add comments for number of used registers. · 89cc49fe
      Matt Arsenault authored
      llvm-svn: 196467
      89cc49fe
    • Rafael Espindola's avatar
      Try harder to get a consistent floating point results. · d50dbc78
      Rafael Espindola authored
      This just extends the existing hack. It should be enough to get a reproducible bootstrap
      on 32 bits.
      
      I will open a bug to track getting a real fix for this.
      
      llvm-svn: 196462
      d50dbc78
    • NAKAMURA Takumi's avatar
      Move llvm/test/MC/ELF/thumb-st_other.s to test/MC/ARM. · 57b20a7e
      NAKAMURA Takumi authored
      llvm-svn: 196457
      57b20a7e
    • Jiangning Liu's avatar
    • Cameron McInally's avatar
      Add FileCheck statements for r196435. · 164097a6
      Cameron McInally authored
      llvm-svn: 196449
      164097a6
    • Reid Kleckner's avatar
      Compiler.h: Disable initializer list usage with clang-cl · 51a10fb6
      Reid Kleckner authored
      Most people are using MSVC 2012, which lacks the <initializer_list>
      header.  MSVC 2013 shipped with that header, but it has not yet been
      tested.  If clang works with the 2013 header, then we can enable this by
      checking the value of _MSC_VER.
      
      llvm-svn: 196448
      51a10fb6
    • Will Dietz's avatar
      Export symbols in tools that support loading plugins. · ff1264b5
      Will Dietz authored
      llvm-svn: 196447
      ff1264b5
Loading