Skip to content
  1. Jan 28, 2017
  2. Jan 27, 2017
    • Peter Collingbourne's avatar
    • Kostya Serebryany's avatar
      [libFuzzer] make shmem more robust in the presence of signals · 6d58dbb6
      Kostya Serebryany authored
      llvm-svn: 293339
      6d58dbb6
    • Artem Tamazov's avatar
      [AMDGPU][mc] Fix memory corruption uncovered by AddressSanitizer during... · 33b01e9c
      Artem Tamazov authored
      [AMDGPU][mc] Fix memory corruption uncovered by AddressSanitizer during coverage/smoke Gfx7/8 testing.
      
      Coverage/smoke Gfx7/8 tests were committed r292922 but then reverted
      by r292974 due to AddressSanitizer failure, which is fixed by this patch.
      Tests to be re-committed soon.
      
      llvm-svn: 293338
      33b01e9c
    • Tim Northover's avatar
      GlobalISel: set correct regclass for LOAD_STACK_GUARD. · d8b85584
      Tim Northover authored
      Since it's not actually a generic MI, its register operands need a RegClass,
      which is conveniently the target's pointer RegClass.
      
      llvm-svn: 293335
      d8b85584
    • Tim Northover's avatar
      GlobalISel: mark incoming landing-pad registers as live. · c9bc8a55
      Tim Northover authored
      Should fix machine verifier failures.
      
      llvm-svn: 293334
      c9bc8a55
    • Krzysztof Parzyszek's avatar
      [Hexagon] Remove unused variable (and silence a warning) · 35ce5dac
      Krzysztof Parzyszek authored
      llvm-svn: 293331
      35ce5dac
    • Mehdi Amini's avatar
      Fix ASAN failure in cxa_demangle · 453ab352
      Mehdi Amini authored
      Found with ASAN + libFuzzer by Kostya Serebryany <kcc@google.com>
      
      llvm-svn: 293330
      453ab352
    • Mehdi Amini's avatar
      Global DCE performance improvement · 888dee44
      Mehdi Amini authored
      Change the original algorithm so that it scales better when meeting
      very large bitcode where every instruction does not implies a global.
      
      The target query is "how to you get all the globals referenced by
      another global"?
      
      Before this patch, it was doing this by walking the body (or the
      initializer) and collecting the references. What this patch is doing,
      it precomputing the answer to this query for the whole module by
      walking the use-list of every global instead.
      
      Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>
      
      Differential Revision: https://reviews.llvm.org/D28549
      
      llvm-svn: 293328
      888dee44
    • Justin Lebar's avatar
    • Xinliang David Li's avatar
      [PGO] add debug option to view raw count after prof use annotation · d289e454
      Xinliang David Li authored
      Differential Revision: https://reviews.llvm.org/D29045
      
      llvm-svn: 293325
      d289e454
    • Matthias Braun's avatar
      ScheduleDAGInstrs: Do not try to toggle kill flags on debug uses · c91e28af
      Matthias Braun authored
      Preparation for upcoming changes. No testcase as none of the public
      targets bundles early enough and has a post machine scheduler enabled at
      the same time. The error is also easily catched by asserts.
      
      llvm-svn: 293324
      c91e28af
    • Matthias Braun's avatar
      ScheduleDAGInstrs: Cleanup toggleKillFlag(); NFC · 26e8c350
      Matthias Braun authored
      llvm-svn: 293323
      26e8c350
    • Matthias Braun's avatar
      ScheduleDAGInstrs: Cleanup; NFC · bd7d9183
      Matthias Braun authored
      Comment, doxygen and a bit of whitespace cleanup.
      
      llvm-svn: 293322
      bd7d9183
    • Tom Stellard's avatar
      AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel · 08efb7eb
      Tom Stellard authored
      Reviewers: arsenm
      
      Reviewed By: arsenm
      
      Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
      
      Differential Revision: https://reviews.llvm.org/D29068
      
      llvm-svn: 293321
      08efb7eb
    • Konstantin Zhuravlyov's avatar
      a304c836
    • Chris Ray's avatar
      [X86] Adding FFREEP instruction. · 535e7d15
      Chris Ray authored
      Summary: Small change to get the FREEP instruction to decode properly.
      
      Reviewers: craig.topper
      
      Reviewed By: craig.topper
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D29193
      
      llvm-svn: 293314
      535e7d15
    • Anna Thomas's avatar
      NFC: Add debug tracing for more cases where loop unrolling fails. · e7d865e3
      Anna Thomas authored
      llvm-svn: 293313
      e7d865e3
    • Matt Arsenault's avatar
      AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands · d8f7ea38
      Matt Arsenault authored
      Accomplishes what r292982 was supposed to, which ended up
      only really making the necessary test changes.
      
      This should be applied to the 4.0 branch.
      
      Patch by Vedran Miletić <vedran@miletic.net>
      
      llvm-svn: 293310
      d8f7ea38
    • Matthew Simpson's avatar
      [ARM/AArch64] Relocate and update InterleavedAccessPass tests (NFC) · 3650df13
      Matthew Simpson authored
      The interleaved access pass is an IR-to-IR transformation that runs before code
      generation. It matches interleaved memory operations to target-specific
      intrinsics (that are later lowered to load and store multiple instructions on
      ARM/AArch64). We place tests for similar passes (e.g., GlobalMergePass) under
      test/Transforms. This patch moves the InterleavedAccessPass tests out of
      test/CodeGen and into target-specific directories under
      test/Transforms/InterleavedAccess.
      
      Although the pass is an IR pass, many of the existing tests were llc tests
      rather opt tests. For example, the tests would check for ldN/stN instructions
      generated by llc rather than the intrinsic calls the pass actually inserts.
      Thus, this patch updates all tests to be opt tests that check for the inserted
      intrinsics. We already have separate CodeGen tests that ensure we lower the
      interleaved access intrinsics to their corresponding ldN/stN instructions. In
      addition to migrating the tests to opt, this patch also performs some minor
      clean-up (to ensure consistent naming, etc.).
      
      Differential Revision: https://reviews.llvm.org/D29184
      
      llvm-svn: 293309
      3650df13
    • Matt Arsenault's avatar
      NVPTX: Make NVPTXInferAddressSpaces preserve CFG · 32b9600a
      Matt Arsenault authored
      llvm-svn: 293308
      32b9600a
    • Jun Bum Lim's avatar
      [CodeGenPrep]No negative cost in the ExtLd promotion · b99a06b7
      Jun Bum Lim authored
      Summary: This change prevent the signed value of cost from being negative as the value is passed as an unsigned argument.
      
      Reviewers: mcrosier, jmolloy, qcolombet, javed.absar
      
      Reviewed By: mcrosier, qcolombet
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D28871
      
      llvm-svn: 293307
      b99a06b7
    • Stanislav Mekhanoshin's avatar
      [AMDGPU] Turn AMDGPUUnifyMetadata back into module pass · f6c1feb8
      Stanislav Mekhanoshin authored
      With the adjustPassManager interface that is now possible to use
      custom early module passes.
      
      Differential Revision: https://reviews.llvm.org/D29189
      
      llvm-svn: 293300
      f6c1feb8
    • Mehdi Amini's avatar
      Fix BasicAA incorrect assumption on GEP · 1726fc69
      Mehdi Amini authored
      This is fixing pr31761: BasicAA is deducing NoAlias
      on the result of the GEP if the base pointer is itself NoAlias.
      
      This is possible only if the NoAlias on the base pointer is
      deduced with a non-sized query: this should guarantee that
      the pointers are belonging to different memory allocation
      and that the GEP can't legally jump from one to another.
      
      Differential Revision: https://reviews.llvm.org/D29216
      
      llvm-svn: 293293
      1726fc69
    • Ivan Krasin's avatar
      Avoid using unspecified ordering in MetadataLoader::MetadataLoaderImpl::parseOneMetadata. · c05c9db3
      Ivan Krasin authored
      Summary:
      MetadataLoader::MetadataLoaderImpl::parseOneMetadata uses
      the following construct in a number of places:
      
      ```
      MetadataList.assignValue(<...>, NextMetadataNo++);
      ```
      
      There, NextMetadataNo gets incremented, and since the order
      of arguments evaluation is not specified, that can happen
      before or after other arguments are evaluated.
      
      In a few cases the other arguments indirectly use NextMetadataNo.
      For instance, it's
      
      ```
      MetadataList.assignValue(
          GET_OR_DISTINCT(DIModule,
                          (Context, getMDOrNull(Record[1]),
                           getMDString(Record[2]), getMDString(Record[3]),
                           getMDString(Record[4]), getMDString(Record[5]))),
          NextMetadataNo++);
      ```
      
      getMDOrNull calls getMD that uses NextMetadataNo:
      
      ```
      MetadataList.getMetadataFwdRef(NextMetadataNo);
      ```
      
      Therefore, the order of evaluation becomes important. That caused
      a very subtle LLD crash that only happens if compiled with GCC or
      if LLD is built with LTO. In the case if LLD is compiled with Clang
      and regular linking mode, everything worked as intended.
      
      This change extracts incrementing of NextMetadataNo outside of
      the arguments list to guarantee the correct order of evaluation.
      
      For the record, this has taken 3 days to track to the origin. It all
      started with a ThinLTO bot in Chrome not being able to link a target
      if debug info is enabled.
      
      Reviewers: pcc, mehdi_amini
      
      Reviewed By: mehdi_amini
      
      Subscribers: aprantl, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D29204
      
      llvm-svn: 293291
      c05c9db3
    • Simon Dardis's avatar
      [mips] Recommit: "N64 static relocation model support" · ca74dd79
      Simon Dardis authored
      This patch makes one change to GOT handling and two changes to N64's
      relocation model handling. Furthermore, the jumptable encodings have
      been corrected for static N64.
      
      Big GOT handling is now done via a new SDNode MipsGotHi - this node is
      unconditionally lowered to an lui instruction.
      
      The first change to N64's relocation handling is the lifting of the
      restriction that N64 always uses PIC. Now it is possible to target static
      environments.
      
      The second change adds support for 64 bit symbols and enables them by
      default. Previously N64 had patterns for sym32 mode only. In this mode all
      symbols are assumed to have 32 bit addresses. sym32 mode support
      is selectable with attribute 'sym32'. A follow on patch for clang will
      add the necessary frontend parameter.
      
      This partially resolves PR/23485.
      
      Thanks to Brooks Davis for reporting the issue!
      
      This version corrects a "Conditional jump or move depends on uninitialised
      value(s)" error detected by valgrind present in the original commit.
      
      Reviewers: dsanders, seanbruno, zoran.jovanovic, vkalintiris
      
      Differential Revision: https://reviews.llvm.org/D23652
      
      llvm-svn: 293279
      ca74dd79
    • Alexey Bataev's avatar
      [SLP] Refactoring of horizontal reduction analysis, NFC. · 4015bf83
      Alexey Bataev authored
      Some checks in SLP horizontal reduction analysis function are performed
      several times, though it is enough to perform these checks only once
      during an initial attempt at adding candidate for the reduction
      instruction/reduced value.
      
      Differential Revision: https://reviews.llvm.org/D29175
      
      llvm-svn: 293274
      4015bf83
    • Chandler Carruth's avatar
      [LICM] When we are recomputing the alias sets for a subloop, we cannot · fd2d7c72
      Chandler Carruth authored
      skip sub-subloops.
      
      The logic to skip subloops dated from when this code was shared with the
      cached case. Once it was factored out to only run in the case of
      recomputed subloops it became a dangerous bug. If a subsubloop contained
      an interfering instruction it would be silently skipped from the alias
      sets for LICM.
      
      With the old pass manager this was extremely hard to trigger as it would
      require failing to visit these subloops with the LICM pass but then
      visiting the outer loop somehow. I've not yet contrived any test case
      that actually manages to trigger this.
      
      But with the new pass manager we don't do the cross-loop caching hack
      that the old PM does and so we recompute alias set information from
      first principles. While this seems much cleaner and simpler it exposed
      this bug and would subtly miscompile code due to failing to correctly
      model the aliasing constraints of deeply nested loops.
      
      llvm-svn: 293273
      fd2d7c72
    • Jonas Paulsson's avatar
      [DAGTypeLegalizer] Handle SIGN/ZERO_EXTEND in WidenVecRes_Convert(). · bb0ed3e7
      Jonas Paulsson authored
      In case of a SIGN/ZERO_EXTEND of an incomplete vector type (using only a
      partial number of available vector elements), WidenVecRes_Convert() used to
      resort to scalarization.
      
      This patch adds a handling of the (common) case where an input vector can be
      found of same width as the widened result vector, by converting the node to
      SIGN/ZERO_EXTEND_VECTOR_INREG.
      
      Review: Eli Friedman
      llvm-svn: 293268
      bb0ed3e7
    • Adam Nemet's avatar
      [opt-viewer] Introduce global context · 572fca71
      Adam Nemet authored
      This is necessary since globals (max_hotness, caller_loc) need to be
      explicitly passed to the subprocesses.
      
      llvm-svn: 293266
      572fca71
    • Adam Nemet's avatar
      [opt-viewer] Remove message from the key · 07f1264b
      Adam Nemet authored
      This is causing problems because the rendering of the text will depend on
      varying global state to show relative hotness or a link in the inlining
      context.
      
      llvm-svn: 293265
      07f1264b
    • Adam Nemet's avatar
      [opt-viewer] Unique across the different jobs as well · 41cf9b27
      Adam Nemet authored
      llvm-svn: 293264
      41cf9b27
    • Adam Nemet's avatar
      [opt-viewer] Make sorting for the index page deterministic · 4f075e3c
      Adam Nemet authored
      Break the tie between entries with identical hotness deterministically.
      
      llvm-svn: 293263
      4f075e3c
    • Adam Nemet's avatar
      [opt-viewer] Include the function in the remark key · 742615e5
      Adam Nemet authored
      Avoid uniquing remarks with different the inlining context (Function).
      
      llvm-svn: 293262
      742615e5
    • Adam Nemet's avatar
      [opt-viewer] Put critical items in parallel · 55bfb497
      Adam Nemet authored
      Summary:
      Put opt-viewer critical items in parallel
      
      Patch by Brian Cain!
      
      Requires features from Python 2.7
      
      **Performance**
      Below are performance results across various configurations. These were taken on an i5-5200U (dual core + HT). They were taken with a small subset of the YAML output of building Python 3.6.0b3 with LTO+PGO. 60 YAML files.
      
      "multiprocessing" is the current submission contents. "baseline" is as of 544f14c6b2a07a94168df31833dba9dc35fd8289 (I think this is aka r287505).
      
      "ImportError" vs "class<...CLoader>" below are just confirming the expected configuration (with/without CLoader).
      
      The below was measured on AMD A8-5500B (4 cores) with 224 input YAML files, showing a ~1.75x speed increase over the baseline with libYAML.  I suspect it would scale well on high-end servers.
      
      ```
      **************************************** MULTIPROCESSING ****************************************
      PyYAML:
              Traceback (most recent call last):
                File "<string>", line 1, in <module>
              ImportError: cannot import name CLoader
              Python 2.7.10
      489.42user 5.53system 2:38.03elapsed 313%CPU (0avgtext+0avgdata 400308maxresident)k
      0inputs+31392outputs (0major+473540minor)pagefaults 0swaps
      
      PyYAML+libYAML:
              <class 'yaml.cyaml.CLoader'>
              Python 2.7.10
      78.69user 5.45system 0:32.63elapsed 257%CPU (0avgtext+0avgdata 398560maxresident)k
      0inputs+31392outputs (0major+542022minor)pagefaults 0swaps
      
      PyPy/PyYAML:
              Traceback (most recent call last):
                File "<builtin>/app_main.py", line 75, in run_toplevel
                File "<builtin>/app_main.py", line 601, in run_it
                File "<string>", line 1, in <module>
              ImportError: cannot import name 'CLoader'
              Python 2.7.9 (2.6.0+dfsg-3, Jul 04 2015, 05:43:17)
              [PyPy 2.6.0 with GCC 4.9.3]
      154.27user 8.12system 0:53.83elapsed 301%CPU (0avgtext+0avgdata 627960maxresident)k
      808inputs+30376outputs (0major+727994minor)pagefaults 0swaps
      **************************************** BASELINE        ****************************************
      PyYAML:
              Traceback (most recent call last):
                File "<string>", line 1, in <module>
              ImportError: cannot import name CLoader
              Python 2.7.10
              358.08user 4.05system 6:08.37elapsed 98%CPU (0avgtext+0avgdata 315004maxresident)k
      0inputs+31392outputs (0major+85252minor)pagefaults 0swaps
      
      PyYAML+libYAML:
              <class 'yaml.cyaml.CLoader'>
              Python 2.7.10
      50.32user 3.30system 0:56.59elapsed 94%CPU (0avgtext+0avgdata 307296maxresident)k
      0inputs+31392outputs (0major+79335minor)pagefaults 0swaps
      
      PyPy/PyYAML:
              Traceback (most recent call last):
                File "<builtin>/app_main.py", line 75, in run_toplevel
                File "<builtin>/app_main.py", line 601, in run_it
                File "<string>", line 1, in <module>
              ImportError: cannot import name 'CLoader'
              Python 2.7.9 (2.6.0+dfsg-3, Jul 04 2015, 05:43:17)
              [PyPy 2.6.0 with GCC 4.9.3]
      72.94user 5.18system 1:23.41elapsed 93%CPU (0avgtext+0avgdata 455312maxresident)k
      0inputs+30392outputs (0major+110280minor)pagefaults 0swaps
      
      ```
      
      Reviewers: fhahn, anemet
      
      Reviewed By: anemet
      
      Subscribers: llvm-commits, mehdi_amini
      
      Differential Revision: https://reviews.llvm.org/D26967
      
      llvm-svn: 293261
      55bfb497
    • Richard Trieu's avatar
      Fix unused variable warning. · 0b79aa33
      Richard Trieu authored
      llvm-svn: 293260
      0b79aa33
    • Saleem Abdulrasool's avatar
      ARM: fix vectorized division on WoA · 26c00e37
      Saleem Abdulrasool authored
      The Windows on ARM target uses custom division for normal division as
      the backend needs to insert division-by-zero checks.  However, it is
      designed to only handle non-vectorized division.  ARM has custom
      lowering for vectorized division as that can avoid loading registers
      with the values and invoke a division routine for each one, preferring
      to lower using NEON instructions.  Fall back to the custom lowering for
      the NEON instructions if we encounter a vectorized division.
      
      Resolves PR31778!
      
      llvm-svn: 293259
      26c00e37
Loading