Skip to content
  1. Jun 07, 2019
    • Jinsong Ji's avatar
      [MachineScheduler] checkResourceLimit boundary condition update · 7aafdef6
      Jinsong Ji authored
      When we call checkResourceLimit in bumpCycle or bumpNode, and we
      know the resource count has just reached the limit (the equations
       are equal). We should return true to mark that we are resource
      limited for next schedule, or else we might continue to schedule
      in favor of latency for 1 more schedule and create a schedule that
       actually overbook the resource.
      
      When we call checkResourceLimit to estimate the resource limite before
      scheduling, we don't need to return true even if the equations are
      equal, as it shouldn't limit the schedule for it .
      
      Differential Revision: https://reviews.llvm.org/D62345
      
      llvm-svn: 362805
      7aafdef6
    • Stefan Stipanovic's avatar
      test-commit · 128e8e8f
      Stefan Stipanovic authored
      llvm-svn: 362802
      128e8e8f
    • David Bolvansky's avatar
      [NFC] Added tests for D63004 · 43f8ce44
      David Bolvansky authored
      llvm-svn: 362801
      43f8ce44
    • Matt Arsenault's avatar
      TailDuplicator: Remove no-op analyzeBranch call · 94a609e3
      Matt Arsenault authored
      This could fail, which looked concerning. However nothing was actually
      using the results of this. I assume this was intended to use the
      anti-feature of analyzeBranch of removing instructions, but wasn't
      actually calling it with AllowModify = true.
      
      Fixes bug 42162.
      
      llvm-svn: 362800
      94a609e3
    • Joerg Sonnenberger's avatar
      [NFC] Don't export helpers of ConstantFoldCall · b2e96169
      Joerg Sonnenberger authored
      llvm-svn: 362799
      b2e96169
    • Nico Weber's avatar
      llvm-lib: Disallow mixing object files with different machine types · d546b505
      Nico Weber authored
      lib.exe doesn't allow creating .lib files with object files that have
      differing machine types. Update llvm-lib to match.
      
      The motivation is to make it possible to infer the machine type of a
      .lib file in lld, so that it can warn when e.g. a 32-bit .lib file is
      passed to a 64-bit link (PR38965).
      
      Fixes PR38782.
      
      Differential Revision: https://reviews.llvm.org/D62913
      
      llvm-svn: 362798
      d546b505
    • Sanjay Patel's avatar
      [x86] narrow extract subvector of vector select · 6880bced
      Sanjay Patel authored
      This is a potentially large perf win for AVX1 targets because of the way we
      auto-vectorize to 256-bit but then expect the backend to legalize/optimize
      for the half-implemented AVX1 ISA.
      
      On the motivating example from PR37428 (even though this patch doesn't solve
      the vector shift issue):
      https://bugs.llvm.org/show_bug.cgi?id=37428
      ...there's a 16% speedup when compiling with "-mavx" (perf tested on Haswell)
      because we eliminate the remaining 256-bit vblendv ops.
      
      I added comments on a couple of tests that require further work. If we have
      256-bit logic ops separating the vselect and extract, we should probably narrow
      everything to 128-bit, but that requires a larger pattern match.
      
      Differential Revision: https://reviews.llvm.org/D62969
      
      llvm-svn: 362797
      6880bced
    • Nico Weber's avatar
      gn build: Merge r362766 · 0723c659
      Nico Weber authored
      llvm-svn: 362796
      0723c659
    • Nico Weber's avatar
      gn build: Merge r362774 · 9cf96046
      Nico Weber authored
      llvm-svn: 362795
      9cf96046
    • Nico Weber's avatar
      95dd67ac
    • Simon Tatham's avatar
      [ARM] Fix bugs introduced by the fp64/d32 rework. · 5d66f2b0
      Simon Tatham authored
      Change D60691 caused some knock-on failures that weren't caught by the
      existing tests. Firstly, selecting a CPU that should have had a
      restricted FPU (e.g. `-mcpu=cortex-m4`, which should have 16 d-regs
      and no double precision) could give the unrestricted version, because
      `ARM::getFPUFeatures` returned a list of features including subtracted
      ones (here `-fp64`,`-d32`), but `ARMTargetInfo::initFeatureMap` threw
      away all the ones that didn't start with `+`. Secondly, the
      preprocessor macros didn't reliably match the actual compilation
      settings: for example, `-mfpu=softvfp` could still set `__ARM_FP` as
      if hardware FP was available, because the list of features on the cc1
      command line would include things like `+vfp4`,`-vfp4d16` and clang
      didn't realise that one of those cancelled out the other.
      
      I've fixed both of these issues by rewriting `ARM::getFPUFeatures` so
      that it returns a list that enables every FP-related feature
      compatible with the selected FPU and disables every feature not
      compatible, which is more verbose but means clang doesn't have to
      understand the dependency relationships between the backend features.
      Meanwhile, `ARMTargetInfo::handleTargetFeatures` is testing for all
      the various forms of the FP feature names, so that it won't miss cases
      where it should have set `HW_FP` to feed into feature test macros.
      
      That in turn caused an ordering problem when handling `-mcpu=foo+bar`
      together with `-mfpu=something_that_turns_off_bar`. To fix that, I've
      arranged that the `+bar` suffixes on the end of `-mcpu` and `-march`
      cause feature names to be put into a separate vector which is
      concatenated after the output of `getFPUFeatures`.
      
      Another side effect of all this is to fix a bug where `clang -target
      armv8-eabi` by itself would fail to set `__ARM_FEATURE_FMA`, even
      though `armv8` (aka Arm v8-A) implies FP-Armv8 which has FMA. That was
      because `HW_FP` was being set to a value including only the `FPARMV8`
      bit, but that feature test macro was testing only the `VFP4FPU` bit.
      Now `HW_FP` ends up with all the bits set, so it gives the right
      answer.
      
      Changes to tests included in this patch:
      
      * `arm-target-features.c`: I had to change basically all the expected
        results. (The Cortex-M4 test in there should function as a
        regression test for the accidental double-precision bug.)
      * `arm-mfpu.c`, `armv8.1m.main.c`: switched to using `CHECK-DAG`
        everywhere so that those tests are no longer sensitive to the order
        of cc1 feature options on the command line.
      * `arm-acle-6.5.c`: been updated to expect the right answer to that
        FMA test.
      * `Preprocessor/arm-target-features.c`: added a regression test for
        the `mfpu=softvfp` issue.
      
      Reviewers: SjoerdMeijer, dmgreen, ostannard, samparker, JamesNagurne
      
      Reviewed By: ostannard
      
      Subscribers: srhines, javed.absar, kristof.beyls, hiraditya, cfe-commits, llvm-commits
      
      Tags: #clang, #llvm
      
      Differential Revision: https://reviews.llvm.org/D62998
      
      llvm-svn: 362791
      5d66f2b0
    • Sam Elliott's avatar
      [RISCV] Support Bit-Preserving FP in F/D Extensions · f720647d
      Sam Elliott authored
      Summary:
      This allows some integer bitwise operations to instead be performed by
      hardware fp instructions. This is correct because the RISC-V spec
      requires the F and D extensions to use the IEEE-754 standard
      representation, and fp register loads and stores to be bit-preserving.
      
      This is tested against the soft-float ABI, but with hardware float
      extensions enabled, so that the tests also ensure the optimisation also
      fires in this case.
      
      Reviewers: asb, luismarques
      
      Reviewed By: asb
      
      Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D62900
      
      llvm-svn: 362790
      f720647d
    • Valery Pykhtin's avatar
      [AMDGPU] Constrain the AMDGPU inliner on maximum number of basic blocks in a... · cb8de55f
      Valery Pykhtin authored
      [AMDGPU] Constrain the AMDGPU inliner on maximum number of basic blocks in a caller function (compile time performance)
      
      Differential revision: https://reviews.llvm.org/D62917
      
      llvm-svn: 362789
      cb8de55f
    • Dmitri Gribenko's avatar
      Work around a circular dependency between IR and MC introduced in r362735 · 5b3c9880
      Dmitri Gribenko authored
      I replaced the circular library dependency with a forward declaration,
      but it is only a workaround, not a real fix.
      
      llvm-svn: 362782
      5b3c9880
    • Cullen Rhodes's avatar
      [AArch64][AsmParser] error on unexpected SVE predicate type suffix · 1f0d2512
      Cullen Rhodes authored
      Summary:
      This patch fixes a bug in the assembler that permitted a type suffix on
      predicate registers when not expected. For instance, the following was
      previously valid:
      
          faddv h0, p0.q, z1.h
      
      This bug was present in all SVE instructions containing predicates with
      no type suffix and no predication form qualifier, i.e. /z or /m. The
      latter instructions are already caught with an appropiate error message
      by the assembler, e.g.:
      
                  .text
          <stdin>:1:13: error: not expecting size suffix
          cmpne p1.s, p0.b/z, z2.s, 0
                      ^
      
      A similar issue for SVE vector registers was fixed in:
      
        https://reviews.llvm.org/D59636
      
      Reviewed By: SjoerdMeijer
      
      Differential Revision: https://reviews.llvm.org/D62942
      
      llvm-svn: 362780
      1f0d2512
    • Cullen Rhodes's avatar
      [AArch64][AsmParser] Provide better diagnostics for SVE predicates · f7305484
      Cullen Rhodes authored
      Patch by Sander de Smalen (sdesmalen)
      
      Reviewed By: SjoerdMeijer
      
      Differential Revision: https://reviews.llvm.org/D62941
      
      llvm-svn: 362779
      f7305484
    • George Rimar's avatar
      [llvm-objcopy] - Emit error and don't crash if program header reaches past end of file. · 33044a7a
      George Rimar authored
      This is https://bugs.llvm.org/show_bug.cgi?id=42122.
      
      If an object file has a size less than program header's file [offset + size]
      (i.e. if we have overflow), llvm-objcopy crashes instead of reporting a
      error.
      
      The patch fixes this issue.
      
      Differential revision: https://reviews.llvm.org/D62898
      
      llvm-svn: 362778
      33044a7a
    • George Rimar's avatar
      [yaml2elf] - Refactoring followup for D62809 · eb394e93
      George Rimar authored
      This is a refactoring follow-up for D62809
      "Change how we handle implicit sections.".
      It allows to simplify the code.
      
      Differential revision: https://reviews.llvm.org/D62912
      
      llvm-svn: 362777
      eb394e93
    • Pengfei Wang's avatar
      [X86] -march=cooperlake (llvm) · f8b28931
      Pengfei Wang authored
      Support intel -march=cooperlake in llvm
      
      Patch by Shengchen Kan (skan)
      
      Differential Revision: https://reviews.llvm.org/D62836
      
      llvm-svn: 362776
      f8b28931
    • Sam Parker's avatar
      Fix for lld buildbot · 67f9dc60
      Sam Parker authored
      Removed unused (in non-debug builds) variable.
      
      llvm-svn: 362775
      67f9dc60
    • Sam Parker's avatar
      [CodeGen] Generic Hardware Loop Support · c5ef502e
      Sam Parker authored
          
      Patch which introduces a target-independent framework for generating
      hardware loops at the IR level. Most of the code has been taken from
      PowerPC CTRLoops and PowerPC has been ported over to use this generic
      pass. The target dependent parts have been moved into
      TargetTransformInfo, via isHardwareLoopProfitable, with
      HardwareLoopInfo introduced to transfer information from the backend.
          
      Three generic intrinsics have been introduced:
      - void @llvm.set_loop_iterations
        Takes as a single operand, the number of iterations to be executed.
      - i1 @llvm.loop_decrement(anyint)
        Takes the maximum number of elements processed in an iteration of
        the loop body and subtracts this from the total count. Returns
        false when the loop should exit.
      - anyint @llvm.loop_decrement_reg(anyint, anyint)
        Takes the number of elements remaining to be processed as well as
        the maximum numbe of elements processed in an iteration of the loop
        body. Returns the updated number of elements remaining.
      
      llvm-svn: 362774
      c5ef502e
    • Dylan McKay's avatar
      [AVR] Expand 16-bit rotations during the legalization stage · 04b418f2
      Dylan McKay authored
      In r356860, the legalization logic for BSWAP was modified to ISD::ROTL,
      rather than the old ISD::{SHL, SRL, OR} nodes.
      
      This works fine on AVR for 8-bit rotations, but 16-bit rotations are
      currently unimplemented - they always trigger an assertion error in the
      AVRExpandPseudoInsts pass ("RORW unimplemented").
      
      This patch instructions the legalizer to expand 16-bit rotations into
      the previous SHL, SRL, OR pattern it did previously.
      
      This fixes the 'issue-cannot-select-bswap.ll' test. Interestingly, this
      test failure seems flaky - it passes successfully on the avr-build-01
      buildbot, but fails locally on my Arch Linux install.
      
      llvm-svn: 362773
      04b418f2
    • Michael Pozulp's avatar
      [NFC] Delete trailing whitespace character. · 65d1ff8e
      Michael Pozulp authored
      llvm-svn: 362772
      65d1ff8e
    • Michael Pozulp's avatar
      [llvm-objdump] Print source when subsequent lines in the translation unit come... · 767bdd55
      Michael Pozulp authored
      [llvm-objdump] Print source when subsequent lines in the translation unit come from the same line in two different headers.
      
      Reviewers: grimar, rupprecht, jhenderson
      
      Reviewed By: grimar, jhenderson
      
      Subscribers: llvm-commits, jhenderson
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D62461
      
      llvm-svn: 362771
      767bdd55
    • Michael Pozulp's avatar
      [llvm-objdump] Add warning if --disassemble-functions specifies an unknown symbol · 50f61af3
      Michael Pozulp authored
      Summary: Fixes Bug 41904 https://bugs.llvm.org/show_bug.cgi?id=41904
      
      Reviewers: jhenderson, rupprecht, grimar, MaskRay
      
      Reviewed By: jhenderson, rupprecht, MaskRay
      
      Subscribers: dexonsmith, rupprecht, kristina, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D62275
      
      llvm-svn: 362768
      50f61af3
    • Fangrui Song's avatar
      [MC][ELF] Don't create relocations with section symbols for STB_LOCAL ifunc · c841b9ab
      Fangrui Song authored
      We should keep the symbol type (STT_GNU_IFUNC) for a local ifunc because
      it may result in an IRELATIVE reloc that the dynamic loader will use to
      resolve the address at startup time.
      
      There is another problem that is not fixed by this patch: a PC relative
      relocation should also create a relocation with the ifunc symbol.
      
      llvm-svn: 362767
      c841b9ab
    • Michael Pozulp's avatar
      [ADT] Enable set_difference() to be used on StringSet · 0bddef79
      Michael Pozulp authored
      Subscribers: mgorny, mgrang, dexonsmith, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D62992
      
      llvm-svn: 362766
      0bddef79
    • Michael Pozulp's avatar
      [NFC] Test commit. · c7029e4e
      Michael Pozulp authored
      llvm-svn: 362763
      c7029e4e
    • Fangrui Song's avatar
      [LV] Fix -Wunused-function after r362736 · 19189993
      Fangrui Song authored
      llvm-svn: 362762
      19189993
    • Matt Arsenault's avatar
      AMDGPU: Don't count mask branch pseudo towards skip threshold · c0edb8f5
      Matt Arsenault authored
      llvm-svn: 362761
      c0edb8f5
    • Matt Arsenault's avatar
      AMDGPU: Insert skips for blocks with FLAT · 99ee81b1
      Matt Arsenault authored
      This already forced a skip for VMEM, so it should also be done for
      flat. I'm somewhat skeptical about the benefit of this though.
      
      llvm-svn: 362760
      99ee81b1
    • Nemanja Ivanovic's avatar
      [PowerPC] Exploit the vector min/max instructions · ef4a3aa5
      Nemanja Ivanovic authored
      Use the PPC vector min/max instructions for computing the corresponding
      operation as these should be faster than the compare/select sequences
      we currently emit.
      
      Differential revision: https://reviews.llvm.org/D47332
      
      llvm-svn: 362759
      ef4a3aa5
    • Matt Arsenault's avatar
      AMDGPU: Insert skip branches over return blocks · b6cfa129
      Matt Arsenault authored
      SIInsertSkips really doesn't understand the control flow, and makes
      very stupid assumptions about the block layout. This was able to get
      away with not skipping return blocks, since usually after
      structurization there is only one placed at the end of the
      function. Tail duplication can break this assumption.
      
      llvm-svn: 362754
      b6cfa129
    • David Tenty's avatar
      [NFC] Test commit, whitespace change · b82ea52b
      David Tenty authored
      As per the Developer Policy, upon obtaining commit access.
      
      llvm-svn: 362753
      b82ea52b
  2. Jun 06, 2019
Loading