Skip to content
  1. Mar 15, 2021
    • Markus Böck's avatar
      [test] Add ability to get error messages from CMake for errc substitution · af2796c7
      Markus Böck authored
      Visual Studios implementation of the C++ Standard Library does not use strerror to produce a message for std::error_code unlike other standard libraries such as libstdc++ or libc++ that might be used.
      
      This patch adds a cmake script that through running a C++ program gets the error messages for the POSIX error codes and passes them onto lit through an optional config parameter.
      
      If the config parameter is not set, or getting the messages failed, due to say a cross compiling configuration without an emulator, it will fall back to using pythons strerror functions.
      
      Differential Revision: https://reviews.llvm.org/D98278
      af2796c7
    • Jon Chesterfield's avatar
      [libomptarget] Fix devicertl build · bcb3f0f8
      Jon Chesterfield authored
      [libomptarget] Fix devicertl build
      
      The target specific functions in target_interface are extern C, but the
      implementations for nvptx were mostly C++ mangling. That worked out as
      a quirk of DEVICE macro expanding to nothing, except for shuffle.h which
      only forward declared the functions with C++ linkage.
      
      Also implements GetWarpSize, as used by shuffle, and includes target_interface
      in nvptx target_impl.cu to help catch future divergence between interface and
      implementation.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D98651
      bcb3f0f8
    • Michael Kruse's avatar
      [Polly] Fix deprecation warning. NFC. · 9c486eb3
      Michael Kruse authored
      IRBuilder::CreateLoad without type parameter was deprecated in 6312c538
      to prepare for opaque pointers.
      9c486eb3
    • Wenlei He's avatar
      [CSSPGO] Load context profile for external functions in PreLink and populate ThinLTO import list · a5d30421
      Wenlei He authored
      For ThinLTO's prelink compilation, we need to put external inline candidates into an import list attached to function's entry count metadata. This enables ThinLink to treat such cross module callee as hot in summary index, and later helps postlink to import them for profile guided cross module inlining.
      
      For AutoFDO, the import list is retrieved by traversing the nested inlinee functions. For CSSPGO, since profile is flatterned, a few things need to happen for it to work:
      
       - When loading input profile in extended binary format, we need to load all child context profile whose parent is in current module, so context trie for current module includes potential cross module inlinee.
       - In order to make the above happen, we need to know whether input profile is CSSPGO profile before start reading function profile, hence a flag for profile summary section is added.
       - When searching for cross module inline candidate, we need to walk through the context trie instead of nested inlinee profile (callsite sample of AutoFDO profile).
       - Now that we have more accurate counts with CSSPGO, we swtiched to use entry count instead of total count to decided if an external callee is potentially beneficial to inline. This make it consistent with how we determine whether call tagert is potential inline candidate.
      
      Differential Revision: https://reviews.llvm.org/D98590
      a5d30421
    • Jianzhou Zhao's avatar
      [dfsan] Updated check_custom_wrappers.sh to dedup function names · 9cf5220c
      Jianzhou Zhao authored
      The origin wrappers added by https://reviews.llvm.org/D98359 reuse
      those __dfsw_ functions.
      9cf5220c
    • Fangrui Song's avatar
      Change void getNoop(MCInst &NopInst) to MCInst getNop() · 5d44c92b
      Fangrui Song authored
      Prefer (self-documenting) return values to output parameters (which are
      liable to be used).
      While here, rename Noop to Nop which is more widely used and improves
      consistency with hasEmitNops/setEmitNops/emitNop/etc.
      5d44c92b
    • Jez Ng's avatar
      [lld-macho] Place LC_FUNCTION_STARTS data at the right position · 29d46760
      Jez Ng authored
      This pleases the codesign
      
      (Otherwise it complains about "function starts data out of place")
      
      Reviewed By: #lld-macho, smeenai
      
      Differential Revision: https://reviews.llvm.org/D98648
      29d46760
    • Jianzhou Zhao's avatar
      [dfsan] Do not check dfsan_get_origin by check_custom_wrappers.sh · 57a532b3
      Jianzhou Zhao authored
      It is implemented like dfsan_get_label, and does not any code
      in dfsan_custome.cpp.
      57a532b3
    • Craig Topper's avatar
      [RISCV] Add RISCVISD::BR_CC similar to RISCVISD::SELECT_CC. · 41759c3d
      Craig Topper authored
      This allows me to introduce similar combines for branches as
      we have recently added for SELECT_CC. Some of them are less
      useful for standalone setccs and only help branch instructions.
      By having a BR_CC node its easier to only affect branches.
      
      I'm using CondCodeSDNode to make isel patterns easier to
      write so we can refer to the codes by name. SELECT_CC uses a
      constant instead.
      
      I've translated the condition code just like SELECT_CC so
      we need less patterns for the swapped conditions. This
      includes special cases for X < 1 and X > -1 that get translated
      to blez and bgez by using a 0 constant.
      
      computeKnownBitsForTargetNode support for SELECT_CC is added
      to allow MaskedValueIsZero to work for cases where the true
      and false values of the SELECT_CC are setccs and the
      result of the SELECT_CC is used by a BR_CC. This was needed
      to avoid regressions in some of the overflow tests.
      
      Reviewed By: luismarques
      
      Differential Revision: https://reviews.llvm.org/D98159
      41759c3d
    • Jon Chesterfield's avatar
      [libomptarget] Drop assert.h, use freestanding for amdgcn devicertl · f675b3df
      Jon Chesterfield authored
      [libomptarget] Drop assert.h, use freestanding for amdgcn devicertl
      
      Promotes the runtime assert to a link time error for the unimplemented
      fallback functions. Enables amdgcn to build with only clang provided
      headers, which makes it less likely to break other builds when enabled.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D98649
      f675b3df
    • Philipp Tomsich's avatar
      [RISCV] Add isel-patterns to optimize (a < 1) into blez (a <= 0) · 018e96f7
      Philipp Tomsich authored
      The following code-sequence showed up in a testcase (isolated from
      SPEC2017) for if-conversion and vectorization when searching for the
      maximum in an array:
              addi    a2, zero, 1
              blt     a1, a2, .LBB0_5
      which can be expressed as `bge zero,a1,.LBB0_5`/`blez a1,/LBB0_5`.
      
      More generally, we want to express (a < 1) as (a <= 0).
      
      This adds the required isel-pattern and updates the testcases.
      
      Reviewed By: craig.topper
      
      Differential Revision: https://reviews.llvm.org/D98449
      018e96f7
    • Michael Kruse's avatar
      [Polly][Optimizer] Apply user-directed unrolling. · 3f170eb1
      Michael Kruse authored
      Make Polly look for unrolling metadata (https://llvm.org/docs/TransformMetadata.html#loop-unrolling) that is usually only interpreted by the LoopUnroll pass and apply it to the SCoP's schedule.
      
      While not that useful by itself (there already is an unroll pass), it introduces mechanism to apply arbitrary loop transformation directives in arbitrary order to the schedule. Transformations are applied until no more directives are found. Since ISL's rescheduling would discard the manual transformations and it is assumed that when the user specifies the sequence of transformations, they do not want any other transformations to apply. Applying user-directed transformations can be controlled using the `-polly-pragma-based-opts` switch and is enabled by default.
      
      This does not influence the SCoP detection heuristic. As a consequence, loop that do not fulfill SCoP requirements or the initial profitability heuristic will be ignored. `-polly-process-unprofitable` can be used to disable the latter.
      
      Other than manually editing the IR, there is currently no way for the user to add loop transformations in an order other than the order in the default pipeline, or transformations other than the one supported by clang's LoopHint. See the `unroll_double.ll` test as example that clang currently is unable to emit. My own extension of `#pragma clang loop` allowing an arbitrary order and additional transformations is available here: https://github.com/meinersbur/llvm-project/tree/pragma-clang-loop. An effort to upstream this functionality as `#pragma clang transform` (because `#pragma clang loop` has an implicit transformation order defined by the loop pipeline) is D69088.
      
      Additional transformations from my downstream pragma-clang-loop branch are tiling, interchange, reversal, unroll-and-jam, thread-parallelization and array packing. Unroll was chosen because it uses already-defined metadata and does not require correctness checks.
      
      Reviewed By: sebastiankreutzer
      
      Differential Revision: https://reviews.llvm.org/D97977
      3f170eb1
    • Stelios Ioannou's avatar
      [AArch64] Implement __rndr, __rndrrs intrinsics · ab86edbc
      Stelios Ioannou authored
      This patch implements the __rndr and __rndrrs intrinsics to provide access to the random
      number instructions introduced in Armv8.5-A. They are only defined for the AArch64
      execution state and are available when __ARM_FEATURE_RNG is defined.
      
      These intrinsics store the random number in their pointer argument and return a status
      code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter
      intrinsic reseeds the random number generator.
      
      The instructions write the NZCV flags indicating the success of the operation that we can
      then read with a CSET.
      
      [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
      [2] https://bugs.llvm.org/show_bug.cgi?id=47838
      
      Differential Revision: https://reviews.llvm.org/D98264
      
      Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc
      ab86edbc
    • Alex Zinenko's avatar
      [mlir] fix SPIR-V CPU and Vulkan runners after e2310704 · b868a3ed
      Alex Zinenko authored
      The commit in question changed the syntax but did not update the runner
      tests. This also required registering the MemRef dialect for custom
      parser to work correctly.
      b868a3ed
    • serge-sans-paille's avatar
      Allow __ieee128 as an alias to __float128 on ppc · 4aa510be
      serge-sans-paille authored
      This matches gcc behavior.
      
      Differential Revision: https://reviews.llvm.org/D97846
      4aa510be
    • serge-sans-paille's avatar
      [NFC] Use higher level constructs to check for whitespace/newlines in the lexer · 9628cb1f
      serge-sans-paille authored
      It turns out that according to valgrind and perf, it's also slightly faster.
      
      Differential Revision: https://reviews.llvm.org/D98637
      9628cb1f
    • Luke Drummond's avatar
      [OpenCL] Respect calling convention for builtin · fcfd3fda
      Luke Drummond authored
      `__translate_sampler_initializer` has a calling convention of
      `spir_func`, but clang generated calls to it using the default CC.
      
      Instruction Combining was lowering these mismatching calling conventions
      to `store i1* undef` which itself was subsequently lowered to a trap
      instruction by simplifyCFG resulting in runtime `SIGILL`
      
      There are arguably two bugs here: but whether there's any wisdom in
      converting an obviously invalid call into a runtime crash over aborting
      with a sensible error message will require further discussion. So for
      now it's enough to set the right calling convention on the runtime
      helper.
      
      Reviewed By: svenh, bader
      
      Differential Revision: https://reviews.llvm.org/D98411
      fcfd3fda
    • Andrzej Warzynski's avatar
    • Martin Storsjö's avatar
      [libcxx] [test] Fix the temp_directory_path test for windows · b5e228fc
      Martin Storsjö authored
      Check a different set of env vars, don't check the exact value
      of the fallback path. (GetTempPath falls back to returning the Windows
      folder if nothing better is available in env vars.)
      
      The test still fails one check on windows (due to relying on perms::none),
      which will be addressed separately.
      
      Differential Revision: https://reviews.llvm.org/D98139
      b5e228fc
  2. Mar 16, 2021
  3. Mar 15, 2021
Loading