Skip to content
  1. Feb 27, 2019
  2. Feb 26, 2019
    • Vedant Kumar's avatar
      [HotColdSplit] Disable splitting for sanitized functions · 73522d16
      Vedant Kumar authored
      Splitting can make sanitizer errors harder to understand, as the
      trapping instruction may not be in the function where the bug was
      detected.
      
      rdar://48142697
      
      llvm-svn: 354931
      73522d16
    • Rong Xu's avatar
      [PGO] Context sensitive PGO (part 1) · 35d2d513
      Rong Xu authored
      Current PGO profile counts are not context sensitive. The branch probabilities
      for the inlined functions are kept the same for all call-sites, and they might
      be very different from the actual branch probabilities. These suboptimal
      profiles can greatly affect some downstream optimizations, in particular for
      the machine basic block placement optimization.
      
      In this patch, we propose to have a post-inline PGO instrumentation/use pass,
      which we called Context Sensitive PGO (CSPGO). For the users who want the best
      possible performance, they can perform a second round of PGO instrument/use on
      the top of the regular PGO. They will have two sets of profile counts. The
      first pass profile will be manly for inline, indirect-call promotion, and
      CGSCC simplification pass optimizations. The second pass profile is for
      post-inline optimizations and code-gen optimizations.
      
      A typical usage:
      // Regular PGO instrumentation and generate pass1 profile.
      > clang -O2 -fprofile-generate source.c -o gen
      > ./gen
      > llvm-profdata merge default.*profraw -o pass1.profdata
      // CSPGO instrumentation.
      > clang -O2 -fprofile-use=pass1.profdata -fcs-profile-generate -o gen2
      > ./gen2
      // Merge two sets of profiles
      > llvm-profdata merge default.*profraw pass1.profdata -o profile.profdata
      // Use the combined profile. Pass manager will invoke two PGO use passes.
      > clang -O2 -fprofile-use=profile.profdata -o use
      
      This change touches many components in the compiler. The reviewed patch
      (D54175) will committed in phrases.
      
      Differential Revision: https://reviews.llvm.org/D54175
      
      llvm-svn: 354930
      35d2d513
    • Yaxun Liu's avatar
      [CUDA][HIP] Check calling convention based on function target · fa49c3a8
      Yaxun Liu authored
      MSVC header files using vectorcall to differentiate overloaded functions, which
      causes failure for AMDGPU target. This is because clang does not check function
      calling convention based on function target.
      
      This patch checks calling convention using the proper target info.
      
      Differential Revision: https://reviews.llvm.org/D57716
      
      llvm-svn: 354929
      fa49c3a8
    • Alexey Bataev's avatar
      [OPENMP][CUDA]Do not emit warnings for variables in late-reported asm · 305b6b96
      Alexey Bataev authored
      statements.
      
      If the assembler instruction is not generated and the delayed diagnostic
      is emitted, we may end up with extra warning message for variables used
      in the asm statement. Since the asm statement is not built, the
      variables may be left non-referenced and it may produce a warning about
      a use of the non-initialized variables.
      
      llvm-svn: 354928
      305b6b96
    • Craig Topper's avatar
      [X86] Add 'znver2' and 'cascadelake' support to __cpu_indicator_init. · 938d3f46
      Craig Topper authored
      For 'cascadelake' this is adding a 'avx512vnni' feature check to the 0x55 skylake-avx512 model check. These CPUs use the same model number and only differ in the stepping number. But the feature flag is simpler than collecting all the stepping numbers.
      
      For 'znver2' this is just syncing with LLVM's Host.cpp.
      
      llvm-svn: 354927
      938d3f46
    • Stanislav Mekhanoshin's avatar
      [AMDGPU] Fixed hang during DAG combine · da1628eb
      Stanislav Mekhanoshin authored
      SITargetLowering::reassociateScalarOps() does not touch constants
      so that DAGCombiner::ReassociateOps() does not revert the combine.
      However a global address is not a ConstantSDNode.
      
      Switched to the method used by DAGCombiner::ReassociateOps() itself
      to detect constants.
      
      Differential Revision: https://reviews.llvm.org/D58695
      
      llvm-svn: 354926
      da1628eb
    • Alexey Bataev's avatar
      [OPENMP]Delay emission for unsupported va_arg expression. · ddc181d2
      Alexey Bataev authored
      If the OpenMP device is NVPTX and va_arg is used, delay emission of the
      error for va_arg unless it is used in the device code.
      
      llvm-svn: 354925
      ddc181d2
Loading