Skip to content
  1. May 03, 2021
  2. May 02, 2021
  3. May 01, 2021
    • Harald van Dijk's avatar
      [X32][CET] Fix size and alignment of .note.gnu.property section · f3050063
      Harald van Dijk authored
      X32 uses 32-bit ELF object files with 32-bit alignment, so the
      .note.gnu.property section needs to be emitted as it is for X86.
      
      Reviewed By: MaskRay
      
      Differential Revision: https://reviews.llvm.org/D101689
      f3050063
    • Nikita Popov's avatar
      [LVI] Handle mask not equal zero conditions · db9d00c5
      Nikita Popov authored
      If V & Mask != 0, we know that at least one of the bits in Mask
      must be set, so the value must be >= the lowest bit in Mask.
      db9d00c5
    • Nikita Popov's avatar
      7aafd104
    • Roman Lebedev's avatar
      [X86] AMD Zen 3 Scheduler Model · 2b93c9c1
      Roman Lebedev authored
      Introduce basic schedule model for AMD Zen 3 CPU's, a.k.a `znver3`.
      
      This is fully built from scratch, from llvm-mca measurements
      and documented reference materials.
      Nothing was copied from `znver2`/`znver1`.
      
      I believe this is in a reasonable state of completion for inclusion,
      probably better than D52779 `bdver2` was :)
      
      Namely:
      * uops are pretty spot-on (at least what llvm-mca can measure)
        {F16422596}
      * latency is also pretty spot-on (at least what llvm-mca can measure)
        {F16422601}
      * throughput is within reason
        {F16422607}
      
      I haven't run much benchmarks with this,
      however RawSpeed benchmarks says this is beneficial:
      {F16603978}
      {F16604029}
      
      I'll call out the obvious problems there:
      * i didn't really bother with X87 instructions
      * i didn't really bother with obviously-microcoded/system instructions
      * There are large discrepancy in throughput for `mr` and `rm` instructions.
        I'm not really sure if it's a modelling defect that needs to be fixed,
        or it's a defect of measurments.
      * Pipe distributions are probably bad :)
        I can't do much here until AMD allows that to be fixed
        by documenting the appropriate counters and updating libpfm
      
      That being said, as @RKSimon notes:
      >>! In D94395#2647381, @RKSimon wrote:
      > I'll mention again that all the znver* models appear to be very inaccurate wrt SIMD/FPU instructions <...>
      so how much worse this could possibly be?!
      
      Things that aren't there:
      * Various tunings: zero idioms, etc. That is follow-ups.
      
      Differential Revision: https://reviews.llvm.org/D94395
      2b93c9c1
    • Nikita Popov's avatar
      [SCEV] Simplify backedge count clearing (NFC) · cc58e891
      Nikita Popov authored
      This seems to be a leftover from when the BackedgeTakenInfo
      stored multiple exit counts with manual memory management. At
      some point this was switchted to a simple vector, and there should
      be no need to micro-manage the clearing anymore. We can simply
      drop the loop from the map and the the destructor do its job.
      cc58e891
Loading