Skip to content
  1. Jun 12, 2020
    • Florian Hahn's avatar
      [VPlan] Reject loops without computable backedge taken counts · 3a846d4d
      Florian Hahn authored
      getOrCreateTripCount is used to generate code for the outer loop, but it
      requires a computable backedge taken counts. Check that in the VPlan
      native path.
      
      Reviewers: Ayal, gilr, rengolin, sguggill
      
      Reviewed By: sguggill
      
      Differential Revision: https://reviews.llvm.org/D81088
      3a846d4d
    • Sebastian Neubauer's avatar
      [AMDGPU] Add G16 support to image instructions · 29a6ad94
      Sebastian Neubauer authored
      Add G16 feature for GFX10 and support A16 and G16 in GlobalISel.
      
      Differential Revision: https://reviews.llvm.org/D76836
      29a6ad94
    • Georgii Rymar's avatar
      [yaml2obj][MachO] - Fix PubName/PubType handling. · d95f8e7a
      Georgii Rymar authored
      `PubName` and `PubType` are optional fields since D80722.
      
      They are defined as:
        Optional<PubSection> PubNames;
        Optional<PubSection> PubTypes;
      
      And initialized in the following way:
        IO.mapOptional("debug_pubnames", DWARF.PubNames);
        IO.mapOptional("debug_pubtypes", DWARF.PubTypes);
      
      But problem is that because of the issue in `YAMLTraits.cpp`,
      when there are no `debug_pubnames`/`debug_pubtypes` keys in a YAML description,
      they are not initialized to `Optional::None` as the code expects, but they
      are initialized to default `PubSection()` instances.
      
      Because of this, the `if` condition in the following code is always true:
      
      if (Obj.DWARF.PubNames)
        Err = DWARFYAML::emitPubSection(OS, *Obj.DWARF.PubNames,
                                        Obj.IsLittleEndian);
      
      What means `emitPubSection` is always called and it writes few values.
      
      This patch fixes the issue. I've reduced `sizeofcmds` by size of data
      previously written because of this bug.
      
      Differential revision: https://reviews.llvm.org/D81686
      d95f8e7a
    • Chen Zheng's avatar
      [PowerPC] refactor convertToImmediateForm - NFC · 9b6e86a1
      Chen Zheng authored
      This is a NFC patch to make convertToImmediateForm a light wrapper
      for converting xform and imm form instructions on PowerPC.
      
      Reviewed By: Steven.zhang
      
      Differential Revision: https://reviews.llvm.org/D80907
      9b6e86a1
    • EgorBo's avatar
      [InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0" · 012909dc
      EgorBo authored
      Summary:
      "X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two)
      However, "X % Y" can also be represented as "X - (X / Y) * Y" so if I rewrite the initial expression:
      "X - (X / C) * C == 0" it's not currently optimized to "X & C-1 == 0", see godbolt: https://godbolt.org/z/KzuXUj
      
      This is my first contribution to LLVM so I hope I didn't mess things up
      
      Reviewers: lebedev.ri, spatel
      
      Reviewed By: lebedev.ri
      
      Subscribers: hiraditya, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D79369
      012909dc
    • Jonas Devlieghere's avatar
      [llvm/Object] Reimplment basic_symbol_iterator in TapiFile · 425c6f07
      Jonas Devlieghere authored
      Use indices into the Symbols vector instead of casting the objects in
      the vector and dereferencing std::vector::end().
      
      This change is NFC modulo the Windows failure reported by
      llvm-clang-x86_64-expensive-checks-win.
      
      Differential revision: https://reviews.llvm.org/D81717
      425c6f07
    • Kristof Beyls's avatar
      [AArch64] Extend AArch64SLSHardeningPass to harden BLR instructions. · c35ed40f
      Kristof Beyls authored
      To make sure that no barrier gets placed on the architectural execution
      path, each
        BLR x<N>
      instruction gets transformed to a
        BL __llvm_slsblr_thunk_x<N>
      instruction, with __llvm_slsblr_thunk_x<N> a thunk that contains
      __llvm_slsblr_thunk_x<N>:
        BR x<N>
        <speculation barrier>
      
      Therefore, the BLR instruction gets split into 2; one BL and one BR.
      This transformation results in not inserting a speculation barrier on
      the architectural execution path.
      
      The mitigation is off by default and can be enabled by the
      harden-sls-blr subtarget feature.
      
      As a linker is allowed to clobber X16 and X17 on function calls, the
      above code transformation would not be correct in case a linker does so
      when N=16 or N=17. Therefore, when the mitigation is enabled, generation
      of BLR x16 or BLR x17 is avoided.
      
      As BLRA* indirect calls are not produced by LLVM currently, this does
      not aim to implement support for those.
      
      Differential Revision:  https://reviews.llvm.org/D81402
      c35ed40f
    • Yevgeny Rouban's avatar
      [JumpThreading] Handle zero !prof branch_weights · 707836ed
      Yevgeny Rouban authored
      Avoid division by zero in updatePredecessorProfileMetadata().
      
      Reviewers: yamauchi
      Tags: #llvm
      Differential Revision: https://reviews.llvm.org/D81499
      707836ed
    • Craig Topper's avatar
      [X86] Add a helper lambda to getIntelProcessorTypeAndSubtype to select feature... · 0ce9bf6e
      Craig Topper authored
      [X86] Add a helper lambda to getIntelProcessorTypeAndSubtype to select feature bits from the correct 32-bit feature variable.
      
      We have three 32 bit variables containing feature bits. But our
      enum is a flat 96 bit space. So we need to pick which of the
      variables to use based on the bit value. We used to do this
      manually by mentioning the correct variable and subtracting an
      offset from the enum. But this is error prone.
      0ce9bf6e
    • Vitaly Buka's avatar
      [StackSafety] Fix byval handling · 99930732
      Vitaly Buka authored
      We don't need process paramenters which marked as
      byval as we are not going to pass interested allocas
      without copying.
      
      If we pass value into byval argument, we just handle that
      as Load of corresponding type and stop that branch of analysis.
      99930732
    • Yonghong Song's avatar
      [BPF] fix incorrect type in BPFISelDAGToDAG readonly load optimization · 4db18781
      Yonghong Song authored
      In BPF Instruction Selection DAGToDAG transformation phase,
      BPF backend had an optimization to turn load from readonly data
      section to direct load of the values. This phase is implemented
      before libbpf has readonly section support and before alu32
      is supported.
      
      This phase however may generate incorrect type when alu32 is
      enabled. The following is an example,
        -bash-4.4$ cat ~/tmp2/t.c
        struct t {
          unsigned char a;
          unsigned char b;
          unsigned char c;
        };
        extern void foo(void *);
        int test() {
          struct t v = {
            .b = 2,
          };
          foo(&v);
          return 0;
        }
      
      The compiler will turn local variable "v" into a readonly section.
      During instruction selection phase, the compiler generates two
      loads from readonly section, one 2 byte load or 1 byte load, e.g., for 2 loads,
        t8: i32,ch = load<(dereferenceable load 2 from `i8* getelementptr inbounds
             (%struct.t, %struct.t* @__const.test.v, i64 0, i32 0)`, align 1),
             anyext from i16> t3, GlobalAddress:i64<%struct.t* @__const.test.v> 0, undef:i64
        t9: ch = store<(store 2 into %ir.v1.sub1), trunc to i16> t3, t8,
          FrameIndex:i64<0>, undef:i64
      
      BPF backend changed t8 to i64 = Constant<2> and eventually the generated machine IR:
        t10: i64 = MOV_ri TargetConstant:i64<2>
        t40: i32 = SLL_ri_32 t10, TargetConstant:i32<8>
        t41: i32 = OR_ri_32 t40, TargetConstant:i64<0>
        t9: ch = STH32<Mem:(store 2 into %ir.v1.sub1)> t41, TargetFrameIndex:i64<0>,
            TargetConstant:i64<0>, t3
      
      Note that t10 in the above is not correct. The type should be i32 and instruction
      should be MOV_ri_32. The reason for incorrect insn selection is BPF insn selection
      generated an i64 constant instead of an i32 constant as specified in the original
      load instruction. Such incorrect insn sequence eventually caused the following
      fatal error when a COPY insn tries to copy a 64bit register to a 32bit subregister.
        Impossible reg-to-reg copy
        UNREACHABLE executed at ../lib/Target/BPF/BPFInstrInfo.cpp:42!
      
      This patch fixed the issue by using the load result type instead of always i64
      when doing readonly load optimization.
      
      Differential Revision: https://reviews.llvm.org/D81630
      4db18781
    • Cyndy Ishida's avatar
      [llvm][llvm-nm] add TextAPI/MachO support · 28fefcc8
      Cyndy Ishida authored
      Summary:
      This completes the needed glueing to support reading tbd files from nm.
      This includes specifying which slice filtering with `--arch` and a new
      option specifically for tbd files `--add-inlinedinfo` which will show
      the reexported libraries that are appended in the tbd file.
      
      Reviewers: ributzka, steven_wu, JDevlieghere, jhenderson
      
      Reviewed By: JDevlieghere
      
      Subscribers: hiraditya, MaskRay, dexonsmith, rupprecht, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D81614
      28fefcc8
    • Alina Sbirlea's avatar
      Verify MemorySSA after all updates. · 519b019a
      Alina Sbirlea authored
      Verify after completing all updates.
      Resolves PR46275.
      519b019a
    • Eric Christopher's avatar
      Tidy up unsigned -> Register fixups. · 3ff8f619
      Eric Christopher authored
      3ff8f619
    • Eric Christopher's avatar
      Add a diagnostic string to an assert. · cb21b168
      Eric Christopher authored
      cb21b168
    • Matt Arsenault's avatar
    • Sanjay Patel's avatar
      039ff29e
    • Vitaly Buka's avatar
      [StackSafety,NFC] Fix use of CallBase API · a10fc165
      Vitaly Buka authored
      Code does not need iterate arguments and can get ArgNo from
      CallBase::getArgOperandNo.
      a10fc165
    • Matt Arsenault's avatar
    • Matt Arsenault's avatar
      AMDGPU/GlobalISel: Set insert point when emitting control flow pseudos · 2247072b
      Matt Arsenault authored
      This was implicitly assuming the branch instruction was the next after
      the pseudo. It's possible for another non-terminator instruction to be
      inserted between the intrinsic and the branch, so adjust the insertion
      point. Fixes a non-terminator after terminator verifier error (which
      without the verifier, manifested itself as an infinite loop in
      analyzeBranch much later on).
      2247072b
    • Kirill Naumov's avatar
      [InlineCost] Preparational patch for creation of Printer pass. · 1022b5eb
      Kirill Naumov authored
      - Renaming the printer class, flag
      - Refactoring
      - Changing some tests
      
      This patch is a preparational stage for introducing a new printing pass and new
      functionality to the existing Annotation Writer. I plan to extend
      this functionality for this tool to be more useful when looking at the inline
      process.
      1022b5eb
    • Fangrui Song's avatar
      [Support] Don't tie errs() to outs() by default · 03089752
      Fangrui Song authored
      This reverts part of D81156.
      
      Accessing errs() concurrently was safe before and racy after D81156.
      (`errs() << 'a'` is always racy)
      
      Accessing outs() and errs() concurrently was safe before and racy after D81156.
      
      Don't tie errs() to outs() by default to fix the fallout.
      llvm-dwarfdump is single-threaded and opting in the tie behavior is safe.
      03089752
    • Stanislav Mekhanoshin's avatar
      Fixed assertion in SROA if block has ho successors · a98d618f
      Stanislav Mekhanoshin authored
      BasicBlock::isLegalToHoistInto() asserts if block does not
      have successors. The case is degenarate but assertion still
      needs to be avoided.
      
      https://bugs.llvm.org/show_bug.cgi?id=46280
      
      Differential Revision: https://reviews.llvm.org/D81674
      a98d618f
    • Craig Topper's avatar
      [X86] Remove unnecessary #if around call to isCpuIdSupported in getHostCPUName. · c5251681
      Craig Topper authored
      The exact same #if is already inside isCpuIdSupported and causes
      it to return true. The definition of isCpuIdSupported isn't
      conditional so we should be able just rely on its body doing
      the right thing.
      c5251681
    • Thomas Lively's avatar
      [WebAssembly] Make BR_TABLE non-duplicable · c5d01234
      Thomas Lively authored
      Summary:
      After their range checks were removed in 7f50c15b, br_tables
      started being duplicated into their predecessors by tail
      folding. Unfortunately, when the br_tables were in loops this
      transformation introduced bad irreducible control flow which was later
      expanded into even more br_tables. This commit abuses the
      `isNotDuplicable` property to prevent this irreducible control flow
      from being introduced. This change saves a few dozen bytes of code
      size and has a negligible affect on performance for most of the large
      Emscripten benchmarks, but can improve performance significantly on
      microbenchmarks of switches in loops.
      
      Reviewers: aheejin, dschuff
      
      Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D81628
      c5d01234
  2. Jun 11, 2020
Loading