[BPF] Add a few new insns under cpu=v4
In [1], a few new insns are proposed to expand BPF ISA to . fixing the limitation of existing insn (e.g., 16bit jmp offset) . adding new insns which may improve code quality (sign_ext_ld, sign_ext_mov, st) . feature complete (sdiv, smod) . better user experience (bswap) This patch implemented insn encoding for . sign-extended load . sign-extended mov . sdiv/smod . bswap insns . unconditional jump with 32bit offset The new bswap insns are generated under cpu=v4 for __builtin_bswap. For cpu=v3 or earlier, for __builtin_bswap, be or le insns are generated which is not intuitive for the user. To support 32-bit branch offset, a 32-bit ja (JMPL) insn is implemented. For conditional branch which is beyond 16-bit offset, llvm will do some transformation 'cond_jmp' -> 'cond_jmp + jmpl' to simulate 32bit conditional jmp. See BPFMIPeephole.cpp for details. The algorithm is hueristic based. I have tested bpf selftest pyperf600 with unroll account 600 which can indeed generate 32-bit jump insn, e.g., 13: 06 00 00 00 9b cd 00 00 gotol +0xcd9b <LBB0_6619> Eduard is working on to add 'st' insn to cpu=v4. A list of llc flags: disable-ldsx, disable-movsx, disable-bswap, disable-sdiv-smod, disable-gotol can be used to disable a particular insn for cpu v4. For example, user can do: llc -march=bpf -mcpu=v4 -disable-movsx t.ll to enable cpu v4 without movsx insns. References: [1] https://lore.kernel.org/bpf/4bfe98be-5333-1c7e-2f6d-42486c8ec039@meta.com/ Differential Revision: https://reviews.llvm.org/D144829
Loading
Please sign in to comment