[clangd] Define a compact binary serialization fomat for symbol slab/index.
Summary: This is intended to replace the current YAML format for general use. It's ~10x more compact than YAML, and ~40% more compact than gzipped YAML: llvmidx.riff = 20M, llvmidx.yaml = 272M, llvmidx.yaml.gz = 32M It's also simpler/faster to read and write. The format is a RIFF container (chunks of (type, size, data)) with: - a compressed string table - simple binary encoding of symbols (with varints for compactness) It can be extended to include occurrences, Dex posting lists, etc. There's no rich backwards-compatibility scheme, but a version number is included so we can detect incompatible files and do ad-hoc back-compat. Alternatives considered: - compressed YAML or JSON: bulky and slow to load - llvm bitstream: confusing model and libraries are hard to use. My attempt produced slightly larger files, and the code was longer and slower. - protobuf or similar: would be really nice (esp for back-compat) but the dependency is a big hassle - ad-hoc binary format without a container: it seems clear we're going to add posting lists and occurrences here, and that they will benefit from sharing a string table. The container makes it easy to debug these pieces in isolation, and make them optional. Reviewers: ioeric Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D51585 llvm-svn: 341375
Loading
Please register or sign in to comment