Skip to content
  1. Sep 11, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Fix improper TSD init after TLS destructors are called · 040c211b
      Kostya Kortchinsky authored
      Summary:
      Some of glibc's own thread local data is destroyed after a user's thread local
      destructors are called, via __libc_thread_freeres. This might involve calling
      free, as is the case for strerror_thread_freeres.
      If there is no prior heap operation in the thread, this free would end up
      initializing some thread specific data that would never be destroyed properly
      (as user's pthread destructors have already been called), while still being
      deallocated when the TLS goes away. As a result, a program could SEGV, usually
      in __sanitizer::AllocatorGlobalStats::Unregister, where one of the doubly linked
      list links would refer to a now unmapped memory area.
      
      To prevent this from happening, we will not do a full initialization from the
      deallocation path. This means that the fallback cache & quarantine will be used
      if no other heap operation has been called, and we effectively prevent the TSD
      being initialized and never destroyed. The TSD will be fully initialized for all
      other paths.
      
      In the event of a thread doing only frees and nothing else, a TSD would never
      be initialized for that thread, but this situation is unlikely and we can live
      with that.
      
      Reviewers: alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D37697
      
      llvm-svn: 312939
      040c211b
  2. Jul 12, 2017
    • Kostya Kortchinsky's avatar
      [scudo] PRNG makeover · 00582563
      Kostya Kortchinsky authored
      Summary:
      This follows the addition of `GetRandom` with D34412. We remove our
      `/dev/urandom` code and use the new function. Additionally, change the PRNG for
      a slightly faster version. One of the issues with the old code is that we have
      64 full bits of randomness per "next", using only 8 of those for the Salt and
      discarding the rest. So we add a cached u64 in the PRNG that can serve up to
      8 u8 before having to call the "next" function again.
      
      During some integration work, I also realized that some very early processes
      (like `init`) do not benefit from `/dev/urandom` yet. So if there is no
      `getrandom` syscall as well, we have to fallback to some sort of initialization
      of the PRNG.
      
      Now a few words on why XoRoShiRo and not something else. I have played a while
      with various PRNGs on 32 & 64 bit platforms. Some results are below. LCG 32 & 64
      are usually faster but produce respectively 15 & 31 bits of entropy, meaning
      that to get a full 64-bit, you would need to call them several times. The simple
      XorShift is fast, produces 32 bits but is mediocre with regard to PRNG test
      suites, PCG is slower overall, and XoRoShiRo is faster than XorShift128+ and
      produces full 64 bits.
      
      %%%
      root@tulip-chiphd:/data # ./randtest.arm
      [+] starting xs32...
      [?] xs32 duration: 22431833053ns
      [+] starting lcg32...
      [?] lcg32 duration: 14941402090ns
      [+] starting pcg32...
      [?] pcg32 duration: 44941973771ns
      [+] starting xs128p...
      [?] xs128p duration: 48889786981ns
      [+] starting lcg64...
      [?] lcg64 duration: 33831042391ns
      [+] starting xos128p...
      [?] xos128p duration: 44850878605ns
      
      root@tulip-chiphd:/data # ./randtest.aarch64
      [+] starting xs32...
      [?] xs32 duration: 22425151678ns
      [+] starting lcg32...
      [?] lcg32 duration: 14954255257ns
      [+] starting pcg32...
      [?] pcg32 duration: 37346265726ns
      [+] starting xs128p...
      [?] xs128p duration: 22523807219ns
      [+] starting lcg64...
      [?] lcg64 duration: 26141304679ns
      [+] starting xos128p...
      [?] xos128p duration: 14937033215ns
      %%%
      
      Reviewers: alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: aemerson, kristof.beyls, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D35221
      
      llvm-svn: 307798
      00582563
  3. May 05, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Add Android support · ee069576
      Kostya Kortchinsky authored
      Summary:
      This change adds Android support to the allocator (but doesn't yet enable it in
      the cmake config), and should be the last fragment of the rewritten change
      D31947.
      
      Android has more memory constraints than other platforms, so the idea of a
      unique context per thread would not have worked. The alternative chosen is to
      allocate a set of contexts based on the number of cores on the machine, and
      share those contexts within the threads. Contexts can be dynamically reassigned
      to threads to prevent contention, based on a scheme suggested by @dvyuokv in
      the initial review.
      
      Additionally, given that Android doesn't support ELF TLS (only emutls for now),
      we use the TSan TLS slot to make things faster: Scudo is mutually exclusive
      with other sanitizers so this shouldn't cause any problem.
      
      An additional change made here, is replacing `thread_local` by `THREADLOCAL`
      and using the initial-exec thread model in the non-Android version to prevent
      extraneous weak definition and checks on the relevant variables.
      
      Reviewers: kcc, dvyukov, alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: srhines, mgorny, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32649
      
      llvm-svn: 302300
      ee069576
  4. Apr 27, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Move thread local variables into their own files · 36b34341
      Kostya Kortchinsky authored
      Summary:
      This change introduces scudo_tls.h & scudo_tls_linux.cpp, where we move the
      thread local variables used by the allocator, namely the cache, quarantine
      cache & prng. `ScudoThreadContext` will hold those. This patch doesn't
      introduce any new platform support yet, this will be the object of a later
      patch. This also changes the PRNG so that the structure can be POD.
      
      Reviewers: kcc, dvyukov, alekseyshl
      
      Reviewed By: dvyukov, alekseyshl
      
      Subscribers: llvm-commits, mgorny
      
      Differential Revision: https://reviews.llvm.org/D32440
      
      llvm-svn: 301584
      36b34341
Loading