Commits · 040c211bc41bcea76ca9e186af298567976b9d84 · Lorenzo Albano / LLVM bpEVL

Sep 11, 2017

[scudo] Fix improper TSD init after TLS destructors are called · 040c211b

Kostya Kortchinsky authored Sep 11, 2017

Summary:
Some of glibc's own thread local data is destroyed after a user's thread local
destructors are called, via __libc_thread_freeres. This might involve calling
free, as is the case for strerror_thread_freeres.
If there is no prior heap operation in the thread, this free would end up
initializing some thread specific data that would never be destroyed properly
(as user's pthread destructors have already been called), while still being
deallocated when the TLS goes away. As a result, a program could SEGV, usually
in __sanitizer::AllocatorGlobalStats::Unregister, where one of the doubly linked
list links would refer to a now unmapped memory area.

To prevent this from happening, we will not do a full initialization from the
deallocation path. This means that the fallback cache & quarantine will be used
if no other heap operation has been called, and we effectively prevent the TSD
being initialized and never destroyed. The TSD will be fully initialized for all
other paths.

In the event of a thread doing only frees and nothing else, a TSD would never
be initialized for that thread, but this situation is unlikely and we can live
with that.

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37697

llvm-svn: 312939

040c211b

Jul 12, 2017

[scudo] PRNG makeover · 00582563

Kostya Kortchinsky authored Jul 12, 2017

Summary:
This follows the addition of `GetRandom` with D34412. We remove our
`/dev/urandom` code and use the new function. Additionally, change the PRNG for
a slightly faster version. One of the issues with the old code is that we have
64 full bits of randomness per "next", using only 8 of those for the Salt and
discarding the rest. So we add a cached u64 in the PRNG that can serve up to
8 u8 before having to call the "next" function again.

During some integration work, I also realized that some very early processes
(like `init`) do not benefit from `/dev/urandom` yet. So if there is no
`getrandom` syscall as well, we have to fallback to some sort of initialization
of the PRNG.

Now a few words on why XoRoShiRo and not something else. I have played a while
with various PRNGs on 32 & 64 bit platforms. Some results are below. LCG 32 & 64
are usually faster but produce respectively 15 & 31 bits of entropy, meaning
that to get a full 64-bit, you would need to call them several times. The simple
XorShift is fast, produces 32 bits but is mediocre with regard to PRNG test
suites, PCG is slower overall, and XoRoShiRo is faster than XorShift128+ and
produces full 64 bits.

%%%
root@tulip-chiphd:/data # ./randtest.arm
[+] starting xs32...
[?] xs32 duration: 22431833053ns
[+] starting lcg32...
[?] lcg32 duration: 14941402090ns
[+] starting pcg32...
[?] pcg32 duration: 44941973771ns
[+] starting xs128p...
[?] xs128p duration: 48889786981ns
[+] starting lcg64...
[?] lcg64 duration: 33831042391ns
[+] starting xos128p...
[?] xos128p duration: 44850878605ns

root@tulip-chiphd:/data # ./randtest.aarch64
[+] starting xs32...
[?] xs32 duration: 22425151678ns
[+] starting lcg32...
[?] lcg32 duration: 14954255257ns
[+] starting pcg32...
[?] pcg32 duration: 37346265726ns
[+] starting xs128p...
[?] xs128p duration: 22523807219ns
[+] starting lcg64...
[?] lcg64 duration: 26141304679ns
[+] starting xos128p...
[?] xos128p duration: 14937033215ns
%%%

Reviewers: alekseyshl

Reviewed By: alekseyshl

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35221

llvm-svn: 307798

00582563

May 05, 2017

[scudo] Add Android support · ee069576

Kostya Kortchinsky authored May 05, 2017

Summary:
This change adds Android support to the allocator (but doesn't yet enable it in
the cmake config), and should be the last fragment of the rewritten change
D31947.

Android has more memory constraints than other platforms, so the idea of a
unique context per thread would not have worked. The alternative chosen is to
allocate a set of contexts based on the number of cores on the machine, and
share those contexts within the threads. Contexts can be dynamically reassigned
to threads to prevent contention, based on a scheme suggested by @dvyuokv in
the initial review.

Additionally, given that Android doesn't support ELF TLS (only emutls for now),
we use the TSan TLS slot to make things faster: Scudo is mutually exclusive
with other sanitizers so this shouldn't cause any problem.

An additional change made here, is replacing `thread_local` by `THREADLOCAL`
and using the initial-exec thread model in the non-Android version to prevent
extraneous weak definition and checks on the relevant variables.

Reviewers: kcc, dvyukov, alekseyshl

Reviewed By: alekseyshl

Subscribers: srhines, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D32649

llvm-svn: 302300

ee069576

Apr 27, 2017

[scudo] Move thread local variables into their own files · 36b34341

Kostya Kortchinsky authored Apr 27, 2017

Summary:
This change introduces scudo_tls.h & scudo_tls_linux.cpp, where we move the
thread local variables used by the allocator, namely the cache, quarantine
cache & prng. `ScudoThreadContext` will hold those. This patch doesn't
introduce any new platform support yet, this will be the object of a later
patch. This also changes the PRNG so that the structure can be POD.

Reviewers: kcc, dvyukov, alekseyshl

Reviewed By: dvyukov, alekseyshl

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D32440

llvm-svn: 301584

36b34341