Skip to content
  1. Sep 11, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Fix improper TSD init after TLS destructors are called · 040c211b
      Kostya Kortchinsky authored
      Summary:
      Some of glibc's own thread local data is destroyed after a user's thread local
      destructors are called, via __libc_thread_freeres. This might involve calling
      free, as is the case for strerror_thread_freeres.
      If there is no prior heap operation in the thread, this free would end up
      initializing some thread specific data that would never be destroyed properly
      (as user's pthread destructors have already been called), while still being
      deallocated when the TLS goes away. As a result, a program could SEGV, usually
      in __sanitizer::AllocatorGlobalStats::Unregister, where one of the doubly linked
      list links would refer to a now unmapped memory area.
      
      To prevent this from happening, we will not do a full initialization from the
      deallocation path. This means that the fallback cache & quarantine will be used
      if no other heap operation has been called, and we effectively prevent the TSD
      being initialized and never destroyed. The TSD will be fully initialized for all
      other paths.
      
      In the event of a thread doing only frees and nothing else, a TSD would never
      be initialized for that thread, but this situation is unlikely and we can live
      with that.
      
      Reviewers: alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D37697
      
      llvm-svn: 312939
      040c211b
  2. Aug 16, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Application & platform compatibility changes · 43917720
      Kostya Kortchinsky authored
      Summary:
      This patch changes a few (small) things around for compatibility purposes for
      the current Android & Fuchsia work:
      - `realloc`'ing some memory that was not allocated with `malloc`, `calloc` or
        `realloc`, while UB according to http://pubs.opengroup.org/onlinepubs/009695399/functions/realloc.html
        is more common that one would think. We now only check this if
        `DeallocationTypeMismatch` is set; change the "mismatch" error
        messages to be more homogeneous;
      - some sketchily written but widely used libraries expect a call to `realloc`
        to copy the usable size of the old chunk to the new one instead of the
        requested size. We have to begrundingly abide by this de-facto standard.
        This doesn't seem to impact security either way, unless someone comes up with
        something we didn't think about;
      - the CRC32 intrinsics for 64-bit take a 64-bit first argument. This is
        misleading as the upper 32 bits end up being ignored. This was also raising
        `-Wconversion` errors. Change things to take a `u32` as first argument.
        This also means we were (and are) only using 32 bits of the Cookie - not a
        big thing, but worth mentioning.
      - Includes-wise: prefer `stddef.h` to `cstddef`, move `scudo_flags.h` where it
        is actually needed.
      - Add tests for the memalign-realloc case, and the realloc-usable-size one.
      
      (Edited typos)
      
      Reviewers: alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D36754
      
      llvm-svn: 311018
      43917720
  3. Jul 25, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Check for pvalloc overflow · 65fdf677
      Kostya Kortchinsky authored
      Summary:
      Previously we were rounding up the size passed to `pvalloc` to the next
      multiple of page size no matter what. There is an overflow possibility that
      wasn't accounted for. So now, return null in the event of an overflow. The man
      page doesn't seem to indicate the errno to set in this particular situation,
      but the glibc unit tests go for ENOMEM (https://code.woboq.org/userspace/glibc/malloc/tst-pvalloc.c.html#54)
      so we'll do the same.
      Update the aligned allocation funtions tests to check for properly aligned
      returned pointers, and the `pvalloc` corner cases.
      
      @alekseyshl: do you want me to do the same in the other Sanitizers?
      
      Reviewers: alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: kubamracek, alekseyshl, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D35818
      
      llvm-svn: 309033
      65fdf677
  4. Jul 24, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Quarantine overhaul · 2d94405a
      Kostya Kortchinsky authored
      Summary:
      First, some context.
      
      The main feedback we get about the quarantine is that it's too memory hungry.
      A single MB of quarantine will have an impact of 3 to 4MB of PSS/RSS, and
      things quickly get out of hand in terms of memory usage, and the quarantine
      ends up disabled.
      
      The main objective of the quarantine is to protect from use-after-free
      exploitation by making it harder for an attacker to reallocate a controlled
      chunk in place of the targeted freed chunk. This is achieved by not making it
      available to the backend right away for reuse, but holding it a little while.
      
      Historically, what has usually been the target of such attacks was objects,
      where vtable pointers or other function pointers could constitute a valuable
      targeti to replace. Those are usually on the smaller side. There is barely any
      advantage in putting the quarantine several megabytes of RGB data or the like.
      
      Now for the patch.
      
      This patch introduces a new way the Quarantine behaves in Scudo. First of all,
      the size of the Quarantine will be defined in KB instead of MB, then we
      introduce a new option: the size up to which (lower than or equal to) a chunk
      will be quarantined. This way, we only quarantine smaller chunks, and the size
      of the quarantine remains manageable. It also prevents someone from triggering
      a recycle by allocating something huge. We default to 512 bytes on 32-bit and
      2048 bytes on 64-bit platforms.
      
      In details, the patches includes the following:
      - introduce `QuarantineSizeKb`, but honor `QuarantineSizeMb` if set to fall
        back to the old behavior (meaning no threshold in that case);
        `QuarantineSizeMb` is described as deprecated in the options descriptios;
        documentation update will follow;
      - introduce `QuarantineChunksUpToSize`, the new threshold value;
      - update the `quarantine.cpp` test, and other tests using `QuarantineSizeMb`;
      - remove `AllocatorOptions::copyTo`, it wasn't used;
      - slightly change the logic around `quarantineOrDeallocateChunk` to accomodate
        for the new logic; rename a couple of variables there as well;
      
      Rewriting the tests, I found a somewhat annoying bug where non-default aligned
      chunks would account for more than needed when placed in the quarantine due to
      `<< MinAlignment` instead of `<< MinAlignmentLog`. This is fixed and tested for
      now.
      
      Reviewers: alekseyshl, kcc
      
      Reviewed By: alekseyshl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D35694
      
      llvm-svn: 308884
      2d94405a
  5. Jul 18, 2017
    • Alex Shlyapnikov's avatar
      [Sanitizers] ASan/MSan/LSan allocators set errno on failure. · 42bea018
      Alex Shlyapnikov authored
      Summary:
      ASan/MSan/LSan allocators set errno on allocation failures according to
      malloc/calloc/etc. expected behavior.
      
      MSan allocator was refactored a bit to make its structure more similar
      with other allocators.
      
      Also switch Scudo allocator to the internal errno definitions.
      
      TSan allocator changes will follow.
      
      Reviewers: eugenis
      
      Subscribers: llvm-commits, kubamracek
      
      Differential Revision: https://reviews.llvm.org/D35275
      
      llvm-svn: 308344
      42bea018
  6. Jul 14, 2017
  7. Jul 13, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Do not grab a cache for secondary allocation & per related changes · b44364dd
      Kostya Kortchinsky authored
      Summary:
      Secondary backed allocations do not require a cache. While it's not necessary
      an issue when each thread has its cache, it becomes one with a shared pool of
      caches (Android), as a Secondary backed allocation or deallocation holds a
      cache that could be useful to another thread doing a Primary backed allocation.
      
      We introduce an additional PRNG and its mutex (to avoid contention with the
      Fallback one for Primary allocations) that will provide the `Salt` needed for
      Secondary backed allocations.
      
      I changed some of the code in a way that feels more readable to me (eg: using
      some values directly rather than going  through ternary assigned variables,
      using directly `true`/`false` rather than `FromPrimary`). I will let reviewers
      decide if it actually is.
      
      An additional change is to mark `CheckForCallocOverflow` as `UNLIKELY`.
      
      Reviewers: alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D35358
      
      llvm-svn: 307958
      b44364dd
  8. Jul 12, 2017
    • Kostya Kortchinsky's avatar
      [scudo] PRNG makeover · 00582563
      Kostya Kortchinsky authored
      Summary:
      This follows the addition of `GetRandom` with D34412. We remove our
      `/dev/urandom` code and use the new function. Additionally, change the PRNG for
      a slightly faster version. One of the issues with the old code is that we have
      64 full bits of randomness per "next", using only 8 of those for the Salt and
      discarding the rest. So we add a cached u64 in the PRNG that can serve up to
      8 u8 before having to call the "next" function again.
      
      During some integration work, I also realized that some very early processes
      (like `init`) do not benefit from `/dev/urandom` yet. So if there is no
      `getrandom` syscall as well, we have to fallback to some sort of initialization
      of the PRNG.
      
      Now a few words on why XoRoShiRo and not something else. I have played a while
      with various PRNGs on 32 & 64 bit platforms. Some results are below. LCG 32 & 64
      are usually faster but produce respectively 15 & 31 bits of entropy, meaning
      that to get a full 64-bit, you would need to call them several times. The simple
      XorShift is fast, produces 32 bits but is mediocre with regard to PRNG test
      suites, PCG is slower overall, and XoRoShiRo is faster than XorShift128+ and
      produces full 64 bits.
      
      %%%
      root@tulip-chiphd:/data # ./randtest.arm
      [+] starting xs32...
      [?] xs32 duration: 22431833053ns
      [+] starting lcg32...
      [?] lcg32 duration: 14941402090ns
      [+] starting pcg32...
      [?] pcg32 duration: 44941973771ns
      [+] starting xs128p...
      [?] xs128p duration: 48889786981ns
      [+] starting lcg64...
      [?] lcg64 duration: 33831042391ns
      [+] starting xos128p...
      [?] xos128p duration: 44850878605ns
      
      root@tulip-chiphd:/data # ./randtest.aarch64
      [+] starting xs32...
      [?] xs32 duration: 22425151678ns
      [+] starting lcg32...
      [?] lcg32 duration: 14954255257ns
      [+] starting pcg32...
      [?] pcg32 duration: 37346265726ns
      [+] starting xs128p...
      [?] xs128p duration: 22523807219ns
      [+] starting lcg64...
      [?] lcg64 duration: 26141304679ns
      [+] starting xos128p...
      [?] xos128p duration: 14937033215ns
      %%%
      
      Reviewers: alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: aemerson, kristof.beyls, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D35221
      
      llvm-svn: 307798
      00582563
  9. Jun 29, 2017
    • Alex Shlyapnikov's avatar
      Merge · 346988bf
      Alex Shlyapnikov authored
      llvm-svn: 306748
      346988bf
    • Kostya Kortchinsky's avatar
      [scudo] Change aligned alloc functions to be more compliant & perf changes · 0ce49990
      Kostya Kortchinsky authored
      Summary:
      We were not following the `man` documented behaviors for invalid arguments to
      `memalign` and associated functions. Using `CHECK` for those was a bit extreme,
      so we relax the behavior to return null pointers as expected when this happens.
      Adapt the associated test.
      
      I am using this change also to change a few more minor performance improvements:
      - mark as `UNLIKELY` a bunch of unlikely conditions;
      - the current `CHECK` in `__sanitizer::RoundUpTo` is redundant for us in *all*
        calls. So I am introducing our own version without said `CHECK`.
      - change our combined allocator `GetActuallyAllocatedSize`. We already know if
        the pointer is from the Primary or Secondary, so the `PointerIsMine` check is
        redundant as well, and costly for the 32-bit Primary. So we get the size by
        directly using the available Primary functions.
      
      Finally, change a `int` to `uptr` to avoid a warning/error when compiling on
      Android.
      
      Reviewers: alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D34782
      
      llvm-svn: 306698
      0ce49990
  10. Jun 20, 2017
    • Alex Shlyapnikov's avatar
      [Sanitizers] Move cached allocator_may_return_null flag to sanitizer_allocator · ccab11b0
      Alex Shlyapnikov authored
      Summary:
      Move cached allocator_may_return_null flag to sanitizer_allocator.cc and
      provide API to consolidate and unify the behavior of all specific allocators.
      
      Make all sanitizers using CombinedAllocator to follow
      AllocatorReturnNullOrDieOnOOM() rules to behave the same way when OOM
      happens.
      
      When OOM happens, turn allocator_out_of_memory flag on regardless of
      allocator_may_return_null flag value (it used to not to be set when
      allocator_may_return_null == true).
      
      release_to_os_interval_ms and rss_limit_exceeded will likely be moved to
      sanitizer_allocator.cc too (later).
      
      Reviewers: eugenis
      
      Subscribers: srhines, kubamracek, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D34310
      
      llvm-svn: 305858
      ccab11b0
  11. May 11, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Use our own combined allocator · 01a66fc9
      Kostya Kortchinsky authored
      Summary:
      The reasoning behind this change is twofold:
      - the current combined allocator (sanitizer_allocator_combined.h) implements
        features that are not relevant for Scudo, making some code redundant, and
        some restrictions not pertinent (alignments for example). This forced us to
        do some weird things between the frontend and our secondary to make things
        work;
      - we have enough information to be able to know if a chunk will be serviced by
        the Primary or Secondary, allowing us to avoid extraneous calls to functions
        such as `PointerIsMine` or `CanAllocate`.
      
      As a result, the new scudo-specific combined allocator is very straightforward,
      and allows us to remove some now unnecessary code both in the frontend and the
      secondary. Unused functions have been left in as unimplemented for now.
      
      It turns out to also be a sizeable performance gain (3% faster in some Android
      memory_replay benchmarks, doing some more on other platforms).
      
      Reviewers: alekseyshl, kcc, dvyukov
      
      Reviewed By: alekseyshl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D33007
      
      llvm-svn: 302830
      01a66fc9
  12. May 09, 2017
    • Kostya Kortchinsky's avatar
      [scudo] CRC32 optimizations · b0e96eb2
      Kostya Kortchinsky authored
      Summary:
      This change optimizes several aspects of the checksum used for chunk headers.
      
      First, there is no point in checking the weak symbol `computeHardwareCRC32`
      everytime, it will either be there or not when we start, so check it once
      during initialization and set the checksum type accordingly.
      
      Then, the loading of `HashAlgorithm` for SSE versions (and ARM equivalent) was
      not optimized out, while not necessary. So I reshuffled that part of the code,
      which duplicates a tiny bit of code, but ends up in a much cleaner assembly
      (and faster as we avoid an extraneous load and some calls).
      
      The following code is the checksum at the end of `scudoMalloc` for x86_64 with
      full SSE 4.2, before:
      ```
      mov     rax, 0FFFFFFFFFFFFFFh
      shl     r10, 38h
      mov     edi, dword ptr cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
      and     r14, rax
      lea     rsi, [r13-10h]
      movzx   eax, cs:_ZN7__scudoL13HashAlgorithmE ; __scudo::HashAlgorithm
      or      r14, r10
      mov     rbx, r14
      xor     bx, bx
      call    _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
      mov     rsi, rbx
      mov     edi, eax
      call    _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong)
      mov     r14w, ax
      mov     rax, r13
      mov     [r13-10h], r14
      ```
      After:
      ```
      mov     rax, cs:_ZN7__scudoL6CookieE ; __scudo::Cookie
      lea     rcx, [rbx-10h]
      mov     rdx, 0FFFFFFFFFFFFFFh
      and     r14, rdx
      shl     r9, 38h
      or      r14, r9
      crc32   eax, rcx
      mov     rdx, r14
      xor     dx, dx
      mov     eax, eax
      crc32   eax, rdx
      mov     r14w, ax
      mov     rax, rbx
      mov     [rbx-10h], r14
      ```
      
      Reviewers: dvyukov, alekseyshl, kcc
      
      Reviewed By: alekseyshl
      
      Subscribers: aemerson, rengolin, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32971
      
      llvm-svn: 302538
      b0e96eb2
  13. May 05, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Add Android support · ee069576
      Kostya Kortchinsky authored
      Summary:
      This change adds Android support to the allocator (but doesn't yet enable it in
      the cmake config), and should be the last fragment of the rewritten change
      D31947.
      
      Android has more memory constraints than other platforms, so the idea of a
      unique context per thread would not have worked. The alternative chosen is to
      allocate a set of contexts based on the number of cores on the machine, and
      share those contexts within the threads. Contexts can be dynamically reassigned
      to threads to prevent contention, based on a scheme suggested by @dvyuokv in
      the initial review.
      
      Additionally, given that Android doesn't support ELF TLS (only emutls for now),
      we use the TSan TLS slot to make things faster: Scudo is mutually exclusive
      with other sanitizers so this shouldn't cause any problem.
      
      An additional change made here, is replacing `thread_local` by `THREADLOCAL`
      and using the initial-exec thread model in the non-Android version to prevent
      extraneous weak definition and checks on the relevant variables.
      
      Reviewers: kcc, dvyukov, alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: srhines, mgorny, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32649
      
      llvm-svn: 302300
      ee069576
  14. Apr 27, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Move thread local variables into their own files · 36b34341
      Kostya Kortchinsky authored
      Summary:
      This change introduces scudo_tls.h & scudo_tls_linux.cpp, where we move the
      thread local variables used by the allocator, namely the cache, quarantine
      cache & prng. `ScudoThreadContext` will hold those. This patch doesn't
      introduce any new platform support yet, this will be the object of a later
      patch. This also changes the PRNG so that the structure can be POD.
      
      Reviewers: kcc, dvyukov, alekseyshl
      
      Reviewed By: dvyukov, alekseyshl
      
      Subscribers: llvm-commits, mgorny
      
      Differential Revision: https://reviews.llvm.org/D32440
      
      llvm-svn: 301584
      36b34341
  15. Apr 21, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Bypass Quarantine if its size is set to 0 · f1a54fdf
      Kostya Kortchinsky authored
      Summary:
      In the current state of things, the deallocation path puts a chunk in the
      Quarantine whether it's enabled or not (size of 0). When the Quarantine is
      disabled, this results in the header being loaded (and checked) twice, and
      stored (and checksummed) once, in `deallocate` and `Recycle`.
      
      This change introduces a `quarantineOrDeallocateChunk` function that has a
      fast path to deallocation if the Quarantine is disabled. Even though this is
      not the preferred configuration security-wise, this change saves a sizeable
      amount of processing for that particular situation (which could be adopted by
      low memory devices). Additionally this simplifies a bit `deallocate` and
      `reallocate`.
      
      Reviewers: dvyukov, kcc, alekseyshl
      
      Reviewed By: dvyukov
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32310
      
      llvm-svn: 301015
      f1a54fdf
  16. Apr 20, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Remove GetActuallyAllocatedSize calls from the fast path · fff8e062
      Kostya Kortchinsky authored
      Summary:
      GetActuallyAllocatedSize is actually expensive. In order to avoid calling this
      function in the malloc/free fast path, we change the Scudo chunk header to
      store the size of the chunk, if from the Primary, or the amount of unused
      bytes if from the Secondary. This way, we only have to call the culprit
      function for Secondary backed allocations (and still in realloc).
      
      The performance gain on a singly threaded pure malloc/free benchmark exercising
      the Primary allocator is above 5%.
      
      Reviewers: alekseyshl, kcc, dvyukov
      
      Reviewed By: dvyukov
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32299
      
      llvm-svn: 300861
      fff8e062
    • Kostya Kortchinsky's avatar
      [scudo] Minor changes and refactoring · 006805d1
      Kostya Kortchinsky authored
      Summary:
      This is part of D31947 that is being split into several smaller changes.
      
      This one deals with all the minor changes, more specifically:
      - Rename some variables and functions to make their purpose clearer;
      - Reorder some code;
      - Mark the hot termination incurring checks as `UNLIKELY`; if they happen, the
        program will die anyway;
      - Add a `getScudoChunk` method;
      - Add an `eraseHeader` method to ScudoChunk that will clear a header with 0s;
      - Add a parameter to `allocate` to know if the allocated chunk should be filled
        with zeros. This allows `calloc` to not have to call
        `GetActuallyAllocatedSize`; more changes to get rid of this function on the
        hot paths will follow;
      - reallocate was missing a check to verify that the pointer is properly
        aligned on `MinAlignment`;
      - The `Stats` in the secondary have to be protected by a mutex as the `Add`
        and `Sub` methods are actually not atomic;
      - The software CRC32 function was moved to the header to allow for inlining.
      
      Reviewers: dvyukov, alekseyshl, kcc
      
      Reviewed By: dvyukov
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D32242
      
      llvm-svn: 300846
      006805d1
  17. Feb 03, 2017
    • Kostya Kortchinsky's avatar
      [scudo] 32-bit quarantine sizes adjustments and bug fixes · 8d6257b4
      Kostya Kortchinsky authored
      Summary:
      The local and global quarantine sizes were not offering a distinction for
      32-bit and 64-bit platforms. This is addressed with lower values for 32-bit.
      
      When writing additional tests for the quarantine, it was discovered that when
      calling some of the allocator interface function prior to any allocation
      operation having occured, the test would crash due to the allocator not being
      initialized. This was addressed by making sure the allocator is initialized
      for those scenarios.
      
      Relevant tests were added in interface.cpp and quarantine.cpp.
      
      Last change being the removal of the extraneous link dependencies for the
      tests thanks to rL293220, anf the addition of the gc-sections linker flag.
      
      Reviewers: kcc, alekseyshl
      
      Reviewed By: alekseyshl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D29341
      
      llvm-svn: 294037
      8d6257b4
  18. Jan 20, 2017
  19. Jan 18, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Refactor of CRC32 and ARM runtime CRC32 detection · b39dff45
      Kostya Kortchinsky authored
      Summary:
      ARM & AArch64 runtime detection for hardware support of CRC32 has been added
      via check of the AT_HWVAL auxiliary vector.
      
      Following Michal's suggestions in D28417, the CRC32 code has been further
      changed and looks better now. When compiled with full relro (which is strongly
      suggested to benefit from additional hardening), the weak symbol for
      computeHardwareCRC32 is read-only and the assembly generated is fairly clean
      and straight forward. As suggested, an additional optimization is to skip
      the runtime check if SSE 4.2 has been enabled globally, as opposed to only
      for scudo_crc32.cpp.
      
      scudo_crc32.h has no purpose anymore and was removed.
      
      Reviewers: alekseyshl, kcc, rengolin, mgorny, phosek
      
      Reviewed By: rengolin, mgorny
      
      Subscribers: aemerson, rengolin, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D28574
      
      llvm-svn: 292409
      b39dff45
  20. Jan 10, 2017
    • Kostya Kortchinsky's avatar
      [scudo] Separate hardware CRC32 routines · c4d6c938
      Kostya Kortchinsky authored
      Summary:
      As raised in D28304, enabling SSE 4.2 for the whole Scudo tree leads to the
      emission of SSE 4.2 instructions everywhere, while the runtime checks only
      applied to the CRC32 computing function.
      
      This patch separates the CRC32 function taking advantage of the hardware into
      its own file, and only enabled -msse4.2 for that file, if detected to be
      supported by the compiler.
      
      Another consequence of removing SSE4.2 globally is realizing that memcpy were
      not being optimized, which turned out to be due to the -fno-builtin in
      SANITIZER_COMMON_CFLAGS. So we now explicitely enable builtins for Scudo.
      
      The resulting assembly looks good, with some CALLs are introduced instead of
      the CRC32 code being inlined.
      
      Reviewers: kcc, mgorny, alekseyshl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D28417
      
      llvm-svn: 291570
      c4d6c938
  21. Dec 15, 2016
    • Kostya Kortchinsky's avatar
      [scudo] Use DefaultSizeClassMap for 32-bit · 47be0edf
      Kostya Kortchinsky authored
      Summary:
      With the recent changes to the Secondary, we use less bits for UnusedBytes,
      which allows us in return to increase the bits used for Offset. That means
      that we can use a Primary SizeClassMap allowing for a larger maximum size.
      
      Reviewers: kcc, alekseyshl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D27816
      
      llvm-svn: 289838
      47be0edf
  22. Dec 13, 2016
    • Kostya Kortchinsky's avatar
      Corrected D27428: Do not use the alignment-rounded-up size with secondary · c74da7ce
      Kostya Kortchinsky authored
      Summary:
      I atually had an integer overflow on 32-bit with D27428 that didn't reproduce
      locally, as the test servers would manage allocate addresses in the 0xffffxxxx
      range, which led to some issues when rounding addresses.
      
      At this point, I feel that Scudo could benefit from having its own combined
      allocator, as we don't get any benefit from the current one, but have to work
      around some hurdles (alignment checks, rounding up that is no longer needed,
      extraneous code).
      
      Reviewers: kcc, alekseyshl
      
      Subscribers: llvm-commits, kubabrecka
      
      Differential Revision: https://reviews.llvm.org/D27681
      
      llvm-svn: 289572
      c74da7ce
  23. Dec 02, 2016
  24. Nov 30, 2016
    • Kostya Kortchinsky's avatar
      [scudo] 32-bit and hardware agnostic support · 1148dc52
      Kostya Kortchinsky authored
      Summary:
      This update introduces i386 support for the Scudo Hardened Allocator, and
      offers software alternatives for functions that used to require hardware
      specific instruction sets. This should make porting to new architectures
      easier.
      
      Among the changes:
      - The chunk header has been changed to accomodate the size limitations
        encountered on 32-bit architectures. We now fit everything in 64-bit. This
        was achieved by storing the amount of unused bytes in an allocation rather
        than the size itself, as one can be deduced from the other with the help
        of the GetActuallyAllocatedSize function. As it turns out, this header can
        be used for both 64 and 32 bit, and as such we dropped the requirement for
        the 128-bit compare and exchange instruction support (cmpxchg16b).
      - Add 32-bit support for the checksum and the PRNG functions: if the SSE 4.2
        instruction set is supported, use the 32-bit CRC32 instruction, and in the
        XorShift128, use a 32-bit based state instead of 64-bit.
      - Add software support for CRC32: if SSE 4.2 is not supported, fallback on a
        software implementation.
      - Modify tests that were not 32-bit compliant, and expand them to cover more
        allocation and alignment sizes. The random shuffle test has been deactivated
        for linux-i386 & linux-i686 as the 32-bit sanitizer allocator doesn't
        currently randomize chunks.
      
      Reviewers: alekseyshl, kcc
      
      Subscribers: filcab, llvm-commits, tberghammer, danalbert, srhines, mgorny, modocache
      
      Differential Revision: https://reviews.llvm.org/D26358
      
      llvm-svn: 288255
      1148dc52
  25. Nov 29, 2016
    • Evgeniy Stepanov's avatar
      Return memory to OS right after free (not in the async thread). · d3305afc
      Evgeniy Stepanov authored
      Summary:
      In order to avoid starting a separate thread to return unused memory to
      the system (the thread interferes with process startup on Android,
      Zygota waits for all threads to exit before fork, but this thread never
      exits), try to return it right after free.
      
      Reviewers: eugenis
      
      Subscribers: cryptoad, filcab, danalbert, kubabrecka, llvm-commits
      
      Patch by Aleksey Shlyapnikov.
      
      Differential Revision: https://reviews.llvm.org/D27003
      
      llvm-svn: 288091
      d3305afc
  26. Oct 26, 2016
    • Kostya Kortchinsky's avatar
      [scudo] Lay the foundation for 32-bit support · 71dcc33c
      Kostya Kortchinsky authored
      Summary:
      In order to support 32-bit platforms, we have to make some adjustments in
      multiple locations, one of them being the Scudo chunk header. For it to fit on
      64 bits (as a reminder, on x64 it's 128 bits), I had to crunch the space taken
      by some of the fields. In order to keep the offset field small, the secondary
      allocator was changed to accomodate aligned allocations for larger alignments,
      hence making the offset constant for chunks serviced by it.
      
      The resulting header candidate has been added, and further modifications to
      allow 32-bit support will follow.
      
      Another notable change is the addition of MaybeStartBackgroudThread() to allow
      release of the memory to the OS.
      
      Reviewers: kcc
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D25688
      
      llvm-svn: 285209
      71dcc33c
  27. Sep 30, 2016
  28. Sep 19, 2016
    • Kostya Kortchinsky's avatar
      [scudo] Modify Scudo to use its own Secondary Allocator · 3beafffc
      Kostya Kortchinsky authored
      Summary:
      The Sanitizer Secondary Allocator was not entirely ideal was Scudo for several
      reasons: decent amount of unneeded code, redundant checks already performed by
      the front end, unneeded data structures, difficulty to properly protect the
      secondary chunks header.
      
      Given that the second allocator is pretty straight forward, Scudo will use its
      own, trimming all the unneeded code off of the Sanitizer one. A significant
      difference in terms of security is that now each secondary chunk is preceded
      and followed by a guard page, thus mitigating overflows into and from the
      chunk.
      
      A test was added as well to illustrate the overflow & underflow situations
      into the guard pages.
      
      Reviewers: kcc
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D24737
      
      llvm-svn: 281938
      3beafffc
  29. Aug 26, 2016
  30. Aug 25, 2016
  31. Aug 03, 2016
    • Kostya Serebryany's avatar
      [sanitizer] Implement a __asan_default_options() equivalent for Scudo · 707894b0
      Kostya Serebryany authored
      Summary:
      Currently, the Scudo Hardened Allocator only gets its flags via the SCUDO_OPTIONS environment variable.
      With this patch, we offer the opportunity for programs to define their own options via __scudo_default_options() which behaves like __asan_default_options() (weak symbol).
      A relevant test has been added as well, and the documentation updated accordingly.
      I also used this patch as an opportunity to rename a few variables to comply with the LLVM naming scheme, and replaced a use of Report with dieWithMessage for consistency (and to avoid a callback).
      
      Reviewers: llvm-commits, kcc
      
      Differential Revision: https://reviews.llvm.org/D23018
      
      llvm-svn: 277536
      707894b0
  32. Jun 07, 2016
    • Kostya Serebryany's avatar
      [sanitizer] Initial implementation of a Hardened Allocator · 712fc980
      Kostya Serebryany authored
      Summary:
      This is an initial implementation of a Hardened Allocator based on Sanitizer Common's CombinedAllocator.
      It aims at mitigating heap based vulnerabilities by adding several features to the base allocator, while staying relatively fast.
      The following were implemented:
      - additional consistency checks on the allocation function parameters and on the heap chunks;
      - use of checksum protected chunk header, to detect corruption;
      - randomness to the allocator base;
      - delayed freelist (quarantine), to mitigate use after free and overall determinism.
      Additional mitigations are in the works.
      
      Reviewers: eugenis, aizatsky, pcc, krasin, vitalybuka, glider, dvyukov, kcc
      
      Subscribers: kubabrecka, filcab, llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D20084
      
      llvm-svn: 271968
      712fc980
Loading