The LLVM Thread Sanitizer has been ported to NetBSD

During the past month I’ve finished my work on TSan for NetBSD/amd64.
There are still few minor issues, although the Sanitizer is already suitable for real applications and is stabile.
I was able to build real applications like LLDB against TSan and get it to work to find real threading problems.

The process of stabilization and fixing TSan was challenging as there are intermixed types of issues that resulted in one big random breakage bug that is difficult to analyze.
Software debuggers need more work with threaded programs, so this was like a chicken-egg problem, to debug debugging utilities.

Corrections

Most of the corrections were in TSan-specific and Common Sanitizer code. There was also one fix in LSan.

TSan: on_exit()/at_exit(3)/__cxa_atexit()

There are different function types for the same purpose: to execute a callback function on thread or process termination.
The existing code in TSan wasn’t compatible with the NetBSD Operating System:

  • on_exit() – This function is Linux-specific, I’ve disabled it for NetBSD.
  • at_exit(3) – It was reimplemented by TSan using __cxa_atexit(), however in an incompatible way for NetBSD.
    TSan was attempting to register a wrapper callback through __cxa_atexit() with the second argument as a function pointer and the third argument (Dynamic Shared Object pointer) equal with NULL.
    This approach is not portable and it broke on NetBSD, therefore I had to add a new implementation based on a stack (LIFO container).
  • Every at_exit(3) registering function is intercepted by TSan and the sanitizer pushes it to the local LIFO container, passing its local wrapper function to the system.
    During the execution of a callback by the OS, we call the wrapper, which pops the originally saved function pointer from the stack and executes it.

  • __cxa_atexit() – This callback shared TSan internals with at_exit(3) and is functional on NetBSD.

To assure the changes, I’ve added a new test named atexit3, which assures the correct order of execution of the at_exit(3) callbacks.

TSan: _lwp_exit()

In order to detect a thread’s termination by the TSan interceptors, a mechanism to register a callback function in the pthread(3) destructor was used.
The destructor callback was registered with pthread_key_create(3) and this approach was broken on NetBSD for two reasons.

  1. We cannot register it during early libc and libpthread(3) bootstrap, as the system functions need to initialize.
  2. The execution of callback functions is not the last event during a POSIX thread entity termination.

I was looking for a mechanism to defer the destructor callback registration to subsequent libc initialization stages, similar to constructor sections.
I’ve understood that this approach was suboptimal because it resulted in further breakage.
The NetBSD implementation of a POSIX thread termination notifies a parent thread (waiter for join) and still attempts to acquire mutex.
TSan assumed that no longer any thread specific function is called like a mutex acquisition and destroyed part of thread specific data to trace such events.
I’ve switched the POSIX thread termination event detection to the interception of _lwp_exit(2) call,
as it’s truly the latest interceptable function on NetBSD, detaching the low-level
thread entity (LWP) that is the kernel context for POSIX thread.

TSan: Thread Joined vs Thread Exited

Correcting the detection of termination of a thread caused new problems,
with a race between two event notifications that happen at the same time:

  • Thread A sleeps waiting for joining of thread B.
  • Thread B wakes thread A notifying it as joinable.
  • Thread B terminates calling _lwp_exit().

Both events are traced by TSan: joining and exiting and they must be intercepted in the order of exiting followed by joining
(unless a thread is marked to be detached without joining).

This problem has been analyzed and fixed by the introduction of atomic-function waiters in low-level parts (not exposed to TSan or other sanitizers), that causes busy waiting
in ThreadRegistry::JoinThread for notifying the end of execution of ThreadRegistry::FinishThread.
This approach happened to be stable and so far no failures are observed.
There was a tiny breakage in ppc64-linux, as this change introduced as infinite freeze, but it was
caused by an unrelated problem and a faulty test was switched from failing to unsupported.

Sanitizers: GetTls

I’ve implemented the initial support for determining whether a memory buffer is allocated as Thread-Local-Storage.
The current approach uses FreeBSD code, however it’s subject to future improvement: in order to make it more generic and aware of dynamic allocation
(like after dlopen(3)) TLS vectors.

Sanitizers: Handling NetBSD specific indirection of libpthread functions

I’ve corrected handling of three libpthread(3) functions on NetBSD:

  • pthread_mutex_lock(3),
  • pthread_mutex_unlock(3),
  • pthread_setcancelstate(3).

Code out of the libpthread(3) context uses the libc symbols:

  • __libc_mutex_lock,
  • __libc_mutex_unlock,
  • __libc_thr_setcancelstate.

The threading library (libpthread(3)) defines strong aliases:

  • __strong_alias(__libc_mutex_lock,pthread_mutex_lock)
  • __strong_alias(__libc_mutex_unlock,pthread_mutex_unlock)
  • __strong_alias(__libc_thr_setcancelstate,pthread_setcancelstate)

This caused that these functions were invisible to sanitizers on NetBSD.
I’ve introduced interception of the libc-specific functions and I have added them as NetBSD-specific aliases
for the common pthread(3) functions.

NetBSD needs to intercept both functions, as the regularly named ones are used internally in libpthread(3).

Sanitizers: Adding DemangleFunctionName for backtracing on NetBSD

NetBSD uses indirection for old threading functions for historical reasons.
The mangled names are an internal implementation detail and should not be
exposed even in backtraces.

  • __libc_mutex_init -> pthread_mutex_init
  • __libc_mutex_lock -> pthread_mutex_lock
  • __libc_mutex_trylock -> pthread_mutex_trylock
  • __libc_mutex_unlock -> pthread_mutex_unlock
  • __libc_mutex_destroy -> pthread_mutex_destroy
  • __libc_mutexattr_init -> pthread_mutexattr_init
  • __libc_mutexattr_settype -> pthread_mutexattr_settype
  • __libc_mutexattr_destroy -> pthread_mutexattr_destroy
  • __libc_cond_init -> pthread_cond_init
  • __libc_cond_signal -> pthread_cond_signal
  • __libc_cond_broadcast -> pthread_cond_broadcast
  • __libc_cond_wait -> pthread_cond_wait
  • __libc_cond_timedwait -> pthread_cond_timedwait
  • __libc_cond_destroy -> pthread_cond_destroy
  • __libc_rwlock_init -> pthread_rwlock_init
  • __libc_rwlock_rdlock -> pthread_rwlock_rdlock
  • __libc_rwlock_wrlock -> pthread_rwlock_wrlock
  • __libc_rwlock_tryrdlock -> pthread_rwlock_tryrdlock
  • __libc_rwlock_trywrlock -> pthread_rwlock_trywrlock
  • __libc_rwlock_unlock -> pthread_rwlock_unlock
  • __libc_rwlock_destroy -> pthread_rwlock_destroy
  • __libc_thr_keycreate -> pthread_key_create
  • __libc_thr_setspecific -> pthread_setspecific
  • __libc_thr_getspecific -> pthread_getspecific
  • __libc_thr_keydelete -> pthread_key_delete
  • __libc_thr_once -> pthread_once
  • __libc_thr_self -> pthread_self
  • __libc_thr_exit -> pthread_exit
  • __libc_thr_setcancelstate -> pthread_setcancelstate
  • __libc_thr_equal -> pthread_equal
  • __libc_thr_curcpu -> pthread_curcpu_np

This demangling also fixes several tests that expect the regular pthread(3) function names.

TSan: Handling NetBSD specific indirection of libpthread functions

I’ve corrected handling of libpthread(3) functions in TSan/NetBSD:

  • pthread_cond_init(3),
  • pthread_cond_signal(3),
  • pthread_cond_broadcast(3),
  • pthread_cond_wait(3),
  • pthread_cond_destroy(3),
  • pthread_mutex_init(3),
  • pthread_mutex_destroy(3),
  • pthread_mutex_trylock(3),
  • pthread_rwlock_init(3),
  • pthread_rwlock_destroy(3),
  • pthread_rwlock_rdlock(3),
  • pthread_rwlock_tryrdlock(3),
  • pthread_rwlock_wrlock(3),
  • pthread_rwlock_trywrlock(3),
  • pthread_rwlock_unlock(3),
  • pthread_once(3).

Code out of the libpthread(3) context uses the libc symbols
that are prefixed with __libc_, for example: __libc_cond_init.

This has caused that these functions were invisible to sanitizers on NetBSD.
Intercepting the libc-specific and adding them as NetBSD-specific aliases
for the common pthread(3) functions.

NetBSD needs to intercept both functions, as the regularly named ones
are used internally in libpthread(3).

TSan: Correcting NetBSD support in pthread_once(3)

The pthread_once(3)/NetBSD type is built with the following structure:


struct __pthread_once_st {
pthread_mutex_t pto_mutex;
int pto_done;
};

I’ve set the pto_done position as shifted by __sanitizer::pthread_mutex_t_sz
from the beginning of the pthread_once struct.

This corrects deadlocks when the pthread_once(3) function
is used.

Sanitizers: Plug dlerror() leak for swift_demangle

InitializeSwiftDemangler() attempts to resolve the
swift_demangle symbol. If this is not available, we
observe dlerror message leak.

LSan: Detecting thread’s termination

I’ve fixed the same problem as has been analyzed in TSan, and I’ve switched to the _lwp_exit(2) approach.

Sanitizers: Handling symbol renaming of sigaction on NetBSD

NetBSD uses the __sigaction14 symbol name for historical and compat
reasons for the sigaction(2) function name.

I’ve renamed the interceptors and users of sigaction to sigaction_symname
and I’ve reused it in the code base.

TSan: Correcting mangled_sp on NetBSD/amd64

I’ve fixed the LongJmp(3) function on NetBSD and pointed the correct place of the RSP (stack pointer) register on NetBSD/amd64.

TSan: Supporting the setjmp(3) family of functions on NetBSD/amd64

I’ve added support for handling the setjmp(3)/longjmp(3) family of functions on NetBSD/amd64.

There are three types of them on NetBSD:

  • setjmp(3) / longjmp(3)
  • sigsetjmp(3) / sigsetjmp(3)
  • _setjmp(3) / _longjmp(3)

Due to historical and compat reasons the symbol
names are mangled:

  • setjmp -> __setjmp14
  • longjmp -> __longjmp14
  • sigsetjmp -> __sigsetjmp14
  • siglongjmp -> __siglongjmp14
  • _setjmp -> _setjmp
  • _longjmp -> _longjmp

This leads to symbol renaming in the existing codebase.

There is no such symbol as __sigsetjmp/__longsetjmp
on NetBSD so it has been disabled.

Additonally, I’ve added a comment that GNU-style executable stack
note is not needed on NetBSD. The stack is not
executable without it.

TSan: Deferring StartBackgroundThread() and StopBackgroundThread()

NetBSD cannot spawn new POSIX thread entities in early
libc and libpthread initialization stage. I’ve deferred this to the point
of intercepting the first pthread_create(3) call.

This is the last change that makes Thread Sanitizer functional
on NetBSD/amd64 without downstream patches.

Final TSan results

Results for the check-tsan test-target.

********************
Testing Time: 64.91s
********************
Failing Tests (5):
    ThreadSanitizer-x86_64 :: dtls.c
    ThreadSanitizer-x86_64 :: ignore_lib5.cc
    ThreadSanitizer-x86_64 :: ignored-interceptors-mmap.cc
    ThreadSanitizer-x86_64 :: mutex_lock_destroyed.cc
    ThreadSanitizer-x86_64 :: vfork.cc

  Expected Passes    : 290
  Expected Failures  : 1
  Unsupported Tests  : 83
  Unexpected Failures: 5

The following results present that the all crucial issues are now fixed, and this Sanitizer can be used to trace real software.
The remaining problems are minor ones and they are scheduled to be fixed in the future:

  • signal_block.cc – there is some race; sometimes it works sometimes it
    does not work.
  • dtls.c – it looks like dynamically allocated TLS vectors are missing on the
    NetBSD side.
  • vfork.cc – testing UB, it looks like NetBSD behaves the same way like
    Linux does, however the test is failing.
  • mutex_lock_destroyed.cc – it is based on UB implemented in style of Linux.
  • The other tests fail for similar rare case scenarios like massive
    mmap(2) calls that seem to overflow the shadow.

LLVM JIT

As noted in the previous reports, there is an ongoing process to improve NetBSD compatiblity with existing Just-In-Time frameworks in LLVM.
In the recent month the existing code has been adjusted to the point to pass all existing LLVM tests of JIT code on NetBSD under PaX MPROTECT.

Scudo hardened allocator

I’ve added initial support for NetBSD in the Scudo hardened allocator.
I keep this code locally in pkgsrc-wip/compiler-rt-netbsd.

More work is needed in order to correct the known failures in tests.
These are largely caused by the fact that Scudo was a Linux-only feature and the existing tests depend on GLIBC specific internals.
They need to be adapted for the default NetBSD allocator (jemalloc(3)).

********************
Testing Time: 5.40s
********************
Failing Tests (32):
    Scudo-i386 :: double-free.cpp
    Scudo-i386 :: interface.cpp
    Scudo-i386 :: memalign.c
    Scudo-i386 :: mismatch.cpp
    Scudo-i386 :: options.cpp
    Scudo-i386 :: overflow.c
    Scudo-i386 :: preload.cpp
    Scudo-i386 :: quarantine.c
    Scudo-i386 :: realloc.cpp
    Scudo-i386 :: rss.c
    Scudo-i386 :: secondary.c
    Scudo-i386 :: sizes.cpp
    Scudo-i386 :: valloc.c
    Scudo-x86_64 :: alignment.c
    Scudo-x86_64 :: double-free.cpp
    Scudo-x86_64 :: interface.cpp
    Scudo-x86_64 :: malloc.cpp
    Scudo-x86_64 :: memalign.c
    Scudo-x86_64 :: mismatch.cpp
    Scudo-x86_64 :: options.cpp
    Scudo-x86_64 :: overflow.c
    Scudo-x86_64 :: preload.cpp
    Scudo-x86_64 :: quarantine.c
    Scudo-x86_64 :: random_shuffle.cpp
    Scudo-x86_64 :: realloc.cpp
    Scudo-x86_64 :: rss.c
    Scudo-x86_64 :: secondary.c
    Scudo-x86_64 :: sized-delete.cpp
    Scudo-x86_64 :: sizes.cpp
    Scudo-x86_64 :: threads.c
    Scudo-x86_64 :: valloc.c

  Expected Passes    : 8
  Unexpected Failures: 32

Plans for the next milestone

The next goal is to finish MSan and switch back to LLDB restoration for tracing single threaded programs.

The TSan corrections indirectly increased the number of passing MSan tests.
I’m going to solve the detected problems and thanks to the experience with other sanitizers the MSan issues don’t seem to be as challenging like as before finishing TSan.

********************
Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
Testing Time: 30.91s
********************
Failing Tests (69):
    MemorySanitizer-x86_64 :: allocator_returns_null.cc
    MemorySanitizer-x86_64 :: backtrace.cc
    MemorySanitizer-x86_64 :: c-strdup.c
    MemorySanitizer-x86_64 :: chained_origin.cc
    MemorySanitizer-x86_64 :: chained_origin_empty_stack.cc
    MemorySanitizer-x86_64 :: chained_origin_limits.cc
    MemorySanitizer-x86_64 :: chained_origin_memcpy.cc
    MemorySanitizer-x86_64 :: chained_origin_with_signals.cc
    MemorySanitizer-x86_64 :: check_mem_is_initialized.cc
    MemorySanitizer-x86_64 :: death-callback.cc
    MemorySanitizer-x86_64 :: dlopen_executable.cc
    MemorySanitizer-x86_64 :: dso-origin.cc
    MemorySanitizer-x86_64 :: dtls_test.c
    MemorySanitizer-x86_64 :: dtor-base-access.cc
    MemorySanitizer-x86_64 :: dtor-bit-fields.cc
    MemorySanitizer-x86_64 :: dtor-derived-class.cc
    MemorySanitizer-x86_64 :: dtor-multiple-inheritance-nontrivial-class-members.cc
    MemorySanitizer-x86_64 :: dtor-multiple-inheritance.cc
    MemorySanitizer-x86_64 :: dtor-trivial-class-members.cc
    MemorySanitizer-x86_64 :: dtor-vtable-multiple-inheritance.cc
    MemorySanitizer-x86_64 :: dtor-vtable.cc
    MemorySanitizer-x86_64 :: fork.cc
    MemorySanitizer-x86_64 :: ftime.cc
    MemorySanitizer-x86_64 :: getaddrinfo-positive.cc
    MemorySanitizer-x86_64 :: getaddrinfo.cc
    MemorySanitizer-x86_64 :: getc_unlocked.c
    MemorySanitizer-x86_64 :: heap-origin.cc
    MemorySanitizer-x86_64 :: icmp_slt_allones.cc
    MemorySanitizer-x86_64 :: iconv.cc
    MemorySanitizer-x86_64 :: ifaddrs.cc
    MemorySanitizer-x86_64 :: insertvalue_origin.cc
    MemorySanitizer-x86_64 :: mktime.cc
    MemorySanitizer-x86_64 :: mmap.cc
    MemorySanitizer-x86_64 :: msan_copy_shadow.cc
    MemorySanitizer-x86_64 :: msan_dump_shadow.cc
    MemorySanitizer-x86_64 :: msan_print_shadow.cc
    MemorySanitizer-x86_64 :: msan_print_shadow2.cc
    MemorySanitizer-x86_64 :: origin-store-long.cc
    MemorySanitizer-x86_64 :: param_tls_limit.cc
    MemorySanitizer-x86_64 :: print_stats.cc
    MemorySanitizer-x86_64 :: pthread_getattr_np_deadlock.cc
    MemorySanitizer-x86_64 :: pvalloc.cc
    MemorySanitizer-x86_64 :: readdir64.cc
    MemorySanitizer-x86_64 :: realloc-large-origin.cc
    MemorySanitizer-x86_64 :: realloc-origin.cc
    MemorySanitizer-x86_64 :: report-demangling.cc
    MemorySanitizer-x86_64 :: scandir.cc
    MemorySanitizer-x86_64 :: scandir_null.cc
    MemorySanitizer-x86_64 :: select_float_origin.cc
    MemorySanitizer-x86_64 :: select_origin.cc
    MemorySanitizer-x86_64 :: sem_getvalue.cc
    MemorySanitizer-x86_64 :: signal_stress_test.cc
    MemorySanitizer-x86_64 :: sigwait.cc
    MemorySanitizer-x86_64 :: stack-origin.cc
    MemorySanitizer-x86_64 :: stack-origin2.cc
    MemorySanitizer-x86_64 :: strerror_r-non-gnu.c
    MemorySanitizer-x86_64 :: strlen_of_shadow.cc
    MemorySanitizer-x86_64 :: strndup.cc
    MemorySanitizer-x86_64 :: textdomain.cc
    MemorySanitizer-x86_64 :: times.cc
    MemorySanitizer-x86_64 :: tls_reuse.cc
    MemorySanitizer-x86_64 :: tsearch.cc
    MemorySanitizer-x86_64 :: tzset.cc
    MemorySanitizer-x86_64 :: unaligned_read_origin.cc
    MemorySanitizer-x86_64 :: unpoison_string.cc
    MemorySanitizer-x86_64 :: use-after-dtor.cc
    MemorySanitizer-x86_64 :: use-after-free.cc
    MemorySanitizer-x86_64 :: wcsncpy.cc

  Expected Passes    : 38
  Expected Failures  : 1
  Unsupported Tests  : 24
  Unexpected Failures: 69

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate

Go to Source
Author: Kamil Rytarowski

Powered by WPeMatico