Skip to content

Commit 641fbf1

Browse files
authored
[TySan] Add initial Type Sanitizer runtime (#76261)
This patch introduces the runtime components for type sanitizer: a sanitizer for type-based aliasing violations. It is based on Hal Finkel's https://reviews.llvm.org/D32197. C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help. For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected. The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime. The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls. The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated. As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work. This includes build fixes for Linux from Mingjie Xu. Depends on #76260 (Clang support), #76259 (LLVM support) PR: #76261
1 parent 5d4e4b3 commit 641fbf1

35 files changed

+1787
-2
lines changed

clang/runtime/CMakeLists.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ if(LLVM_BUILD_EXTERNAL_COMPILER_RT AND EXISTS ${COMPILER_RT_SRC_ROOT}/)
122122
COMPONENT compiler-rt)
123123

124124
# Add top-level targets that build specific compiler-rt runtimes.
125-
set(COMPILER_RT_RUNTIMES fuzzer asan builtins dfsan lsan msan profile tsan ubsan ubsan-minimal)
125+
set(COMPILER_RT_RUNTIMES fuzzer asan builtins dfsan lsan msan profile tsan tysan ubsan ubsan-minimal)
126126
foreach(runtime ${COMPILER_RT_RUNTIMES})
127127
get_ext_project_build_command(build_runtime_cmd ${runtime})
128128
add_custom_target(${runtime}

compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake

+1
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ else()
8585
set(ALL_TSAN_SUPPORTED_ARCH ${X86_64} ${MIPS64} ${ARM64} ${PPC64} ${S390X}
8686
${LOONGARCH64} ${RISCV64})
8787
endif()
88+
set(ALL_TYSAN_SUPPORTED_ARCH ${X86_64} ${ARM64})
8889
set(ALL_UBSAN_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM32} ${ARM64} ${RISCV64}
8990
${MIPS32} ${MIPS64} ${PPC64} ${S390X} ${SPARC} ${SPARCV9} ${HEXAGON}
9091
${LOONGARCH64})

compiler-rt/cmake/config-ix.cmake

+14-1
Original file line numberDiff line numberDiff line change
@@ -458,6 +458,7 @@ if(APPLE)
458458
set(SANITIZER_COMMON_SUPPORTED_OS osx)
459459
set(PROFILE_SUPPORTED_OS osx)
460460
set(TSAN_SUPPORTED_OS osx)
461+
set(TYSAN_SUPPORTED_OS osx)
461462
set(XRAY_SUPPORTED_OS osx)
462463
set(FUZZER_SUPPORTED_OS osx)
463464
set(ORC_SUPPORTED_OS)
@@ -593,6 +594,7 @@ if(APPLE)
593594
list(APPEND FUZZER_SUPPORTED_OS ${platform})
594595
list(APPEND ORC_SUPPORTED_OS ${platform})
595596
list(APPEND UBSAN_SUPPORTED_OS ${platform})
597+
list(APPEND TYSAN_SUPPORTED_OS ${platform})
596598
list(APPEND LSAN_SUPPORTED_OS ${platform})
597599
list(APPEND STATS_SUPPORTED_OS ${platform})
598600
endif()
@@ -651,6 +653,9 @@ if(APPLE)
651653
list_intersect(CTX_PROFILE_SUPPORTED_ARCH
652654
ALL_CTX_PROFILE_SUPPORTED_ARCH
653655
SANITIZER_COMMON_SUPPORTED_ARCH)
656+
list_intersect(TYSAN_SUPPORTED_ARCH
657+
ALL_TYSAN_SUPPORTED_ARCH
658+
SANITIZER_COMMON_SUPPORTED_ARCH)
654659
list_intersect(TSAN_SUPPORTED_ARCH
655660
ALL_TSAN_SUPPORTED_ARCH
656661
SANITIZER_COMMON_SUPPORTED_ARCH)
@@ -703,6 +708,7 @@ else()
703708
filter_available_targets(PROFILE_SUPPORTED_ARCH ${ALL_PROFILE_SUPPORTED_ARCH})
704709
filter_available_targets(CTX_PROFILE_SUPPORTED_ARCH ${ALL_CTX_PROFILE_SUPPORTED_ARCH})
705710
filter_available_targets(TSAN_SUPPORTED_ARCH ${ALL_TSAN_SUPPORTED_ARCH})
711+
filter_available_targets(TYSAN_SUPPORTED_ARCH ${ALL_TYSAN_SUPPORTED_ARCH})
706712
filter_available_targets(UBSAN_SUPPORTED_ARCH ${ALL_UBSAN_SUPPORTED_ARCH})
707713
filter_available_targets(SAFESTACK_SUPPORTED_ARCH
708714
${ALL_SAFESTACK_SUPPORTED_ARCH})
@@ -748,7 +754,7 @@ if(COMPILER_RT_SUPPORTED_ARCH)
748754
endif()
749755
message(STATUS "Compiler-RT supported architectures: ${COMPILER_RT_SUPPORTED_ARCH}")
750756

751-
set(ALL_SANITIZERS asan;rtsan;dfsan;msan;hwasan;tsan;safestack;cfi;scudo_standalone;ubsan_minimal;gwp_asan;nsan;asan_abi)
757+
set(ALL_SANITIZERS asan;rtsan;dfsan;msan;hwasan;tsan;tysan;safestack;cfi;scudo_standalone;ubsan_minimal;gwp_asan;nsan;asan_abi)
752758
set(COMPILER_RT_SANITIZERS_TO_BUILD all CACHE STRING
753759
"sanitizers to build if supported on the target (all;${ALL_SANITIZERS})")
754760
list_replace(COMPILER_RT_SANITIZERS_TO_BUILD all "${ALL_SANITIZERS}")
@@ -843,6 +849,13 @@ else()
843849
set(COMPILER_RT_HAS_CTX_PROFILE FALSE)
844850
endif()
845851

852+
if (COMPILER_RT_HAS_SANITIZER_COMMON AND TYSAN_SUPPORTED_ARCH AND
853+
OS_NAME MATCHES "Linux|Darwin")
854+
set(COMPILER_RT_HAS_TYSAN TRUE)
855+
else()
856+
set(COMPILER_RT_HAS_TYSAN FALSE)
857+
endif()
858+
846859
if (COMPILER_RT_HAS_SANITIZER_COMMON AND TSAN_SUPPORTED_ARCH)
847860
if (OS_NAME MATCHES "Linux|Darwin|FreeBSD|NetBSD")
848861
set(COMPILER_RT_HAS_TSAN TRUE)

compiler-rt/lib/tysan/CMakeLists.txt

+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
include_directories(..)
2+
3+
# Runtime library sources and build flags.
4+
set(TYSAN_SOURCES
5+
tysan.cpp
6+
tysan_interceptors.cpp)
7+
set(TYSAN_COMMON_CFLAGS ${SANITIZER_COMMON_CFLAGS})
8+
append_rtti_flag(OFF TYSAN_COMMON_CFLAGS)
9+
# Prevent clang from generating libc calls.
10+
append_list_if(COMPILER_RT_HAS_FFREESTANDING_FLAG -ffreestanding TYSAN_COMMON_CFLAGS)
11+
12+
add_compiler_rt_object_libraries(RTTysan_dynamic
13+
OS ${SANITIZER_COMMON_SUPPORTED_OS}
14+
ARCHS ${TYSAN_SUPPORTED_ARCH}
15+
SOURCES ${TYSAN_SOURCES}
16+
ADDITIONAL_HEADERS ${TYSAN_HEADERS}
17+
CFLAGS ${TYSAN_DYNAMIC_CFLAGS}
18+
DEFS ${TYSAN_DYNAMIC_DEFINITIONS})
19+
20+
21+
# Static runtime library.
22+
add_compiler_rt_component(tysan)
23+
24+
25+
if(APPLE)
26+
add_weak_symbols("sanitizer_common" WEAK_SYMBOL_LINK_FLAGS)
27+
28+
add_compiler_rt_runtime(clang_rt.tysan
29+
SHARED
30+
OS ${SANITIZER_COMMON_SUPPORTED_OS}
31+
ARCHS ${TYSAN_SUPPORTED_ARCH}
32+
OBJECT_LIBS RTTysan_dynamic
33+
RTInterception
34+
RTSanitizerCommon
35+
RTSanitizerCommonLibc
36+
RTSanitizerCommonSymbolizer
37+
CFLAGS ${TYSAN_DYNAMIC_CFLAGS}
38+
LINK_FLAGS ${WEAK_SYMBOL_LINK_FLAGS}
39+
DEFS ${TYSAN_DYNAMIC_DEFINITIONS}
40+
PARENT_TARGET tysan)
41+
42+
add_compiler_rt_runtime(clang_rt.tysan_static
43+
STATIC
44+
ARCHS ${TYSAN_SUPPORTED_ARCH}
45+
OBJECT_LIBS RTTysan_static
46+
CFLAGS ${TYSAN_CFLAGS}
47+
DEFS ${TYSAN_COMMON_DEFINITIONS}
48+
PARENT_TARGET tysan)
49+
else()
50+
foreach(arch ${TYSAN_SUPPORTED_ARCH})
51+
set(TYSAN_CFLAGS ${TYSAN_COMMON_CFLAGS})
52+
append_list_if(COMPILER_RT_HAS_FPIE_FLAG -fPIE TYSAN_CFLAGS)
53+
add_compiler_rt_runtime(clang_rt.tysan
54+
STATIC
55+
ARCHS ${arch}
56+
SOURCES ${TYSAN_SOURCES}
57+
$<TARGET_OBJECTS:RTInterception.${arch}>
58+
$<TARGET_OBJECTS:RTSanitizerCommon.${arch}>
59+
$<TARGET_OBJECTS:RTSanitizerCommonLibc.${arch}>
60+
$<TARGET_OBJECTS:RTSanitizerCommonSymbolizer.${arch}>
61+
CFLAGS ${TYSAN_CFLAGS}
62+
PARENT_TARGET tysan)
63+
endforeach()
64+
endif()

compiler-rt/lib/tysan/lit.cfg

+35
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# -*- Python -*-
2+
3+
import os
4+
5+
# Setup config name.
6+
config.name = 'TypeSanitizer' + getattr(config, 'name_suffix', 'default')
7+
8+
# Setup source root.
9+
config.test_source_root = os.path.dirname(__file__)
10+
11+
# Setup default compiler flags used with -fsanitize=type option.
12+
clang_tysan_cflags = (["-fsanitize=type",
13+
"-mno-omit-leaf-frame-pointer",
14+
"-fno-omit-frame-pointer",
15+
"-fno-optimize-sibling-calls"] +
16+
config.target_cflags +
17+
config.debug_info_flags)
18+
clang_tysan_cxxflags = config.cxx_mode_flags + clang_tysan_cflags
19+
20+
def build_invocation(compile_flags):
21+
return " " + " ".join([config.clang] + compile_flags) + " "
22+
23+
config.substitutions.append( ("%clang_tysan ", build_invocation(clang_tysan_cflags)) )
24+
config.substitutions.append( ("%clangxx_tysan ", build_invocation(clang_tysan_cxxflags)) )
25+
26+
# Default test suffixes.
27+
config.suffixes = ['.c', '.cc', '.cpp']
28+
29+
# TypeSanitizer tests are currently supported on Linux only.
30+
if config.host_os not in ['Linux']:
31+
config.unsupported = True
32+
33+
if config.target_arch != 'aarch64':
34+
config.available_features.add('stable-runtime')
35+

compiler-rt/lib/tysan/lit.site.cfg.in

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
@LIT_SITE_CFG_IN_HEADER@
2+
3+
# Tool-specific config options.
4+
config.name_suffix = "@TYSAN_TEST_CONFIG_SUFFIX@"
5+
config.target_cflags = "@TYSAN_TEST_TARGET_CFLAGS@"
6+
config.target_arch = "@TYSAN_TEST_TARGET_ARCH@"
7+
8+
# Load common config for all compiler-rt lit tests.
9+
lit_config.load_config(config, "@COMPILER_RT_BINARY_DIR@/test/lit.common.configured")
10+
11+
# Load tool-specific config that would do the real work.
12+
lit_config.load_config(config, "@TYSAN_LIT_SOURCE_DIR@/lit.cfg")

0 commit comments

Comments
 (0)