-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Required prerequisites
- Make sure you've read the documentation. Your issue may be addressed there.
- Search the issue tracker and Discussions to verify that this hasn't already been reported. +1 or comment there if it has.
- Consider asking first in the Gitter chat room or in a Discussion.
What version (or hash if on master) of pybind11 are you using?
Problem description
Hi, there is a potential bug in pythonbuf reachable by providing truncated UTF-8 data to an undersized buffer.
This bug was reproduced on e6984c8.
Description
The pythonbuf class can be used to stream output data with a variable internal buffer_size. It implements some logic to prevent sending incomplete UTF-8 data between flushes. In certain cases, however if the buffer is undersized this write logic will write beyond the bounds of the buffer.
Specifically, I believe the core issue is in the overflow logic:
pybind11/include/pybind11/iostream.h
Lines 47 to 53 in e6984c8
| int overflow(int c) override { | |
| if (!traits_type::eq_int_type(c, traits_type::eof())) { | |
| *pptr() = traits_type::to_char_type(c); | |
| pbump(1); | |
| } | |
| return sync() == 0 ? traits_type::not_eof(c) : traits_type::eof(); | |
| } |
In particular, this function may be invoked when the streambuf is full, so unconditionally writing to *pptr() is a dangerous pattern that could write out of bounds.
POC
The following testcase demonstrates the bug:
testcase.cpp
#include <pybind11/pybind11.h>
#include <pybind11/iostream.h>
int main() {
if (!Py_IsInitialized()) Py_Initialize();
PyGILState_STATE g = PyGILState_Ensure();
pybind11::object pyostream = pybind11::module_::import("sys").attr("stdout");
// buffer_size=1 is accepted by the constructor but triggers an overflow later
pybind11::detail::pythonbuf pb(pyostream, 1);
PyGILState_Release(g);
std::ostream os(&pb);
// Emit an incomplete UTF-8 sequence split across writes to exercise utf8 remainder logic
// This sequence crashes
if (1) {
os.put('\xE2');
os.flush();
os.put('\x80'); // ASan: heap-buffer-overflow in pythonbuf::overflow
os.flush();
}
// This sequence does not crash
if (0) {
os.put('a');
os.flush();
os.put('b');
os.flush();
}
return 0;
}
stdout
=================================================================
==1==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5020000003b1 at pc 0x555dd5c7e9b5 bp 0x7ffce660f1b0 sp 0x7ffce660f1a8
WRITE of size 1 at 0x5020000003b1 thread T0
#0 0x555dd5c7e9b4 in pybind11::detail::pythonbuf::overflow(int) /fuzz/install/include/pybind11/iostream.h:49:21
#1 0x7f7d89ba9269 in std::ostream::put(char) (/lib/x86_64-linux-gnu/libstdc++.so.6+0x13c269) (BuildId: e72c155b714bc42a767ec9c0dd94589110e5b42f)
#2 0x555dd5c031a6 in main /fuzz/testcase.cpp:19:12
#3 0x7f7d89750d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#4 0x7f7d89750e3f in __libc_start_main csu/../csu/libc-start.c:392:3
#5 0x555dd5b27de4 in _start (/fuzz/test+0x35de4) (BuildId: e9cb09f22c5440d5b232e234a3bc1b5edf7930bb)
0x5020000003b1 is located 0 bytes after 1-byte region [0x5020000003b0,0x5020000003b1)
allocated by thread T0 here:
#0 0x555dd5c00d0d in operator new[](unsigned long) (/fuzz/test+0x10ed0d) (BuildId: e9cb09f22c5440d5b232e234a3bc1b5edf7930bb)
#1 0x555dd5c055be in pybind11::detail::pythonbuf::pythonbuf(pybind11::object const&, unsigned long) /fuzz/install/include/pybind11/iostream.h:121:43
#2 0x555dd5c03126 in main /fuzz/testcase.cpp:9:33
#3 0x7f7d89750d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
SUMMARY: AddressSanitizer: heap-buffer-overflow /fuzz/install/include/pybind11/iostream.h:49:21 in pybind11::detail::pythonbuf::overflow(int)
Shadow bytes around the buggy address:
0x502000000100: fa fa 01 fa fa fa 01 fa fa fa fd fa fa fa fd fd
0x502000000180: fa fa fd fd fa fa fd fa fa fa fd fd fa fa 01 fa
0x502000000200: fa fa 01 fa fa fa 00 07 fa fa fd fd fa fa fd fd
0x502000000280: fa fa fd fd fa fa fd fd fa fa 01 fa fa fa 06 fa
0x502000000300: fa fa 00 00 fa fa 06 fa fa fa 00 00 fa fa fd fd
=>0x502000000380: fa fa 01 fa fa fa[01]fa fa fa 00 fa fa fa 00 00
0x502000000400: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x502000000480: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x502000000500: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x502000000580: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x502000000600: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==1==ABORTING
stderr
Steps to Reproduce
The crash was triaged with the following Dockerfile:
Dockerfile
# Ubuntu 22.04 with some packages pre-installed
FROM hgarrereyn/stitch_repro_base@sha256:3ae94cdb7bf2660f4941dc523fe48cd2555049f6fb7d17577f5efd32a40fdd2c
RUN git clone https://github.com/pybind/pybind11.git /fuzz/src && \
cd /fuzz/src && \
git checkout e6984c805ec09c0e5f826e3081a32f322a6bfe63 && \
git submodule update --init --remote --recursive
ENV LD_LIBRARY_PATH=/fuzz/install/lib
ENV ASAN_OPTIONS=hard_rss_limit_mb=1024:detect_leaks=0
RUN echo '#!/bin/bash\nexec clang-17 -fsanitize=address -O0 "$@"' > /usr/local/bin/clang_wrapper && \
chmod +x /usr/local/bin/clang_wrapper && \
echo '#!/bin/bash\nexec clang++-17 -fsanitize=address -O0 "$@"' > /usr/local/bin/clang_wrapper++ && \
chmod +x /usr/local/bin/clang_wrapper++
RUN apt-get update && apt-get install -y --no-install-recommends \
python3-dev python3-minimal cmake ninja-build libeigen3-dev \
&& rm -rf /var/lib/apt/lists/*
ENV CC=clang_wrapper \
CXX=clang_wrapper++
WORKDIR /fuzz/src
RUN cmake -S . -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=/fuzz/install \
-DPYBIND11_TEST=OFF
RUN cmake --build build --target installBuild Command
clang++-17 -fsanitize=address -g -O0 -o /fuzz/test /fuzz/testcase.cpp -I/fuzz/install/include -I/usr/include/python3.10 -I/usr/include/eigen3 -L/usr/lib/x86_64-linux-gnu -lpython3.10 && /fuzz/testReproduce
- Copy
Dockerfileandtestcase.cppinto a local folder. - Build the repro image:
docker build . -t repro --platform=linux/amd64- Compile and run the testcase in the image:
docker run \
-it --rm \
--platform linux/amd64 \
--mount type=bind,source="$(pwd)/testcase.cpp",target=/fuzz/testcase.cpp \
repro \
bash -c "clang++-17 -fsanitize=address -g -O0 -o /fuzz/test /fuzz/testcase.cpp -I/fuzz/install/include -I/usr/include/python3.10 -I/usr/include/eigen3 -L/usr/lib/x86_64-linux-gnu -lpython3.10 && /fuzz/test"Additional Info
This testcase was discovered by STITCH, an autonomous fuzzing system. All reports are reviewed manually (by a human) before submission.
Reproducible example code
Is this a regression? Put the last known working version here if it is.
Not a regression