Skip to content

Use-after-free in _Unpickler_ReadIntoFromFile: temporary memoryview passed to readinto() can outlive its buffer #151046

@tonghuaroot

Description

@tonghuaroot

Bug report

Bug description

The C implementation of pickle.Unpickler, when reading from a file-like
object that provides readinto(), hands that method a temporary memoryview
created over an internal buffer:

PyObject *buf_obj = PyMemoryView_FromMemory(buf, n, PyBUF_WRITE);
...
PyObject *read_size_obj = _Pickle_FastCall(self->readinto, buf_obj);

(Modules/_pickle.c, _Unpickler_ReadIntoFromFile.)

buf points into a short-lived buffer (e.g. the bytes object allocated in
load_counted_binbytes, which may also be reallocated by _PyBytes_Resize or
freed when unpickling ends). The memoryview is never released or invalidated
after readinto() returns. A readinto() implementation that keeps a
reference to the view can therefore use it to read or write the buffer after
it has been freed, which is a use-after-free at the C level.

This only requires a pure-Python file-like object -- no ctypes. A pure-Python
program should not be able to make the interpreter read or write freed memory.

Reproducer

import pickle, struct, gc

stashed = []

class EvilFile:
    def __init__(self):
        self._h = b"\x80\x05" + b"\x8e" + struct.pack("<Q", 200_000)
        self._p = 0
    def read(self, n=-1):
        d = self._h[self._p:] if (n is None or n < 0) else self._h[self._p:self._p+n]
        self._p += len(d); return d
    def readline(self):
        return self.read(-1)
    def readinto(self, view):
        stashed.append(view)             # keep the view past readinto()
        view[:] = b"A" * len(view); return len(view)

up = pickle.Unpickler(EvilFile())
try:
    up.load()                            # stream ends after the payload
except EOFError:
    pass
del up; gc.collect()                     # free the backing buffer
_ = [bytes(200_000) for _ in range(8)]   # churn the allocator
stashed[0][0]                            # <-- use-after-free read

On a --with-address-sanitizer --with-pydebug build this reports a clean
heap-use-after-free (READ in unpack_single, the buffer freed via
Pdata_dealloc and originally allocated in load_counted_binbytes).

Root cause and fix direction

The view is a non-owning window over a raw pointer that the unpickler does not
keep alive. Other CPython sites that drive a user readinto() (e.g.
_io.RawIOBase.read()) hand out an owning object (a bytearray) whose buffer
protocol prevents it from being freed while still exported. The pickle path
should release the temporary memoryview as soon as readinto() returns, so a
surviving reference raises ValueError: operation forbidden on released memoryview object instead of dereferencing freed memory.

I have a patch with a regression test and will open a PR.

(For context: this was originally raised privately with the Python Security
Response Team, who advised opening a public issue.)

CPython versions tested on

3.16.0a0 (main, commit 5755d0f).

Operating systems tested on

macOS (arm64), --with-address-sanitizer --with-pydebug build.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    extension-modulesC modules in the Modules dirtype-crashA hard crash of the interpreter, possibly with a core dump
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions