Skip to content

Commit 339cee2

Browse files
serhiy-storchakaGlyphack
authored andcommitted
pythongh-113626: Add allow_code parameter in marshal functions (pythonGH-113648)
Passing allow_code=False prevents serialization and de-serialization of code objects which is incompatible between Python versions.
1 parent 135a4d7 commit 339cee2

10 files changed

+356
-53
lines changed

Doc/library/marshal.rst

+31-10
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,11 @@ transfer of Python objects through RPC calls, see the modules :mod:`pickle` and
2323
:mod:`shelve`. The :mod:`marshal` module exists mainly to support reading and
2424
writing the "pseudo-compiled" code for Python modules of :file:`.pyc` files.
2525
Therefore, the Python maintainers reserve the right to modify the marshal format
26-
in backward incompatible ways should the need arise. If you're serializing and
26+
in backward incompatible ways should the need arise.
27+
The format of code objects is not compatible between Python versions,
28+
even if the version of the format is the same.
29+
De-serializing a code object in the incorrect Python version has undefined behavior.
30+
If you're serializing and
2731
de-serializing Python objects, use the :mod:`pickle` module instead -- the
2832
performance is comparable, version independence is guaranteed, and pickle
2933
supports a substantially wider range of objects than marshal.
@@ -40,7 +44,8 @@ Not all Python object types are supported; in general, only objects whose value
4044
is independent from a particular invocation of Python can be written and read by
4145
this module. The following types are supported: booleans, integers, floating
4246
point numbers, complex numbers, strings, bytes, bytearrays, tuples, lists, sets,
43-
frozensets, dictionaries, and code objects, where it should be understood that
47+
frozensets, dictionaries, and code objects (if *allow_code* is true),
48+
where it should be understood that
4449
tuples, lists, sets, frozensets and dictionaries are only supported as long as
4550
the values contained therein are themselves supported. The
4651
singletons :const:`None`, :const:`Ellipsis` and :exc:`StopIteration` can also be
@@ -54,27 +59,32 @@ bytes-like objects.
5459
The module defines these functions:
5560

5661

57-
.. function:: dump(value, file[, version])
62+
.. function:: dump(value, file, version=version, /, *, allow_code=True)
5863

5964
Write the value on the open file. The value must be a supported type. The
6065
file must be a writeable :term:`binary file`.
6166

6267
If the value has (or contains an object that has) an unsupported type, a
6368
:exc:`ValueError` exception is raised --- but garbage data will also be written
6469
to the file. The object will not be properly read back by :func:`load`.
70+
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.
6571

6672
The *version* argument indicates the data format that ``dump`` should use
6773
(see below).
6874

6975
.. audit-event:: marshal.dumps value,version marshal.dump
7076

77+
.. versionchanged:: 3.13
78+
Added the *allow_code* parameter.
7179

72-
.. function:: load(file)
80+
81+
.. function:: load(file, /, *, allow_code=True)
7382

7483
Read one value from the open file and return it. If no valid value is read
7584
(e.g. because the data has a different Python version's incompatible marshal
76-
format), raise :exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`. The
77-
file must be a readable :term:`binary file`.
85+
format), raise :exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`.
86+
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.
87+
The file must be a readable :term:`binary file`.
7888

7989
.. audit-event:: marshal.load "" marshal.load
8090

@@ -88,24 +98,32 @@ The module defines these functions:
8898
This call used to raise a ``code.__new__`` audit event for each code object. Now
8999
it raises a single ``marshal.load`` event for the entire load operation.
90100

101+
.. versionchanged:: 3.13
102+
Added the *allow_code* parameter.
103+
91104

92-
.. function:: dumps(value[, version])
105+
.. function:: dumps(value, version=version, /, *, allow_code=True)
93106

94107
Return the bytes object that would be written to a file by ``dump(value, file)``. The
95108
value must be a supported type. Raise a :exc:`ValueError` exception if value
96109
has (or contains an object that has) an unsupported type.
110+
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.
97111

98112
The *version* argument indicates the data format that ``dumps`` should use
99113
(see below).
100114

101115
.. audit-event:: marshal.dumps value,version marshal.dump
102116

117+
.. versionchanged:: 3.13
118+
Added the *allow_code* parameter.
103119

104-
.. function:: loads(bytes)
120+
121+
.. function:: loads(bytes, /, *, allow_code=True)
105122

106123
Convert the :term:`bytes-like object` to a value. If no valid value is found, raise
107-
:exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`. Extra bytes in the
108-
input are ignored.
124+
:exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`.
125+
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.
126+
Extra bytes in the input are ignored.
109127

110128
.. audit-event:: marshal.loads bytes marshal.load
111129

@@ -114,6 +132,9 @@ The module defines these functions:
114132
This call used to raise a ``code.__new__`` audit event for each code object. Now
115133
it raises a single ``marshal.loads`` event for the entire load operation.
116134

135+
.. versionchanged:: 3.13
136+
Added the *allow_code* parameter.
137+
117138

118139
In addition, the following constants are defined:
119140

Doc/whatsnew/3.13.rst

+8
Original file line numberDiff line numberDiff line change
@@ -247,6 +247,14 @@ ipaddress
247247
* Add the :attr:`ipaddress.IPv4Address.ipv6_mapped` property, which returns the IPv4-mapped IPv6 address.
248248
(Contributed by Charles Machalow in :gh:`109466`.)
249249

250+
marshal
251+
-------
252+
253+
* Add the *allow_code* parameter in module functions.
254+
Passing ``allow_code=False`` prevents serialization and de-serialization of
255+
code objects which are incompatible between Python versions.
256+
(Contributed by Serhiy Storchaka in :gh:`113626`.)
257+
250258
mmap
251259
----
252260

Include/internal/pycore_global_objects_fini_generated.h

+1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/internal/pycore_global_strings.h

+1
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,7 @@ struct _Py_global_strings {
276276
STRUCT_FOR_ID(after_in_child)
277277
STRUCT_FOR_ID(after_in_parent)
278278
STRUCT_FOR_ID(aggregate_class)
279+
STRUCT_FOR_ID(allow_code)
279280
STRUCT_FOR_ID(append)
280281
STRUCT_FOR_ID(argdefs)
281282
STRUCT_FOR_ID(arguments)

Include/internal/pycore_runtime_init_generated.h

+1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/internal/pycore_unicodeobject_generated.h

+3
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Lib/test/test_marshal.py

+26
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,32 @@ def test_different_filenames(self):
129129
self.assertEqual(co1.co_filename, "f1")
130130
self.assertEqual(co2.co_filename, "f2")
131131

132+
def test_no_allow_code(self):
133+
data = {'a': [({0},)]}
134+
dump = marshal.dumps(data, allow_code=False)
135+
self.assertEqual(marshal.loads(dump, allow_code=False), data)
136+
137+
f = io.BytesIO()
138+
marshal.dump(data, f, allow_code=False)
139+
f.seek(0)
140+
self.assertEqual(marshal.load(f, allow_code=False), data)
141+
142+
co = ExceptionTestCase.test_exceptions.__code__
143+
data = {'a': [({co, 0},)]}
144+
dump = marshal.dumps(data, allow_code=True)
145+
self.assertEqual(marshal.loads(dump, allow_code=True), data)
146+
with self.assertRaises(ValueError):
147+
marshal.dumps(data, allow_code=False)
148+
with self.assertRaises(ValueError):
149+
marshal.loads(dump, allow_code=False)
150+
151+
marshal.dump(data, io.BytesIO(), allow_code=True)
152+
self.assertEqual(marshal.load(io.BytesIO(dump), allow_code=True), data)
153+
with self.assertRaises(ValueError):
154+
marshal.dump(data, io.BytesIO(), allow_code=False)
155+
with self.assertRaises(ValueError):
156+
marshal.load(io.BytesIO(dump), allow_code=False)
157+
132158
@requires_debug_ranges()
133159
def test_minimal_linetable_with_no_debug_ranges(self):
134160
# Make sure when demarshalling objects with `-X no_debug_ranges`
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Add support for the *allow_code* argument in the :mod:`marshal` module.
2+
Passing ``allow_code=False`` prevents serialization and de-serialization of
3+
code objects which is incompatible between Python versions.

0 commit comments

Comments
 (0)