You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
erik895 opened this issue
Oct 20, 2024
· 2 comments
Labels
3.12bugs and security fixes3.13bugs and security fixes3.14new features, bugs and security fixesdocsDocumentation in the Doc dirstdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error
When pickling a container object. The pickler class will store child-objects to its self.memo even if the container could not be fully pickled (and not written to the output).
This can cause references to memo objects that are not in the output stream.
importsysimportpickleimportiobuf=io.BytesIO()
pickler=pickle.Pickler(buf)
i_will_be_memorized='qqqqqqqqqqqq'i_cant_be_pickled=systry:
pickler.dump([i_will_be_memorized, i_cant_be_pickled])
exceptTypeError:
# i_will_be_memorized is now in pickler memory despite the exceptionpass# this dump will use the memopickler.dump(i_will_be_memorized)
data=buf.getbuffer().tobytes()
# so this will fail with:# _pickle.UnpicklingError: Memo value not found at index 1pickle.loads(data)
The memo is the data structure that remembers which objects the
pickler has already seen, so that shared or recursive objects
are pickled by reference and not by value. This method is
useful when re-using picklers.
"""
self.memo.clear()
In your case, you could probably do something like this:
try:
pickler.dump([i_will_be_memorized, i_cant_be_pickled])
exceptTypeError:
# i_will_be_memorized is now in pickler memory despite the exceptionpickler.clear_memo()
Thanks for the quick reply! I thought clear_memo was not documented because it’s dangerous to use. When you clear the memo, I don’t think that 'action' gets written to the output, so the loader won’t clear its own memo (I have't checked any code, this is just my observation).
As a result, loading a stream where clear_memo was used could lead to loading the wrong objects. If clear_memo is used publicly, I think it deserves its own issue.
importpickleimportiobuf=io.BytesIO()
pickler=pickle.Pickler(buf)
pickler.dump("foo")
pickler.clear_memo()
pickler.dump("bar")
pickler.dump("bar")
buf.seek(0)
loader=pickle.Unpickler(buf)
print(loader.load()) # prints fooprint(loader.load()) # prints barprint(loader.load()) # prints foo (I wanted bar)
3.12bugs and security fixes3.13bugs and security fixes3.14new features, bugs and security fixesdocsDocumentation in the Doc dirstdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error
Bug report
Bug description:
When pickling a container object. The pickler class will store child-objects to its self.memo even if the container could not be fully pickled (and not written to the output).
This can cause references to memo objects that are not in the output stream.
CPython versions tested on:
3.12
Operating systems tested on:
Windows
Linked PRs
Pickler.clear_memo
#125762The text was updated successfully, but these errors were encountered: