Why pickle is not a good way to save your data
April 20, 2009
On the surface, the (c)Pickle module looks like a good, simple way for your Python program to save and load its data; much like XML, it means you don't have to write a parser or even save and load routines as such, just some file and object manipulation code. However, through my experience in writing DWiki I've come to understand that this temptation would be a mistake (one that I've actually half-made; DWiki's caching layer uses pickling).
Fundamentally the problems with pickle for saving data are inherent in what it exists to do; it exists to persist and recover Python objects, not save and restore data. These sound similar enough on first look, but in the longer term I think you run into some significant issues:
This is not to say that pickle is pointless. It's just that if you're using it, you need to be sure that you really do want objects, not just data.
If you still want to use pickle as your save format because it's easy, I've come around to the idea that you should not attempt to pickle your objects directly. Instead I think that you should treat pickle like you would JSON, and first serialize your actual objects into simple data structures (dictionaries, lists, etc) and pickle only the data structures.
(Admittedly, this is easy for me to say because my use of pickle to date has been for objects that are relatively easily represented this way.)
Written on 20 April 2009.
* * *