I'm writing a web app that stores user input in an object. This object will be pickled.
Is it possible for a user to craft malicious input that could do something egregious when the object is unpickled?
Here's a really basic code example that ignores wonderful principles such as encapsulation but epitomizes what I'm looking at:
import pickle
class X(object):
some_attribute = None
x = X()
x.some_attribute = 'insert some user input that could possibly be bad'
p = pickle.dumps(x)
# Can bad things happen here if the object, before being picked, contained
# potentially bad data in some_attribute?
x = pickle.loads(p)
Yes and no...
No - unless there's a bug with the interpreter or the pickle module, you can't run arbitrary code via pickled text, or something like that. unless the pickled text is eval
ed later, or you're doing stuff like creating a new object with a type mentioned in this data.
Yes - depending on what you plan to do with the information in the object later, a user can do all sorts of things. From SQL injection attempts, to changing credentials, brute force password cracking, or anything that should be considered when you're validating user input. But you are probably checking for all this.
Edit:
The python documentation states this:
Warning The pickle module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.
However this is not your case - you accept the input, put it through the regular validation, and then pickle it.
Well according to the documentation
Warning: The pickle
module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an
untrusted or unauthenticated source.
It would imply that it is possible to attack this functionality just by invoking it if the structure of the data existed in such a state that the pickle algorithm would enter into a state where program behavior was not guaranteed.
According to this site
import pickle
pickle.loads("cos\nsystem\n(S'ls ~'\ntR.") # This will run: ls ~
is all that is required to execute arbitrary code. There are other examples there as well as an "improvement" to pickling for security purposes.
I found this in the documentation of multiprocessing module which I think answers the question:
Warning
The Connection.recv() method automatically unpickles the data it
receives, which can be a security risk unless you can trust the
process which sent the message.
Therefore, unless the connection object was produced using Pipe() you
should only use the recv() and send() methods after performing some
sort of authentication. See Authentication keys.
(emphasis mine)
Conclusion is that if the connection object is produced using a trusted Pipe, i.e. a trusted pickle, then it can be safely unpickled.