What is the security impact of deserializing untru

Is it safe to deserialize untrusted data, provided my code makes no assumptions about the state or class of the deserialized object, or can the mere act of deserializing cause undesired operation?

(Threat model: The attacker may freely modify the serialized data, but that's all he can do)

Deserialization itself can already be unsafe. A serializable class may define a readObject method (see also the specification), which is called when an object of this class is going to be deserialized from the stream. The attacker cannot provide this code, but using a crafted input she can invoke any such readObject method that is on your classpath, with any input.

Code injection

It is possible to make a readObject implementation that opens the door to arbitrary bytecode injection. Simply read a byte array from the stream and pass it to ClassLoader.defineClass and ClassLoader.resolveClass() (see the javadoc for the former and the later). I don't know what the use of such an implementation would be, but it is possible.

Memory exhaustion

Writing secure readObject methods is hard. Up until somewhat recently the readObject method of HashMap contained the following lines.

int numBuckets = s.readInt();
table = new Entry[numBuckets];

This makes it very easy for an attacker to allocate several gigabytes of memory with just a few dozen bytes of serialized data, which will have your system down with an OutOfMemoryError in no time.

The current implementation of Hashtable seems to still be vulnerable to a similar attack; it computes the size of the allocated array based on the number of elements and the load factor, but there is no guard in place against unreasonable values in loadFactor, so we can easily request a billion slots be allocated for each element in the table.

Excessive CPU load

Fixing the vulnerability in HashMap was done as part of changes to address another security issue related to hash-based maps. CVE-2012-2739 describes a denial-of-servic attack based on CPU consumption by creating a HashMap with very many colliding keys (i.e. distinct keys with the same hash value). The documented attacks are based on query parameters in URLs or keys in HTTP POST data, but deserialization of a HashMap is also vulnerable to this attack.

The safeguards that were put into HashMap to prevent this type of attack are focussed on maps with String keys. This is adequate to prevent the HTTP-based attacks, but is easily circumvented with deserialization, e.g. by wrapping each String with an ArrayList (whose hashCode is also predictable). Java 8 includes a proposal (JEP-180) to further improve the behaviour of HashMap in the face of many collisions, which extends the protection to all key types that implements Comparable, but that still allows an attack based on ArrayList keys.

The upshot of this is that is possible for the attacker to engineer a byte streams such that the CPU effort it takes to deserialize an object from this stream grows quadratically with the size of the stream.

Summary

By controlling the input to the deserialization process an attacker can trigger the invocation of any readObject deserialization-method. It is theoretically possible for such a method to allow bytecode injection. In practice it is certainly possible to easily exhaust memory or CPU resources this way, resulting in denial-of-service attacks. Auditing your system against such vulnerabilities is very difficult: you have to check every implementation of readObject, including those in third-party libraries and the runtime library.