I want to have complete control of the binary file format generated by ObjectOutputStream, so I implemented Externalizable in my objects and extended ObjectOutputStream so that I could overwrite the writeStreamHeader() and readStreamHeader() methods.

Tim Rohaly

A serialized object is written out by an ObjectOutputStream and read in by an ObjectInputStream. It is these streams that are responsible for writing and recognizing the binary serialization format. Objects that implement Serializable or Externalizable can only tell the underlying stream what data they want to write out - they don't have control over the binary format.

These object streams do allow minimal customization of the protocol. Namely, the ability to override writeStreamHeader() and readStreamHeader() in subclasses. This affects only the magic number and version number written at the beginning of the serialized data, not the internal structure of the data itself. To do any more customization requires you to write your own stream classes.

This is true even for primitive types written out through the ObjectOutputStream, because ObjectOutputStream implements its own write() methods to conform to the serialization protocol.

As an example, here is a snippet of code which writes out one byte to an ObjectOutputStream:

FileOutputStream   file = new FileOutputStream("test.ser");
ObjectOutputStream out  = new ObjectOutputStream(file);
The resulting file, "test.ser", will have a total length of 7 bytes, as follows:
0xACED    // ObjectStreamContents.STREAM_MAGIC
0x0005    // ObjectStreamContents.STREAM_VERSION
0x77      // ObjectStreamContents.TC_BLOCKDATA
0x01      // length of data
0x41      // 65 decimal

Subclassing ObjectOutputStream and overriding writeStreamHeader() will allow you to change, delete, or add to the first 4 bytes shown above, but that is the limit of customization allowed by overriding. In particular, the block header TC_BLOCKDATA is written by a private method within ObjectOutputStream, so it can't be overridden.

There is a lot of logic behind the serialization protocol. From the above example, you see that the protocol writes out an indicator that a block of data follows, then writes out the length of the data, and finally the data itself. This allows efficient reading of the data by a stream, and accomodates variable-length data structures. If you really want to define your own protocol, you will have to write your own input and output stream classes that that use your own binary format. You will then need to use these in place of ObjectInputStream and ObjectOutputStream. You will probably want to implement Externalizable in all your classes as well, so you can control how each object writes its own state.