Wednesday, April 05, 2006

Monday, April 03, 2006

System.Runtime.Serialization.SerializationInfo -- .KeyExists()?

The Problem:

There are a number of collection-like objects in the .NET Framework that allow you to 'key' the data you enter into them. Dictionary<TKey,TValue> comes to mind almost immediately. Dictionary<TKey,TValue> contains a Dictionary<TKey,TValue>.ContainsKey(T key) method that makes key-existance determination trivial.

But what about the SerializationInfo object?

When an object implements ISerializable the ISerializable.GetObjectData(SerializationInfo info, StreamingContext context) method adds the object's data to the serialization stream in a key/value style:

   info.AddValue("MyObjectsValueKey", this._MyValue);

Then, during deserialization, the constructor that meets the ISerializable implied-constructor signature:

   MyObject(SerializationInfo info, StreamingContext context)

is called, and the object is supposed to bootstrap itself from the data contained in the SerializationInfo instance.

How? By retrieving values based on the previously used string-keys, of course.

So, why isn't there a SerializationInfo.KeyExists(string key) method?

I hear some of you saying "Well, it's the serialization of an object, you had better know what's was serialized in the first place!" True, very true. But, consider a versioning issue that might arise:


Example (version 1):

Assume you have a business object that implements ISerializable. For this example, let's call that business object MyObject. In your first version of MyObject, you have some code that looks like this (large portions of code left out to save space):

[Serializable]
public sealed class MyObject : ISerializable {
private List<OtherObject> _OtherObjects;
private Int32 _CurrentOtherObjectIndex;

public OtherObject CurrentOtherObject {
get { return this._OtherObjects[_CurrentOtherObjectIndex]; }
set { this._CurrentOtherObjectIndex = this._OtherObjects.IndexOf(value); }
}


private const string OTHEROBJECTS_KEY = "___otherobjetskey___";
private const string INDEX_KEY = "___indexkey___";
private MyObject(SerializationInfo info, StreamingContext context) {
this._OtherObjects = (List<OtherObject>)info.GetValue(OTHEROBJECTS_KEY, typeof(List<OtherObject>));
this._CurrentOtherObjectIndex = info.GetInt32(INDEX_KEY);
}

private void GetObjectData(SerializationInfo info, StreamingContext context) {
info.AddValue(OTHEROBJECTS_KEY, this._OtherObjects);
info.AddValue(INDEX_KEY, this._CurrentOtherObjectIndex);
}
}

As you can see, MyObject has an internal List<T> of OtherObject's. It exposes a property that returns whatever the 'current' OtherObject is, based upon a private index into the _MyOtherObjects List<T>.

The serialization code is fairly straight forward.

Now, let's say, somewhere in the development of the version 2, we find a bug related to _CurrentOtherObjectIndex not being updated properly when instances OtherObject are being inserted into the _OtherObjects List<T> at various locations, thereby possibly invalidating the OtherObject pointed to by _CurrentOtherObjectIndex.

Instead of updating the various code locations that would need to update _CurrentOtherObjectIndex we decide it would be better to just replace _CurrentOtherObjectIndex with an OtherObject instance. ie:

[Serializable]
public sealed class MyObject : ISerializable {
private List<OtherObject> _OtherObjects;
private OtherObject _CurrentOtherObject;

public OtherObject CurrentOtherObject {
get { return this._CurrentOtherObject; }
set { this._CurrentOtherObject = value; }
}
}

Ahhhh, but making this change would break the compatiblity between version 1 and version 2 of MyObject. Oh, but wait, we're handling the serialization ourselves, we should be able to code around it.
[Let's ignore any possible versioning issues involved with strong-naming, let's just assume the version 2 code tries to deserialize a version 1 MyObject.]

So, ideally, we could change the ISerializable implementation to look like this:

private const string OTHEROBJECTS_KEY = "___otherobjetskey___";
private const string INDEX_KEY = "___indexkey___"; // we need the v1 key
private const string CURRENTOTHEROBJECT_KEY = "___currentotherobjectkey___";
private MyObject(SerializationInfo info, StreamingContext context) {
this._OtherObjects = (List<OtherObject>)info.GetValue(OTHEROBJECTS_KEY, typeof(List<OtherObject>));

if (info.ContainsKey(INDEX_KEY)) {
// we're deserializing a version-1 MyObject. Update it to version-2
Int32 tmpIndex = info.GetInt32(INDEX_KEY);
this._CurrentOtherObject = this._OtherObjects[tmpIndex];
} else {
// we're deserializing a version-2 MyObject. Just get the deserialize the CurrentOtherObject
this._CurrentOtherObject = (OtherObject)info.GetValue(CURRENTOTHEROBJECT_KEY, typeof(OtherObject));
}
}

private void GetObjectData(SerializationInfo info, StreamingContext context) {
info.AddValue(OTHEROBJECTS_KEY, this._OtherObjects);
info.AddValue(CURRENTOTHEROBJECT_KEY, this._CurrentOtherObject);
}

Unfortunately, we can't do this because there is no SerializationInfo.ContainsKey(string key) method! What we have to do is this:

private MyObject(SerializationInfo info, StreamingContext context) {
this._OtherObjects = (List<OtherObject>)info.GetValue(OTHEROBJECTS_KEY, typeof(List<OtherObject>));

try {
this._CurrentOtherObject = (OtherObject)info.GetValue(CURRENTOTHEROBJECT_KEY, typeof(OtherObject));
} catch (SerializationException) {
// we're deserializing a version-1 MyObject. Update it to version-2
Int32 tmpIndex = info.GetInt32(INDEX_KEY);
this._CurrentOtherObject = this._OtherObjects[tmpIndex];
}
}

We have to rely on an exception as part of our 'normal' code path. This feels 'icky' to me. ("Icky" being a highly technical term that means many things at different times. In this case, 'icky' means "goes against my 'best-practices' sense". Exceptions should be exactly that: an unexpected error condition. In the course of a normal deserialization, we have decided that being able to deserialize v1 MyObject instances into v2 MyObject instances is a completely normal operation. We should not have to rely on an Exception to perform normal work. And if we have a situation where v2 MyObject code may be deserializing a large amount of v1 MyObjects, we can expect it to be much slower as well because the Exception handling system carries a heavy tax.


One Possible Solution

One solution is to have any object that implements ISerializable also serialize some for of version information. Whether this version information is culled from the assembly versioning info, or is a private field of the class is completely up to your implementation. In the following modification to MyObject, MyObject will store it's own versioning information.

So, again from the top, version 1 of MyObject:

[Serializable]
public sealed class MyObject : ISerializable {
private static Int32 Version = 1;

private List<OtherObject> _OtherObjects;
private Int32 _CurrentOtherObjectIndex;

public OtherObject CurrentOtherObject {
get { return this._OtherObjects[_CurrentOtherObjectIndex]; }
set { this._CurrentOtherObjectIndex = this._OtherObjects.IndexOf(value); }
}


private const string VERSION_KEY = "___version___";
private const string OTHEROBJECTS_KEY = "___otherobjetskey___";
private const string INDEX_KEY = "___indexkey___";
private MyObject(SerializationInfo info, StreamingContext context) {
this._OtherObjects = (List<OtherObject>)info.GetValue(OTHEROBJECTS_KEY, typeof(List<OtherObject>));

// version 1 doesn't care about the VERSION_KEY value because it is the first version
this._CurrentOtherObjectIndex = info.GetInt32(INDEX_KEY);
}

private void GetObjectData(SerializationInfo info, StreamingContext context) {
info.AddValue(VERSION_KEY, MyObject.Version);
info.AddValue(OTHEROBJECTS_KEY, this._OtherObjects);
info.AddValue(INDEX_KEY, this._CurrentOtherObjectIndex);
}
}

And the updated version 2 of MyObject:

[Serializable]
public sealed class MyObject : ISerializable {
private static Int32 Version = 2;

private List<OtherObject> _OtherObjects;
private OtherObject _CurrentOtherObject;

public OtherObject CurrentOtherObject {
get { return this._CurrentOtherObject; }
set { this._CurrentOtherObject = value; }
}


private MyObject(SerializationInfo info, StreamingContext context) {
Int32 streamVersion = info.GetInt32(VERSION_KEY);

this._OtherObjects = (List<OtherObject>info.GetValue(OTHEROBJECTS_KEY, typeof(List<OtherObject>

if (streamVersion == 2) {
this._CurrentOtherObject = (OtherObject)info.GetValue(CURRENTOTHEROBJECT_KEY, typeof(OtherObject));
} else {
Int32 tmpIndex = info.GetInt32(INDEX_KEY);
this._CurrentOtherObject = this._OtherObjects[tmpIndex];
}
}

private void GetObjectData(SerializationInfo info, StreamingContext context) {
info.AddValue(VERSION_KEY, MyObject.Version);
info.AddValue(OTHEROBJECTS_KEY, this._OtherObjects);
info.AddValue(CURRENTOTHEROBJECT_KEY, this._CurrentOtherObject);
}

Tada. Clean, deterministic deserialization of version 1 and 2 MyObject instances.


Wrap-Up

So, does anyone out there know why there isn't a
SerializationInfo.ContainsKey(string key)
method? As with many things that I've learned about the .NET Framework, what at first seems obtuse to me usually has a very good explanation behind it.

I gotta give thanks to Mr. DotNet who has helpd me through those mentally-obtuse times. He really knows his stuff. I was hoping to be able to use his SyntaxHighlighter (*nudge*-*nudge*) to make my code a bit more readable. Maybe in a future update!


Technorati Tags: , , , ,