Deserializing from a sequence of bytes

I was working on some file-based persistence today and found myself needing to load a string from an array of bytes that represented that string's characters. There is, of course, more than one way to skin this particular cat, but I took it as an opportunity to play around with extension methods.

The first way I found to do this was to have a method with an iterator block that took each pair of bytes and used the BitConverter.ToChar() method to yield a character.

public static IEnumerable<char> ToChars(this IEnumerable<byte> sequence)
{
    int counter = 0;
    byte[] bytes = new byte[2];

    foreach (var b in sequence)
    {
        bytes[counter++ % 2] = b;
        if (counter % 2 == 0)
        {
            yield return BitConverter.ToChar(bytes, 0);
        }
    }
}

Turning the results of this method into an array and using the appropriate string constructor meant job done. I realise that a little more work is needed to check we have a non-null sequence with an even number of bytes, but this is just illustrative. However, what if I had wanted a sequence of integers or doubles or some other type?

A more elegant solution would be to create a partitioning method that took the sequence of bytes and returned a sequence of smaller sequences.

public static IEnumerable<byte[]> Partition(this IEnumerable<byte> sequence, int partitionSize)
{
    bool any = false;

    int partitionIndex = 0;
    byte[] partition = new byte[partitionSize];
    foreach (var b in sequence)
    {
        any = true;
        partition[partitionIndex++] = b;

        if (partitionIndex >= partitionSize)
        {
            yield return partition;
            partitionIndex = 0;
        }
    }

    // We have a partial partition to yield.
    if (any && (partitionIndex != 0))
    {
        yield return partition;
    }
}

This time, we've got ourselves a sequence of byte arrays. To get the characters we would've got from the previous example, we have to perform a quick Select() call on the sequence.

var sequenceOfChar = sequenceOfBytes
    .Partition(sizeof(char))
    .Select(x => BitConverter.ToChar(x, 0));

Of course, if we wanted a sequence of integers, the call would be a little different.

var sequenceOfInt32 = sequenceOfBytes
    .Partition(sizeof(int))
    .Select(x => BitConverter.ToInt32(x, 0));

There's a little more polish required to cope with null sequences and there's no guarantee that the last array in the partitioned sequence will have enough bytes for a full partition. Finally, in my examples this relies on the data being persisted in the order expected by the BitConverter calls, but you could manage that yourself depending on your own circumstances.

Is this useful to anyone else? Is there a better way to achieve the same goals?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.