How do I speed up my serialization code?

I have the following code that serializes a list to a byte array for transport via web services. The code works relatively quickly on small objects, but this is a list of 60,000 or so items. The formatter.Serialize method takes a few seconds to complete. Is there anyway to speed it up?

    public static byte[] ToBinary(Object objToBinary)
    {
        using (MemoryStream memStream = new MemoryStream())
        {
            BinaryFormatter formatter = new BinaryFormatter(null, new StreamingContext(StreamingContextStates.Clone));
            formatter.Serialize(memStream, objToBinary);
            memStream.Seek(0, SeekOrigin.Begin);
            return memStream.ToArray();
        }
    }

      

+2


source to share


5 answers


The inefficiencies you are experiencing come from several sources:

  • The standard serialization routine uses reflection to enumerate the fields of an object and retrieve their values.
  • The binary serialization format stores things in associative lists that string names are built on.
  • You've got a parasite in ToArray

    there (as Danny mentioned).

You can get a pretty big improvement from the bat by implementing ISerializable

the type of object your List

. This will disable the default serialization behavior that uses reflection.



You can get a little more speed if you reduce the number of elements in the associative array that stores the serialized data. Make sure the elements you store in this associative array are primitive types.

Finally, you can eliminate ToArray

, but I doubt you will even notice the bump it gives you.

+4


source


if you want real serialization speed consider using protobuf-net , which is the google c # version of the google protocol buffer. it should be an order of magnitude faster binary formatter.



+3


source


It is probably much faster to serialize an entire array (or collection) of 60,000 elements in one shot into one large byte [] array rather than separate chunks. Does each of the individual objects have its own byte [] array required by other parts of the system you're working on? Also, are the actual types of the objects known? If you were using a specific type (maybe some common base class for all those 60,000 objects) then the framework wouldn't need to do so many casts and searches for your pre-built serialization assemblies. Right now you are only giving Object.

+1


source


.ToArray () creates a new array, it is more efficient to copy data to an existing array using unsafe methods (for example, access streaming memory using fixed and then copy memory using MemCopy () via DllImport).

Also consider using faster custom formatting.

+1


source


I started a code generation project that includes a binary DataContract

-Serialzer
which is at least 30x superior to Json.NET . All you need is a nuget generator package and an extra lib that comes with a faster replacement BitConverter

.

Then you create a partial class and decorate it DataContract

and each serialization property with DataMember

. Then the generator will create the method ToBytes

and along with the additional lib you can serialize the collections as well. Check out my example from this post :

var objects = new List<Td>();
for (int i = 0; i < 1000; i++)
{
    var obj = new Td
    {
        Message = "Hello my friend",
        Code = "Some code that can be put here",
        StartDate = DateTime.Now.AddDays(-7),
        EndDate = DateTime.Now.AddDays(2),
        Cts = new List<Ct>(),
        Tes = new List<Te>()
    };
    for (int j = 0; j < 10; j++)
    {
        obj.Cts.Add(new Ct { Foo = i * j });
        obj.Tes.Add(new Te { Bar = i + j });
    }
    objects.Add(obj);
}

      

With this generated method ToBytes()

:

public int Size
{
    get 
    { 
        var size = 24;
        // Add size for collections and strings
        size += Cts == null ? 0 : Cts.Count * 4;
        size += Tes == null ? 0 : Tes.Count * 4;
        size += Code == null ? 0 : Code.Length;
        size += Message == null ? 0 : Message.Length;

        return size;              
    }
}

public byte[] ToBytes(byte[] bytes, ref int index)
{
    if (index + Size > bytes.Length)
        throw new ArgumentOutOfRangeException("index", "Object does not fit in array");

    // Convert Cts
    // Two bytes length information for each dimension
    GeneratorByteConverter.Include((ushort)(Cts == null ? 0 : Cts.Count), bytes, ref index);
    if (Cts != null)
    {
        for(var i = 0; i < Cts.Count; i++)
        {
            var value = Cts[i];
            value.ToBytes(bytes, ref index);
        }
    }
    // Convert Tes
    // Two bytes length information for each dimension
    GeneratorByteConverter.Include((ushort)(Tes == null ? 0 : Tes.Count), bytes, ref index);
    if (Tes != null)
    {
        for(var i = 0; i < Tes.Count; i++)
        {
            var value = Tes[i];
            value.ToBytes(bytes, ref index);
        }
    }
    // Convert Code
    GeneratorByteConverter.Include(Code, bytes, ref index);
    // Convert Message
    GeneratorByteConverter.Include(Message, bytes, ref index);
    // Convert StartDate
    GeneratorByteConverter.Include(StartDate.ToBinary(), bytes, ref index);
    // Convert EndDate
    GeneratorByteConverter.Include(EndDate.ToBinary(), bytes, ref index);
    return bytes;
}

      

It serializes each object in ~ 1.5 microseconds -> 1000 objects in 1.7ms .

0


source







All Articles