Posts Tagged .NET

Fast serialization from .NET to non .NET applications.

In .NET, in case you need perform data exchange between .NET and non .NET applications, the only built in way do do this, is to serialize data using SOAP serialization. However, creating and sending XML is too slow for time critical applications. Therefore you need to find some alternate solution to this problem. No matter to where you sanding data (network or whatever…), somehow you need to create byte buffer, populated with your data. And, for best performance you need to do this as fast as possible. I aware about 2 ways to create byte array with data populated into it:

  • Create MemoryStream and BinaryWriter. Write into steam all data, and at the end call MemoryStream.ToArray().
  • Use BitConverter class for each value type you have. Use Encoding.GetBytes for strings. Combine all together using Buffer.BlockCopy().

I presenting here my way, which is faster 10-20 times than first solution and faster 3-5 times than second one. It is very simple: I creating buffer in advance with desired size (i calculating the whole size before serialization). Creating pointer to it, and using cast to copy values inside. After each copy, I moving pointer forward for the amount of copied data. For example, if I copied boolean value, I should move pointer 4 positions forward, because of the size of boolean type (4 bytes). The same for each type, consult MSDN if you need to know size of each type.

Example – serialization of double value:

I know, that the size of double is 8 bytes. Therefore, I creating buffer in same size:

byte[] buffer = new byte[8];

fixed(byte* pBuffer = buffer) //must use fixed keyword – avoids GC move buffer in memory

{

byte* pBufferWriter = pBuffer; //must create another pointer, because pBuffer is readonly

*((double*)pBufferWriter) = yourDecimalValue; //copy value
pBufferWriter+=8; //move pointer forward 8 bytes.

}

Example – deserialization of double value:

…. //take here pointer pBufferReader to byte[] with data

double result = *((double*)pBufferReader);
pBufferReader+=8;

Same is for any value primitive(int, short, bool. etc..).

The real challenge is serialization of strings. To serialize string, first step is to take pointer to it.

fixed(char* pStr = yourStringValue)

from this point, you just need to copy all bytes from pStr to your buffer. Remember, that .NET uses Unicode, and each char is represented by 2 bytes. I suggesting here function I using to copy strings – its performance is much better that coping byte by byte:

public sealed class SerializationHelper
{
/// <summary>
/// The StringCopy optimized to copy chars.
/// </summary>
/// <param name=”dmem”>Pointer to destination</param>
/// <param name=”smem”>Pointer to source.</param>
/// <param name=”charCount”>Count of chars to copy.</param>
public static unsafe void StringCopy(char* dmem, char* smem, int charCount)
{
if (charCount > 0)
{
if ((((int)dmem) & 2) != 0)
{
dmem[0] = smem[0];
dmem++;
smem++;
charCount–;
}
while (charCount >= 8)  //it is eight here, not smile )
{
*((int*)dmem) = *((int*)smem);
*((int*)(dmem + 2)) = *((int*)(smem + 2));
*((int*)(dmem + 4)) = *((int*)(smem + 4));
*((int*)(dmem + 6)) = *((int*)(smem + 6));
dmem += 8;
smem += 8;
charCount -= 8;
}
if ((charCount & 4) != 0)
{
*((int*)dmem) = *((int*)smem);
*((int*)(dmem + 2)) = *((int*)(smem + 2));
dmem += 4;
smem += 4;
}
if ((charCount & 2) != 0)
{
*((int*)dmem) = *((int*)smem);
dmem += 2;
smem += 2;
}
if ((charCount & 1) != 0)
{
dmem[0] = smem[0];
}
}
}

}

After you finishing copying, don’t forget to advance buffer pointer .

Deserialization of string very simple:

string myString = new string((char*)pBuffer); // pBuffer is pointer to your buffer

Suppose, you have need copy block of memory to your buffer (maybe it is internal class, serialized before) – you can use the following method:

public static unsafe void ByteCopy(byte* ps, byte* pd, int count)
{
// Loop over the count in blocks of 4 bytes, copying an
// integer (4 bytes) at a time:
for (int n = 0; n < count / 4; n++)
{
*((int*)pd) = *((int*)ps);
pd += 4;
ps += 4;
}

// Complete the copy by moving any bytes that weren’t
// moved in blocks of 4:
for (int n = 0; n < count % 4; n++)
{
*pd = *ps;
pd++;
ps++;
}
}

As you can test in your applications, the performance is awesome. However, performance always must cost something. In this case, it costs readability of code and errors prone. It is very easy to do mistake here, for example forwarding pointer too many or too less. You must be very careful using this, and double check every line of code. You may ask: why not create helper functions to encapsulate serialization of every type. The answer is performance penalty. In my tests, calling to function works 2 times slower than performing the same code “on the fly”. However, I leave freedom to you to choose what is better for your needs. By the way, calling static function is faster than calling instance one.

Add comment December 24, 2007

Configuring applications using System.Configuration namespace

Every non trivial application has any kind of configuration. For instance, you might need to store applications state, network settings, database connection string or alternate application behavior in some way. One approach to handle this is to write from scratch your own configuration infrastructure. The other one is to use configuration classes from System.Configuration namespace, which created exactly for this purpose.

The pros for using second approach are obvious:

· Ease of use.

· Allows manipulate configuration data in object oriented way.

· As an official part of .NET framework, this is tested, supported Microsoft code, ready to use. Why waste a time and make your own one??

Beside all these reasons, there is one more: configuration standard. Any Microsoft class which has to be configured, using standard app.config file. Microsoft Enterprise Library uses app.config as well – I will talk about it in my next posts. Therefore, by creating your own configuration files, you risking having “Configuration Hell”. Maintaining number of configuration files is the proven receipt for errors. Even if you have only 10 desktops across a network to install your application on, it is very difficult to configure them all. The task to create configuration application, which will know to handle all your configuration files (each could be in different structure), is nearly impossible. I personally know one guy, whose mission is to create such a configuration utility. The problem is, he need to configure something that has around 20 different configuration files. Each file completely different. He trying to solve this already more than year. Each time that he thinks he done, he receiving changes about configuration files structure and doing all from the beginning again.

Lets talk now about configuration standards. The standard file which used for configuration is “app.config”. After you building your application, it changing its name to “yourAppName.exe.config”. You may have machine.config for all machine, and/or user.config per every user. App.config contains number of predefined sections, and it can contain custom sections as well. There is a partial list of possible predefined sections:

  • You can specify connection strings.
  • Specify which CLR version your app should use.
  • Specify location of assemblies.
  • Configure Remoting (WCF in 3.0).
  • Security and Administration

Some of these settings (such as remoting) can be configured using mscorcfg.msc tool.

Couple examples of app.config files can be found here: http://blogs.msdn.com/suzcook/archive/2004/05/14/132022.aspx

Lets talk now about code. System.Configuration namespace provides everything you might need to configure any average application.

ConfigurationManager class, which is new in framework 2.0, provides all functionality to retrieve, use, modify, validate and store back your custom settings – ready to use out of box. ConfigurationManager retrieves settings in object oriented way, which is very cool. And you can use it, to retrieve any predefined sections, such as connection strings. This mean, you will have only one entry point in code, to handle every kind of configuration: both predefined and custom.

Here are examples which demonstrate retrieving and manipulating predefined sections: connection strings and any application settings:

Note: Don’t forget to reference System.Configuration.dll

 

Example about retrieving configuration string:
http://msdn2.microsoft.com/en-us/library/system.configuration.configurationmanager.connectionstrings(VS.80).aspx

 

Example about retrieving application settings:
http://msdn2.microsoft.com/en-us/library/system.configuration.configurationmanager.appsettings(VS.80).aspx

 

But the most interesting part is custom configuration. To illustrate how to do this, I created demo application. It heavy commented, so I will explain here only basic idea:

 

  1. Define configuration object, just how you want it to be.
  2. Derive from ConfigurationSection class, which located in System.Configuration namespace (actually it can derive from ApplicationSettingsBase, but I not covering this here).
  3. For each configuration value, create property and decorate it with ConfigurationProperty attribute. This will connect object’s property to value in configuration file.
  4. Retrieve it using ConfigurationManager class.

 

That’s all. Everything else is already being handled by base ConfigurationSection class. Optional step: in addition, you can decorate object’s property by attributes derived from ConfigurationValidatorBase. Override CanValidate and Validate methods, and you get configuration value validation. Performance considerations: Validate is called in 3 cases:

  1. Creation of configuration object (in case it has default values).
  2. Setting value.
  3. Saving object.

 

In application I prepared, I have object “Profile”. It contains full name, which is First Name and Last Name. In addition it has address, which consists from street name and building number.

 

This is how I define configuration object for full name section (NameSection):

public class NameSection : ConfigurationSection
{
[ConfigurationProperty("LastName", IsRequired = false, DefaultValue = "NotGiven")]
public string LastName
{
get
{
return (string)base["LastName"];
}
set
{
base["LastName"] = value;
}
}

[ConfigurationProperty("FirstName", IsRequired = false, DefaultValue = "NotGiven")]
public string FirstName
{
get
{
return (string)base["FirstName"];
}
set
{
base["FirstName"] = value;
}
}
}

 

In app.config, this will look as the following:

 

<Profile>
<FullName LastName =”NotGiven” FirstName=”NotGiven”/>
</Profile>

 

In case you want change structure of the xml, just override DeserializeSection and SerializeSection methods. In case you want create configuration application to manipulate configuration file – now it is fairy trivial task – just use the same classes.

 

Last words – thread safety: ConfigurationSection class is not thread safe, so you responsible to handle this in your multithreaded application.

Additional information and examples can be found here.

2 comments December 22, 2007

Is using memcpy worth it in .NET ?

Recently, I decided to boost my .NET / MFC serialization. Instead of copying object fields field by field, I would like use memcpy to speedup this process. However, it is not trivial at all to use memcpy in .NET, because the code is managed, and you have no control over the managed types. You can however, override this by using many tricks, such as converting reference types to value, by generating special structures and so on. However this makes code very hard to maintenance, and difficult to debug. So the main question is: Is it worth to do all this just to use memcpy, instead of copying field by field?
To answer this, I created very simple application. I copying struct with 5 fields by 2 ways: using memcpy, and second is coping field by field. I repeating coping in loop, for big number of iterations (in my example – 300000000 iterations). I measuring time in each way. Here is average results that I got (Windows XP, AMD athlon 64 dual core):

Compiled in debug mode: Coping using memcpy: 9500ms , Coping field by field: 1750ms.

Compiled in release mode (disabling optimizations) – same as in debug mode.
Compiled in release mode (optimize: minimize size): Coping using memcpy: 7050ms, Coping field by field: 0ms

Compiled in release mode (optimize: maximize speed): Coping using memcpy: 0ms, Coping field by field: 0ms. I suspect that because of optimization, it not entering my loop at all.

Compiled in release mode (optimize: full optimization) – Same as optimize: maximize speed.

When my friend tried the same on Intel 32 bit machine, he said memcpy performed faster in about 300 ms than coping field by field.
Conclusion: as you can see, the difference in performance is very minor (in such a high rate of coping). Therefore, unless you know some nice and clear way to use memcpy in .NET, it NOT worth the overhead to use it.

3 comments December 21, 2007


Categories

Top Posts

Tags

.NET addin app.config ArrayList bug CAB Configuration ConfigurationManager ConfigurationSection ContentControl ContextMenu CTime; DateTime custom keys DataBinding DataContext Data templates debugging equals gethashcode GUI Hashtable interlocked Invoke lock lock free memcpy MFC multithreading multithreading; lock free override performance SCSF serialization Smart Client Software Factory Styles System.Configuration unsafe virtual functions Visual Studio wait free WinAPI WinForms WinForms\WPF Integration World of Warcraft World of Warcraft; Addon

Archives