Posts filed under 'General C#'

Improving application performance: Object Pool (C#) with automatic object collection

Introduction:

As you know, creating and destroying objects is a costly operation, in terms of performance. In C++ the allocation of objects is not cheap (first the application looking for available memory, validating it and only then allocating). In .NET the allocation of objects is actually very fast, due to GC memory optimizations in each collection – but this is exactly the reason why destroying objects is expensive: it makes pressure on Garbage Collector, making it work harder and thus affecting application performance.
The trick of object pool is to pay for creation and destruction of the object only once: after the object created (on application startup or on demand) it not destroyed, but cashed for the future use. This way the object is reused multiple times during the whole application lifetime.
The object pool can do very noticeable boost to application performance, especially when the creating of object is time consuming operation, and when the rate of objects creation is high.
The classic examples for object pools are ThreadPool (.NET) and connection pool: both creation of thread or DB connection is time and resource consuming and object pool used here to improve performance. By the way, talking about ThreadPool – on application startup 500 (!!!) threads created to be used during application lifetime.

Design and Fundamental Ideas

The object pool is nothing more than a collection of objects. On application startup we creating objects and storing them inside that collection. If someone requesting object – we removing it from collection and passing it out; when the object is being returned, we just adding it back to the collection. Now, the big question is: who responsible to return object back to collection?
I am not asking about who getting object from collection, because it is easy: the one who creating these objects, having object pool and instead of creating object using ‘new’ keyword, asking pool to get it. However, the one who receiving object doesn’t know (it shouldn’t know!) where this object belongs to. From architectural point of view, the receivers of the object (could be third party code) should not know the origin of the object. Allowing them this knowledge will introduce unnecessary code dependencies. Consider the following example: you generating the same type of object in 20 different places (creators). All created objects are passed to one single place (receiver), maybe for some kind of processing. After the object has been processed, it has to be returned back to object pool. The receiver doesn’t know which object pool (if any!) is an owner of this object. So who does know where this object belongs to? The answer is the object itself.
When the object created first time, he can receive information about the creator, and use this information to return back to home, when necessary. How we can know if it necessary? Very easy – when no one holds reference to that object and we want to ‘destroy’ it – instead of being destroyed it simply returned back to object pool. Here I use the idea of Object Resurrection.

The Object Resurrection

The idea of object resurrection described the best by Jeffrey Richter in his book ‘CLR via C#’. In short, when Garbage Collector thinks it is a time to destroy an object and calling to its Finalize method (in case it has one), the object can ‘escape from death’. Inside the Finalize method, object can return to pool, thus obtaining back reference to itself and preventing garbage collection. As you can see, the receiver of the object doesn’t have to do anything special: it uses object and forgetting about it – the object will return ‘automatically’ back to pool.

Implementation

First, let’s examine the BasePoolObject object – this is object designed to “find its way back to home”.

The Init method:

public void Init(ObjectHandler handler)
{
_handler = handler;
}

ObjectHandler is the delegate to the ReturnObject method of the ObjectPool class. The reason I not storing reference to the ObjectPool object itself, is because I want to keep ReturnObject method private, to avoid direct use of it. This way I am emphasizing the object’s exclusive responsibility to return itself back to pool.

The Finalize method:

if (_handler != null)
{
_handler(this);
GC.ReRegisterForFinalize(this);
}

When this method being called by GC, first object returns back to pool, and second: I re – registering object back to finalize queue, otherwise the Finalize method will never be called again.

However as you know, unlike C++ when you removing any reference to object and going out of variable scope, its Finalize method doesn’t called immediately. Instead, GC starts garbage collection when it thinks it should be done (when generation zero is full). So when the object creation rate is very high, it can introduce kind of memory leak: objects will be created faster than they die, thus making memory constantly grow. By the way, you will experience this problem with any consuming resources object. That’s why in .NET special interface defined: IDisposable. If object has to be returned to pool immediately, calling to its Dispose method will do that. Notice, in Dispose method I am not calling to GC.ReRegisterForFinalize. That’s because not the GC who calling to Dispose method, and object remains registered to Finalize queue after Dispose was called. Also, as you can see there is no virtual protected Dispose method. This is because I want to ensure, the resources is not going to be released while object is still in pool. Instead, release all resources inside the Terminate method.

The Reset method:
One of the pifalls when pooling objects, is their state. If someone before us used the object, and put into some data – probably when I resuing object I expecting to get it ‘clean’ – like if I just created it first time. The Reset method called by pool each time object requested. It is a developer responsibility to reset everithing that should be reseted inside this method.

The Terminate method:
This method called when the object is actually die (not going to be ressurected). Tipically, this will happen when the object pool destroyed.
Inside this method, developer responsible to release all resources holded by this object – close all handles, release unmanaged resources, etc.

Now the ObjectPool object:
As I said in introduction, the object pool is simply collection of objects with small additions:

• Object creation: each created object receiving delegate to ReturnObject method.
• Multithreading: obvious, we have to consider in our implementation thread safety: most of the time will be involved at least two threads – yours and the Finalizer thread. Thread safety can be achieved by using simple queue and synchronizing it by critical sections (just as ThreadPool implemented). However, using locks can potentially introduce dead locks. If this will happen, it will be a disaster – the garbage collection thread will be infinitely blocked – say goodbye to memory reclaiming. Therefore, I using there my ConcurrentLinkedQueue – it is thread safe queue and it has no locks (you can download its source code from here).
• Getting objects: the policy I choose is as the following – if there are no more available objects in the pool, perform garbage collection to get back all not used anymore objects. If even after garbage collection we have no available objects, this means all objects in use, and we have to regrow our pool – I growing it twice each time. Of cause, this is a place for many possible alternatives. For instance, you could try first to perform collection of generation zero, if didn’t helped then generation one and finally generation two. The problem I see there is the collection of each generation also collects all previous. For example, collection of generation one also collects generation zero. I guess (didn’t proved this) in average it faster to perform collection of 3 generations at once (0 + 1 + 2), rather than in worst case perform 6 collections (0 + 0,1 + 0,1,2). However, if your objects lifetime is short – I suggest always try to collect zero generation first.
• Another possible tweak: using weak references instead of strong – it is possible to use weak references from pool to objects, to possibly reduce amount of used memory. Personally, I think this will miss the whole idea of the object pool: if object will be destroyed due to garbage collection, I will have later to create it again – thus loosing all the benefits of the object pool. However, I pointing on this to give you another idea for possible customization to match best your specific needs.

The full source code can be downloaded from here (hosted by RapidShare). As always, I will glad to receive any feedbacks, bugs and everything you like.

6 comments October 8, 2008

Fixed link of the “Fast Drawing” demo!

Hi friends! As you know, after my previous hosting service deleted that demo, I received lots of requests from you to recover it somehow. Lucky for all us, couple days ago I found it on my old memory stick! So here it is – hosted by RapidShare (I updated demo link inside the post also). Have fun!

Add comment October 8, 2008

DateTime and CTime cooperation

Sometimes you might want to pass a current time from managed application to unmanaged and vice versa. The one way to pass time from C# to, lets say, MFC is to calculate total number of seconds from 1.1.70 until current time. Then you will marshal this long value to MFC (probably using COM or network, depends on what you doing) and will create CTime class from it. Here is something you have to consider: DateTime structure by default represents local time, while CTime represents GMT time. Lets say I live in Israel (GMT +2) and my current time is 9:00 AM. Next step I calculating long of total seconds and send it to MFC’s CTime. Now, the CTime thinks it is GMT time. Therefore when I presenting CTIme using ‘Format’ method, I receiving… 11:00 AM. You see, ‘Format’ method counts on fact that CTime is GMT time, and during convertion to string adds GMT delta (in my case + 2 hours). So if you will pass to CTime local time, ‘Format’ method will convert local time to local time again.  The point of this story, is you will always have to pass to CTime class GMT time. if not, ‘Format’ method will represent incorrect values (unless you work in +0 time zone). By the way: in case you want to represent original CTime time, instead of ‘Format’ method use ‘FormatGmt’.

1 comment July 3, 2008

The c# lock free library project – first public preview

     Right now, I finished to port from Java lock free hash table and queue. You can get it from here. The good news, is it seems to work. The bad news, traditional implementation using locks performs faster in tests I did (I tested on 2 and 4 core machines). However, this is contradictive to the research results done on this topic (read more about research done in this field here). Therefore, I believe there are two (or more) possible explanations for that: my code is not optimized or/and, the way I doing my tests is wrong. How I did my tests: I used class I created for this purpose, which is called “Executer” (included in provoded package). Is works same as the BackgroundWorker class, with 2 differences: you can specify number of threads to work, and you can specify time to perform that work.  You can use this class to test many other things in multithreading environment.

      Please send me feedback about this staff, so I could improve it and make it better for all us. The code released under GPL license. Declaimer: I did my best on writing this code and I hope it will be useful, however I not responsible for any damage it could do – so use it on your own risk.

2 comments May 30, 2008

Lots of info about .NET 2.0 configuration

Add comment April 9, 2008

Concurrent Library .NET project started!

Since SUN released OpenJDK – java source under GPL2 licence (including java.util.concurrent package) – it is legal now to be ported to other lenguages :-)   I started project on CodePlex – http://www.codeplex.com/concurrent. The first release is  planning to contain a concurrent hashtable, queue, skiplist, set and arraylist (first 2 are almost done).  Anyone who wish to participate is welcome to contact me.

2 comments March 11, 2008

Volatile and Interlocked.CompareExchange

Recently, when I tried to CAS volatile variable, I got this warning: a reference to a volatile field will not be treated as volatile. So the question is: does CAS succeed or not? I consulted with Jeffrey Richter on this subject, and here is his explanation: when passing volatile variable by ref to method, it is always treated as non volatile. Compiler just don’t distinguish between CompareExchange and all other methods, so it warnings in any case. The CAS operation will succeed, even with volatile variable (or more correct is to say: it will not failure because of volatile variable).  Well, I think that make sense, since the only difference between volatile and non volatile is volatile can’t be optimized and stored in shared memory rather than in CPU cache.  

    In addition he recommended to use Thread.VolatileRead/Write methods instead of declaring variable as volatile for better performance. This is contrasting to conclusion from this article, so I don’t decided which direction is better (despite my personal respect to Jeffrey).

If we talking about multithreading, here are couple bullet points for thinking in spare time:

  • According to .NET specifications, reads and writes to/from variables (even non volatile) are atomic (when memory is properly aligned). However, in contrast with volatile variables, other CPUs not guaranteed to see changes “in real time”. Here is nice article explaining .NET 2.0 memory model and phenomenons such as movement of reads and writes in time.
  • Operations such as increament are not atomic: to perform it CPU has to follow these steps: 1- Load a value from an instance variable into a register, 2- Increment (or decrement) value, and 3 - Store the value in the instance variable. The same is true for volatile variable.
  • It is possible to create concurrent (thread safe) objects and collections without using even single lock (just using volatile and CAS)! Say goodbye to deadlocks, priority inversions, and performance bottlenecks! There are many research on this subject, here are some nice links (I sorted them from basics theoretic to advanced practical):
    • Lock-free and wait-free algorithms - basic idea and definition from wikipedia
    • Software transactional memory - another interesting idea: access and modify shared objects by splitting operations to separate transactions (something like in DB)
    • Here one guy explaining idea of synchronization using CAS operation.
    • Multi Processor Progamming cource from Tel Aviv University - very cool slides with detailed explanations and examples on Java
    • University of Cambridge - including Lock-free library (based on STM idea, implemented on Linux C++). Also I highly recommending to read Concurrent programming without locks paper – extremely interesting!
    • SXM- C# Software Transactional Memory (STM) by Tim Harris. Unfortunately the licence is not permitting commercial use (yet).
    • NSTM - C# STM by Rulf Suderbucher (Open Source, New BSD License)
    • DSTM2 - Java Software Transactional Memory.
    • java.util.concurrent namespace.A Java library developed by Lea Doug, which is including everything you may need in concurrent programming: free lock hash table, lists, sets, java implementation of M.M. Michael queue and many many more. The full JDK code could be downloaded from here. I am not a lawyer, but as much I understand, this open source code could be modified but is not allowed to be ported to any other language but Java :-(
    • Dr. Cliff Click’s wait free lock free hash table – code (Java) is here, pdf with explanations and benchmark results is here, his lecture on this subject is here.
    Is it just me or .NET has a HUGE gap to fill for concurrency support compared to Java? Last point: Microsoft preparing to release in .NET 4.0 library for parallel computations. It will provide API, which will allow to execute separate calculations on different processors. Couple month ago they released preview (not for commercial use yet). The problem is when these calculations performed on shared memory, you still have to synchronize it – and here creating performance bottleneck. Anyone beside me see how to solve that problem? :-)   (Tip – read previous point).
    Update (11.03.08): Here is one additional product: Intel Threading Building Blocks (TBB), which includes concurrent collections and parallelism API (Microsoft doing now something similar). Language – C++. Available in commercial and Open Source versions.

Add comment March 7, 2008

Another thoughts about hash tables.

As you probably understand from my previous posts, I working a lot with hash tables. During my work on the current project, I saw people doing so badly things with them, so I decided to write something on this topic.

In my atricle about hash table keys I discussed how is bad to use value types as keys (this is applied to non generic hash table). Today I saw another stupid thing, which is worth to mention. Consider the following code:

if(!_hashTable.ContainsKey(someKey))
{
  hashTable.Add(someKey, someValue);
}

You see, each time you calling to ContainsKey, it has to call GetHashCode(), find proper bucket in the hash table and then, in case of collisions, go over all entities of that bucket. After this you adding your value, and during this the same is doing again. Much cheaper is just to add value in try/catch block, and catch ArgumentException in case of duplication. Even more, you can use indexer – in this case the duplicate will be overwritten, without exception being thrown.

The same with removing values:

//Bad Code!!
if(_hashTable.ContainsKey(someKey))
{
  _hashTable.Remove(someKey);

In any case, no exception is thrown. Remove method returns true or false according to success of operation. The same with getting value from hash table. Instead of calling to ContainsKey method, much cheaper to do that in try/catch block, and catch KeyNotFoundException. You should use ContainsKey only when you have to do some action based on the result, but this action should not be done on hash table (or else you will pay twice).

Another point: are you sure you need to use hash table? Because if you adding and removing objects in high rate, probably you should consider to use another data structure. For many people I know, hash table looks as best data structure because of O(1) add/search/remove operations. However, this O(1) can take significant time. Each access to hashtable requires hash computation, which is relative slow operation (even if it done in constant number of steps). In case you have collisions, you will have to go over all entries of the bucket. And in case you reaching hash table capacity, you have to rebuild and rehash the whole table. Therefore, when the number of objects is not so big (I didn’t tested for exact limit) is better use other data structure, such as balanced tree (if n is not big, log(n) can be faster than O(1) – think about this). Anyway, if you going to use hashtable, make it BIG (at least twice bigger than you need) to avoid rebuilds and collisions as much as possible. By default, the size of hashtable (if I remember correctly) is 17.

Important remark: from people responses, I did a conclusion that I have to explain better the main point: you can actually improve performance by using try/catch block, only when GetHashCode is expensive operation. For example, GetHashCode from integer (in default implementation) is very fast – the hash code is the integer itself. In this case using try/catch block is bad option, it will perform slower. However, when you calling GetHashCode on reference type key (espessially on string) – it is better to use trycatch than call to ContainsKey. Today, when I tryed to explain that to some guy, I did test with string key (10-12 chars). Implementation with try/catch block performed 25% faster. This leads us to important conclusions: to achieve best performance you have to understand whats going on under the hood, and for each rule the is an exception.

Add comment March 4, 2008

Pitfalls of Equality

Consider the fallowing example:

SomeClass class1 = new SomeClass();
SomeClass class2 = new SomeClass();

bool result = (class1 == class2);

Due to class1 and class2 are exactly the same, you should expect result == true. However, result is false. The reason is operator (==) actually compares references of the objects, rather the objects itself. In case you want to compare objects, override and/or use ‘Equals’ method. Some points of interest:

  • By default, Equals method calls to GetHashCode() method and then compares it.
  • string type is immutable, it overrides == operator and inside it calls to actual string comparison.
  • Interlocked.CompareExchange method comparing objects by references (!!)
  • If anyone cares, in Java operator (==) works in same way

Add comment February 16, 2008

Is Interlocked.Equals really atomic ?!

Recently I had to check for equality of two long variables  in synchronized fasion. Since in c# you can’t declare volatile on long values, I had to consider other high performance options, such as Interlocked class. If you using intellisense to see methods of Interlocked class, you will noticed it has an Equals method. Sound like exactly what I need! However, this method has signature (object arg1, object arg2). This is a bad news, because for each comparison I will had to box my values – hit performance. But this is only beginning! I refered MSDN for this method, and… Interlocked class HAS NO Equals method! Analysing with Reflector and observing Interlocked class metadata confirmed – it has no Equals method. But why Intellisense saying opposite, and why my compilation succeeded? If you will observe Interlocked.Equals method metadata (press F12 on it in VS), you will jump to.. System.Object class! Interlocked class, as any other class in .NET is object, and Equals method actually belongs to System.Object and not to System.Threading.Interlocked class. In the bottom line, what all this mean? This mean that Interlocked.Equals method is not atomic, unless System.Object.Equals already implemented as atomic operation, which is I doubt.

So how you can check for equality of two variables? I prefer to avoid locks where possible, so I recommending to not use lock here . Instead, declare where possible variables as volatile. This will make them synchronized for multithreading. In case you working with long variable (as I do in this example), and in case you using .NET 2.0 or greater, you can use Interlocked.Read method to read your value (atomic operation) to some other local variable, and then check for equality. For example:

long global_variable_I_want_to_check = 125;
long my_local_variable = 0;

my_local_variable = Interlocked.Read(ref global_variable_I_want_to_check);
if(my_local_variable = 435)

Add comment February 16, 2008

Previous Posts


Categories

Top Posts

Tags

.NET addin app.config ArrayList bug CAB Configuration ConfigurationManager ConfigurationSection ContentControl ContextMenu CTime; DateTime custom keys DataBinding DataContext Data templates debugging equals gethashcode GUI Hashtable interlocked Invoke lock lock free memcpy MFC multithreading multithreading; lock free override performance SCSF serialization Smart Client Software Factory Styles System.Configuration unsafe virtual functions Visual Studio wait free WinAPI WinForms WinForms\WPF Integration World of Warcraft World of Warcraft; Addon

Archives