Improving application performance: Object Pool (C#) with automatic object collection
October 8, 2008
Introduction:
As you know, creating and destroying objects is a costly operation, in terms of performance. In C++ the allocation of objects is not cheap (first the application looking for available memory, validating it and only then allocating). In .NET the allocation of objects is actually very fast, due to GC memory optimizations in each collection – but this is exactly the reason why destroying objects is expensive: it makes pressure on Garbage Collector, making it work harder and thus affecting application performance.
The trick of object pool is to pay for creation and destruction of the object only once: after the object created (on application startup or on demand) it not destroyed, but cashed for the future use. This way the object is reused multiple times during the whole application lifetime.
The object pool can do very noticeable boost to application performance, especially when the creating of object is time consuming operation, and when the rate of objects creation is high.
The classic examples for object pools are ThreadPool (.NET) and connection pool: both creation of thread or DB connection is time and resource consuming and object pool used here to improve performance. By the way, talking about ThreadPool – on application startup 500 (!!!) threads created to be used during application lifetime.
Design and Fundamental Ideas
The object pool is nothing more than a collection of objects. On application startup we creating objects and storing them inside that collection. If someone requesting object – we removing it from collection and passing it out; when the object is being returned, we just adding it back to the collection. Now, the big question is: who responsible to return object back to collection?
I am not asking about who getting object from collection, because it is easy: the one who creating these objects, having object pool and instead of creating object using ‘new’ keyword, asking pool to get it. However, the one who receiving object doesn’t know (it shouldn’t know!) where this object belongs to. From architectural point of view, the receivers of the object (could be third party code) should not know the origin of the object. Allowing them this knowledge will introduce unnecessary code dependencies. Consider the following example: you generating the same type of object in 20 different places (creators). All created objects are passed to one single place (receiver), maybe for some kind of processing. After the object has been processed, it has to be returned back to object pool. The receiver doesn’t know which object pool (if any!) is an owner of this object. So who does know where this object belongs to? The answer is the object itself.
When the object created first time, he can receive information about the creator, and use this information to return back to home, when necessary. How we can know if it necessary? Very easy – when no one holds reference to that object and we want to ‘destroy’ it – instead of being destroyed it simply returned back to object pool. Here I use the idea of Object Resurrection.
The Object Resurrection
The idea of object resurrection described the best by Jeffrey Richter in his book ‘CLR via C#’. In short, when Garbage Collector thinks it is a time to destroy an object and calling to its Finalize method (in case it has one), the object can ‘escape from death’. Inside the Finalize method, object can return to pool, thus obtaining back reference to itself and preventing garbage collection. As you can see, the receiver of the object doesn’t have to do anything special: it uses object and forgetting about it – the object will return ‘automatically’ back to pool.
Implementation
First, let’s examine the BasePoolObject object – this is object designed to “find its way back to home”.
The Init method:
public void Init(ObjectHandler handler)
{
_handler = handler;
}
ObjectHandler is the delegate to the ReturnObject method of the ObjectPool class. The reason I not storing reference to the ObjectPool object itself, is because I want to keep ReturnObject method private, to avoid direct use of it. This way I am emphasizing the object’s exclusive responsibility to return itself back to pool.
The Finalize method:
if (_handler != null)
{
_handler(this);
GC.ReRegisterForFinalize(this);
}
When this method being called by GC, first object returns back to pool, and second: I re – registering object back to finalize queue, otherwise the Finalize method will never be called again.
However as you know, unlike C++ when you removing any reference to object and going out of variable scope, its Finalize method doesn’t called immediately. Instead, GC starts garbage collection when it thinks it should be done (when generation zero is full). So when the object creation rate is very high, it can introduce kind of memory leak: objects will be created faster than they die, thus making memory constantly grow. By the way, you will experience this problem with any consuming resources object. That’s why in .NET special interface defined: IDisposable. If object has to be returned to pool immediately, calling to its Dispose method will do that. Notice, in Dispose method I am not calling to GC.ReRegisterForFinalize. That’s because not the GC who calling to Dispose method, and object remains registered to Finalize queue after Dispose was called. Also, as you can see there is no virtual protected Dispose method. This is because I want to ensure, the resources is not going to be released while object is still in pool. Instead, release all resources inside the Terminate method.
The Reset method:
One of the pifalls when pooling objects, is their state. If someone before us used the object, and put into some data – probably when I resuing object I expecting to get it ‘clean’ – like if I just created it first time. The Reset method called by pool each time object requested. It is a developer responsibility to reset everithing that should be reseted inside this method.
The Terminate method:
This method called when the object is actually die (not going to be ressurected). Tipically, this will happen when the object pool destroyed.
Inside this method, developer responsible to release all resources holded by this object – close all handles, release unmanaged resources, etc.
Now the ObjectPool object:
As I said in introduction, the object pool is simply collection of objects with small additions:
• Object creation: each created object receiving delegate to ReturnObject method.
• Multithreading: obvious, we have to consider in our implementation thread safety: most of the time will be involved at least two threads – yours and the Finalizer thread. Thread safety can be achieved by using simple queue and synchronizing it by critical sections (just as ThreadPool implemented). However, using locks can potentially introduce dead locks. If this will happen, it will be a disaster – the garbage collection thread will be infinitely blocked – say goodbye to memory reclaiming. Therefore, I using there my ConcurrentLinkedQueue – it is thread safe queue and it has no locks (you can download its source code from here).
• Getting objects: the policy I choose is as the following – if there are no more available objects in the pool, perform garbage collection to get back all not used anymore objects. If even after garbage collection we have no available objects, this means all objects in use, and we have to regrow our pool – I growing it twice each time. Of cause, this is a place for many possible alternatives. For instance, you could try first to perform collection of generation zero, if didn’t helped then generation one and finally generation two. The problem I see there is the collection of each generation also collects all previous. For example, collection of generation one also collects generation zero. I guess (didn’t proved this) in average it faster to perform collection of 3 generations at once (0 + 1 + 2), rather than in worst case perform 6 collections (0 + 0,1 + 0,1,2). However, if your objects lifetime is short – I suggest always try to collect zero generation first.
• Another possible tweak: using weak references instead of strong – it is possible to use weak references from pool to objects, to possibly reduce amount of used memory. Personally, I think this will miss the whole idea of the object pool: if object will be destroyed due to garbage collection, I will have later to create it again – thus loosing all the benefits of the object pool. However, I pointing on this to give you another idea for possible customization to match best your specific needs.
The full source code can be downloaded from here (hosted by RapidShare). As always, I will glad to receive any feedbacks, bugs and everything you like.
Entry Filed under: General C#, General Programming. Tags: performance.
6 Comments Add your own
Leave a Comment
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
Trackback this post | Subscribe to the comments via RSS Feed
1.
Christian Kullmann | November 4, 2008 at 8:22 am
Hi Evgeni,
Thanks for the code.
It appears to be an elegant solution.
I am a bit concerned about the need to make all by objects inherit from BasePoolObject but I don’t have any alternative suggestions to implement the finalize() stuff.
Regards,
Christian
2.
Anonymous | January 2, 2009 at 12:26 am
please host this on google code or google pages or github or any other place that is source code driven and not rapidshare/$$$ driven
3.
Aizikovich Evgeni | January 14, 2009 at 1:32 pm
Good idea, I will, thank you.
4.
DotNetGuts | January 13, 2009 at 9:21 pm
Good stuff, it was helpful.
DotNetGuts
http://dotnetguts.blogspot.com
5.
Hamza | May 11, 2009 at 8:50 am
Hello Aizikovich,
I’m relatively new to multithreading and object pool. I was trying to use your thread safe implementation of object pool with .net threadpool. I’m building a .net component and have your files correctly added to my solution.
I need an object pool of WebBrowser objects (System.Windows.Forms namespace) like this
bool instantiated = false;
if (!instantiated)
{
ObjectPool objPool = new ObjectPool(30);
instantiated = true;
}
On building my solution I get an error
Error 2 The type ‘System.Windows.Forms.WebBrowser’ cannot be used as type parameter ‘T’ in the generic type or method ‘StoreSnapshot.ObjectPool’. There is no implicit reference conversion from ‘System.Windows.Forms.WebBrowser’ to ‘StoreSnapshot.BasePoolObject’. C:\Documents and Settings\Hamza\My Documents\Visual Studio 2008\Projects\StoreSnapshot\StoreSnapshot\Snapshot.cs 37 65 StoreSnapshot
Even with value type ‘int16′ I get an error
Error 1 The type ’short’ cannot be used as type parameter ‘T’ in the generic type or method ‘StoreSnapshot.ObjectPool’. There is no boxing conversion from ’short’ to ‘StoreSnapshot.BasePoolObject’. C:\Documents and Settings\Hamza\My Documents\Visual Studio 2008\Projects\StoreSnapshot\StoreSnapshot\Snapshot.cs 37 28 StoreSnapshot
Can you please show me how the correct usage should be?
Thanks in advance
6.
Aizikovich Evgeni | May 11, 2009 at 5:42 pm
As a compiler says, the type of the object you going to pool, have to derive from the BasePoolObject.