Posts filed under 'General Programming'
IBM ClearCase and the Visual Studio 2008 integration: the case is not clear
Introduction
When the previous version of the Visual Studio was released (VS 2005), IBM released a CC integration package for it, which job was to integrate VS and CC. The integration package dlls by default located at “Program Files\Common Files\Rational\ClearCase\CCVSI” folder (just for the record – CCVSI stands for ClearCase Visual Studio Integration). Later, when the VS 2008 was released, the package failed to integrate, due to VS internal changes. So IBM released the patch, which is adding necessary keys to the computer registry and this way registering properly the CC package. The patch can be downloaded from the www.ibm.com (look for the Rational products). The package adding ClearCase menus to the VS, such as check in, check out and so on, thus making possible to operate CC directly from the VS.
A year ago, when we switched to Visual Studio 2008 and installed it on ours development machines with the ClearCase, Visual Studio refused to open our projects without patch being applied. However, then we noticed very strange phenomenon: on some machines, for unknown reason (now it does known
), VS 2008 managed to open and fully integrate with ClearCase without a patch! For a year that was a voodoo for us, one thing I can say with confidence: the lucky guys whose computer worked without the patch was extremely happy, because VS worked a way better without it.
Life with patch and without
Personally, I don’t like the patch (I don’t like ClearCase either, but this is not relevant to the issue). Probably, one of the reasons for that is my previous experience with the Microsoft Source Safe – its integration worked so transparently to the user, as he will almost never notice to its existence. Contrary, when you work with the CC integration you have the feeling it not belongs to there. Each time you doing some operation, such as check in or adding a new file, external ugly dialogs are pop ups, freezing the whole environment while initializing. The integration changes project files, adding its tags to it. As the result, any attempt to open project on the machine with ClearCase but without the patch will fail. With the patch, some of the features (such as pending check ins) does not works. With the patch, you don’t have an icon for hijack or added new file. And more.
You don’t have to use the patch (and CC integration)!
Because patch was making projects corrupted to the machines without patch, we had to make a decision: to make all our machines work with patch or without, in the uniform way. A week ago I discovered what caused to some computers integrate without a patch: they had a different image, and a ClearCase on that image was installed without Visual Studio integration. To make clear what I trying to say here, I will say it again: if you install ClearCase with VS integration – you need a patch. If you install CC without VS integration – VS 2008 integrates perfectly without a patch or anything else.
Wow, this is weird! Why IBM released that patch from the beginning? How it works without the CC integration package? I don’t have answers, and honestly I don’t care since it works and I happy.
When you installing Visual Studio 2008 on the machine with ClearCase installed without VS integration, the VS integrates itself to CC. Inside it, appears all the same standard menus and icons you have with, for example, Source Safe. Every ability I tested perfectly works:
Check in – check
Check out – check
Undo check out – check
For almost every operation opens Visual Studio standard dialog (not third party) – check
Hijack file icon (a flag icon) – check
Added new file icon (a plus icon) – check
View history – check
Compare – check
Pending check ins – check
.. and more
Under File menu added standard “Source Control” menu item – nice, you don’t have it with the patch! Under Tools\Options\Source Control plug in selection – now you can select ClearCase or any other source control provider! Without a patch, the whole Visual Studio works much faster, no freezing anymore. And you can open your projects on computer with patch (pay attention – patch changes project files). We decided to work without a patch and integration, cleaned it from our computers, and no one regrets about that.
Ok, I want to work without a patch and CC integration. How I do that?
If you installing ClearCase first time, just make sure you installing it without Visual Studio integration. In case you already have a ClearCase installed with the integration, you have to perform two simple steps: first is to uninstall CC VS integration (CC itself should remain, of cause), and second is to remove patch registry keys from the registry.
Uninstalling CC VS integration: go to Add or Remove Programs, find “IBM rational ClearCase” there and choose Change >> Next >> Modify. Make sure the “IBM Rational ClearCase Client for VS.Net” is not selected. Press Next to apply changes.
Removing registry keys: when you applied a patch, you used two .reg files. Modify these files by adding “-” char at the end of every key. For example:
[HKEY_LOCAL_MACHINE\Software\Microsoft\VisualStudio\9.0\Packages] turns into:
[HKEY_LOCAL_MACHINE\Software\Microsoft\VisualStudio\9.0\Packages-]
After you modifying all the keys inside these files, run both of them. This will remove a specified keys from the registry.
Now run Visual Studio and enjoy.
Software versions I tested: Visual Studio 2008 with/without SP1, ClearCase versions 2003.06.10+ and 7.0.1.3. Our computers (the ones without integration installed) work this way about a year, no complains
6 comments January 13, 2009
Passing value types by reference (a ref keyword) to the managed C++ (CLI)
Hi, after a little research I did to accomplish this task, I thought it would be nice to share the solution with my readers, so here it is: in the CLI method you should use a “%” keyword, it is the same as “ref” in the C#.
Example: (passing integer by ref from C# to CLI)
declaring method in the CLI:
MyClass::MyMethod(Int32% myValue);
calling that method from C#:
MyClass myClass = new MyClass();
int myValue = 12;
myClass.MyMethod(ref myValue);
Hope that was helpful,
Evgeni
Add comment November 26, 2008
Debugging mixed code: solving “not loaded symbols”
Today, while debugging mixed code (calling c# code from unmanaged c++ via COM). I noticed some strange behavior: the application was running, but I was unable to hit any breakpoint inside the c# code. Usually this can be result of “dll hell ” or something like this, maybe pdb files from different version. However, I double checked all my files and I was pretty sure they are ok. Finally, I discovered the source of my troubles: the startup project (c++ in my case) has to be configured correctly to support debugging of both managed and unmanaged code. I did it this way: going to startup project properties >> debugging, choosing debugging mode as “mixed”. Hope that was helpful
Add comment October 12, 2008
Improving application performance: Object Pool (C#) with automatic object collection
Introduction:
As you know, creating and destroying objects is a costly operation, in terms of performance. In C++ the allocation of objects is not cheap (first the application looking for available memory, validating it and only then allocating). In .NET the allocation of objects is actually very fast, due to GC memory optimizations in each collection – but this is exactly the reason why destroying objects is expensive: it makes pressure on Garbage Collector, making it work harder and thus affecting application performance.
The trick of object pool is to pay for creation and destruction of the object only once: after the object created (on application startup or on demand) it not destroyed, but cashed for the future use. This way the object is reused multiple times during the whole application lifetime.
The object pool can do very noticeable boost to application performance, especially when the creating of object is time consuming operation, and when the rate of objects creation is high.
The classic examples for object pools are ThreadPool (.NET) and connection pool: both creation of thread or DB connection is time and resource consuming and object pool used here to improve performance. By the way, talking about ThreadPool – on application startup 500 (!!!) threads created to be used during application lifetime.
Design and Fundamental Ideas
The object pool is nothing more than a collection of objects. On application startup we creating objects and storing them inside that collection. If someone requesting object – we removing it from collection and passing it out; when the object is being returned, we just adding it back to the collection. Now, the big question is: who responsible to return object back to collection?
I am not asking about who getting object from collection, because it is easy: the one who creating these objects, having object pool and instead of creating object using ‘new’ keyword, asking pool to get it. However, the one who receiving object doesn’t know (it shouldn’t know!) where this object belongs to. From architectural point of view, the receivers of the object (could be third party code) should not know the origin of the object. Allowing them this knowledge will introduce unnecessary code dependencies. Consider the following example: you generating the same type of object in 20 different places (creators). All created objects are passed to one single place (receiver), maybe for some kind of processing. After the object has been processed, it has to be returned back to object pool. The receiver doesn’t know which object pool (if any!) is an owner of this object. So who does know where this object belongs to? The answer is the object itself.
When the object created first time, he can receive information about the creator, and use this information to return back to home, when necessary. How we can know if it necessary? Very easy – when no one holds reference to that object and we want to ‘destroy’ it – instead of being destroyed it simply returned back to object pool. Here I use the idea of Object Resurrection.
The Object Resurrection
The idea of object resurrection described the best by Jeffrey Richter in his book ‘CLR via C#’. In short, when Garbage Collector thinks it is a time to destroy an object and calling to its Finalize method (in case it has one), the object can ‘escape from death’. Inside the Finalize method, object can return to pool, thus obtaining back reference to itself and preventing garbage collection. As you can see, the receiver of the object doesn’t have to do anything special: it uses object and forgetting about it – the object will return ‘automatically’ back to pool.
Implementation
First, let’s examine the BasePoolObject object – this is object designed to “find its way back to home”.
The Init method:
public void Init(ObjectHandler handler)
{
_handler = handler;
}
ObjectHandler is the delegate to the ReturnObject method of the ObjectPool class. The reason I not storing reference to the ObjectPool object itself, is because I want to keep ReturnObject method private, to avoid direct use of it. This way I am emphasizing the object’s exclusive responsibility to return itself back to pool.
The Finalize method:
if (_handler != null)
{
_handler(this);
GC.ReRegisterForFinalize(this);
}
When this method being called by GC, first object returns back to pool, and second: I re – registering object back to finalize queue, otherwise the Finalize method will never be called again.
However as you know, unlike C++ when you removing any reference to object and going out of variable scope, its Finalize method doesn’t called immediately. Instead, GC starts garbage collection when it thinks it should be done (when generation zero is full). So when the object creation rate is very high, it can introduce kind of memory leak: objects will be created faster than they die, thus making memory constantly grow. By the way, you will experience this problem with any consuming resources object. That’s why in .NET special interface defined: IDisposable. If object has to be returned to pool immediately, calling to its Dispose method will do that. Notice, in Dispose method I am not calling to GC.ReRegisterForFinalize. That’s because not the GC who calling to Dispose method, and object remains registered to Finalize queue after Dispose was called. Also, as you can see there is no virtual protected Dispose method. This is because I want to ensure, the resources is not going to be released while object is still in pool. Instead, release all resources inside the Terminate method.
The Reset method:
One of the pifalls when pooling objects, is their state. If someone before us used the object, and put into some data – probably when I resuing object I expecting to get it ‘clean’ – like if I just created it first time. The Reset method called by pool each time object requested. It is a developer responsibility to reset everithing that should be reseted inside this method.
The Terminate method:
This method called when the object is actually die (not going to be ressurected). Tipically, this will happen when the object pool destroyed.
Inside this method, developer responsible to release all resources holded by this object – close all handles, release unmanaged resources, etc.
Now the ObjectPool object:
As I said in introduction, the object pool is simply collection of objects with small additions:
• Object creation: each created object receiving delegate to ReturnObject method.
• Multithreading: obvious, we have to consider in our implementation thread safety: most of the time will be involved at least two threads – yours and the Finalizer thread. Thread safety can be achieved by using simple queue and synchronizing it by critical sections (just as ThreadPool implemented). However, using locks can potentially introduce dead locks. If this will happen, it will be a disaster – the garbage collection thread will be infinitely blocked – say goodbye to memory reclaiming. Therefore, I using there my ConcurrentLinkedQueue – it is thread safe queue and it has no locks (you can download its source code from here).
• Getting objects: the policy I choose is as the following – if there are no more available objects in the pool, perform garbage collection to get back all not used anymore objects. If even after garbage collection we have no available objects, this means all objects in use, and we have to regrow our pool – I growing it twice each time. Of cause, this is a place for many possible alternatives. For instance, you could try first to perform collection of generation zero, if didn’t helped then generation one and finally generation two. The problem I see there is the collection of each generation also collects all previous. For example, collection of generation one also collects generation zero. I guess (didn’t proved this) in average it faster to perform collection of 3 generations at once (0 + 1 + 2), rather than in worst case perform 6 collections (0 + 0,1 + 0,1,2). However, if your objects lifetime is short – I suggest always try to collect zero generation first.
• Another possible tweak: using weak references instead of strong – it is possible to use weak references from pool to objects, to possibly reduce amount of used memory. Personally, I think this will miss the whole idea of the object pool: if object will be destroyed due to garbage collection, I will have later to create it again – thus loosing all the benefits of the object pool. However, I pointing on this to give you another idea for possible customization to match best your specific needs.
The full source code can be downloaded from here (hosted by RapidShare). As always, I will glad to receive any feedbacks, bugs and everything you like.
6 comments October 8, 2008
Important: Dead Links
Hi folks. As you probably noticed, I using free host for my blog (the wordpress). One of the drawbacks of that is I don’t have any space to host my demo applications. As a result, I have to store my staff on public storage services. The one I used in past is SendSpace service. Unfortunately, as it discovered later, SendSpace deleting files after a while. This is the reason why some of the links don’t work. Right now I switched to RapidShare and I hope they will keep my files. For all guys who asking me about demo of “fast rendering”: SendSpace deleted my demo, and I don’t have any backup. If someone downloaded it in the past and still have it, please contact me so i will reupload it again. Right now I not planning to rewrite that demo, simply because I don’t have spare time for that. Basically, all important staff located in the article itself. If you need me to clarify specific points, write me (and as well about anything else) and I will try to find time to help.
Yours, Evgeni
Add comment May 30, 2008
Volatile and Interlocked.CompareExchange
Recently, when I tried to CAS volatile variable, I got this warning: a reference to a volatile field will not be treated as volatile. So the question is: does CAS succeed or not? I consulted with Jeffrey Richter on this subject, and here is his explanation: when passing volatile variable by ref to method, it is always treated as non volatile. Compiler just don’t distinguish between CompareExchange and all other methods, so it warnings in any case. The CAS operation will succeed, even with volatile variable (or more correct is to say: it will not failure because of volatile variable). Well, I think that make sense, since the only difference between volatile and non volatile is volatile can’t be optimized and stored in shared memory rather than in CPU cache.
In addition he recommended to use Thread.VolatileRead/Write methods instead of declaring variable as volatile for better performance. This is contrasting to conclusion from this article, so I don’t decided which direction is better (despite my personal respect to Jeffrey).
If we talking about multithreading, here are couple bullet points for thinking in spare time:
-
According to .NET specifications, reads and writes to/from variables (even non volatile) are atomic (when memory is properly aligned). However, in contrast with volatile variables, other CPUs not guaranteed to see changes “in real time”. Here is nice article explaining .NET 2.0 memory model and phenomenons such as movement of reads and writes in time.
-
Operations such as increament are not atomic: to perform it CPU has to follow these steps: 1- Load a value from an instance variable into a register, 2- Increment (or decrement) value, and 3 - Store the value in the instance variable. The same is true for volatile variable.
-
It is possible to create concurrent (thread safe) objects and collections without using even single lock (just using volatile and CAS)! Say goodbye to deadlocks, priority inversions, and performance bottlenecks! There are many research on this subject, here are some nice links (I sorted them from basics theoretic to advanced practical):
-
Lock-free and wait-free algorithms - basic idea and definition from wikipedia
-
Software transactional memory - another interesting idea: access and modify shared objects by splitting operations to separate transactions (something like in DB)
-
Here one guy explaining idea of synchronization using CAS operation.
-
Multi Processor Progamming cource from Tel Aviv University - very cool slides with detailed explanations and examples on Java
-
University of Cambridge - including Lock-free library (based on STM idea, implemented on Linux C++). Also I highly recommending to read Concurrent programming without locks paper – extremely interesting!
-
Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms- by Maged M. Michael and Michael L. Scott
-
SXM- C# Software Transactional Memory (STM) by Tim Harris. Unfortunately the licence is not permitting commercial use (yet).
-
NSTM - C# STM by Rulf Suderbucher (Open Source, New BSD License)
-
DSTM2 - Java Software Transactional Memory.
-
java.util.concurrent namespace.A Java library developed by Lea Doug, which is including everything you may need in concurrent programming: free lock hash table, lists, sets, java implementation of M.M. Michael queue and many many more. The full JDK code could be downloaded from here. I am not a lawyer, but as much I understand, this open source code could be modified but is not allowed to be ported to any other language but Java
-
- Is it just me or .NET has a HUGE gap to fill for concurrency support compared to Java? Last point: Microsoft preparing to release in .NET 4.0 library for parallel computations. It will provide API, which will allow to execute separate calculations on different processors. Couple month ago they released preview (not for commercial use yet). The problem is when these calculations performed on shared memory, you still have to synchronize it – and here creating performance bottleneck. Anyone beside me see how to solve that problem?
- Update (11.03.08): Here is one additional product: Intel Threading Building Blocks (TBB), which includes concurrent collections and parallelism API (Microsoft doing now something similar). Language – C++. Available in commercial and Open Source versions.
Add comment March 7, 2008
Consumer/Producer application
I think, everyone who working (struggling) with multithreading, at some point had to implement consumer/producer application. Accidentally I found such an app in MSDN -> here. It implemented with circular buffer, using c++ (but you can easily port it to any language you want).
Add comment March 5, 2008