August 2007 - Posts
Here's another nasty one that tried to bite me today. Let's say I have the following classes:
public class COMPlusClass : ServicedComponent
{
public COMPlusClass()
{
// Default initialization.
}
public COMPlusClass (string data)
{
// Parameterized initialization.
}
}
public class Client
{
public Client()
{
COMPlusClass cpc = new COMPlusClass("I am a client");
}
}
Which constructor do you think will be called?
Naturally, I wouldn't ask if it was the second one. Much to my surprise, the first constructor was called when I instantiated the COM+ component.
Why is this? Because COM+, due to its COM heritage, requires all components to expose a default, public, parameterless constructor and always initializes through it.
Why didn't I get an error, then? Because I'm writing .NET classes that don't know in advance that they will be instantiated through COM+. If all I know of COMPlusClass came from a COM type library, the second ctor probably wouldb't be exposed. But since I see it as a .NET class, I can believe I am instantiating the parametrized constructor. COM+ probably discards my precious string along the way.
Solution? Add some sort of Initialize method to pass the construction data.
By a staggering margin, most of my problems integrating C++/CLI code into my C#-based project has been deployment problems. Without fail, every integration or test deployment will be plagued with inexplicable problems. I'll try to list a few, with their causes and (probable) solutions:
1) .NET assemblies are usually very loose in what they'll bind to. If I compiled A.DLL against B.DLL version 1.0.0.0, it will still work if it can only find B.DLL version 1.2.0.0. I don't have to rebuild and reversion everything every time. However, if my C# A.DLL has a reference to my CPPCLI.DLL v1.0.0.0, and I then put CPPCLI.DLL v1.1.0.0, I'll get an exception saying the referenced DLL wasn't found. Annoying.
Solution: If you're changing the version number of a C++/CLI assembly, make sure to recompile any assembly that references it.
2) C++/CLI can be compiled as a pure .NET assembly, just like C# or VB. I don't really see the point in that, unless I'm a C++ coder and refuse to switch syntax. There isn't a great deal of benefit to that over writing in C#, because I can't reference unmanaged code easily. The standard mode, /clr, allows me to mix both managed and unmanaged code. The problem is that a mixed-mode assembly has a dependency on the Visual C++ Runtime libraries. This means that if the machine I am installing on doesn't have it, I'll crash. Furthermore, if I developed my code on a machine running VS2005SP1, my MSVCRT version will be 8.0.50727.762, but on a machine with vanilla .NET2.0 and the corresponding runtimes, I'll just find 8.0.50727.42. Result? An inexplicable "FileNotFoundException", not saying what's missing.
Solution: Make sure your deployment scenario includes installing the relevant MSVC++ Runtime. VS2005 comes with a Merge Module (msm) so you can add it to any MSI project.
3) The usual reason to add C++/CLI code to a C# project is to act as a bridge to an unmanaged API. Usually this API is composed of various DLLs. These DLLs can come in many forms - some are statically linked to their respective runtimes, some link dynamically. Some expect other DLLs in the same directory, others expect them in the PATH. Many times we'll get cryptic Module Load Failed errors or FileLoadFailedExceptions.
One tool we have for this is Filemon and Procmon, SysInternals' wonderful debugging tools. Open them up, add a filter to weed out the noise and see what files receive a FILE_NOT_FOUND error.
Another indispensible tool is the Dependency Walker (depends.exe) that is shipped with Visual Studio (C:\Program Files\Microsoft Visual Studio 8\Common7\Tools\Bin). This tool will show you all the DLLs that our file has a dependency on. First thing to look for is any dependencies highlighted in red, underneath the tree. These are our missing buddies. Ignore anything that has a little hourglass next to it - these are delay-load modules and probably not the source of the problem.
One thing to note when opening our C++/CLI DLLs in Depends is the dependencies on the MSVC++ Runtime. Look for a dependency on "MSVCP80.DLL". You might find instead "MSVCP80D.DLL" - this means that we're bound to the Debug build of the runtime, which probably doesn't exist on our servers. This usually means that we've compiled the project in DEBUG build, rather than RELEASE.
One last indispensible tool is trusty old Reflector. Open our managed A.DLL that contains a reference to CPPCLI.DLL, and we can see what version it was referenced against. This can help us find problems like I mentioned in section 1.
There, that's all I can think of at the moment. Hope it gets some googlejuice and helps someone stuck as I am.
Today, I had a very tight deadline to achieve a very simple task: pass a managed .NET string to an API function that expects a null-terminated char*. Trivial, you would expect? Unfortunately it wasn't.
My first though was to do the pinning trick that I mentioned in my last post, but in this case I needed my resulting char* to be null-terminated.
Second thought was to go to the System.Runtime.InteropServices.Marshal class and see what it had for me. I found two contenders:
1) Marshal::StringToBSTR() - this creates a COM-compatible BSTR. I found various tutorials about BSTRs saying that they MIGHT be, under SOME circumstances, compatibles with zero-terminated wide-character strings. Didn't seem to be safe enough.
2) Marshal::StringToHGlobalAuto() - this allocates a block of memory, copies the characters in and even null-terminates it for me. This looked like a winner. Stage one was done - we managed to get an unmanaged string. But can we use it now?
The next problem was that StringToHGlobalAuto returns an IntPtr, and casting it to a char* led to a compilation error. The solution to that is either to cast the IntPtr to a (void*) before casting to (char*), or to do the same action by calling the IntPtr's ToPointer() method. The second option seems neater to those of us who like as few direct casts as possible - I'd rather my conversion was done by a method than by a possibly unsafe casting operation. I'm sure those more concerned with method-call overheads will disagree.
The next problem is that the result of this operation was a single character string - the first character from the expected string. C++ programmers who've struggled with Unicode will quickly spot the problem.
char* strings are null terminated - the first byte containing 00 is the terminator. For Unicode strings as returned by StringToHGlobalAuto, each character takes 2 bytes. If it's a character from the lower reaches of the Unicode spectrum, the second byte, being the high-order byte, will usually be 00, thus terminating the string. There are two options:
1. Instead of char*, use wchar_t* - wide, 2-byte character string, terminated by two null bytes.
2. use StringToHGlobalANSI, which converts the string to standard 8-bit ANSI encoding. This should be used only if we know we can't receive any unicode characters, or (as in my case) when the API we call only acceptts char*. :(
So that's a bit more C++ that haunted and taunted me today. See you next time.
Recently I've found myself stumbling around some C++/CLI code. C++ is a language which I learned years ago and never really worked with seriously, so I've been cursing and moaning as I worked. Strange for me to go back to a (partially) unmanaged environment now, with all sorts of assumptions that I have proven to be false. I'll try to go over some pitfalls and insights I'm having during the visit. This is the first:
The Garbage Collector really spoiled me. I'm not talking about deleting what I instantiated, but all sorts of subtler bugs related to going out of scope. For instance, I had code like this that receives a managed, .NET byte[] and turns it into an unmanaged char*:
void DoSomething(array<Byte>^ data)
{
// Pin the managed array to an unmanaged char*.
pin_ptr<Byte> pinnedBuffer = &data[0];
char* buffer = (char*)pinnedBuffer;
// Now do something with the unmanaged buffer.
}
Simple enough, but then I found myself needing it in a different method too. My first instinct, of course, is refactor those two lines into a new method to do the array conversion, especially since I had a couple of lines of parameter validation there too. But my new GetUnmanagedBuffer(array<Byte>^) method wasn't as simple as I hoped. This is because the pinnedBuffer object that I created to prevent the GC from moving the managed array went out of scope when the method exited, and by the time I used the buffer in my unmanaged code, the data wasn't sure to be there anymore.
In the managed world, we're used to one of two scenarios when returning function parameters: either's a value type that's copied completely, or a reference type that passes a managed, GCed reference. In both these cases, we know that returning any object from a method is a safe operation.
Additonally, we in managed land are used to refactoring being a safe operation, in most cases. If I don't have to worry about scope, I don't worry about extracting any block of code into its own method. In C++, however, I have this constant uncertainty. It will pass with exprience, but I still feel much less safe than I would moving to a different, managed environment.
This was the first installment of Tales from the Unmanaged Side. I'll see if I have anything else to say the more I work on this
A while ago, I wrote about a simple pattern to allow us to put a timeout limitation on a long running operations. Simply put, it allows us to replace this:
public MyObject GetData()
{
try
{
MyObject data = BigComponent.LongRunningOperation();
return data;
}
catch (Exception ex)
{
// Log and rethrow.
throw;
}
}
with this:
public MyObject GetData()
{
try
{
MyObject data;
Thread t = new Thread(
delegate()
{
data = BigComponent.LongRunningOperation();
});
t.Start();
bool success = t.Join(timeoutForOperation);
if (success)
{
reutrn data;
}
else
{
throw new TimeoutException();
}
return data;
}
catch (Exception ex)
{
// Log and rethrow.
throw;
}
}
This pattern, while simple at first, introduced a bug into my design that in retrospect should be glaringly obvious. The bug is that although my component previously caught exceptions in LongRunningOperation, this method is now called in a different thread and not handled by my try/catch. This is hard to see when using anonymous delegates, since you get the feeling that it's still a part of the parent method.
In .NET 2.0, the default behavior for an unhandled exception in a thread is to abort the whole process. In my case, it was an out-of-process COM+ service doing some heavy data crunching for us, and the result was a pop-up(!) on the server complaining that the COM Surrogate crashed. Took me a while to figure out what I did wrong.
One way of solving this is to simply wrap the code i nside the delegate with try/catch and swallow the exception. But that way I lose the information about the inner exception, which I want to propagate upwards. What I ended up doing is somewhat uglier, and involves passing the exception backwards the way I did with the data:
MyObject data;
Exception exceptionInProcess;
Thread t = new Thread(
delegate()
{
try
{
data = BigComponent.LongRunningOperation();
}
catch (Exception ex)
{
exceptionInProcess = ex;
});
...
if (exceptionInProcess != null)
{
throw exceptionInProcess;
}
More Posts