February 2004 - Posts

.NET in Terminal Server (or Citrix) Scenarios

Please note item #6 in Chris Brumme's blog on Finalization in .NET from this weekend:

"6) Since we dedicate a thread to calling finalizers, we inflict an expense on every managed process. This can be significant in Terminal Server scenarios where the high number of processes multiplies the number of finalizer threads."

Posted by PaulWilson | 1 comment(s)

Leak Found in .NET ContextMenu

My colleague Doug Ware of Magenic found another leak in the .NET framework this week.  It seems that every time a MenuItem is added to a ContextMenu in .NET WinForms, it adds a reference to the MenuItem in an internal static hashtable.  This may not be “wrong” if you only setup your context menu one time, but if you do things dynamically it means that all the old context menuitems are still technically reachable.  To make things worse, the menuitems contain a reference to their parent, which is often a form, which might have references to large objects like datasets or images.  Of course, once you find the problem and know what to look for, you can always find other mentions of it already on the net, along with the work-arounds.  Its again frustrating though that something that's potentially so common, with a big impact too, which has been reported to Microsoft for quite some time apparently, is still not noted in any official Microsoft list that I can find.  So if you are using ContextMenus -- beware -- and get the .NET Memory Profiler to help you find your own similar leaks.  By the way, Doug is really getting good at this -- I found a leak in our own code today that I probably wouldn't have found before helping him recently -- although he still found the solution before I could.
Posted by PaulWilson | 4 comment(s)

.NET GC is Excellent -- Better Than Java

I've received a couple of emails lately due to my postings where I've apparently made some people worried about .NET.  I want to make it very clear that I am very very impressed with the .NET Garbage Collector -- there is no need to be “worried”.  I've built enough systems, both before .NET and with .NET, to have experienced the blessings of the .NET GC -- they are real.  I have yet to build an application in .NET that leaked any memory in the old sense that it was not even reclaimable after the app was terminated, except in cases where a 3rd party control had leaks due to unmanaged code.  Yes, there are a few cases I've seen where .NET will leak memory, but only in the newer sense that your application's footprint keeps growing while it is still running -- it was always freed upon termination which means the GC did work.  Its also possible to have these types of leaks in your own code -- we found one ourselves today where we had an object chain that included an object that was included in a static collection.  I've also been very frustrated that the experts, including Microsoft people, have made simplistic statements that are just wrong in some cases -- one person that emailed me told me that the MS consultants at his site assured him that I was wrong about setting objects to null.  I'm also very frustrated that we actually have to keep refinding the leaks in the framework that have been found before, but which Microsoft still does not list anywhere on their own site -- not even in the partner level KB that I have access to as an MVP.  But none of this means that the .NET Garbage Collector is flawed -- its not -- it works very well, although it makes some assumptions that we developers need to understand so that we can work with it, or even help it in some cases.  Its also true that GC in general is lazy, so there are probably some systems that should not be based on such systems, although those are probably few and far between (although maybe I have one).

Now I also titled this post “Better Than Java” -- why?  I also had a couple of people email me that Java didn't have these problems -- specifically they said they didn't have to worry about Dispose methods, or setting objects to null, or even memory leaks of any kind!  Well all I can say is that anyone making such claims is living in a fantasy world, and is hopefully only writing small department applications instead of large enterprise systems.  Its actually quite easy to google on things like Garbage Collect, leak, Dispose, null, or other similar terms, along with Java, and find all sorts of similar issues that Java has.  Neither Java nor .NET developers “need” to set objects to null, but both have situations where it can help.  Java also has leaks in some of its framework, and its possible for Java developers to create their own, just like .NET -- they also go away when the application is terminated.  As for the Dispose method, I learned that this pattern is commonly used and accepted as necessary in Java, but it was never officially made part of Java itself, which is one place .NET is actually better since they learned from Java's experience.  Many Java frameworks do have the dispose method, but its simply not defined in Java itself, so you can imagine there are lots of lesser experienced developers that don't realize how important it is.  To me this just underscores the importance of calling Dispose when its defined, whether we think it does anything or not -- it was added for a reason and Java is severely lacking it as a standard.  Another place where .NET is better than Java, as far as the GC is concerned, is that .NET defines the basics of the GC once and for all, even though various implementations (Rotor, Mono, DotGnu) of the framework might have slightly different internal implementations of the GC.  Java doesn't have any GC standard at all -- it simply says that it should be garbage collected -- which makes it rather difficult to properly understand or help the GC in memory critical systems -- so much for cross platform.

So please understand that I may point out issues and concerns and frustrations with .NET and its GC, but that should not be taken as a condemnation.  It simply reflects that we need to understand more about the GC and the other internals of the framework we use in order to develop anything but the smallest of apps.  And don't even mention my posts as a reason to use Java -- that's ridiculous.

Posted by PaulWilson | 3 comment(s)

.NET GC Best Practice -- ALWAYS Call Dispose

Its very common practice to not call Dispose on a lot of .NET objects, like DataSets or SqlCommands for instance.  There's even been several discussion on these blogs about the best practices in some of these cases.  The problem is that many of us “know” that calling Dispose on some objects actually does nothing underneath -- it simply exists due to an inheritance chain that included the IDisposable interface.  So many experts (myself included) have gotten in the habit of writing the most “efficient” code, which often means the fewest lines necessary.  So why call Dispose if we know it does not do anything?  As my colleague Doug Ware pointed out to me, this assumes some internal knowledge that should not be relied upon and which technically could even change in the future.  Instead, we should remember that the IDisposable pattern was implemented for a reason in general, and so we should follow it or risk being in error at some point.  In other words, what's the harm in writing this extra line of code that “might” have significant reasons for existing.

.NET GC Myth #2 -- The GC Frees Memory

.NET GC Myth #2 -- The GC Frees Memory

.NET Garbage Collection Myth #2:

The .NET Garbage Collector frequently removes unreferenced objects and frees their memory for other processes.

Fact:

The .NET Garbage Collector frequently removed unreferenced objects and reserves their memory for the same process.

Details:

Yes, the .NET GC is frequently removing your objects that are no longer referenced (actually unreachable is the term), but that's not the whole story.  The .NET memory manager grabs memory in large chunks and reserves them so that it doesn't have to keep getting lots of small blocks of memory very often.  This is a very good practice, especially when there is plenty of memory available, since getting memory allocated from the OS does take time.  This means that your application may start with 16MB that is dedicated to it, even though its only using 2MB -- note that the number may be different on your system.  Then when you finally get to needing 17MB, it will go out and grab another chunk, so that now you have 32MB (or some other number again) reserved for your application.  Now, what happens when the GC frees up memory and you now are back to only needing 2MB?  The GC may have removed your objects, and reduced your committed memory, but the reserved memory will not get reduced since it makes the reasonable assumption that you may need it again.  Again, this is a great principal for most systems on today's computers with lots of memory, but there are situations where you need to be aware of this and manage it accordingly.  The situations I'm thinking of are when there are multiple systems all competing for the same total memory, like my Citrix cases recently, or web servers when each web app is configured as a separate appdomain, or your individual computer if you run lots of programs at once.  How can you better manage for this?  By helping the GC remove objects as quickly as possible so it doesn't have to keep reserving more memory.  You can do this by keeping as many objects local to methods as possible, and also by setting objects to null if they are relatively large and you're done with them.  Finally, when does memory actually get freed?  I can't say conclusively that this is the only occassion, although it may be, but the OS sends a low memory resource notification event that .NET applications pick up -- when there is only 32MB of available RAM on a system.

Posted by PaulWilson | 5 comment(s)

.NET GC Myth #1 -- Set Object to Null

.NET Garbage Collection Myth #1:

There is never any reason to setting your objects to null (or nothing in VB) since the GC will do this for you automatically.

Fact:

There are times where setting your objects to null (or nothing in VB) can make huge differences in your memory footprint.

Details:

Yes, you do not “need” to ever set an object to null since the GC will eventually collect your object after it goes out of scope.  For small objects that have a short life this is great, and its exactly why GC is a great thing.  But there are exceptions, and the experts aren't helping by ignoring these cases by simply making blanket statements.  I have an example you can download and try for yourself, as well as several charts from the example, that illustrates one of these cases -- see my last post for more details.  This particular case involves a large object that is replaced by another large object, which leaves you vulnerable to having two such large objects in memory if you don't first set the existing object to null.  Why?  Mostly because the first large object never goes out of scope until it is replaced by the second, so the GC doesn't know you're done with it early enough.  I've had some very well informed people tell me that setting the object to null in the middle of a method is meaningless -- and I believed them after hearing it so many times -- that is until I saw it for myself, thanks to Doug Ware of Magenic.  My sample app also calls the Clear method, and Dispose when its defined, but actually my more detailed tests showed they did not make any difference, but the set to null made a huge difference.  Now some people have told me this example is flawed because you could design the app in such a way so that the first large object could go out of scope before creating the second large object.  This criticism is partially valid -- my example was concocted to show the worst case, which I also don't think is all that uncommon in many winform applications with datagrids.  But I also had an earlier version of the example which did have this other design, and it also showed a significantly smaller average memory footprint when the objects were set to null, although not as drastic as the example noted above.  So what's my point?  Just that while you may not “need” to set your objects to null, there are some cases where you can realize significant gains by being preemptive and setting your large objects to null when you know they are no longer needed.

.NET Memory Management and Garbage Collection

Newbies to .NET often criticize its memory management and its garbage collector,
but the criticisms are typically based on a lack of understanding and nothing more.
For instance, many have observed that trivial .NET applications allocate about 20MB,
but the incorrect assumption that often follows is that the .NET runtime needs it.
Instead, we know that allocating memory is time-consuming, so its generally better
when there is lots of memory to just go ahead and allocate a big chunk all at once.
This means that our .NET applications can actually perform better in many cases,
since they don't have to be constantly allocating and deallocating needed memory.
Similarly, the garbage collector is often criticized by newbies to .NET unfairly,
because the implication of GC is that memory is not released as quickly as possible.
While this is true, its generally an acceptable trade-off with plenty of memory,
since garbage collection frees us from having to worry about memory ourselves.
That's right, we know that GC means we easily end up with more reliable systems,
with fewer memory leaks without investing tons of time managing memory manually.

So what's the point of this post?  Sadly, not all systems are such simple cases.
I've been working with a large .NET WinForms application that uses lots of very
large objects (datasets actually), which means that memory is often running low.
To make matters worse, this application is supposed to run in a Citrix environment,
which means there will be multiple instances of this application at the same time.
Our Citrix servers have 4GB of RAM and dual processors, which sounds like a lot,
but that memory and horsepower has to be shared among many concurrent users here.
The existing Access-based applications we are replacing ran in this environment,
and it was typically common to have up to 40 users working on the same Citrix box.
Its easy to assume, as have the business people, that since .NET is new and better,
then we can surely get 40 users on the same Citrix box with .NET applications too.
Well, it isn't going to happen -- at this point we'll be happy if we can get 20,
and prior to finding the bug in the XmlSerializer we wondered if 10 was possible.
Don't get me wrong, there are issues beyond just .NET here, like the rationale
for working with such large datasets, but then Access made it "seem" easy to do.

So what have I learned so far?  First, .NET global memory performance counters
do NOT work -- they simply report the last sample from a single .NET application.
Next, contrary to what I've often read and been told, by many .NET experts too,
setting objects to null (or Nothing in VB) can make a huge difference afterall!
Note, you can download a simple application I've created to see this for yourself
-- it tracks single or multiple processes, with or without the set null cleanup.
My sample creates an approximately 100MB large object (a dataset or an arraylist),
but the process private bytes quickly level at 200MB, and even 300MB at times.
On the other hand, setting my objects to null keeps the private bytes level at
100MB
, although certainly there are still the momentary expected spikes to 200MB.
This may not matter for small footprint apps, but it can be quite critical for
large applications, let alone cases where there are multiple such applications.
My sample also calls the Clear method, and Dispose when its defined, but separate
tests actually showed they actually did not make a difference -- just set to null.
That said, a colleague has convinced me that not calling Dispose when its defined
is just asking for trouble, afterall why is it defined if not for a good reason.
A look at the source code can even prove some Dispose methods do nothing at all,
but that's an internal implementation detail that is not good to assume in use.

OK, so what else have I learned?  .NET relies on a low memory notification event,
which occurs when there is 32MB of RAM left on systems with 4GB of RAM or less.
I need to be very careful here, since I don't fully understand it myself still,
but it seems that memory management and garbage collection are different things.
Garbage collection occurs rather frequently, which reduces the committed bytes,
but that does NOT mean the memory is given back to the OS for other processes.
Instead, the reserved memory associated with the process stays with the process,
which is why my earlier example often found it acceptable to rise even to 300MB.
Apparently, the reserved memory is only given back to the OS to use for other
processes when the overall available system memory drops to this 32MB threshold.
For a single process, especially one that also sets its large objects to null,
this isn't really an issue -- there is lots of memory and no competition for it.
But with multiple processes, each consuming large objects, this can be an issue!
That's right, imagine my Citrix case with just a dozen users, which isn't many.
A couple processes do large operations, and even after their garbage collection,
they continue to tie up large amounts of memory reserved for their future use.
A litte later a couple of other processes begin large operations, and reduce
the overall memory to below 32MB, at which point the notification event occurs.
The problem is that the new operations can't wait, so paging quickly increases,
and with multiple processes the paging can begin to overwhelm everything else.
So in my opinion the 32MB threshold is simply too late for systems like this!

It can probably be argued that this is an OS problem, and not a .NET problem,
but I think they are related since garbage collection must first free memory.
I ran just three instances of my simple application on my own 512MB computer,
and when the objects were not set to null it became swamped almost immediately.
I made it drastically better when I did change to setting my objects to null,
but make another process or two still swamps the system very quickly anyhow.
True, I won't be running multiple users on my personal computer anytime soon,
but I should be able to run multiple applications on it at the same time, right?
My conclusion is that .net makes it very difficult to do multiple applications
that handle large objects, whether on Citrix or just your own 512MB computer.
The garbage collector that is great for smaller and/or single applications is
just too lazy when combined with this 32MB low memory notification threshold.
Again, there's no doubt that my particular scenario should have been designed
differently, with some type of limits to avoid so many large objects in memory.
But there's no doubt that this system would support more users if it was NOT
having to work with the lazy garbage collector, as our older systems did so.

Oh well, I've went on long enough, and probably said some foolish things too.
My frustrations are not so much about .NET, since I think its great for the proper
scenarios, which are most cases -- my frustrations are that these types of issues
are simply not documented, and there is too much incorrect or misleading data.
This should not stop you from creating most of your applications with .NET still,
but I would very much like some “real” advice on the larger scenarios like this.

Leak Found in the .NET XmlSerializer

We finally found another leak in our Citrix WinForms app.  Early tests show this one to be huge, so we are optimistic that this will clear up most of our problems.  Apparently the XmlSerializer creates dynamic assemblies, but they only get reused if you stick to the simplist constructors.  We are using one of the more complex constructors, so we were essentially leaking dynamic assemblies since .net provides no way to remove assemblies from an AppDomain.  I can't take credit for finding the problem, one of my colleagues here did that, but I found the solution -- just create it once and reuse it.  Its also interesting to note that now that I know what to look for, its easy to find other people that have ran into this before, like Joseph Cooney and Scott Hanselman.
Posted by PaulWilson | 6 comment(s)

Global .NET Memory Performance Counters Do NOT Work

I discovered today that the _global_ .net memory performance counters simply do NOT work.  They don't tell you anything at all about the sum total of all your .net processes.  They instead only report the last sample that was collected, regardless of the process.  This is not at all documented as far as I can tell.  In fact, the MSDN docs specifically go out of their way to say this is the behavior of only the 3 counters that track the total number of collections, implying that the others work fine.  I was heavily using the counters about the total number of bytes that .net had in its various heaps, and this is the correct usage according to the books and articles I've seen.  But they clearly don't work since you can set counters for each individual process and compare them to the global counter for yourself.  Its very frustrating to find out that all of my performance tests have been for nothing, since they assumed that the performance counters were reliable.  It also begs the question that I'm still having problems actually tracking down in Citrix (or Terminal Server) -- does the .net garbage collector understand that there are other processes also running at the same time?  Everyone without Citrix experience trys to tell me that of course the .net gc works right in Citrix, but it seems that the few other people that are trying keep having the same questions.  And now its apparent that the global .net memory performance counters are unaware of multiple processes, so . . . ?  By the way, someone from Microsoft noted that the .net gc listens to the low memory notification event to know when it needs to work.  But guess what -- the default setting for this is that your memory is low when you only have 32MB left on a 4GB server!  There's also nothing I can find anywhere that tells you how to change this default setting to something more reasonable.  That number sounds too low in any setting, but imagine a Citrix server with many users all having processes open -- when there's only 32MB left it will be far too late to do anything without severely impacting performance.

Here's another question for you .net gurus -- why does a recordset that is only 4MB in query analyzer take 24MB in a .net datatable?  And why is there another 3MB that is lost in total system available memory that isn't reflected in the .net memory performance counters for that process?  Similarly, a 7MB recordset takes 40MB in a .net datatable and a 10MB recordset takes 65MB in a .net datatable, and each always has another 3MB that seems to go missing according to the counters.  I realize there is more to a datatable than just data, and that .net allocates a large chunk than it needs, but this seems a little excessive.  By the way, this is all from testing on .net v1.1 on Windows 2003 Enterprise Server.

Posted by PaulWilson | 8 comment(s)

More Forms and Windows Security in ASP.NET

Ryan Dunn and I have been having a dialog on the ASP.NET Forums about my recent article on Mixing Forms and Windows Security in ASP.NET.  He has another technique that attempts to do something similar here on GotDotNet and he very much disagrees that my solution is sufficient.  Basically, my solution only demos how to combine Forms and Windows Authentication to automatically capture an Intranet user's name.  His method instead combines Forms and Windows Authorization by creating a WindowsPrincipal that roles can be checked against.  I apologize if someone thinks I've misled them since my article did not go all the way and illustrate the combined Authorization also, so I'm attaching the small amount of code, based on Ryan's work, that will create the WindowsPrincipal and complete the example.

Please also note that Ryan's technique is not at all sufficient, and is thus very misleading, since it does not actually do any real Windows Authentication!  It makes an assumption that all users within a certain IP address range are valid users -- which is not at all true if your network allows visitors to plug into the network and access the Intranet.  This means that visitors will automatically get access to your applications that use this technique and don't then check for an additional role.  It will also actually prevent such visitors from ever logging in with the alternative custom login form to prove they have the roles in the custom scenario you worked hard to create.  So, here's the necessary code, not very “clean” since its just a quick example, to complete my technique by creating a real WindowsPrincipal for Windows users:

Change the entirety of WinLogin.aspx's Page_Load method to:

IServiceProvider service = (IServiceProvider) this.Context;
HttpWorkerRequest request = (HttpWorkerRequest) service.GetService(typeof(HttpWorkerRequest));
this.Response.Cookies.Add(new HttpCookie("UserToken", request.GetUserToken().ToString()));
string userName = this.Request.ServerVariables["LOGON_USER"];
FormsAuthentication.RedirectFromLoginPage(userName, false);


Then add the following to the Global.asax's Application_AuthenticateRequest:

else if (this.Request.Cookies["UserToken"] != null) {
    string token = this.Request.Cookies["UserToken"].Value;
    IntPtr userToken = new IntPtr(int.Parse(token));
    WindowsIdentity identity = new WindowsIdentity(userToken,
        "NTLM", WindowsAccountType.Normal, true);
    HttpContext.Current.User = new WindowsPrincipal(identity);
}

You can make this “cleaner” (i.e. more secure) by including the userToken into the FormsAuthentication cookie's UserData so that it gets encrypted, instead of being a separate cookie as I've done here.

More Posts