.NET Memory Management and Garbage Collection

Newbies to .NET often criticize its memory management and its garbage collector,
but the criticisms are typically based on a lack of understanding and nothing more.
For instance, many have observed that trivial .NET applications allocate about 20MB,
but the incorrect assumption that often follows is that the .NET runtime needs it.
Instead, we know that allocating memory is time-consuming, so its generally better
when there is lots of memory to just go ahead and allocate a big chunk all at once.
This means that our .NET applications can actually perform better in many cases,
since they don't have to be constantly allocating and deallocating needed memory.
Similarly, the garbage collector is often criticized by newbies to .NET unfairly,
because the implication of GC is that memory is not released as quickly as possible.
While this is true, its generally an acceptable trade-off with plenty of memory,
since garbage collection frees us from having to worry about memory ourselves.
That's right, we know that GC means we easily end up with more reliable systems,
with fewer memory leaks without investing tons of time managing memory manually.

So what's the point of this post?  Sadly, not all systems are such simple cases.
I've been working with a large .NET WinForms application that uses lots of very
large objects (datasets actually), which means that memory is often running low.
To make matters worse, this application is supposed to run in a Citrix environment,
which means there will be multiple instances of this application at the same time.
Our Citrix servers have 4GB of RAM and dual processors, which sounds like a lot,
but that memory and horsepower has to be shared among many concurrent users here.
The existing Access-based applications we are replacing ran in this environment,
and it was typically common to have up to 40 users working on the same Citrix box.
Its easy to assume, as have the business people, that since .NET is new and better,
then we can surely get 40 users on the same Citrix box with .NET applications too.
Well, it isn't going to happen -- at this point we'll be happy if we can get 20,
and prior to finding the bug in the XmlSerializer we wondered if 10 was possible.
Don't get me wrong, there are issues beyond just .NET here, like the rationale
for working with such large datasets, but then Access made it "seem" easy to do.

So what have I learned so far?  First, .NET global memory performance counters
do NOT work -- they simply report the last sample from a single .NET application.
Next, contrary to what I've often read and been told, by many .NET experts too,
setting objects to null (or Nothing in VB) can make a huge difference afterall!
Note, you can download a simple application I've created to see this for yourself
-- it tracks single or multiple processes, with or without the set null cleanup.
My sample creates an approximately 100MB large object (a dataset or an arraylist),
but the process private bytes quickly level at 200MB, and even 300MB at times.
On the other hand, setting my objects to null keeps the private bytes level at
100MB
, although certainly there are still the momentary expected spikes to 200MB.
This may not matter for small footprint apps, but it can be quite critical for
large applications, let alone cases where there are multiple such applications.
My sample also calls the Clear method, and Dispose when its defined, but separate
tests actually showed they actually did not make a difference -- just set to null.
That said, a colleague has convinced me that not calling Dispose when its defined
is just asking for trouble, afterall why is it defined if not for a good reason.
A look at the source code can even prove some Dispose methods do nothing at all,
but that's an internal implementation detail that is not good to assume in use.

OK, so what else have I learned?  .NET relies on a low memory notification event,
which occurs when there is 32MB of RAM left on systems with 4GB of RAM or less.
I need to be very careful here, since I don't fully understand it myself still,
but it seems that memory management and garbage collection are different things.
Garbage collection occurs rather frequently, which reduces the committed bytes,
but that does NOT mean the memory is given back to the OS for other processes.
Instead, the reserved memory associated with the process stays with the process,
which is why my earlier example often found it acceptable to rise even to 300MB.
Apparently, the reserved memory is only given back to the OS to use for other
processes when the overall available system memory drops to this 32MB threshold.
For a single process, especially one that also sets its large objects to null,
this isn't really an issue -- there is lots of memory and no competition for it.
But with multiple processes, each consuming large objects, this can be an issue!
That's right, imagine my Citrix case with just a dozen users, which isn't many.
A couple processes do large operations, and even after their garbage collection,
they continue to tie up large amounts of memory reserved for their future use.
A litte later a couple of other processes begin large operations, and reduce
the overall memory to below 32MB, at which point the notification event occurs.
The problem is that the new operations can't wait, so paging quickly increases,
and with multiple processes the paging can begin to overwhelm everything else.
So in my opinion the 32MB threshold is simply too late for systems like this!

It can probably be argued that this is an OS problem, and not a .NET problem,
but I think they are related since garbage collection must first free memory.
I ran just three instances of my simple application on my own 512MB computer,
and when the objects were not set to null it became swamped almost immediately.
I made it drastically better when I did change to setting my objects to null,
but make another process or two still swamps the system very quickly anyhow.
True, I won't be running multiple users on my personal computer anytime soon,
but I should be able to run multiple applications on it at the same time, right?
My conclusion is that .net makes it very difficult to do multiple applications
that handle large objects, whether on Citrix or just your own 512MB computer.
The garbage collector that is great for smaller and/or single applications is
just too lazy when combined with this 32MB low memory notification threshold.
Again, there's no doubt that my particular scenario should have been designed
differently, with some type of limits to avoid so many large objects in memory.
But there's no doubt that this system would support more users if it was NOT
having to work with the lazy garbage collector, as our older systems did so.

Oh well, I've went on long enough, and probably said some foolish things too.
My frustrations are not so much about .NET, since I think its great for the proper
scenarios, which are most cases -- my frustrations are that these types of issues
are simply not documented, and there is too much incorrect or misleading data.
This should not stop you from creating most of your applications with .NET still,
but I would very much like some “real” advice on the larger scenarios like this.

18 Comments

  • SOme good info there on how .net works with memory ahd GC.



    I think you have verifyied some things I have told folks over on GDN.com at least a few times.



    but I think the question was posed there:



    why run 40 users thru meta-frame to run a .net app??



    it seems to me (untill I would know more details) that a better design could be built?



    are the 40 users unable for some reason to run .net software on the pc's that use meta-frame?



    even if this is the case I think you might see if you could perhaps run a second server and have it do some form of "Pooling" and or "Cacheing" of data and objects thus taking a large load off the meta-frame box.



    3 tier design....

    handle the meta-frame as if it was a web-client and use a middle server as if it was a web server / app server....

  • Hmm, I personally used to write some data hungry .net winforms applications, and I think that it's not so good idea to have everything in memory as datasets. I used DataReader or save-query-to-disk techniques to handle such things. Anyway, finally I found out that it's much better to use web-service oriented architecture (like MS did with their business products like MS CRM). It's especially important in case Citrix or other terminal based scenarios, since you an "outsource" data processing activities to some other box(es)

  • Why use Citrix? First and foremost is security -- we don't want our data on individual machines. Its also a TCO issue to limit deployments and upgrades to centralized servers. While I would prefer a non-Citrix solution, its hard to argue with these valid business concerns.



    Why not a middle tier? In this case it would make things worse! The problem is not getting the data, its working with it in the rich client application. Our issue is that we've designed no limitations, so paging or limits on number of records do not exist. Our app depends on everything being loaded into a feature-rich 3rd party grid control, which means all the data has to be in memory in the client process. I'll agree that these design decisions are what is hurting us the most, but given the requirements . . .



    We've looked at save-to-disk techniques, and may yet go that way. The key is that any type of "paging" we do with the data must not slow down the user's experience.

  • "Our app depends on everything being loaded into a feature-rich 3rd party grid control, which means all the data has to be in memory in the client process."



    that sounds like the "Kiss of death" right there.....



    Paul, I have used sql databases with up to 5 million rows of data at a time....

    I NEVER had to give the client all 5 million rows.... the "WOrst case" was doing odd-ball searches for some set of rows ....



    but I never found that giving the client a set of less rows was a problem... as long as the app design worked with the user to give them a logical set that they expected to get based on what they were doing....



    also if 20-40 users are using the same app the question is: how much data is *exacly the same* for all 20-40 users? if they can edit the data then you have updates to deal with but ..... that can be done without re-inventing the wheel.



    I just think that perhaps some form of common data object could be created that would start each user with a reference to one common data object, this would have a "Local" object that looks at the global to read data....



    then an update would cause the local to write the data to the sql database and pass an event to the global to cause it to re-load changed rows, that would then push an even out the the local proxy objects to re-fresh the local views.



    under a model like that I can visualize 40 sessions having one common 100-600 meg data store and 40 local images of 1-10 changes to write



    seems like that could reduce local memory stress a whole lot.



    basicaly (this is .net data sets right??)

    you would create a new data-adapter and data-set

    objects that comply with the normal interfaces and extend them to work with a common memory based store.

  • I fully agree its not a good technical design -- "kiss of death" is appropriate.

    We currently have a 100,000 row limit in place -- and some users are quite upset.

    As for shared data -- there will be very little, if any, in most cases for this.

    Also, updates are not allowed to the original data, only to new copies of it made.

    I'm talking about multi-GB databases, some of them in the TB-range -- very large.

    Single users look at their "small" chunks, which might be 100MB, or maybe 1GB+!

  • Hi Paul,



    great data, I found myself in similar positions when researching 'best pratices' for .NET based systems, although not on Citrix, so I very much appreciate you sharing this information with us. Thanks!- I also wish that MS would provide some more info on these kind of things publicly as I am sure they have done tests on terminal server/citrix...



    One other question? - Why didn't you use a browser-based approach?



    Best regards,



    Marc

  • It makes perfect sense to nullify global objects that doesnt need to be lurking out there anymore.



    But this doesnt mean that you need to nullify each and every object; Its all about reachability. Your objects can not be collected when some code call still use/reach it.



    In other words, objects that are created and only used within a certain scope do not have to be nullified.

  • Marc:

    A browser-based approach would be nice, but it simply does not provide a rich enough GUI, nor fast enough performance with this type of data. Think about it this way, would you drop all of your everyday tools and move them to the web?



    Ryan:

    Sounds like a nice theory, unfortunately the reality I've looked at shows that even setting local scope objects to null "can" make a big difference. Again, it may not be "needed", so it may depend on the scenario, and I certainly don't see the need with small apps and less data. But for large memory intensive apps, you really do need and can help the GC by setting at least the large objects to null. I didn't believe it myself, until I saw it.

  • Ok very big data + grid + Metaframe



    Yuk!



    yes this is a "Rock and a hard place" indeed.



    last comment: smart-client?



    gets users out of meta-frame, keeps rich gui, spreads out memory footprint, keeps control of code and data central, clients code is updated on the fly by a server based code share.

    you then manage "data on the client" with some custom classes ....

    the only hard req. for this is all clients run WIn98 or later and get a 1 time install of .net framwork.

    as for row-limit's I'd tend to think a custom data class could make the data "Virtual" to the client.... just a matter of how you present the data and let them scroll thru it.

  • I agree that would be my choice too, but security concerns have overriden technical preferences.

  • Why not to run garbage collector after an user finished working with the data? Running garbage collector can be a bad idea for a server application, but for an user driven application it should not affect performance so much.

  • Paul, I have change some code of your test app like:



    private static void CreateArrayLists()

    {

    int arrCount = 2000;

    int max = 20000 * approxMB / arrCount;

    int count;

    if (random)

    {

    count = new Random().Next(max / 2, max);

    }

    else

    {

    count = max;

    }

    for ( int arr = 0; arr < arrCount; arr++)

    {

    ArrayList list = new ArrayList();

    for (int index = 0; index < count; index++)

    {

    list.Add(index.ToString());

    }

    }

    }



    When changing arrCount into 1 value, I get the same behaviour like your original testprogram.



    However when creating more arrays, the GC seems to do a better job.



    I do recall that the GC is handling objects with a large footprint differently than their lightweight counterparts. This behaviour was already "discovered" by other people.



    I hope someone with more insights on the GC than us will comment on this issue...

  • Per your memory management issues. It has been my experience that although the GC does a great deal of good work for you, i hope to see future releases be more preemptive in recapturing resources.



    I read in interesting article on the GAC and it did open my eyes a bit to how it worked. If I can recall the Microsoft Blogger I will forward it to you. What was interesting was how some of it worked. For example, when you create an object, it puts it on the heap and your variable is just a pointer to that space (nothing new here), but when you set it to null you are merely just informing the CLR that you are no longer interested in that chunk of memory.



    Furthermore, when a variable leaves scope, the GC has to first recapture the variable that holds the address to the chunk of memory. Then once all variables have been recaptured and all counters have been decremented to 0, it can reclaim the memory that the variables point to (this is a bit of conjecture on my part but I think I am right...if i am wrong someone please correct me). If all of this is true, then setting it to null must help the GC in its collection efforts because it does not need to recapture the memory the object variables are pointing to and the GC can more readily capture the underlying memory the objects were pointing to.



    My 2 cents anyway,

    -Mathew Nolton

  • ...continued



    ...this extra step of reclaiming the variable directly affects the GC's ability to reclaim the memory the variable points to and can cause the GC to fall behind in its efforts to reclaim memory under a heavy load. That is why setting it to null helps the GC...Again this contains some conjecture but we have run across similar issues of the GC falling behind.



    -Mathew Nolton

  • Sounds like a good explanation, short of hearing one from the clr team. Thanks.

  • Paul, it's interesting to read your comments. My organisation is in a similar situation (Citrix, Server, GC not releasing memory, the works).



    I have a couple of points so far:



    Firstly, I think the reason why nulling out your reference to the list shows a benefit is because you are first building up a new list, then assigning it to the reference. Of course there will be GC activity while building up such a big list, and as you're still holding the reference until the list is assigned, the GC will not be able to collect the memory.



    I believe you could change your code with one of the following methods to get the same results, which clarifies that it's not some mystical "We're telling the GC to collect it because when we set it to null, the GC knows to collect it" behaviour.



    Instead of making a new local ArrayList, then adding to it, returning and assigning it, you either assign the new ArrayList straight to the list variable, then build it, or clear it without a new assignment, and build it.



    In reference to Matthew Nolton's post, I am confused. As far as I'm aware, the .Net GC uses no reference counts, so if that's what you are referring to as "counters", I believe you are mistaken.



    I also disagree with the statement that the GC has to clear some objects out before it can clear out the objects that were referenced by those in the first clearing. The GC presumes all objects are garbage until proved otherwise.



    If an object is garbage, then the GC will never trace references from that object to any other object. Hence, if a garbage object references another garbage object, both are marked as garbage in one sweep, and collected at once.



    I would love to hear of any solutions that come up for the issue with the GC.



    Niall

  • Thanks,

    I will have to check it out. Richter always seems to get it right.

    -Mathew Nolton

  • Paul, great article. We are in a similar situation. We're running a large enterprise .Net app with datasets loaded into a 3rd party control suite (mostly grids) on an 8 box Citrix server cluster. Where our old VB6/Access app could run 20 concurrent users, the new .Net/SQL Server app has trouble with more than 5 or so. As a stop gap workaround we've used TrimWorkingSet to force deallocation of memory at certain key process points until we can put in more permanent and programmatically sound fixes with our next major release.

    Dim Procs As Process() = Process.GetProcesses()
    If Procs.Length > 0 Then
    For Each Proc As Process In Procs
    If Proc.Id = pidTarget Then

    Dim ipMax As New IntPtr(1000000)
    Dim ipMin As New IntPtr(500000)

    Proc.MaxWorkingSet = ipMax
    Proc.MinWorkingSet = ipMin
    End If
    Next
    End If

    Only issue here is that this has to be run under Admin credentials. I certainly hope your users don't have Admin rights on Citrix so options are a) shell out a new process with proper security context from the primary app or b) run as a timed service on each citrix box. I know this not an optimal solution by far but as a finger in the dike it seems to be working so far.


Comments have been disabled for this content.