Preferred CLR Development Techniques

Here's a list of some preferred CLR development techniques that come from my CLR Internals talk
last week at the
Dallas C# SIG.  Most of these are a collection from Jeffrey Richter's book and webcasts from Microsoft Sources such as Rico Mariani and Gregor Noriskin.  Thanks to them of course for their work.

Preferred CLR Development Techniques

 

Initialization

Consider the size of the assemblies being loaded on startup.  For typical applications, loading few large assemblies are preferred over many small assemblies to avoid the overhead of assembly finding, loading, and initialization. 

 

Small assemblies have their place with plugin architecture and internet base applications.  Internet downloaded assemblies should trickle to the user as needed or provide an installation.

 

Consider on-demand assembly loading and late assembly loading using reflection with an object variable.  JIT will prevent this and possibly counter the prejitted X86 code benefit.

 

There is no version checking for assemblies without strong names so don’t strong name the assembly if it will always be a private assembly to avoid version checking.

 

Preload forms on another thread to a cache such as a hashtable.

 

Rebase your assemblies after studying your memory footprints.

 

 

Operation

The foreach() iterator locks to get an iterator - use for() instead for CLR optimization.  Don't cache the size then go into the loop, the CLR will not be able to optimize it.  Nested foreach - performs a new each time

 

Rewrite any commonly used components when possible to avoid PInvoke (10 IL instructions), COM calls (13 IL instructions), or reflection (metadata and reflection objects) in critical paths.

 

Do your homework, measure, and test with the CLR profiler, watch the performance counter especially % Time in GC.

 

Late binding and reflection as little as possible.  Enforce early binding with Option Explicit On and Option Strict On.

 

Use chunky not chatty calls especially when performing COM calls – marshalling hits.

 

Make COM calls isometric with primitive types or strongly typed arrays.  Unicode built COM objects are preferred for retaining a Unicode to Unicode string marshal instead of Unicode to ANSI string.

 

NGen to improve startup such as small utilities.  NGEN counteracts some optimizations such as cross assembly inlining and delay loading.  NGEN code may run slower because the tables created such as vtables may not be as optimal as JIT compiled for better processors and L1 / L2 hardware cache.  If NGEN is found to be incompatible or corrupt, all will fail over to JIT anyway.

 

In the current version, code that gets JIT compiled is not shared across processes. If you have a component that will be loaded into many processes, consider pre-jitting with NGEN to theoretically share the native code as much as possible

 

Strong named assemblies not in the GAC must have their Strong Name repetitively verified.  Place in the GAC to avoid this repetition.

 

Don't use exceptions for normal flow control.  Use try-finally often.  Only catch excetions if you can add some value or enhance the user experince.

 

Throw fewer exceptions.

 

Perform local work before transferring across the Network

 

Use Asynchronous Calls when beneficial at least for a better user experience.

 

Make Use of Pluggable Channels and Formatters such as binary over SOAP if the application does not need the interoperability features.

 

 

Threading

Use the ReadWrite lock when beneficial to read many but write by one.  The ReadWrite lock comes at a price so use the lock method most of the time.

 

Use locks only when necessary and do as small amount as needed while in the lock.

 

Use the internal ThreadPool.  25 threads per process per processor at your disposal.  Run static ThreadPool.QueueUserWorkItem(callback, objectToWorkOn)

 

Don’t perform long running operations on the ThreadPool.  They are meant for other components in the Application.  Instead, create a more user controlled thread with flags to cancel.

 

 

Miscellaneous

Use FXCop for performance hints such as testing for empty strings and avoid building non-callable code.

Include Return Statements in VB functions because the JIT compiler must create a few variables to support returning values.  These variables may prevent JIT optimizations such as inlining.

In VS2005 use but don’t overuse generics for strong typing to avoid boxing – limited to single character names so use them wisely.

 

In VS2005 there is better GC Pinning to cause less fragmentation.  Think about your pinning strategy early.

 

 

Memory

Don’t just consider strings, consider objects that contain string variables and other reference types.

 

Remember initialization constructors and sizes on things such as StringBuilder(string value, int capacity)

 

Avoid multiple string allocation such as using += concatenating, Split, Trim, ToLower, etc.  Use StringBuilder.  Remember that string constants are already interned, so no runtime allocation making a += is faster than allocating a StringBuilder.

 

String.Format calls the static function System.Text.StringBuilder.AppendFormat() so remember that the System.Text assembly gets loaded if not already loaded for this call.

 

Lookout for Implicit Boxing

 

Use jagged arrays over multi-dimensional arrays.  Easier GC cleanup with fewer roots?

 

Allocate large objects soon.  Cleanup ASAP by setting references to null or lose scope.

 

Running += in a tight loop can cause GC churning especially with strings and objects that contain strings.

 

If seriously reusing in an object oriented world, then make the objects smaller and short-lived or in

 

Preallocate confident amounts in constructors such as a Hashtable.  Consider making two passes to know the count to avoid reallocation.  Ex:  How many images in an HTML DOM?

 

Avoid pre-allocating chunks of memoery just to ensure that memory is available.  Managed heap allocation is fast.  If pre-allocation is preferred, do it seldom and ASAP so it gets promoted faster so generation 0 gets optimized sooner.

 

Avoid calling GC.Collect - It resets optimization statistics that will need to be accumulated again over time.  If things aren't closing or cleaning up properly, focus on the actual problem.  Causes a CPU load.

 

Consider generation 1 objects that get compacted with memcpy.  Using large short-lived objects causes lots of copies.

 

Setting objects to null or nothing trims the object reference graph thereby causing a cleanup on the next GC.

 

Use weak references on medium or large cached objects.  With a memory pressure, weak references will be cleaned up.   (Jeffrey Richter's MSDN article.)

 

Pin objects only when needed.

 

Use object pools.

 

Use short lived objects when possible for less memory consumption.  Avoid useless long lived large objects.  This increases the memory pressure and causes more GC churning.

 

Keep properties simple and small so that they can be inlined by the JIT compiler.

 

Keep custom value types such as structures limited to about 16 bytes or less because they will be on the stack.

 

Strongly typed arrays are fast because their size is fixed and have no extra functionality such as sort, find, or type reflection.

 

Application domains perform IPC with remoting architecture so creating and using objects across domains may be slower with the data marshalling.

 

Avoid too many static members and methods.  Takes up memory until the assembly is unloaded and it gets initialized on startup.

 

 

Cleanup

Avoid implementing a Finalize method unless resources need to be cleaned up.  Finalizers need pointer setup time at startup.  And promotes to a longer lived object until the finalizer is run.  Extra GC step for finalized types.

 

Consider what objects are roots and which have root related objects.  In a memory pressure, if the chain of objects remain they will be promoted in the GC.  Consider a lightweight version that's short lived without a finalize.

 

Implement finalize in leaf objects and not the root if possible to avoid promoting all of the objects in the graph.

 

Remember the 2 second / 40 second rule - finalize finishes or application domain gets shutdown.

 

Use suppress finalize to avoid the finalization / freachable queue step of the GC for faster cleanup.  Keep it simple such as closing a handle.

 

Should use Dispose pattern with Finalize but not necessarily Finalize with Dispose.  Allow multiple Dispose calls and avoid any exceptions if it was disposed already.

 

Call Dispose on base class from the inherited class to ensure faster cleanup and in case the base class needs more cleanup and calls SuppressFinalize.  

 

Use Dispose pattern when finalizing for developer control.  Create a close method when possible.

 

In C#, use the using statement to ensure Dispose is created in IL for you and called.

 

No Comments