March 2005 - Posts

Remoting Comparisons Part 2

Here is another unofficial remoting comparison.

 

Three classes involved:

  • RemotingClassA - Returned from the TCP remoting service with the sole purpose of returning a useful instance of MBVClass1.
  • RemotingClassB - Remote object hosted by IIS that retrieves a RemotingClassA instance to finally return an MBVClass1.
  • MBVClass1 - ISerializable Class with data that must be marshaled for local use without proxy to a server behind a firewall.  The client has a concrete class to use this object.

 

The main goal is to retrieve an MBVClass1 from a Singleton or SingleCall TCP remoting service.  I need these basic numbers to choose the fastest VPN and Internet accessibility, excluding factors such as DB interactions, multiple processors, slow networks, etc.  The primary goal is to write the TCP remoting service once and have a web interface retrieve objects from that service (scalability issues aside).

 

The " Client <-> WS <-> TCP " contacts a stateless web service which creates and returns an MBVClass1 with the SOAP formatter.  This means that the client application would not use remoting, only a web service interface.  This is simple for the client and allows version changes to be limited between the web service and the internal TCP service.  Using the web service also has advantages of plugin technologies such as compression, encryption, and authentication as Web services evolve.

 

The “Client <-> IIS <-> TCP” is a remoting object hosted in IIS, RemotingClassB, used to connect to the internal TCP remoting service to create a RemotingClassA to return MBVClass1.  This must be in a second class, RemotingClassB, because the proxy that is created for RemotingClassA must be available on the client that is hosting the object in IIS.  The actual object for RemotingClassA lives in the TCP remoting service.  The client can instead use the proxy for RemotingClassB to get the MBVClass1 that resides in the IIS hosted object.

 

The following are call counts in a 5 second period.  The IIS remoting solution uses the binary formatter after retrieving the object from the TCP remoting service.  The web solutions, IIS host and the Web Service retrieval would perform the JIT compile delay and also perform fewer calls when the Web server was under a load.  Compression was also tested using the latest ICSharpCode.SharpZipLib assembly.

 

Localhost

Method

Return Object

Return Object w/10K String

Return Object w/100K String

Return Object w/1MB String

Return Object w/1MB String

Max Compression Speed

(5844 Chars returned)

Return Object w/1MB String

Max Compression Size

(1324 Chars returned)

Local

7,300,000

325,000

7,400

1,100

124

2

Client - TCP

5,550

3,130

670

75

170

1

Client - IIS - TCP

685

520

205

32

137

3

Client - WS - TCP

710

560

205

32

138

3

 

Network Retrieval

Method

Return Object

Return Object w/10K String

Return Object w/100K String

Return Object w/1MB String

Return Object w/1MB String

Max Compression Speed

(5844 Chars returned)

Return Object w/1MB String

Max Compression Size

(1324 Chars returned)

Client - TCP

2,750

1,465

250

30

54

2

Client - IIS - TCP

310

275

40

10

48

2

Client - WS - TCP

390

300

20

7

48

2

 

Understanding how things work in the CLR such as the JIT compiler and the GC is imperative when evaluating choices like these.

 

Pay for play as we say.

 

Posted by vblasberg with 2 comment(s)

Remoting Comparisons

Here is a remoting comparison with some interesting results.  The numbers represent the call count after 5 seconds and after an initial hit for JIT compiling the Web service and IIS application.  The simple object that was returned contained only a name and price property.  The Remoting TCP’s well known type was a singleton but a single call was about the same performance for the IIS and WS to TCP object retrieval.

Since a standard web service is required to use SOAP (without extensions), we can understand the average of 400 transactions.  The Web service often dipped down to 10, 75, 200, and the normal 400.  Adding state in the Web service to store the singleton SAO object from the TCP remote service didn’t help any. 

It looks like the most optimal and reliable Internet access would be a simple pass-through class in a virtual directory with HTTP channel / binary formatter to get an object from the internal TCP service.  VPN clients should go directly to the TCP service for best performance.

Remoting Method

Method Returns a Double

Method Returns an Object on Localhost

Method Returns an Object Across a Fast Network

Local Object  (no remoting)

36,800,000

17,000,000

-

TCP

11,000 Singleton / 8,900 SingleCall

6,100

3000

Client to IIS Class to TCP

1,200 Soap / 1250 Binary

1020

650

Client to WS to TCP

750

680

10 to 400 w/ & w/o state

 

Posted by vblasberg with 2 comment(s)

Notes on DVD Media Quality for Videos

After a few days of media trial and error, here are some DVD media notes that maybe someone else can consider and save some video editing and burning time.  The final DVD project was grainy so I suppose other factors are to be considered but this is when you see how the DVD media is going to treat you with a video deadline.  Several methods were used to burn the DVD but I ended up using Pinnacle to Verbatim media and had no failures at all.

Project:          106 minutes compiled with Pinnacle 8.  Usually exceptional quality but this one was a bit MPEG grainy.
Media:           DVD-R
DVD Player:  One unit with Progressive scan and one without.

DVD Brand and Quality Note
________________________
Memorex (4x purchased two months ago - unavailable now) - 100% Reliable, never pauses.  The best you can get.
Memorex (8x - the only Memorex -R available) -  4 out of 5 would not play at all on either progressive scan or non-PS.
Sony - Pauses during play on non-PS.
Fuji - Pauses during play on non-PS.
Great Quality - No quality.  Always pauses and even with simple data backups will read slow.
Platinum - Least expensive and 95% reliable.
Verbatim - Least expensive and 100% reliable.

Time is money when we use bad media for training videos or other small DVD based projects.

Your mileage may vary.

 

Posted by vblasberg with 1 comment(s)

CLR Development Resource List

This is a list of resources that I had in my CLR Internals talk last week at the Dallas C# SIG.  There are so many great CLR articles and webcasts but these are my best and may come in handy for someone else.  The oldie but goodie is on DrDobbs site, Don Box discussing the two main CLR DLL's.  I'm sure that it won't be up there forever.

Don Box Webcast on How the CLR Works
http://technetcast.ddj.com/tnc_play_stream.html?stream_id=605

Profiling

Lutz Roeder's Reflector for .NET
http://www.aisto.com/roeder/dotnet/

ANTS Profiler
http://www.red-gate.com/

CLR Profiler
http://www.microsoft.com/downloads/details.aspx?FamilyId=86CE6052-D7F4-4AEB-9B7A-94635BEEBDDA&displaylang=en

http://msdn.microsoft.com/msdntv/episode.aspx?xml=episodes/en/20030729CLRGN/manifest.xml

Optimized Development

Writing Faster Managed Code: Know What Things Cost (CLR Profiler)
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/fastmanagedcode.asp

Improving .NET Application Performance and Scalability
http://msdn.microsoft.com/library/en-us/dnpag/html/scalenet.asp

CLR Information And CLR Performance
http://www.gotdotnet.com/team/clr/about_clr.aspx
http://www.gotdotnet.com/team/clr/about_clr_performance.aspx

Community

CLR Newsgroup
http://msdn.microsoft.com/newsgroups/default.aspx?dg=microsoft.public.dotnet.framework.clr

CLR Team Blogs with Tons of Tidbits
http://msdn.microsoft.com/netframework/community/blogs/default.aspx

 

Webcasts and Articles

Don Box’s - Migrating Native Code to the .NET CLR
http://msdn.microsoft.com/msdnmag/issues/01/05/com/

Common Type System
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpconthecommontypesystem.asp

Performance Considerations for Run-Time Technologies in the .NET Framework
http://msdn.microsoft.com/library/en-us/dndotnet/html/dotnetperftechs.asp

Performance Tips and Tricks in .NET Applications
http://msdn.microsoft.com/library/en-us/dndotnet/html/dotnetperftips.asp

Other

Online IL Development Book
http://vijaymukhi.com/documents/books/ilbook/contents.htm

Project imago (link via Lutz Roeder)
http://www.aisto.com/roeder/imago/

Books to Consume Large Amounts of Caffeine With

Applied .Net Framework Programming
by Jeffrey Richter

Essential .NET, Volume I:
The Common Language Runtime

by Don Box

Posted by vblasberg with 2 comment(s)

Preferred CLR Development Techniques

Here's a list of some preferred CLR development techniques that come from my CLR Internals talk
last week at the
Dallas C# SIG.  Most of these are a collection from Jeffrey Richter's book and webcasts from Microsoft Sources such as Rico Mariani and Gregor Noriskin.  Thanks to them of course for their work.

Preferred CLR Development Techniques

 

Initialization

Consider the size of the assemblies being loaded on startup.  For typical applications, loading few large assemblies are preferred over many small assemblies to avoid the overhead of assembly finding, loading, and initialization. 

 

Small assemblies have their place with plugin architecture and internet base applications.  Internet downloaded assemblies should trickle to the user as needed or provide an installation.

 

Consider on-demand assembly loading and late assembly loading using reflection with an object variable.  JIT will prevent this and possibly counter the prejitted X86 code benefit.

 

There is no version checking for assemblies without strong names so don’t strong name the assembly if it will always be a private assembly to avoid version checking.

 

Preload forms on another thread to a cache such as a hashtable.

 

Rebase your assemblies after studying your memory footprints.

 

 

Operation

The foreach() iterator locks to get an iterator - use for() instead for CLR optimization.  Don't cache the size then go into the loop, the CLR will not be able to optimize it.  Nested foreach - performs a new each time

 

Rewrite any commonly used components when possible to avoid PInvoke (10 IL instructions), COM calls (13 IL instructions), or reflection (metadata and reflection objects) in critical paths.

 

Do your homework, measure, and test with the CLR profiler, watch the performance counter especially % Time in GC.

 

Late binding and reflection as little as possible.  Enforce early binding with Option Explicit On and Option Strict On.

 

Use chunky not chatty calls especially when performing COM calls – marshalling hits.

 

Make COM calls isometric with primitive types or strongly typed arrays.  Unicode built COM objects are preferred for retaining a Unicode to Unicode string marshal instead of Unicode to ANSI string.

 

NGen to improve startup such as small utilities.  NGEN counteracts some optimizations such as cross assembly inlining and delay loading.  NGEN code may run slower because the tables created such as vtables may not be as optimal as JIT compiled for better processors and L1 / L2 hardware cache.  If NGEN is found to be incompatible or corrupt, all will fail over to JIT anyway.

 

In the current version, code that gets JIT compiled is not shared across processes. If you have a component that will be loaded into many processes, consider pre-jitting with NGEN to theoretically share the native code as much as possible

 

Strong named assemblies not in the GAC must have their Strong Name repetitively verified.  Place in the GAC to avoid this repetition.

 

Don't use exceptions for normal flow control.  Use try-finally often.  Only catch excetions if you can add some value or enhance the user experince.

 

Throw fewer exceptions.

 

Perform local work before transferring across the Network

 

Use Asynchronous Calls when beneficial at least for a better user experience.

 

Make Use of Pluggable Channels and Formatters such as binary over SOAP if the application does not need the interoperability features.

 

 

Threading

Use the ReadWrite lock when beneficial to read many but write by one.  The ReadWrite lock comes at a price so use the lock method most of the time.

 

Use locks only when necessary and do as small amount as needed while in the lock.

 

Use the internal ThreadPool.  25 threads per process per processor at your disposal.  Run static ThreadPool.QueueUserWorkItem(callback, objectToWorkOn)

 

Don’t perform long running operations on the ThreadPool.  They are meant for other components in the Application.  Instead, create a more user controlled thread with flags to cancel.

 

 

Miscellaneous

Use FXCop for performance hints such as testing for empty strings and avoid building non-callable code.

Include Return Statements in VB functions because the JIT compiler must create a few variables to support returning values.  These variables may prevent JIT optimizations such as inlining.

In VS2005 use but don’t overuse generics for strong typing to avoid boxing – limited to single character names so use them wisely.

 

In VS2005 there is better GC Pinning to cause less fragmentation.  Think about your pinning strategy early.

 

 

Memory

Don’t just consider strings, consider objects that contain string variables and other reference types.

 

Remember initialization constructors and sizes on things such as StringBuilder(string value, int capacity)

 

Avoid multiple string allocation such as using += concatenating, Split, Trim, ToLower, etc.  Use StringBuilder.  Remember that string constants are already interned, so no runtime allocation making a += is faster than allocating a StringBuilder.

 

String.Format calls the static function System.Text.StringBuilder.AppendFormat() so remember that the System.Text assembly gets loaded if not already loaded for this call.

 

Lookout for Implicit Boxing

 

Use jagged arrays over multi-dimensional arrays.  Easier GC cleanup with fewer roots?

 

Allocate large objects soon.  Cleanup ASAP by setting references to null or lose scope.

 

Running += in a tight loop can cause GC churning especially with strings and objects that contain strings.

 

If seriously reusing in an object oriented world, then make the objects smaller and short-lived or in

 

Preallocate confident amounts in constructors such as a Hashtable.  Consider making two passes to know the count to avoid reallocation.  Ex:  How many images in an HTML DOM?

 

Avoid pre-allocating chunks of memoery just to ensure that memory is available.  Managed heap allocation is fast.  If pre-allocation is preferred, do it seldom and ASAP so it gets promoted faster so generation 0 gets optimized sooner.

 

Avoid calling GC.Collect - It resets optimization statistics that will need to be accumulated again over time.  If things aren't closing or cleaning up properly, focus on the actual problem.  Causes a CPU load.

 

Consider generation 1 objects that get compacted with memcpy.  Using large short-lived objects causes lots of copies.

 

Setting objects to null or nothing trims the object reference graph thereby causing a cleanup on the next GC.

 

Use weak references on medium or large cached objects.  With a memory pressure, weak references will be cleaned up.   (Jeffrey Richter's MSDN article.)

 

Pin objects only when needed.

 

Use object pools.

 

Use short lived objects when possible for less memory consumption.  Avoid useless long lived large objects.  This increases the memory pressure and causes more GC churning.

 

Keep properties simple and small so that they can be inlined by the JIT compiler.

 

Keep custom value types such as structures limited to about 16 bytes or less because they will be on the stack.

 

Strongly typed arrays are fast because their size is fixed and have no extra functionality such as sort, find, or type reflection.

 

Application domains perform IPC with remoting architecture so creating and using objects across domains may be slower with the data marshalling.

 

Avoid too many static members and methods.  Takes up memory until the assembly is unloaded and it gets initialized on startup.

 

 

Cleanup

Avoid implementing a Finalize method unless resources need to be cleaned up.  Finalizers need pointer setup time at startup.  And promotes to a longer lived object until the finalizer is run.  Extra GC step for finalized types.

 

Consider what objects are roots and which have root related objects.  In a memory pressure, if the chain of objects remain they will be promoted in the GC.  Consider a lightweight version that's short lived without a finalize.

 

Implement finalize in leaf objects and not the root if possible to avoid promoting all of the objects in the graph.

 

Remember the 2 second / 40 second rule - finalize finishes or application domain gets shutdown.

 

Use suppress finalize to avoid the finalization / freachable queue step of the GC for faster cleanup.  Keep it simple such as closing a handle.

 

Should use Dispose pattern with Finalize but not necessarily Finalize with Dispose.  Allow multiple Dispose calls and avoid any exceptions if it was disposed already.

 

Call Dispose on base class from the inherited class to ensure faster cleanup and in case the base class needs more cleanup and calls SuppressFinalize.  

 

Use Dispose pattern when finalizing for developer control.  Create a close method when possible.

 

In C#, use the using statement to ensure Dispose is created in IL for you and called.

 

Posted by vblasberg with 3 comment(s)
More Posts