Performance in .NET – Part 1

Updated: thanks, Paulo Morgado!

Updated: see the second post here and the third here.

Introduction

Along the years I wrote a couple of posts about performance in the .NET world. Some were more tied to specific frameworks, such as NHibernate or Entity Framework, while others focus on the generic bits. In this series of posts I will summarize my findings on .NET in general, namely:

  • Object creation (this post)
  • Object cloning
  • Value Types versus Reference Types
  • Collections
  • Possibly other stuff

I won’t be talking about object serialization, as there are lots of serializers out there, each with its pros and cons. In general, I’d say either serializing to and from JSON or from a binary format seem to be the most demanded ones, and each has quite a few options, either provided by Microsoft or from third parties. The actual usage also affects what we want – is it a general-purpose serializer or one for a particular usage, that needs classes prepared accordingly? Let’s keep it out of this discussion.

As always, feel free to reach out to me if you want to discuss any of these! So, lets start with object creation.

Object Creation

Let’s start with object creation and by defining our purpose: we want to be able to create object instances of a certain type as fast as possible. We have a couple of strategies:

Let’s cover them all one by one.

Using the new Operator

This is the most obvious (and fast), but does not play well with dynamic instantiation, meaning, the type to instantiate needs to be hardcoded. I call it direct instantiation, and it goes as this (you know, you know…):

var obj = new Xpto();

This should be the baseline for all performance operations, as it should offer the best possible performance.

Using Reflection

Here I’m caching the public parameterless constructor and invoking it, then casting the result to the target type:

var ci = typeof(Xpto).GetConstructor(Type.EmptyTypes);
var obj = ci.Invoke(null) as Xpto;

Just avoid getting the constructor over and over again, do it once for each type then cache it somewhere.

Using FormatterServices.GetUninitializedObject

The GetUninitializedObject method is used internally by some serializers and what it does is, it merely allocates memory for the target type and zeroes all of its fields, without actually running any constructor. This has the effect that any explicitly declared field and property values will be lost, so use with care. It is available in .NET Core:

var obj = FormatterServices.GetUninitializedObject(typeof(Xpto)) as Xpto;

Pay attention that none of the constructors of your type are executed, and no fields or properties have their initial values set, other than the default value for each type (null for reference types, the default for value types).

Using System.Reflection.Emit code generation

This one uses the code generation library that is built-in with .NET (but not .NET Core, for the time being):

var m = new DynamicMethod(string.Empty, typeof(object), null, typeof(Xpto), true);
var ci = typeof(Xpto).GetConstructor(Type.EmptyTypes);
var il = m.GetILGenerator();
il.Emit(OpCodes.Newobj, ci);
il.Emit(OpCodes.Ret);
var creator = m.CreateDelegate(typeof(Func<object>)) as Func<object>;
var obj = creator() as Xpto;

As you can see, we are just generating code for a dynamic method, providing a simple content that does “new Xpto()”, and execute it.

Using Activator.CreateInstance

This is essentially a wrapper around the reflection code I’ve shown earlier, with the drawback that it does not cache each types' public parameterless constructor:

var obj = Activator.CreateInstance(typeof(Xpto)) as Xpto;

Using LINQ expressions

The major drawback of this approach is the time it takes to build the actual code (the first call to Compile). After that, it should be fast:

var ci = typeof(Xpto).GetConstructor(Type.EmptyTypes);
var expr = Expression.New(ci);
var del = Expression.Lambda(expr).Compile();
var obj = del.DynamicInvoke() as Xpto;

Of course, if you are to call this a number of times for the same type, it may be worth caching the constructor for each type.

Using Delegates

The LINQ expressions approach actually compiles to this one, but this is strongly typed:

Func<Xpto> del = () => new Xpto();
var obj = del();

Using Roslyn

This one is relatively new in .NET. As you may know, Microsoft now uses Roslyn to both parse and generate code dynamically. The scripting capabilities are made available through the Microsoft.CodeAnalysis.CSharp.Scripting NuGet package. The actual code for instantiating a class (or actually executing any code) dynamically goes like this:

var obj = CSharpScript.EvaluateAsync("new Xpto()").GetAwaiter().GetResult() as Xpto;

Do keep in mind that Roslyn is asynchronous by nature, so you need to wait for the result, also, do add the full namespace of your type, which I omitted for brevity. There are other APIs that allow you to compile code and reuse the compilation:

var script = CSharpScript.Create<Xpto>("new Xpto()", ScriptOptions.Default.AddReferences(typeof(Xpto).Assembly));
var runner = script.CreateDelegate();
var obj = runner().GetAwaiter().GetResult();

Conclusion

Feel free to run your tests, with a few iterations, and look at the results. Always compare with the normal way to create objects, the new operator. Do not forget the problems with each approach, like the need to cache something or any limitations on the instantiated object.

In my machine, for 1000 iterations, a couple times for the same run, I get these average results (elapsed ticks):

Technique Delay
Direct 0.148
FormatterServices.GetUninitializedObject 0.324
Activator.CreateInstance 0.296
Reflection 0.6
IL 0.557
LINQ Expression 4.085
Delegate 0.109
Roslyn 2400.796

Some of these may be surprising to you, as they were to me! It seems that reflection is not that much slower than direct instantiation as one might think… hmmm…

As usual, I’d love to hear your thoughts on this! More to come soon! Winking smile

                             

13 Comments

  • It is so surprising that instantiating by delegates is faster than the new operator!!!

  • I Don't See a reason the delegate method wouldn't be compiled to direct initialization using new... That will also explain the fact that it takes about the same time as using new. It also has no benefits over using new since you have to import the type explicitly, removing all 'dynamic' advantages mentioned in the beginning of the post.

  • Also for critically intensive operations IMHO it might be worth considering re-using objects from a pre-allocated pool if that's a feasible option. This saves both the allocation time and the garbage collection cost. The downside is the need for a clean "GetFromPool / ReturnToPool" system but this can yield surprisingly large performance gains.

  • VOVA: you're probably right, and the results are here to show it! :-)
    D.C.Elington: Indeed, in fact .NET Core even includes an object pool implementation - maybe something for a future post.

  • Hello, you're not giving any insights into how you measured this. Can you clarify that a bit?

  • Hi, Alex!
    Sure; one method for each test (Roslyn, direct, etc). Those that benefit from caching (e.g., reflection), I do the caching before I ran 1000 iterations where I just build an instance. Then I ran the entire thing 3 times and I take the last results. Paulo Morgado rightfully reminded me that the first results may be misleading.
    Have you experienced different results?

  • > Some of these may be surprising to you, as they were to me!

    There's no way delegate invocation is faster than direct creation. I would question the reliability of the benchmark itself; micro-benchmarks are hard to get right. Have you tried using BenchmarkDotNet? It might give you very different results.
    Also, some of the approaches have a "warmup" phase (Roslyn, Linq Expressions, IL), which should be measured separately.

  • Thomas: I did run a warmup. But, by all means, do try them yourself!

  • I took the liberty to run those benchmarks using BenchmarkDotNet and the results are very different.
    https://github.com/lusocoding/dotnet-objectcreation-benchmarks/blob/master/NetCoreBenchObjCreation.Benches-report-github.md
    You'll find everything in there, together with the source code.

  • Nuno: thanks, but you only covered two methods, will the other follow?

  • Ricardo: I was able to do it just now. All test results can be found using the same address. Roslyn strategy however is giving an error. Will look into it when I have a chance

  • also you could add https://github.com/dadhi/FastExpressionCompiler

  • Nuno: I don't think your benchmarks are doing the same thing. For instance, in the reflection setup you are just running both lines above in each benchmark. Whereas I 'm fairly certain based on this statement

    "Just avoid getting the constructor over and over again, do it once for each type then cache it somewhere."

    that in Ricardo's tests he was storing the reference to the reflection object during the initialisation and then his benchmarks were just testing the creation of the object.

    EG:


    private ConstructorInfo ci;

    [GlobalSetup]
    public void Setup()
    {
    ci = typeof(Foo).GetConstructor(Type.EmptyTypes);
    }

    [Benchmark(Baseline = false, Description = "Using Reflection")]
    public void Reflection()
    {
    var obj = ci.Invoke(null) as Foo;
    }

    Without having to find the constructor each time, it will be pretty close to using the new operator.

Add a Comment

As it will appear on the website

Not displayed

Your website