C# functional programming in-depth (2) Named function and function polymorphism

[LINQ via C# series]

[C# functional programming in-depth series]

Latest version: https://weblogs.asp.net/dixin/functional-csharp-named-function-and-static-instance-extension-method

In C#, the most intuitive functions are method members of class and structure, including static method,  instance method, and extension method, etc. These methods have names at design and are called by name, so they are named functions. Some other method-like members, including static constructor, constructor, finalizer, conversion operator, operator overload, property, indexer, event accessor, are also named functions, with specific name generated by at compiled time. This chapter discusses named functions in C#, how these named functions are defined, and looks into how they work. Method member’s name is available at design time, which some other function members’ name are generated at compile time.

Constructor, static constructor and finalizer

Class and structure can have constructor, static constructor, and finalizer. Constructor can access static and instance members, and is usually used to initialize instance members. Static constructor can only access static members, and is called only once automatically at runtime before the first instance is constructed, or before any static member is accessed. Class can also have finalizer, which usually cleanup unmanaged resources, which is called automatically before the instance is garbage collected at runtime. The following simple type Data is a simple wrapper of a int value:

internal partial class Data
{
    private readonly int value;

    static Data() // Static constructor.
    {
        Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // .cctor
    }

    internal Data(int value) // Constructor.
    {
        Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // .ctor
        this.value = value;
    }

    internal int Value
    {
        get { return this.value; }
    }

    ~Data() // Finalizer.
    {
        Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // Finalize
    }
    // Compiled to:
    // protected override void Finalize()
    // {
    //    try
    //    {
    //        Trace.WriteLine(MethodBase.GetCurrentMethod().Name);
    //    }
    //    finally
    //    {
    //        base.Finalize();
    //    }
    // }
}

Here System.Reflection.MethodBase’s static GetCurrentMethod method returns a System.Reflection.MethodInfo instance to represent the current executing function member. MethodInfo’s Name property returns the actual function name at runtime. The static constructor is compiled to a static method like member, which is parameterless and returns void, and has a special name .cctor (class constructor). The constructor is compiled to an instance method like member, with special name .ctor (constructor). And finalizer is compiled to a protected instance method Finalize, which also calls base type’s Finalize method.

Static method and instance method

Still take above Data type as example. instance method and static method and be defined in the type:

internal partial class Data
{
    internal int InstanceAdd(int value1, int value2)
    {
        return this.value + value1 + value2;
    }

    internal static int StaticAdd(Data @this, int value1, int value2)
    {
        return @this.value + value1 + value2;
    }
}

These 2 method both add a Data instance’s value field with other integers. The difference is, the static method cannot use this keyword to access the Data instance, so a Data instance is passed to the static method as the first parameter. These 2 methods are compiled to different signature, but identical CIL in their bodies:

.method assembly hidebysig instance int32 InstanceAdd (
    int32 value1,
    int32 value2) cil managed 
{
    .maxstack  2
    .locals init ([0] int32 V_0) // Local int variable V_0.
    IL_0000:  nop // Do nothing.
    IL_0001:  ldarg.0 // Load first argument this.
    IL_0002:  ldfld int32 Data::'value' // Load field this.value.
    IL_0007:  ldarg.1 // Load second argument value1.
    IL_0008:  add // Add this.value and value1.
    IL_0009:  ldarg.2 // Load third argument value2.
    IL_000a:  add // Add value2.
    IL_000b:  stloc.0 // Set result to first local variable V_0.
    IL_000c:  br.s IL_000e // Transfer control to IL_000e.
    IL_000e:  ldloc.0 // Load first local variable V_0.
    IL_000f:  ret // Return V_0.
}

.method assembly hidebysig static int32 StaticAdd (
    class Data this,
    int32 value1,
    int32 value2) cil managed 
{
    .maxstack  2
    .locals init ([0] int32 V_0) // Local int variable V_0.
    IL_0000:  nop // Do nothing.
    IL_0001:  ldarg.0 // Load first argument this.
    IL_0002:  ldfld int32 Data::'value' // Load field this.value.
    IL_0007:  ldarg.1 // Load second argument value1.
    IL_0008:  add // Add this.value and value1.
    IL_0009:  ldarg.2 // Load third argument value2.
    IL_000a:  add // Add value2.
    IL_000b:  stloc.0 // Set result to first local variable V_0.
    IL_000c:  br.s IL_000e // Transfer control to IL_000e.
    IL_000e:  ldloc.0 // Load first local variable V_0.
    IL_000f:  ret // Return V_0.
}

So internally, instance method works similarly to static method. The different is, in an instance method, the current instance, which can be referred by this keyword, becomes the first actual argument, the first declared argument from the method signature becomes the second actual argument, the second declared argument becomes the third actual argument, and so on. The similarity of above instance and static methods can be viewed as:

internal int CompiledInstanceAdd(int value1, int value2)
{
    Data arg0 = this;
    int arg1 = value1;
    int arg2 = value2;
    return arg0.value + arg1 + arg2;
}

internal static int CompiledStaticAdd(Data @this, int value1, int value2)
{
    Data arg0 = @this;
    int arg1 = value1;
    int arg2 = value2;
    return arg0.value + arg1 + arg2;
}

Extension method

C# 3.0 introduces extension method syntactic sugar. An extension method is a static method defined in a static non-generic class, with this keyword proceeding the first parameter:

internal static partial class DataExtensions
{
    internal static int ExtensionAdd(this Data @this, int value1, int value2)
    {
        return @this.Value + value1 + value2;
    }
}

The above method is called an extension method for Data type. It can be called like an instance method of Data type:

internal static void CallExtensionMethod(Data data)
{
    int result = data.ExtensionAdd(1, 2L);
}

So extension method’s first declared argument becomes the current instance, the second declared argument becomes the first calling argument, the third declared argument becomes the second calling argument, and so on. This syntax design is easy to understand based on the nature of instance and static methods. Actually, the extension method definition is compiled to a normal static method with System.Runtime.CompilerServices.ExtensionAttribute:

internal static partial class DataExtensions
{
    [Extension]
    internal static int CompiledExtensionAdd(Data @this, int value1, int value2)
    {
        return @this.Value + value1 + value2;
    }
}

And the extension method call is compiled to normal static method call:

internal static void CompiledCallExtensionMethod(Data data)
{
    int result = DataExtensions.ExtensionAdd(data, 1, 2L);
}

If a real instance method and an extension name are both defined for the same type with equivalent signature:

internal partial class Data : IEquatable<Data>
{
    public override bool Equals(object obj)
    {
        return obj is Data && this.Equals((Data)obj);
    }

    public bool Equals(Data other) // Member of IEquatable<T>.
    {
        return this.value == other.value;
    }
}

internal static partial class DataExtensions
{
    internal static bool Equals(Data @this, Data other)
    {
        return @this.Value == other.Value;
    }
}

The instance style method call is compiled to instance method call; In order to call the extension method, use the static method call syntax:

internal static partial class Functions
{
    internal static void CallMethods(Data data1, Data data2)
    {
        bool result1 = data1.Equals(string.Empty); // object.Equals.
        bool result2 = data1.Equals(data2); // Data.Equals.
        bool result3 = DataExtensions.Equals(data1, data2); // DataExtensions.Equals.
    }
}

When compiling instance style method call, C# compiler looks up methods in the following order:

  • instance method defined in the type
  • extension method defined in the current namespace
  • extension method defined in the current namespace’s parent namespaces
  • extension method defined in the other namespaces imported by using directives

Extension method can be viewed as if instance method “added” to the specified type. For example, as fore mentioned, enumeration types cannot have methods. However, extension method can be defined for enumeration type:

internal static class DayOfWeekExtensions
{
    internal static bool IsWeekend(this DayOfWeek dayOfWeek)
    {
        return dayOfWeek == DayOfWeek.Sunday || dayOfWeek == DayOfWeek.Saturday;
    }
}

Now the above extension method can be called as if it is the enumeration type’s instance method:

internal static void CallEnumerationExtensionMethod(DayOfWeek dayOfWeek)
{
    bool result = dayOfWeek.IsWeekend();
}

Most of the LINQ query methods are extension methods, like the Where, OrderBy, Select methods demonstrated previously:

namespace System.Linq
{
    public static class Enumerable
    {
        public static IEnumerable<TSource> Where<TSource>(
            this IEnumerable<TSource> source, Func<TSource, bool> predicate);

        public static IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(
            this IEnumerable<TSource> source, Func<TSource, TKey> keySelector);

        public static IEnumerable<TResult> Select<TSource, TResult>(
            this IEnumerable<TSource> source, Func<TSource, TResult> selector);
    }
}

These methods’ usage and implementation will be discussed in detail in the LINQ to Objects chapter.

This tutorial uses the following extension methods to simplify the tracing of single value and values in sequence:

public static class TraceExtensions
{
    public static T WriteLine<T>(this T value)
    {
        Trace.WriteLine(value);
        return value;
    }

    public static T Write<T>(this T value)
    {
        Trace.Write(value);
        return value;
    }

    public static IEnumerable<T> WriteLines<T>(this IEnumerable<T> values, Func<T, string> messageFactory = null)
    {
        if (messageFactory!=null)
        {
            foreach (T value in values)
            {
                string message = messageFactory(value);
                Trace.WriteLine(message);
            }
        }
        else
        {
            foreach (T value in values)
            {
                Trace.WriteLine(value);
            }
        }
        return values;
    }
}

The WriteLine and Write extension methods are available for any value, and WriteLines is available for any IEnumerable<T> sequence:

internal static void TraceValueAndSequence(Uri value, IEnumerable<Uri> values)
{
    value.WriteLine();
    // Equivalent to: Trace.WriteLine(value);

    values.WriteLines();
    // Equivalent to: 
    // foreach (Uri value in values)
    // {
    //    Trace.WriteLine(value);
    // }
}

More named functions

C# supports operator overload and type conversion operator are defined, they are compiled to static methods. For example:

internal partial class Data
{
    public static Data operator +(Data data1, Data data2)
    // Compiled to: public static Data op_Addition(Data data1, Data data2)
    {
        Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // op_Addition
        return new Data(data1.value + data2.value);
    }

    public static explicit operator int(Data value)
    // Compiled to: public static int op_Explicit(Data data)
    {
        Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // op_Explicit
        return value.value;
    }

    public static explicit operator string(Data value)
    // Compiled to: public static string op_Explicit(Data data)
    {
        Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // op_Explicit
        return value.value.ToString();
    }

    public static implicit operator Data(int value)
    // Compiled to: public static Data op_Implicit(int data)
    {
        Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // op_Implicit
        return new Data(value);
    }
}

The + operator overload is compiled to static method with name op_Addition, the explicit/implicit type conversions are compiled to static methods op_Explicit/op_Implicit method. These operators’ usage are compiled to static method calls:

internal static void Operators(Data data1, Data data2)
{
    Data result = data1 + data2; // Compiled to: Data.op_Addition(data1, data2)
    int int32 = (int)data1; // Compiled to: Data.op_Explicit(data1)
    string @string = (string)data1; // Compiled to: Data.op_Explicit(data1)
    Data data = 1; // Compiled to: Data.op_Implicit(1)
}

Notice the above 2 op_Explicit methods are the special case of ad hoc polymorphism (method overload) in C#.

Property member’s getter and setter are also compiled to named methods. For example:

internal partial class Device
{
    private string description;

    internal string Description
    {
        get // Compiled to: internal string get_Description()
        {
            Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // get_Description
            return this.description;
        }
        set // Compiled to: internal void set_Description(string value)
        {
            Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // set_Description
            this.description = value;
        }
    }
}

The property getter and setter calls are compiled to method calls:

internal static void Property(Device device)
{
    string description = device.Description; // Compiled to: device.get_Description()
    device.Description = string.Empty; // Compiled to: device.set_Description(string.Empty)
}

Indexer member can be viewed as parameterized property. The indexer getter/setter are always compiled to get_Item/set_Item methods:

internal partial class Category
{
    private readonly Subcategory[] subcategories;

    internal Category(Subcategory[] subcategories)
    {
        this.subcategories = subcategories;
    }

    internal Subcategory this[int index]
    {
        get // Compiled to: internal Uri get_Item(int index)
        {
            Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // get_Item
            return this.subcategories[index];
        }
        set // Compiled to: internal Uri set_Item(int index, Subcategory subcategory)
        {
            Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // set_Item
            this.subcategories[index] = value;
        }
    }
}

internal static void Indexer(Category category)
{
    Subcategory subcategory = category[0]; // Compiled to: category.get_Item(0)
    category[0] = subcategory; // Compiled to: category.set_Item(0, subcategory)
}

As fore mentioned, a event has an add accessor and a remove accessor, which are either custom defined, or generated by compiler. They are compiled to named methods as well:

internal partial class Data
{
    internal event EventHandler Saved
    {
        add // Compiled to: internal void add_Saved(EventHandler value)
        {
            Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // add_Saved
        }
        remove // Compiled to: internal void remove_Saved(EventHandler value)
        {
            Trace.WriteLine(MethodBase.GetCurrentMethod().Name); // remove_Saved
        }
    }
}

Event is a function group. The +=/-= operators adds remove event handler function to the event, and –= operator removes event handler function from the event. They are compiled to the calls to above named methods:

internal static void DataSaved(object sender, EventArgs args) { }

internal static void EventAccessor(Data data)
{
    data.Saved += DataSaved; // Compiled to: data.add_Saved(DataSaved)
    data.Saved -= DataSaved; // Compiled to: data.remove_Saved(DataSaved)
}

C#’s event is discussed in detail in the delegate chapter.

Function polymorphisms

The word “Polymorphism” comes from Greek, means “many shapes”. In programming, there are several kinds of polymorphisms. In object-oriented programming, a derived type can override base type’s methods to provide. For example, System.IO.FileStream type and System.IO.Memory type derives from System.IO.Stream type:

namespace System.IO
{
    public abstract class Stream : MarshalByRefObject, IDisposable
    {
        public virtual void WriteByte(byte value);
    }

    public class FileStream : Stream
    {
        public override void WriteByte(byte value);
    }

    public class MemoryStream : Stream
    {
        public override void WriteByte(byte value);
    }
}

FileStream.WriteByte overrides Stream.WriteByte to implement writing to file system, and MemoryStream.WriteByte overrides Stream.WriteByte to implement writing to memory. This is called subtype polymorphism or inclusion polymorphism. In object-oriented programming, the term polymorphism usually refers to subtype polymorphism. There are also ad hoc polymorphism and parametric polymorphism. In functional programming, the term polymorphism usually refers to parametric polymorphism.

Ad hoc polymorphism: method overload

Method overloading allows multiple methods to have the same method name, with different parameter numbers and/or types. For example:

namespace System.Diagnostics
{
    public sealed class Trace
    {
        public static void WriteLine(string message);

        public static void WriteLine(object value);
    }
}

Apparently, the WriteLine overload for string writes the string message. If this is the only method provided, then all the non-string values have to be manually converted to string representation:

internal partial class Functions
{
    internal static void TraceString(Uri uri, FileInfo file, int int32)
    {
        Trace.WriteLine(uri?.ToString());
        Trace.WriteLine(file?.ToString());
        Trace.WriteLine(int32.ToString());
    }
}

The WriteLine overload for object provides convenience for values of arbitrary types. The above code can be simplified to:

internal static void TraceObject(Uri uri, FileInfo file, int int32)
{
    Trace.WriteLine(uri);
    Trace.WriteLine(file);
    Trace.WriteLine(int32);
}

With multiple overloads, WriteLine method is polymorphic and can be called with different arguments. This is called ad hoc polymorphism. In the .NET core library, the most ad hoc polymorphic method is System.Convert’s ToString method. It has 36 overloads to convert values of different types to string representation, in different ways:

namespace System
{
    public static class Convert
    {
        public static string ToString(bool value);

        public static string ToString(int value);

        public static string ToString(long value);

        public static string ToString(decimal value);

        public static string ToString(DateTime value);

        public static string ToString(object value);

        public static string ToString(int value, IFormatProvider provider);

        public static string ToString(int value, int toBase);

        // More overloads and other members.
    }
}

In C#/.NET, constructors can have parameters too, so they can also be overloaded. For example:

namespace System
{
    public struct DateTime : IComparable, IFormattable, IConvertible, IComparable<DateTime>, IEquatable<DateTime>
    {
        public DateTime(long ticks);

        public DateTime(int year, int month, int day);

        public DateTime(int year, int month, int day, int hour, int minute, int second);

        public DateTime(int year, int month, int day, int hour, int minute, int second, int millisecond);

        // Other constructor overloads and other members.
    }
}

Indexers are essentially get_Item/set_Item methods with parameters, so they can be overloaded as well. Take System.Data.DataRow as example:

namespace System.Data
{
    public class DataRow
    {
        public object this[DataColumn column] { get; set; }

        public object this[string columnName] { get; set; }

        public object this[int columnIndex] { get; set; }

        // Other indexer overloads and other members.
    }
}

C# does not allow method overload with only different return type. The following example cannot be compiled:

internal static string FromInt64(long value)
{
    return value.ToString();
}

internal static DateTime FromInt64(long value)
{
    return new DateTime(value);
}

There is a exception for this. In the above example, 2 explicit type conversion operators are both compiled to op_Explicit methods with a single Data parameter. One op_Explicit method returns a int, the other op_Explicit method returns string. This is the only case where  C# allows method overload with only different return type.

Parametric polymorphism: generic method

Besides ad hoc polymorphism, C# also supports parametric polymorphism for methods since 2.0. The following is a normal method that swaps 2 int values:

internal static void SwapInt32(ref int value1, ref int value2)
{
    (value1, value2) = (value2, value1);
}

The above syntax is called tuple assignment, which is a new feature of C# 7.0, and is discussed in the tuple chapter. To reuse this code for values of any other type, just define a generic method, by replacing int with a type parameter. Similar to generic types, generic method’s type parameters are also declared in angle brackets following the method name:

internal static void Swap<T>(ref T value1, ref T value2)
{
    (value1, value2) = (value2, value1);
}

Generic type parameter’s constraints syntax also works for generic method. For example:

internal static IStack<T> PushValue<T>(IStack<T> stack) where T : new()
{
    stack.Push(new T());
    return stack;
}

Generic types as well as generic methods are heavily used in C# functional programming. For example, almost every LINQ query API is parametric polymorphic.

Type argument inference

When calling generic method, if C# compiler can infer generic method’s all type arguments, then the type arguments can be omitted at design time. For example,

internal static void TypeArgumentInference(string value1, string value2)
{
    Swap<string>(ref value1, ref value2);
    Swap(ref value1, ref value2);
}

Swap is called with string values, so C# compiler infers type argument string is passed to the method’s type parameter T. C# compiler can only infer type arguments from type of arguments, not from type of return value. Take the following generic methods as example:

internal static T Generic1<T>(T value)
{
    Trace.WriteLine(value);
    return default(T);
}

internal static TResult Generic2<T, TResult>(T value)
{
    Trace.WriteLine(value);
    return default(TResult);
}

When calling them, Generic1’s type argument can be omitted, but Generic2’s type arguments cannot:

internal static void ReturnTypeInference()
{
    int value1 = Generic1(0);
    string value2 = Generic2<int, string>(0); // Generic2<int>(0) cannot be compiled.
}

For Generic1, T is used as return type, but it can be inferred from the argument type. So type argument can be omitted for Generic1. For Generic2, T can be inferred from argument type too, but TResult can only possibly be inferred from type of return value, which is not supported by C# compiler. As a result, type arguments cannot be omitted when calling Generic2. Otherwise, C# compiler gives error CS0411: The type arguments for method 'Functions.Generic2<T, TResult>(T)' cannot be inferred from the usage. Try specifying the type arguments explicitly.

Type cannot be inferred from null because null can be of any reference type or nullable value type. For example, when calling above Generic1 with null:

internal static void NullArgumentType()
{
    Generic1<FileInfo>(null);
    Generic1((FileInfo)null);
    FileInfo file = null;
    Generic1(file);
}

there are some options:

  • Provide the type argument
  • Explicitly convert null to the expected argument type
  • Create a temporary variable of the expected argument type, pass the value to the generic method

Type argument inference is not supported for generic type’s constructor. Take the following generic type as example:

internal class Generic<T>
{
    internal Generic(T input) { } // T cannot be inferred.
}

When calling above constructor, type arguments must be provided:

internal static Generic<IEnumerable<IGrouping<int, string>>> GenericConstructor(
    IEnumerable<IGrouping<int, string>> input)
{
    return new Generic<IEnumerable<IGrouping<int, string>>>(input);
    // Cannot be compiled:
    // return new Generic(input);
}

A solution is to wrap the constructor call in a static factory method , where type parameter can be inferred:

internal class Generic // Not Generic<T>.
{
    internal static Generic<T> Create<T>(T input) => new Generic<T>(input); // T can be inferred.
}

Now the instance can be constructed without type argument:

internal static Generic<IEnumerable<IGrouping<int, string>>> GenericCreate(
    IEnumerable<IGrouping<int, string>> input)
{
    return Generic.Create(input);
}

Static import

C# 6.0 introduces using static directive, a syntactic sugar, to enable accessing static member of the specified type, so that a static method can be called type name as if it is a functions on the fly. Since extension are essentially static method, this syntax can also import extension methods from the specified type. It also enables accessing enumeration member without enumeration type name.

using static System.DayOfWeek;
using static System.Math;
using static System.Diagnostics.Trace;
using static System.Linq.Enumerable;

internal static partial class Functions
{
    internal static void UsingStatic(int value, int[] array)
    {
        int abs = Abs(value); // Compiled to: Math.Abs(value)
        WriteLine(Monday); // Compiled to: Trace.WriteLine(DayOfWeek.Monday)
        List<int> list = array.ToList(); // Compiled to: Enumerable.ToList(array)
    }
}

The using directive imports the specified all types’ extension methods under the specified namespace, while the using static directive only import the specified type’s extension methods.

Partial method

Partial methods can be defined in partial class or partial structure. One part of the type can have the partial method signature, and the partial method can be optionally implemented in another part of the type. This syntactic sugar is useful for code generation. For example, LINQ to SQL can generate entity type in the following pattern:

[Table(Name = "Production.Product")]
public partial class Product : INotifyPropertyChanging, INotifyPropertyChanged
{
    public Product()
    {
        this.OnCreated(); // Call.
    }

    partial void OnCreated(); // Signature.

    // Other members.
}

The constructor calls partial method OnCreate, which is a hook. If needed, developer can provide another part of the entity type to implement OnCreate:

public partial class Product
{
    partial void OnCreated() // Optional implementation.
    {
        Trace.WriteLine($"{nameof(Product)} is created.");
    }
}

If a partial method is implemented, it is compiled to a normal private method. If a partial method is not implemented, the compiler ignores the method signature, and remove all the method calls. For this reason, access modifiers (like public, etc.), attributes, non-void return value are not allowed for partial method.

8 Comments

Add a Comment

As it will appear on the website

Not displayed

Your website