Introduction to MSIL – Part 4 – Defining Type Members
In Part 3 of the MSIL series, I introduced the basic syntax for defining types. Using the .class directive, you can define reference types and value types. Choosing the type attributes correctly, you can exercise complete control over the definition of your type.
.class abstract Kerr.Sample.Object
{
}
In this installment, we are going to explore how to declare members of a type. To simplify the samples, I may omit the class definitions at times and simply focus on defining the members.
Constructors
Let’s begin with initializers, known as constructors in languages like C++ and C#. CLI supports both type initializers and instance initializers. Type initializers are a very handy thing and address many of the multithreading challenges you encounter when implementing a singleton in native C++. A type initializer is run exactly once for a given type and the runtime guarantees that no methods of the type will be accessible before the type initializer completes.
.method static void .cctor()
{
.maxstack 1
ldstr ".cctor"
call void [mscorlib]System.Console::WriteLine(string)
ret
}
.method public void .ctor()
{
.maxstack 1
ldarg.0
call instance void [mscorlib]System.Object::.ctor()
ldstr ".ctor"
call void [mscorlib]System.Console::WriteLine(string)
ret
}
.cctor and .ctor are known as special method names. A type initializer, which is optional, is defined as a static method named .cctor with no return value. Type initializers are called static constructors in the C++/CLI and C# languages.
An instance initializer, more commonly referred to simply as a constructor, initializes an instance of a type. A constructor is defined as an instance method named .ctor also with not return value. A constructor is called when an instance is created using the newobj instruction. Here is an example.
.locals (class TypeName obj)
newobj void TypeName::.ctor()
stloc obj
When executed, it will result in the following written to the console, assuming the constructors defined above.
.cctor
.ctor
The newobj instruction allocates a new instance of the type and initializes all its fields to the type-equivalent of zero. Then it calls the particular constructor, disambiguated by the signature, ensuring that the first (zero-based) argument refers to the newly created instance. Once the constructor completes, an object reference is pushed onto the stack for access by the caller.
Methods
Although the CLI defines constructors and properties in terms of methods (with some extra metadata for properties), in this section we are going to look at methods in the sense of C++ member functions or C# methods. Of course most of what is said of methods, also applies to instance constructors and property getters and setters. Virtually all of the interesting things that you can do with methods are controlled by method attributes. There are some common scenarios programmers expect so lets cover a few of them.
Static methods are defined using the static attribute. Static methods, as you would expect, are associated with a type but not an instance.
.method static void StaticMethod() { /* impl */ }
Instance methods simply use the instance attribute in place of the static attribute. The IL Assembler assumes instance as the default so you rarely need to specify it explicitly for method declarations.
.method void InstanceMethod() { /* impl */ }
The opposite is true when calling methods. The call instruction assumes a static method unless you specify otherwise. Here is an example of calling both methods.
call void TypeName::StaticMethod()
ldloc obj
call instance void TypeName::InstanceMethod()
Remember to push the object reference pointing to your instance onto the stack before calling the instance method.
Virtual function calls are an important part of object-oriented design and the CLI provides a great deal of flexibility in controlling whether the static or dynamic type of the object will be used to service the call, as well as how this behavior can be overridden in subclasses. When I refer to the static and dynamic type in this context I am referring to it in the C++ sense of the static type known at compile time and the dynamic type determined at runtime. This is generally referred to as polymorphism. There are two aspects to the virtual function support that you need to keep in mind when programming in MSIL. The first is how you declare your instance methods to support virtual function invocation and the second is how you call the method. It should also go without saying that static methods are by definition not virtual.
A method is marked virtual by adding the virtual attribute to the type header. Consider the following example.
.class House
{
.method public virtual void Buy()
{
.maxstack 1
ldstr "House::Buy"
call void [mscorlib]System.Console::WriteLine(string)
ret
}
/* etc */
}
.class TownHouse
extends House
{
.method public virtual void Buy()
{
.maxstack 1
ldstr "TownHouse::Buy"
call void [mscorlib]System.Console::WriteLine(string)
ret
}
/* etc */
}
The House type has a virtual method called Buy. The TownHouse type extends House and also has a virtual method with the same name. Because of this, TownHouse::Buy is said to override House::Buy. So how do we tell the runtime which method to pick? Obviously if I have a House instance I would like House::Buy to be called, but if I have a TownHouse instance I would like TownHouse::Buy to be called and, being a virtual method, I want this decision to be made at runtime when the actual type is known. So far I have used the call instruction in a number of examples in this series. The call instruction invokes the specified method and will always call the same method regardless of the dynamic type of the object. The callvirt instruction, on the other hand, allows the runtime to determine the specific virtual method implementation to invoke based on the actual type of the object. Consider the following example.
newobj instance void House::.ctor()
stloc house
newobj instance void TownHouse::.ctor()
stloc townHouse
ldloc house
call instance void House::Buy()
ldloc townHouse
call instance void TownHouse::Buy()
ldloc townHouse
call instance void House::Buy()
ldloc townHouse
callvirt instance void House::Buy()
When executed, it will result in the following written to the console.
House::Buy
TownHouse::Buy
House::Buy
TownHouse::Buy
The first call to the Buy method with the house reference invokes the House::Buy implementation since call is only interested in the static, or compile-time, type. The second call to Buy with the townhouse reference invokes the TownHouse::Buy implementation for the same reason. The third call will once again invoke House::Buy despite the fact that the object reference points to a TownHouse. It should now be clear that using the call instruction implies making a compile-time decision on which method to execute. The final method call uses the callvirt instruction to invoke the virtual method House::Buy and since the object reference actually points to a TownHouse, the TownHouse::Buy method will be executed. To be clear, the runtime is not looking at the type of the local variable you declared but rather the type of the object being referenced. We could have stored a reference to a TownHouse in a House local variable and the TownHouse::Buy method would still have been called.
If you want to declare a virtual method but do not want to override an inherited virtual method with the same name, you can use the newslot attribute on the new virtual method in the subclass. If you consider that virtual method invocation by the runtime is not concerned about method names then you should see how this is possible. Just think of newslot as adding a new virtual function pointer to the vtbl for the given type.
CLI virtual methods are very interesting, especially when you consider how and to what extent they are exposed by C++/CLI and C#. This entry is getting long enough so I’ll save that discussion for another day.
Read part 5 now: Exception Handling
© 2004 Kenny Kerr