How to write a Standalone VM

In case you ever find yourself having to implement (or simulate, or emulate) a stack-based Virtual Machine, a-la Forth or PostScript or even the CLR itself, here's a handy programming pattern for your consideration. The rough idea is to give every instruction some state variables for remembering its arguments, with virtual methods for "execute" and "ToString" (for “explain” mode) and what not. An instance of the VM will have an instruction stream, an instruction pointer, a stack, and methods for adding instructions to the stream, executing the stream, and explaining itself.

The sample below is intentionally very bare-bones. Think of it as the “hello, world” for a Forth-oid VM. It’s a pattern rather than a solution. I just have a handful of instructions here, just to offer the idea. To make this a real, complete, VM, you would add a lot more instructions, following the pattern. This sample doesn’t use the underlying CLR VM; it could, through Reflection. It doesn’t have a console; it could, through Regex or pick-your-favorite parsing solution. It doesn’t have a debugger; it could, through integration with Visual Studio. It doesn’t have a type system; it could -- it just uses VB late binding through object. In fact, this is so bare-bones that I don’t actually have a method for adding instructions, it just exposes the instruction stream as a public member (oh, the HORROR!) … you’re getting the idea.

So, first, here’s the outline for the VM itself, showing detail, for now, just for the self “Test” routine.

Public Class VM

    Public Class Exception REM gotta have our own, don’tya know

        Inherits ApplicationException REM ... impl omitted

    Public stk As New Stack REM this is a .NET collection class

    Private ix As New List(Of Instruction) REM Instr. is ours

    Private ip As New Integer REM instruction pointer

    Public Sub Execute() REM ... impl omitted

    Public Sub Explain() REM ... impl omitted

    Public Sub Test()

        ix.Clear()

        ix.Add(New Push("A"))

        ix.Add(New Push("B"))

        ix.Add(New Push(42))

        ix.Add(New Push("C"))

        ix.Add(New DisplayStack)

        ix.Add(New MakeList(3))

        ix.Add(New DisplayStack)

        ix.Add(New Reverse)

        ix.Add(New DisplayStack)

        ix.Add(New UnList)

        ix.Add(New DisplayStack)

        ix.Add(New Pop(2))

        ix.Add(New DisplayStack)

        Explain()

        Execute()

    End Sub

End Class

 

Obviously, the interesting thing above is the instruction stream in the “Test” routine. The phasing of operations is “create the instructions, add them to the stream, make the stream show itself, then execute the entire stream.” “Push” is an instruction that, when executed, pushes its argument on the stack. “DisplayStack” does some more explaining at run time. “MakeList,” when executed, pops n arguments off the stack and makes a list out of them, pushing the list back on the stack; n is its argument, 3 in this case.  “Reverse,” when executed, pops a list off the stack, reverses the elements, and pushes the resulting list back on the stack. “UnList,” when executed, pops a list off the stack, then serially pushes the elements of the list back on the stack. “Pop,” when executed, pops n elements off the stack, where n is its argument, 2 in this case.

 

What is the expected output of the above? Something like the following:

 

13 Instructions

   0: Push             A

   1: Push             B

   2: Push             42

   3: Push             C

   4: DisplayStack

   5: MakeList         3

   6: DisplayStack

   7: Reverse

   8: DisplayStack

   9: UnList

  10: DisplayStack

  11: Pop              2

  12: DisplayStack

 

4 elements in the stack

   0: C

   1: 42

   2: B

   3: A

 

2 elements in the stack

   0: ["C", 42, "B"]

   1: A

 

2 elements in the stack

   0: ["B", 42, "C"]

   1: A

 

4 elements in the stack

   0: C

   1: 42

   2: B

   3: A

 

2 elements in the stack

   0: B

   1: A

 

Neat. Ok, now the implementations of VM’s methods. Here’s Execute:

 

    Public Sub Execute()

        ip = 0

        While ip < ix.Count

            ip += ix(ip).execute(Me)

        End While

    End Sub

 

Yup, just calls the overridable “execute” method on each instruction, passing in the VM itself so that each instruction can access the stack, as needed. Notice that the ip (instruction pointer) is advanced by an amount returned by “execute.” That’s so we can write branch instructions that move the instruction pointer somewhere far away. Most instructions just return 1 from “execute.”

 

“Explain” is equally easy. The only thing remotely noteworthy is that Console.WriteLine automatically calls the “ToString” method on class “Instruction,” which will, by override (virtual dispatch) call the various formatting routines in the particular instruction subclasses:

 

    Public Sub Explain()

        Console.WriteLine("{0} Instructions", ix.Count)

        ip = 0

        While ip < ix.Count

            Console.WriteLine("{0,4}: {1}", ip, ix(ip))

            ip += 1

        End While

        Console.WriteLine()

    End Sub

 

Now it’s time to see the base class for instruction, again with some detail omitted for now:

 

Public MustInherit Class Instruction

    Public MustOverride Function execute(ByVal d As VM) _

      As Integer

 

    Public Overrides Function ToString() As String

        Return String.Format("{0,-16} {1}", formatOpcode, _

        formatArgs)

    End Function

 

    Public Overridable Function formatArgs() As String

        formatArgs = ""

    End Function

 

    Public Function formatOpcode() As String

        Const opcodePat = "\w+\.(?<opcode>\w+)$"

        Dim s = Me.GetType.ToString

        formatOpcode = Regex.Match _

        (s, opcodePat).Result("${opcode}")

    End Function

End Class

 

“execute” is a pure virtual method – each instruction must supply an implementation. That’s pretty obvious. “ToString,” called by the VM’s “Explain” procedure for each instruction, calls an overridable “formatArgs,” with the empty string being the default return value for those instructions that don’t override it because they don’t have arguments they want to print out. Interestingly, the “formatOpcode” isn’t overridable, because, via Reflection, we can get the name of each instruction subclass without having that subclass participate. Neat.

 

Nothing much left, now. Just show a few instructions, and you’ll get the idea:

 

Public Class Push

    Inherits Instruction

    Public a1 As Object

 

    Private Sub New() REM Not allowed to call this

    End Sub

 

    Public Sub New(ByVal a1_ As Object) REM must pass an arg

        Me.a1 = a1_

    End Sub

 

    Public Overrides Function execute(ByVal v As VM) As Integer

        execute = 1

        If a1 Is Nothing Then

            Throw New VM.Exception _

            ("Push must have one, non-nothing argument")

        End If

        v.stk.Push(a1)

    End Function

 

    Public Overrides Function formatArgs() As String

        formatArgs = Me.a1.ToString

    End Function

End Class

 

Gosh, this is almost too easy. Nothing to say. How about this:

 

Public Class MakeList

    Inherits Instruction

    Public a1 As Object

 

    Private Sub New()

    End Sub

 

    Public Sub New(ByVal a1_ As Object)

        Me.a1 = a1_

    End Sub

 

    Public Overrides Function execute(ByVal v As VM) As Integer

        execute = 1

        Dim kamah As Integer = a1

        If Not kamah >= 0 Then

            Throw New VM.Exception("MakeList must have p1 >= 0")

        End If

        Dim h = New List(Of Object)

        For i = 1 To kamah

            Dim s = v.stk.Pop()

            h.Add(s)

        Next i

        v.stk.Push(h)

    End Function

 

    Public Overrides Function formatArgs() As String

        formatArgs = Me.a1.ToString

    End Function

End Class

 

Ok, you can fill in the rest. Just so you can see how I personally filled in some of the boring details, I'll post the whole thing in the "Excutables" topic (links on the left). You'll need the CTP version of VB (Customer Technology Preview) from http://msdn.microsoft.com/VBasic/Future/, so you can have type inference and other VB9 glories, which I freely use. 

 

 

 

 

 

 

 

Published Friday, March 10, 2006 12:51 PM by brianbec

Comments

# re: How to write a Standalone VM@ Saturday, March 14, 2009 5:39 PM

Sehr wertvolle Informationen! Empfehlen!

...