Attention: We are retiring the ASP.NET Community Blogs. Learn more >

Parsing into Nodes

In my last blog entry I created a parser which loops through text and parses stuff:

Sub Parse()
    While Not EOF()
        ParseThisStuff()
        ParseThatStuff()
    End While
End Sub

Sub ParseThisStuff()
    While Not EOF And NextCharIsThisStuff()
        Write( NextChar() )
    End While
End Sub

Sub ParseThatStuff()
    While Not EOF And NextCharIsThatStuff()
        Write( NextChar() )
    End While
End Sub

That's a pretty simple pattern and is repeatable for all kinds of "Stuff". This is useful because, whenever I parse chunks of stuff they will be written out to "wherever" by the Write method in the ParseStuff routine. It would be better if I could channel the "stuff" into a node for later use and put some smarts into the Node as to how its content is "emit'ted". For example I might want to emit the node contents as Xml or native IL, Html or whatever. This Node class can store "stuff" and selectively render the contents:

Class Node
    Public Text As String
    Public Type As Integer
Public Sub New(ByVal text As String, ByVal type As Integer) Me.Text = text Me.Type = type End Sub
Public Function Render() As String Return String.Format("<{0}>" & Me.Text & "</{0}>", Me.Type) End Function End Class

Now, whenever I parse "stuff" I can shove it into a node and add it to a collection of nodes. This requires me to alter my general parsing algorithm slightly so that I don't need to create a new Node for each individual char.

'OLD PARSING PATTERN:
Sub ParseWords()
    While Not EOF And Not IsWhitespaceChar( ReadNextChar() )
        Write( NextChar() )
    End While
End Sub

'NEW PARSING PATTERN ALLOWING FOR NODES:
Private Sub ParseWords()
     If Not IsWhitespaceChar( ReadNextChar() ) Then
          Dim startPos As Integer = Textpos()


While Not EOF() AndAlso Not IsWhitespaceChar( ReadNextChar() ) GetNextChar() End While
Dim endPos As Integer = Textpos()
If startPos < endPos Then listOfNodes.Add( New Node( str.Substring( startPos, endPos - startPos ), 2 ) ) End If
End If End Sub

Now, when I need to output the contents I can call upon the nodes in my list like so...

Private Sub Render()
    For Each n As Node In listOfNodes
        lbl.Text &= n.Render()
    Next
End Sub

No Comments