February 2009 - Posts

Using Lightweight Automation Framework to parse a web page
Friday, February 13, 2009 12:35 AM

 This post is meant to illustrate some capabilities of the Lightweight Test Automation Framework.

Suppose I want to create a small application that displays the latest posts that where made to our forum: http://forums.asp.net/1193.aspx. I would like to issue a WebRequest to the forum and parse the HTML and find the titles of all the posts in the main page. There are probably lots of libraries to parse HTML content, but I'll show how you can use our framework to accomplish this.

1. The first thing to do is to become familiar with the HTML page that you want to parse, in this case I navigate to the forum and using a DOM inspector I can see that all the links to the posts are inside table rows that have the class attribute set to "CommonListRow"

Forum Page

2. Next, I created a new Console application and reference the Microsoft.Web.Testing.Light.dll

3. Make a request to the server (in my example, I use the System.Net.WebRequest class).

4. Use the static HtmlElement.Create(string html) to parse the response into an HtmlElement.

5. Use the common API to find the elements that you need.

Here is the source code:

        static void Main(string[] args)
{
// Create a request for the URL.
WebRequest request = WebRequest.Create("http://forums.asp.net/1193.aspx");

// Get the response.
HttpWebResponse response = (HttpWebResponse)request.GetResponse();

// Get the stream containing content returned by the server.
Stream dataStream = response.GetResponseStream();

// Open the stream using a StreamReader for easy access.
StreamReader reader = new StreamReader(dataStream);

// Read the content.
string responseFromServer = reader.ReadToEnd();

// Remove the DOCTYPE
responseFromServer = Regex.Replace(responseFromServer, @"\<\!DOCTYPE.*?\>", String.Empty);

// Load the response into an HtmlElement
HtmlElement rootElement = HtmlElement.Create(responseFromServer);

// find all the post rows
HtmlElementFindParams findParams = new HtmlElementFindParams();
findParams.TagName = "tr";
findParams.Attributes.Add("class", "CommonListRow");
foreach (HtmlElement tableRow in rootElement.ChildElements.FindAll(findParams))
{
//find the first link within the row
HtmlAnchorElement link = (HtmlAnchorElement) tableRow.ChildElements.Find("a", 0);

// Display the title
Console.WriteLine(String.Format("\"{0}\"",link.CachedInnerText));

//Display the link
Console.WriteLine(String.Format("\thttp://forums.asp.net{0}\n",
link.CachedAttributes.HRef));
}

// Cleanup the streams and the response.
reader.Close();
dataStream.Close();
response.Close();

}

A couple of things to notice:

  • The original response from the server contains a <!DOCTYPE> directive before the main <html> tag. When constructing HtmlElement they most point to a single root "tag". In this case the parser thinks there are 2 tags (the DOCTYPE and the HTML tags) and would fail if we don't remove the DOCTYPE.
  • Notice the use of HtmlElementFindParams to locate all table rows that have a specific class.
  • Notice the use of the strongly typed HtmlAnchorElement to quickly access its HRef property.

 

Here is the console output when I run the program:

 Output

Hopefully this post has shown you some of the not-so-obvious things that you can do with the Lightweight Test Automation Framework.

Federico Silva Armas
ASP.NET QA Team

 

Lightweight Test Automation Framework Source Released!
Wednesday, February 11, 2009 6:00 PM

The Asp.NET QA team has been working to make what we do and our processes more visible to our customers.  Two major things we have done to increase this are, releasing the Lightweight Test Automation Framework binaries with a sample to Codeplex.com and starting our ASP.NET QA Blog.  Well, over the last week we reached another milestone by releasing the source to the Lightweight Test Automation Framework on Codeplex.com.  This is the first time we have tried to do something like this on the ASP.NET QA team and we are very excited.  Visit Codeplex.com and check it out for yourself!

Items Updated in the February Update of the Lightweight Test Automation Framework include

Thanks go out to all those who have helped to get this release out the door, with a special thanks to the lead developer Federico Silva Armas.

More Posts