in

ASP.NET Weblogs

Brian Desmond's Blog

Inherits Network.Admin
Implements IOneManBand

Comics RSS Feed & One Cool Component

The folks over at www.webzinc.net sent me an email a few weeks ago with a key for their WebZinc.Net product, and told me to try it, and if I wanted to, write something about it over at www.aspalliance.com/bdesmond. I did not have the faintest clue what I could do with a bona-fide screen scraper at the time. Fortunately for me, Scott Adams (www.dilbert.com) and Chris Sells (www.sellsbrothers.com) took care of that.

Scott Adams has decided that he wants money in order to email me a dilbert strip every day. Chris Sells blogged about RSS feeds for comics.com. Well, I like reading dilbert when I open Outlook every morning, and I thought Chris Sells' idea was cool. WebZinc is really easy to use - it comes with this utility to open a web page, and then see what the indices of all the elements on the page are in WebZincSpeak. I opened up www.dilbert.com, and found that the daily dilbert strip is, in fact, Image #63. So, I after some HttpModule magic, I got WebZinc setup to grab the full URL of Image #63 every day, and then check to see if it was in my database of comics (to prevent duplicates). If not, the system adds the strip to the database, thereby making it available for the HttpHandler that generates the RSS.

In a couple hours this evening, I put a solution together, utilizing WebZinc, an HttpModule, HttpHandler, SQL2000, threading, and an RSS Library I found on www.gotdotnet.com. The HttpModule handles the BeginRequest event, and starts a new thread. This thread goes into the database, and checks to see if the current date is after the last update + the number of hours between new comics. If it is, this thread queues a work item into the Thread Pool, which then instantiates WebZinc, grabs the comic strip URL, and in turn logs everything in the database. The HttpHandler takes a FeedID query string property, and based on that, goes to the database (or cache), grabs the top 10 comic strips, and generates an RSS feed using the RSS Library gadget I found on GDN.

Once I get the system debugged and beautified, I'll put it up on my server, and provide a system whereby one can subscribe to a comic strip in their RSS aggregator. If you'd like to test the system in its' current state, leave a comment below which includes: your email address, your favorite comic strip at www.comics.com (along with the link to it, and how often a new strip is posted), and one or two sentences describing why you'd like to test it. I will, at my discretion, dole out the current URL.

If you're looking for a high power screen scraper that does a lot of other stuff (ftp client, server, HTTP browser control...), visit Http://www.webzinc.net. They've got a free trial there. Since they gave me a key and I found a use for it, I'll probably write something up at http://www.aspalliance.com/bdesmond.

Published Aug 04 2003, 01:02 AM by bdesmond
Filed under:

Comments

 

Chris Pirillo said:

Definitely keep me posted on this one...
August 9, 2003 2:21 AM
 

dwlt said:

Hi there,

I already have a variety of feeds over at http://dwlt.net/tapestry/, including Dilbert.

August 11, 2003 10:40 AM
 

Ravis said:

Try Comic Alert ( http://www.comicalert.com/ ) for a ton of comics. Plus they allow customized feeds. Funky.
June 17, 2004 5:15 AM

Leave a Comment

(required)  
(optional)
(required)  
Add