Wimdows.NET

Wim's .NET blog

Public available data - FREE screenscraping or pay for API

Been working on a semi-commercial pet project of mine, for which I need a data feed.

A decent enough subset of this data feed is publicly available from this content provider's main website. However, the full dataset (though I won't need all that) is available through an HTTP GET XML API... For a flat fee of over 500 dollars per year.

What would you do? 1) Roll it yourself in about 20 lines of .NET code (using HttpWebRequest & Regex's) and scrape it; 2) Pay for the API...?

Needless to say, I went for 1)...even for just the fun.
Posted: Jan 23 2006, 08:42 PM by Wim | with 2 comment(s)
Filed under: ,

Comments

Mischa Kroon said:

9 out of 10 it's just easier to do the screenscraping thing.

Anything that costs money you have to ask your boss for a make / buy decision.

It takes time etc.

Or you can just do it yourself, not convince the boss and get it done in about 1/2 of the time.

When the regex'es break because of a redisign your screwed though :P

So I would probably go with:

if nonessential data then
regex :P
end if
# January 24, 2006 4:04 AM

Wim Hollebrandse said:

Mischa,

I'd say the same. And yes, when there's a redesign, your regex's break, but that can easily detected.

Also - the mechanism I'm using will store this data in an XML file and when it cannot retrieve the HTTP update properly (ie. an exception occurred), it will use the last successful feed data, and notify me of a problem.
# January 24, 2006 5:04 AM
Leave a Comment

(required) 

(required) 

(optional)

(required)