Chris Hollander in his weblog post RSS emergence wrote:
This may just be one of those history making posts....
Scott Guthrie:
I feel somewhat guilty that the www.asp.net website doesn't publish its content yet via RSS (examples: article of the day, new control gallery submissions, frequent posts on the forums, etc). We need to get that changed (I'm putting it on my long list of things todo).
First, a few dozen of us are blogging. Then, an interesting site or two (or,dare I say, three?) start producing RSS. Sooner or later, other sites might catch on.
Scott Guthrie in his weblog post New Blog Aggreagtor... wrote:
Frankly I wasn't aware that all these "traditional" web sites provided RSS feeds of their articles and content. This really opened my eyes to the power of RSS (I feel like something of a luddite saying that -- but oh well). Unfortunately some of the sites I frequent still aren't RSS enabled -- but hopefully this will change soon.
Mark Pilgrim wrote in first installation of his Dive in XML XML.com column:
"RSS is a format for syndicating news and the content of news-like sites, including major news sites like Wired,
news-oriented community sites like Slashdot, and personal weblogs. But it's not just for news.
Pretty much anything that can be broken down into discrete items can be syndicated via RSS:
the "recent changes" page of a wiki, a changelog of CVS checkins, even the revision history of a book.
Once information about each item is in RSS format, an RSS-aware program can check the feed for changes
and react to the changes in an appropriate way."
Well, Mark is right. In my opinion, main strength of RSS lie in fact that RSS is XML based format. This consequently means that RSS can be manipulated programmatically. And this maked difference. It is at least hard to mining contents from general web sites programmatically, although even this may be eventually done (if you have a C# Today subscription then you may take a look on this interesting case study: .NET Web Data Toolkit).
Chris makes note (see above) that even main MSDN web site may have RSS feed in future. Well, MSDN already has RSS newsfeed for some time now. MSDN download RSS newsfeed is very popular as well. Maybe you know Keith Ballinger. He does not have for some reason RSS newsfeed on his weblog. But no problem, Sam Ruby kindly created one for him. And there are many other similar examples too. For example, if you like to have RSS newsfeed for some CNN or BBC news (by category) you may look here.
It seems that this is not big problem to create RSS newsfeed for something that has some internal structure. RSS format is published, manual discovering elementary structure of HTML source for given site should be easy too. Then just get web page source programmatically (by normal HTTP GET, say using C# and WebRequest class), do some elementary parsing (say using Regular Expression, there's not need to write full fledged parser), collect needed info and produce XML output of new XML feed. And then set this code to repeatedly run from some website and by this provide RSS feed.
Only one question that came in mind with this is, what if original news producer do not like that someone create newsfeed for him. Maybe some HTML source obfuscators appears shortly :o)).
Another idea that comes quickly to mind is what about to process all these RSS newsfeeds at large. You know, probably we all have some aggregator app which we use to manage our RSS subscriptions, but how many RSS newsfeed you able to manage/subscribe/read? Definitely, News Aggregators are relatively new tools and there is enough space for further development there, but there still be a limit how many RSS newsfeeds you able to manage yourself. There are already thousands of web sites with RSS newsfeed (either from original source or from elsewhere) and this number increases rather quickly.
Naturally there pop-up some engines that can collect RSS feeds for you. Just look on News4Sites, NewsIsFree or on Syndic8.
Collecting content from web sites on regular basis is definitely not new concept. Same does Google for web sites after all, isn't it? Google even tries doing same for news. Check Google News web site.
Automatically collect all available RSS newsfeeds and provide some sophisticated services on top of collected information package look very interesting. One thing may be some kind of intelligent search (maybe including discovery of related links too). Another area is dynamically indicating popularity of various actually discussed topics, and gathering data about these topics from weblog sources and serve it. Just two ideas out of many. Look on project like DayPops or Organica. btw, few days ago Google buy Pyra Labs creator of popular Blogger.com webloging system.
And all this is probably only start. There will be changes. Web is changing. Way how we process information is changing. Peoples are changing. World is changing.