New Search Pet Project - NNTP Newsgroup Spider and Search

So, I started thinking about other things that I can search on besides the web.  This time, I actually thought of something useful.  Another pet project of mine has been newsgroups.  While my code and database for the Web Search were stuck in Seattle, I started working on a similar project to search NNTP newsgroups.  I started looking for NNTP communications libraries.  I found one called Smilla and its accompyaning code.  Currently, all my code is connect to an NNTP server and retrieve the news groups themselves.  I just started today working on returning the posted articles.  This will be very interesting.  In some ways, it will be similar to the Web Search.  In other ways, it will be very different.  Using some of the tricks I learned in the Web Search code, I am planning on spawning a thread for each newsgroup, dumping articles into a MSMQ queue, and then picking them up to be inserted in the database.  Hopefully, this will be easier than the Web Search due to the fact that the data will not grow continually like it does with the Web Search.  It will be quite fun to see how things go.  More coming soon............

Wally

1 Comment

Comments have been disabled for this content.