Contents tagged with Web Search with .NET
-
HashCode
Several years ago, I had written my own little hashcode for my web search spider. I checked the performance on my 64 bit system, and to put it mildly, it was horrible. I was talking to Dave Wanta this week and he reminded me of the Hashing support in .NET. I had completely forgotten about it. I yanked out my hash code and used the hasing in .NET. I got better performance using the hashcode in .NET when inserting into the database. I believe that my code did not provide a good spread to effectively use the database indexes that I had setup. By going to the .NET hash support, I think I was able to get a better spread for my indexing system and was able to get rid of a database hotspot. Right now, I am trying to see how many records it will support.
-
64 bits in .NET 2.0
I put my Web Spider code onto my x64 system. Everything is managed code. I was able to just start running the code. The lesson seems to be that if you right 100% managed code, your code will run, your code has a pretty good change of running in 32 bits or 64 bits. I always hate to say something will run a 100% of the time, but my guess is that it has a good chance of running. Perhaps someone from MS can post a comment about the pitfalls of 32 vs. 64 bits in .NET 2.0.
-
Specific Sites
I made some changes to my spider to only search specific sites as oppossed to just going out there and searching the Web Graph. It was fairly simple. All I had to do was change a couple of stored procs.
-
Back to playing with my Web Search Spider
Appropiate that this is my first post in 2006. I spent several hours this evening working getting my web search spider back up and going. I have it running on .NET 2.0 with Sql Server 2005. I need to go back and make a stored procedure change or two and it will be ready for me to scale it back up. I have it running on my 32 bit laptop right now. I am going to get it going in Win64 for x64. Things to do:
-
Yahoo Developer Network
Yahoo has created a Developer Network for their search services. It is located at http://developer.yahoo.net/.
-
Full-Text Search Chapter Outline is now completed
I think I have just about completed my outline for the Full-Text Search Chapter Outline is just about completed and I have written some content. I'll probably have a few changes to the outline, but I like where I am at now. I'm really excited about this. I can honestly see the finish line of this book now.
-
EXISTS vs. COUNT in TSql
Sometimes, programmers will perform a something like the below in TSql:
-
MSMQ Processing of multiple pieces of data
You want to process a bunch of data associated with a single MSMQ message. First, create a class with the Serializable attribute.
-
MSMQ processing of a string
Here is some sample code that processes messages that are in an MSMQ queue (isn't that redundant?). It pulls a string from a defined queue and does something with it. Behind the scenes, it uses the threadpool to process messages in the queue.
-
Optimizing Indexes with Sql Server 2005 (Yukon) Beta 2
If you want to optimize your indexes in Sql Server 2000, you probably used the DBCC REINDEX or DBCC INDEXDEFRAG commands. I like the DBCC INDEXDEFRAG because I don't have to take the table offline from the application. I've been looking at for a similar mechanism in Sql Server 2005 (Yukon). I have found the ALTER INDEX command. The advantage that it has over the DBCC INDEXDEFRAG command is that with a DBCC INDEXDEFRAG command, you must manually specify the indexes that you want to defrag. Why am I looking for this? According to the documentation, DBCC INDEXDEFRAG will be removed from a future version of Sql Server.