September 2007 - Posts
i have switched from my previous blog , http://geekswithblogs.net/burncsharp . Thanks to Joe Stagner for setting up the blog for me. i am still importing some of the posts from my previous blog, also from now and on, this wll be my blogging home.
The things that i have already noticed in weblogs (community server) that it is neat , have more options for adds and layout management. The best thing, which i really missed in my previous blog is on-the-fly preview option for theme switching , now i dont have to change the theme go back to my page and see the effect, it is there right away. Also, i have the full control over fonts and layout and even i can import custom layout. Finally, the ajax - y experince makes it out of the box.
I will post here mostly about asp.net, LINQ and bit sql server and all the current things i am working on. See ya :-)
With the advent of web , search is everywhere. To make every contentful application really useful for users, we need to have a good search capablity inour app to get the right content when users ask for it. My focus here , is to create a pretty simple search solution using the "Full-Text" capability of Sql Server 2005.
Some key features that i belive to watch out.
- When searching with multiple words, it should not get useless inofrmation. For ex . when we search "asp.net ajax", it should not return content having "php ajax".
- Low on CPU usage.
- Chances of index hit is pretty high.
- Does have a word index methods for which searching with same keyword is faster.
Some sqls like we always do
Select *
from Person where [Firsname] like '%mehfuz%' or [LastName] like '%hossain%' First of all , this search is high on IO and almost neve hits an Index , moreover , it high on cpu usage and really slow.
FullText search on the other hand, ensures that your query hits index, low on logical reads and also faster for searching same keyword. Bascially , To enable fulltext , we need to create a Categlog, that maintains a dictionary for searched keywords, thats why , similar search is always faster.
Now, lets dig into full-text. I have divided this up into some steps, step 1 , 2, and 3, can be done manullay DB->Stroage - > FullTextCatelogs - > Create New.
Step 1 : Full-Text Initialization
sp_fulltext_database 'enable' - intializes the database for full text search.
Step 2 : Catelog Creation
CREATE
FULLTEXT CATALOG <catelogName> IN
PATH N'C:\Database\' WITH
ACCENT_SENSITIVITY = ON AUTHORIZATION
[dbo] This will create the catelog file under - > DB - > storage -> FullTextCatelog, Accent_sensitivity means Café(French) and Cafe(English) will be treated differently.
Step 3 : Index Creation
Add the table and the key(Primay Key, that will be used to track down results) , on which the the search will be made.
CREATE FULLTEXT INDEX ON [dbo].[<TableName>] KEY INDEX [<ClusterIndex Name>] ON [<CatelogName>] WITH CHANGE_TRACKING AUTO
Add the column on which the search to be made.
ALTER
FULLTEXT INDEX ON [dbo].[<tableName>] ADD ([<Column on which the search will be made>]) Enable the Index, that will do a re-build operation on sql server for full text.
ALTER
FULLTEXT INDEX ON [dbo].[<tablename>] ENABLE Step 4 : Query and Result
Select
* from <TableName> where Contains([FullTextColumn], 'TV NEAR star') Contains is a full text routine, That accept keyword LIke 'AND', 'OR' , 'NEAR', you can put columns names in Contains, or put an '*' that will do search on all mapped full text columns.
"NEAR' is useful when used with containsTable and we want to sort by Rank
Here is the piece of code that i found from MSDN - bit modified
SELECT
* FROM
tabletobeSearchOn AS FT_TBL INNER JOIN CONTAINSTABLE(tabletobeSearchOn , *, 'tv NEAR star') AS KEY_TBL ON FT_TBL.[ID] = KEY_TBL.[KEY] ORDER
BY KEY_TBL.[Rank] asc Note :The more closer the search keyword will be ,the more higher wil be the rank(asc).
Now, to make the search more better, i would suggest that if you have FirstName, lastName, username, sex, columns to be search on , then combine data of all theses columns and dump them to a single Column, Lets say 'FullTextCOlumn' , whenever a change will occur in the table, use trigger to udpate that column with changed data. In this way ur search will get more accurate if you want to search on discreate columns as a whole, or lets say , if you want to simulate a box model.
What is Box model ? Lets say, if you have 'Box1' with balls marked 'Mehfuz', 'Hossain', 'Male' , 'mehfuz@gmail.com' and 'Box2' with balls marked 'Mehfuz', 'Carlos' , 'Male' , calors@newworld.com'. Now if you search for 'Mehfuz Hossain male' , it should get you Box1, not box 2 although there is 'Mehfuz' in FirstName and 'Male in Sex , column of Box2. To get this done the best solution is to combine columns, in a way that i have just said.
Although, there are plenty of ways to make a search solution , i found the Full-text one more compelling and easy to understand and implement.
Most of the cases we use cursor, for loop like senarios, like calutating total order for an item and put the sum in a specific column, doing some work on particular table on basis of values of some other tables.
Now, cursor is fine for smaller amount of data items, but it becomes really lousy , as the number of items start to grow.
Here, i will discuss about a totally alternative to Cursor which is more faster and effective for working with huge number of data items. In my example , i will work with tables from Sample Database called "AdventureWorks" which you can easily download from MSDN site for Sql server 2005.
Now , My objective is to Calulate the Total amount for sales item from SalesDetail Table and store it in the Sales Table (for each row).
To get this sample work , i have created a sample Column Called TotalAmt in SalesHeader Table, there is already a Column called TotalDue, that stores the agrregatated value of the same calculated sum that i am going to store in this TotalAmt Column, only i didnt want to mess things up for test purpose and as i want to compare my result with the main result, i decided to go for creating a separate column.
Therefore first thing is to create the sample Column, which goes something like this.
alter table Sales.SalesOrderHeader add TotalAmt real not null default 0
Next, i have created the query the using the old cursor method that does the same work which my optimized sql will do.
declare @orderId int
declare @taxAmt real
declare @subTotal real
declare @freight real
declare adventure_cursor cursor for
Select SalesOrderID, SubTotal, TaxAmt, Freight From Sales.SalesOrderHeader
Open adventure_cursor
Fetch next From adventure_cursor
into @orderId , @subTotal, @taxAmt , @freight
while @@fetch_status = 0
begin
update Sales.SalesOrderHeader
Set
TotalAmt = @subTotal + @taxAmt + @freight
where SalesOrderID = @orderId
Fetch next From adventure_cursor
into @orderId , @subTotal, @taxAmt , @freight
end
close adventure_cursor
deallocate adventure_cursor
The query shows that , it sums the total TaxAmt, Feight and SubTotal (Total sales for quantity) and stores it in TotalAmt of Sales.SalesOrderHeader table for each sales order.
In the AdventureWorks Db there are around 31465 rows . It took around 19 seconds to sum and update value for each item.
Now, lets change query to my more optimized update statement solution
update Sales.SalesOrderHeader
Set TotalAmt = list.SubTotal + list.TaxAmt + list.Freight
From Sales.SalesOrderHeader as SalesOrder
inner join (
Select SalesOrderID , SubTotal, TaxAmt, Freight from
(
Select
SalesOrderID,
SubTotal,
TaxAmt,
Freight
From Sales.SalesOrderHeader
)
as T ) as list on list.SalesOrderID = SalesOrder.SalesOrderID
First of all the query does not use intermidiate varible to store data , therefore it does everything on the fly, secondly i have used T table of sql server 2005 , which does the processing in memory but way to faster than any Cursor query.
And, yes the query completes the opeartion in 3 seconds.Amazing!!! You must be thinking 19 seconds was not a big deal at all , but think about 3 milion rows intead of 30,000 rows , in that case surely Cursor is of no use.
More Posts