I observed people implementing website search by using google API's or other third party dll's. This is a simple site search engine to search all the images , files ,... in a website without using any third party's like lucene , google. Previously I worked on lucene where i implemented both desktop and database search. But i thought depending on a third party doesn't gain much knowledge. So i've implemented a simple search using regular expressions.
Architecture of the search engine :
I'm using 5 classes here. Here is the class diagram
-
CleanHtml.cs : This is used to clean the file of HTML Tags
-
Page.cs : Page class to store data of individual files on the website
-
PageData.cs : Defines shared methods to create and add records to dataset
-
Site.cs : Properties of this class are used to store configurations and data of the entire site
-
Usersearch.cs : This class contains all the search function methods.
Design and Implementation :
Here the design interface is inspired by google where I'm using a simple textbox where I'll be giving the input i.e., my search keyword. There are 3 option where i will be searching through phrases, senteces and words.
The input which has been given will be passed as the input to the below method which is an instance of UserSearch inside the code.
private
Searchs.UserSearch SearchSite(string strSearch)
{
Searchs.UserSearch srchSite;srchSite = new Searchs.UserSearch();
srchSite.SearchWords = strSearch;
..........
}
Now the search will be looped into all the folders and files in the project and returns the search results which I'm displaying in a list with the url of the file.
Webconfig :
In the webconfig I'm keeping restrictions for the search like what files to search , what files not to search etc,...Check below
<!-- Place the names of the files types you want searching in the following line sepeararted by commas -->
<
add key="FilesTypesToSearch" value=".htm,.html,.asp,.shtml,.aspx,.xml,.jpg"/><!-- Place the names of the dynamic files types you want searching in the following line separated by commas -->
<
add key="DynamicFilesTypesToSearch" value=".asp,.shtml,.aspx,.xml,.jpg"/><!-- Place the names of the folders you don't want searched in the following line spearated by commas-->
<
add key="BarredFolders" value="support files,cgi_bin,_bin,bin,_vti_cnf,_notes,images,scripts"/><!-- Place the names of the files you don't want searched in the following line spearated by commas include the file extension--> <add key="BarredFiles" value="adminstation.htm,no_allowed.asp,AssemblyInfo.vb,Global.asax,Global.asax.vb,SiteSearch.aspx"/>
This is the basic functionality of my search.In this article I'm posting the complete project. If any comments they are welcome.
Regards,
Surya.
The main idea of building a ASP.NET website is to provide security. Keeping this in mind i have written an encryption class where we can encrypt a particular url and hide the parameter value.
For ex : abc.aspx?id=2 will be encrypted to abc.aspx?id=[encrypted value].
Note:
To protect your site from errors or sql injections better pass the queries as stored procedures.
Here is the class for encryption :
using System;
using System.Data;
using System.Configuration;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using System.Collections.Specialized;
using System.Collections;
/// <summary>
/// Summary description for QueryStringEncDecryption
/// </summary>
public class QueryStringEncDecryption : NameValueCollection
{
private string document;
public string Document
{
get
{
return document;
}
}
public QueryStringEncDecryption()
{
//
// TODO: Add constructor logic here
//
}
public QueryStringEncDecryption(NameValueCollection clone)
: base(clone)
{
}
public static QueryStringEncDecryption FromCurrent()
{
return FromUrl(HttpContext.Current.Request.Url.AbsoluteUri);
}
public static QueryStringEncDecryption FromUrl(string url)
{
string[] parts = url.Split("?".ToCharArray());
QueryStringEncDecryption qs = new QueryStringEncDecryption();
qs.document = parts[0];
if (parts.Length == 1)
return qs;
string[] keys = parts[1].Split("&".ToCharArray());
foreach (string key in keys)
{
string[] part = key.Split("=".ToCharArray());
if (part.Length == 1)
qs.Add(part[0], "");
qs.Add(part[0], part[1]);
}
return qs;
}
public void ClearAllExcept(string except)
{
ClearAllExcept(new string[] { except });
}
public void ClearAllExcept(string[] except)
{
ArrayList toRemove = new ArrayList();
foreach (string s in this.AllKeys)
{
foreach (string e in except)
{
if (s.ToLower() == e.ToLower())
if (!toRemove.Contains(s))
toRemove.Add(s);
}
}
foreach (string s in toRemove)
this.Remove(s);
}
public override void Add(string name, string value)
{
if (this[name] != null)
this[name] = value;
else
base.Add(name, value);
}
public override string ToString()
{
return ToString(false);
}
public string ToString(bool includeUrl)
{
string[] parts = new string[this.Count];
string[] keys = this.AllKeys;
for (int i = 0; i < keys.Length; i++)
parts[i] = keys[i] + "=" + HttpContext.Current.Server.UrlEncode(this[keys[i]]);
string url = String.Join("&", parts);
if ((url != null || url != String.Empty) && !url.StartsWith("?"))
url = "?" + url;
if (includeUrl)
url = this.document + url;
return url;
}
}
This is just the prieview of the encryption class. We will also be having an other class for Encryption which i'm posting as an attachment. Using these two classes you can encrypt ur URL.
I like to thank the complete asp.net developers team for giving me this oppurtunity to be a part of their blogging community.Its an amazing oppurtunity to express our ideas and share them with all the viewers.It took a while for me to start of this blog because of my busy schedule.I assure that I will give my best.