Faraz Shah Khan

MCP, MCAD.Net, MCSD.Net, MCTS-Win, MCTS-Web, MCPD-Web

Regex to find URL within text and make them as link

Some time back on the form somebody was looking for some help in searching URL within text and make those URLs as link. Me and that guy tried various regex but the one that worked out I thought to put it on the blog so that it can help me and others later. Regex itself is:

-------- In VB.Net ---------
Dim regx As New Regex("http://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?", RegexOptions.IgnoreCase)

-------- In C#.Net ---------
Regex regx = new Regex("http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@\\#\\$\\%\\^\\&\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?", RegexOptions.IgnoreCase);

And I used following method to convert the URLs into link within text.

-------- In VB.Net ---------
Protected Function MakeLink(ByVal txt As String) As String
        Dim regx As New Regex("http://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?", RegexOptions.IgnoreCase)

        Dim mactches As MatchCollection = regx.Matches(txt)

        For Each match As Match In mactches
            txt = txt.Replace(match.Value, "<a href='" & match.Value & "'>" & match.Value & "</a>")
        Next

        Return txt
End Function

------- In C#.Net --------
protected string MakeLink(string txt)
{
    Regex regx = new Regex("http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@\\#\\$\\%\\^\\&amp;\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?", RegexOptions.IgnoreCase);
   
    MatchCollection mactches = regx.Matches(txt);
   
    foreach (Match match in mactches) {
        txt = txt.Replace(match.Value, "<a href='" + match.Value + "'>" + match.Value + "</a>");
    }
   
    return txt;
}

Comments

Sandy said:

What about https?

# August 26, 2008 1:31 PM

farazsk11 said:

@Sandy:

Thanks for pointing it out, yes the above regex will not work for https however you can make a small change in the above regex as bleow and it will work for https as well. What I added in the above regex is (s)? after http.

"http(s)?://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?"

# August 27, 2008 1:32 AM

iyeru said:

What about PHP? =/

# September 1, 2008 11:03 AM

farazsk11 said:

@iyeru

well dude I have no idea about PHP as I have never worked on it.

# September 2, 2008 12:23 PM

Manish Mishra said:

Hay dude try this regular expression string.

"http(s)?://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?"

# September 25, 2008 7:36 AM

jspass2 said:

om

# October 3, 2008 7:03 AM

imperialx said:

How about on a URLRewrite like <a href="/category/books/ISBN-123456">?

# October 21, 2008 7:44 AM

r-dog said:

try this, it will get rid of any link that starts with http:// or https://

"(http|https)://([a-zA-Z0-9\\~\\!\\@\\#\\$\\%\\^\\&amp;\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?"

# January 16, 2009 1:01 PM

Dominika said:

How about extractingt the link name associated with that found link as well?

# March 3, 2009 11:30 AM

kotresha said:

Excellent, this saved my day work

# July 14, 2009 7:31 AM

CM said:

Thanx a bunch!

# August 4, 2009 7:59 PM

Confesso que... said:

I use this function here http://www.confessoque.com...works fine! Thanks!

# August 20, 2009 1:12 PM

Baba said:

hi all http://www.google.com/ is a websitr

# September 2, 2009 1:28 AM

Ian Peake said:

Does anyone have a comple piece of code showing how I can read in a text file using streamreader and extract all the lines of text that contain a url? I have tried with RegEx but seem to be getting nowhere. Any help is much appreciated. You can mail me ianpeake@warwickshire.gov.uk

# September 11, 2009 4:30 AM

Kristian Ask said:

Doing a text replace will not work. If there are two links that look the same it will fail to replace. You should use the RegEx.Replace instead.

data = Regex.Replace(

               data,

               @"(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:/~\+#]*[\w\-\@?^=%&amp;/~\+#])",

               delegate(Match match)

               {

                   return string.Format("<a href=\"{0}\">{0}</a>", match.ToString());

               });

# October 7, 2009 2:13 AM

Vladimir said:

Very nice example. Thanks for saving me some time!

# October 26, 2009 5:40 AM

C#Beginer said:

Hi,

Can anyone help in Regular Experssion.

I need an expression to extract URL only from <a href tags and ignore other link on the page.

Many thanks,

# November 3, 2009 11:15 AM
Leave a Comment

(required) 

(required) 

(optional)

(required)