in

ASP.NET Weblogs

Dave Burke - A freelance .NET Developer specializing in Online Communities

A freelance .NET Developer

Substring(start, length) -- d.o.t.d.

I have the good fortune to be working with Microsoft Sharepoint and extracting Web Storage System folder and document information to what I call an Intelligent Middle XML Layer to free us from the digital dashboard (not big fans), combine our Sharepoint data with SQL and Exchange, improve searching, tc.  But that's another post.  Amidst the cool aspects of working on this app, I encountered a stupid parsing issue that frustrated me for more than a few minutes.  While this post is in the D.O.T.D. category, it was more like Stupidity Of The Day (S.O.T.D.) activity.

I needed to extract the project number from folder urls, in this example "20036969."  I pass the year, assign the URL to s, and parse.

  

The Substring needs a starting position and length.  Pretty simple.  Calculating the length was my problem.  The " - " string always follows the project number, so I marked its position and substracted the position of what I thought was the "start" of the number: s.IndexOf("/" + year + "/") + 6.  Wrong.  This subtraction yielded 16.  I needed to reduce the length by the "/" + year + "/" string, so it was - 6.  (I welcome a smarter calc at this length--maybe a single overloaded IndexOf()?)

I should apologize for the boring subject matter of this post: Substrings.  But hey, at least I didn't write about a garbage disposal incident.  A couple of morals:  1) the (parens) mark the spot.  It's easy to get confused with the overloads of the IndexOf() method.  Calculations happen outside of the parens or hard to decipher runtime errors will waste CPU and brain cycles, 2) starting point, length.  Starting point, length, and 3) it's always the most mundane issues that kick your butt.

Comments

 

Daniel Turini said:

Regular Expressions are not subject to this kind of error...
June 27, 2003 7:45 AM
 

Dave Burke said:

Thanks, Daniel. Will definitely spend some quality time with the regex factor.
June 27, 2003 8:43 AM
 

Kit George said:

Sorry to see the confusion here. I do personally believe the API is intuitive, but I want to see if writing this slightly differently can help. I am a bit confused that subtracting the position of the start of the number wouldn't work, since it should.

Here's a rewrite:
string year = "2003";
string s = <your http string>;
string search = "/" + year + "/";
int i = s.IndexOf(search) + search.Length;

if (i >= 0) {
string pruid = s.Substring(i, s.IndexOf("-") - i );
}

This should work just fine

Console.WriteLine(pruid);
June 27, 2003 9:22 PM
 

Dave Burke said:

Kit, you made that substring method a beautiful thing with the additional two lines of assigning the /year/ string to a variable and determining its length. My "less lines are best" philosophy sometimes prevents me from a more simple approach. Thanks very much for the lesson.
June 28, 2003 8:05 AM
 

Kjell-Åke Andersson said:

Have you tried .LastIndexOf ?
June 30, 2003 10:15 AM
 

Don McNamara said:

I agree with Daniel.. (I am not a regex expert, so there may be a better way...) Regex isn't always the most readable, but I still like it.

private string ExtractPruid(string url)
{
const string pattern = @"http://[a-zA-Z0-9\/]+\/(?<pruid>[0-9]*)\s*-\s*.*";
return Regex.Match(url, pattern).Groups["pruid"].Value;
}

string s = @"http://server/workspace/Documents/Projects/2003/20026969 - Project Description";
MessageBox.Show("(" + this.ExtractPruid(s) + ")");
June 30, 2003 10:31 AM

Leave a Comment

(required)  
(optional)
(required)  
Add