Cleaning invalid characters from SharePoint
I stumbled onto one of those "gotchas" you get with SharePoint. We were creating new document libraries based on user names in a domain. A change came in and we had to support multiple domains so a document library name would need a domain identifier (since you could have two of the same user names in two different domains). During acceptance testing we found that document libraries created with dashes in the names (as we were creating them using [domain]-[username] pattern) would strip the dash out (without telling you of course). This caused a bit of a headache with the email we send out with a link since the URL was invalid.
I remember this from a million years ago (as I'm replacing a few SharePoint brain cells with Ruby ones lately) so after a bit of Googling I found a great article by Eric Legault here on the matter.
Here's a small method with a unit test class to handle this cleansing of names.
public static string CleanInvalidCharacters(string name)
{ string cleanName = name; // remove invalid characterscleanName = cleanName.Replace(@"#", string.Empty);
cleanName = cleanName.Replace(@"%", string.Empty);
cleanName = cleanName.Replace(@"&", string.Empty);
cleanName = cleanName.Replace(@"*", string.Empty);
cleanName = cleanName.Replace(@":", string.Empty);
cleanName = cleanName.Replace(@"<", string.Empty);
cleanName = cleanName.Replace(@">", string.Empty);
cleanName = cleanName.Replace(@"?", string.Empty);
cleanName = cleanName.Replace(@"\", string.Empty);
cleanName = cleanName.Replace(@"/", string.Empty);
cleanName = cleanName.Replace(@"{", string.Empty);
cleanName = cleanName.Replace(@"}", string.Empty);
cleanName = cleanName.Replace(@"|", string.Empty);
cleanName = cleanName.Replace(@"~", string.Empty);
cleanName = cleanName.Replace(@"+", string.Empty);
cleanName = cleanName.Replace(@"-", string.Empty);
cleanName = cleanName.Replace(@",", string.Empty);
cleanName = cleanName.Replace(@"(", string.Empty);
cleanName = cleanName.Replace(@")", string.Empty);
// remove periodswhile (cleanName.Contains("."))
cleanName = cleanName.Remove(cleanName.IndexOf("."), 1); // remove invalid start characterif (cleanName.StartsWith("_"))
{cleanName = cleanName.Substring(1);
}
// trim length if(cleanName.Length > 50)cleanName = cleanName.Substring(1, 50);
// Remove leading and trailing spacescleanName = cleanName.Trim();
// Replace spaces with %20cleanName = cleanName.Replace(" ", "%20");
return cleanName;}
[TestFixture]public class When_composing_a_document_library_name
{ [Test]public void Spaces_should_be_converted_to_a_canonicalized_string()
{string invalidName = "Cookie Monster";
Assert.AreEqual("Cookie%20Monster", SharePointHelper.CleanInvalidCharacters(invalidName));
}
[Test]public void Remove_invalid_characters()
{string invalidName = @"#%&*:<>?\/{|}~+-,().";
Assert.AreEqual(string.Empty, SharePointHelper.CleanInvalidCharacters(invalidName));
}
[Test]public void Remove_invalid_underscore_start_character()
{string invalidName = "_CookieMonster";
Assert.AreEqual("CookieMonster", SharePointHelper.CleanInvalidCharacters(invalidName));
}
[Test]public void Remove_any_number_of_periods()
{string invalidName = ".Co..okie...Mon....st.er.";
Assert.AreEqual("CookieMonster", SharePointHelper.CleanInvalidCharacters(invalidName));
}
[Test]public void Names_cannot_be_longer_than_50_characters()
{string invalidName = "CookieMonster".PadRight(51, 'C');
Assert.AreEqual(50, SharePointHelper.CleanInvalidCharacters(invalidName).Length);
}
[Test]public void Leading_and_trailing_spaces_should_be_removed()
{string invalidName = " CookieMonster ";
Assert.AreEqual("CookieMonster", SharePointHelper.CleanInvalidCharacters(invalidName));
}
}
BTW, this would make for a nice 3.5 string extension method (string.ToSharePointName), but alas I'm stuck in 2.0 land for this project.
Enjoy!