June 2008 - Posts
Hey hey hey! It's time for another Extension Methods Roundup! Here are some of the extension methods I've written since the last one:
Dictionary's Missing Remove Methods
public static void Remove<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, TValue value)
{
// Check to see that dictionary is not null
if (dictionary == null)
throw new ArgumentNullException("dictionary");
foreach (var key in (from pair in dictionary
where EqualityComparer<TValue>.Default.Equals(value, pair.Value)
select pair.Key).ToArray())
{
dictionary.Remove(key);
}
}
public static void RemoveRange<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, IEnumerable<TValue> values)
{
// Check to see that dictionary is not null
if (dictionary == null)
throw new ArgumentNullException("dictionary");
// Check to see that values is not null
if (values == null)
throw new ArgumentNullException("values");
foreach (var value in values.ToArray())
{
ExtensionMethods.Remove(dictionary, value);
}
}
public static void RemoveRange<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, IEnumerable<TKey> keys)
{
// Check to see that dictionary is not null
if (dictionary == null)
throw new ArgumentNullException("dictionary");
// Check to see that keys is not null
if (keys == null)
throw new ArgumentNullException("keys");
foreach (var key in keys.ToArray())
{
dictionary.Remove(key);
}
}
String Aggregation
public static string Aggregate(this IEnumerable<string> enumeration, string separator)
{
return Aggregate(enumeration, str => str, separator);
}
public static string Aggregate<T>(this IEnumerable<T> enumeration, Func<T, string> toString, string separator)
{
// Check to see that enumeration is not null
if (enumeration == null)
throw new ArgumentNullException("enumeration");
// Check to see that toString is not null
if (toString == null)
throw new ArgumentNullException("toString");
// Check to see that separator is not null or an empty string
if (string.IsNullOrEmpty(separator))
throw new ArgumentNullException("separator");
return enumeration.Aggregate(string.Empty,
(accum, item) => string.Format("{0}{1}{2}", accum, separator, toString(item)),
str => str.Length > separator.Length ? str.Substring(separator.Length) : str);
}
Those are very good for when you want to create strings such as "a, b, c, d".
LastOrDefault
public static T LastOrDefault<T>(this IList<T> list)
{
// Check to see that list is not null
if (list == null)
throw new ArgumentNullException("list");
if (list.Count == 0)
return default(T);
return list[list.Count - 1];
}
This is an optimized version of the original LastOrDefault for lists that allow random access.
At
public static T At<T>(this IEnumerable<T> enumeration, int index)
{
// Check to see that enumeration is not null
if (enumeration == null)
throw new ArgumentNullException("enumeration");
return enumeration.Skip(index).First();
}
public static IEnumerable<T> At<T>(this IEnumerable<T> enumeration, params int[] indices)
{
return At(enumeration, (IEnumerable<int>)indices);
}
public static IEnumerable<T> At<T>(this IEnumerable<T> enumeration, IEnumerable<int> indices)
{
// Check to see that enumeration is not null
if (enumeration == null)
throw new ArgumentNullException("enumeration");
// Check to see that indices is not null
if (indices == null)
throw new ArgumentNullException("indices");
int currentIndex = 0;
foreach (int index in indices.OrderBy(i => i))
{
while (currentIndex != index)
{
enumeration = enumeration.Skip(1);
currentIndex++;
}
yield return enumeration.First();
}
}
At provides pseudo-random access to enumerable lists, where needed. I've found use for it in a couple of places which returned indices for non-IList<T> enumerations.
SequenceEqual<T1, T2>
public static bool SequenceEqual<T1, T2>(this IEnumerable<T1> left, IEnumerable<T2> right, Func<T1, T2, bool> comparer)
{
using (IEnumerator<T1> leftE = left.GetEnumerator())
{
using (IEnumerator<T2> rightE = right.GetEnumerator())
{
bool leftNext = leftE.MoveNext(), rightNext = rightE.MoveNext();
while (leftNext && rightNext)
{
// If one of the items isn't the same...
if (!comparer(leftE.Current, rightE.Current))
return false;
leftNext = leftE.MoveNext();
rightNext = rightE.MoveNext();
}
// If left or right is longer
if (leftNext || rightNext)
return false;
}
}
return true;
}
This differs from the original SequenceEqual in that it is able to accept two different types of sequences.
AsIndexed
public static IEnumerable<KeyValuePair<int, T>> AsIndexed<T>(this IEnumerable<T> enumeration)
{
// Check to see that enumeration is not null
if (enumeration == null)
throw new ArgumentNullException("enumeration");
int i = 0;
foreach (var item in enumeration)
{
yield return new KeyValuePair<int, T>(i++, item);
}
}
This is when you need indices, but don't want the overhead of creating an Array<T>.
The Missing SelectMany
public static IEnumerable<T> SelectMany<T>(this IEnumerable<IEnumerable<T>> source)
{
// Check to see that source is not null
if (source == null)
throw new ArgumentNullException("source");
foreach (var enumeration in source)
{
foreach (var item in enumeration)
{
yield return item;
}
}
}
Oh, come on! Why wasn't there a parameterless SelectMany in the framework? Oh well, here's one.
ToDictionary of IGrouping
public static Dictionary<TKey, IEnumerable<TElement>> ToDictionary<TKey, TElement>(
this IEnumerable<IGrouping<TKey, TElement>> enumeration)
{
// Check to see that enumeration is not null
if (enumeration == null)
throw new ArgumentNullException("enumeration");
return enumeration.ToDictionary(item => item.Key, item => item.Cast<TElement>());
}
This is shorthand for when you want to create a dictionary from the result of GroupBy.
Post's Correspondence Problem (the other PCP) is a computer science problem, in which you have (and I simplify matters) a set of tiles, each having any number of letters on them from a preset group. For instance, you may have the tiles:
The idea is to create a sequence of tiles (when you can use an infinite amount of tiles from each kind) to get the exact same combination of letters both on the top row and the bottom one. One such combination would be aaaabab, where you would use tiles 4, 4, 2 and 1 to create the sequence: [aa][aa][b][ab] at the top and [a][a][a][abab] at the bottom. If you like, a good mental exercise would be to find the next shortest sequences.
PCP is an undecidable decision problem, which means that no program can be written that could receive a finite set of tiles as its input and return true or false as to whether a combination exists, without the risk of running indefinitely. However, a program that has the risk of running indefinitely that solves the problem can be written: Simply check all sequences of length 1, 2, 3, etc. and stop at the first that is a match.
Writing such a program is a bit cumbersome, as at each sequence of length n, you will have to either save all sequences of length n-1 or recalculate them. Using a classic recursion isn't that useful, since those are used for depth-based analysis, rather than breadth based. Luckily, C# 2.0's yield statement offers us a different type of recursion - the breadth recursion:
IEnumerable<Tile> GetTileSequence(IEnumerable<Tile> tiles)
{
foreach (Tile tile in tiles)
{
yield return new Tile { Top = tile.Top, Bottom = tile.Bottom };
}
foreach (Tile sequence in GetTileSequence(tiles))
{
foreach (Tile tile in tiles)
{
yield return new Tile { Top = sequence.Top + tile.Top, Bottom = sequence.Bottom + tile.Bottom };
}
}
}
The above code runs on the set of tiles to create single tile sequences and then uses itself to create sequences one tile longer than itself. It's a bit confusing, I admit, but after a couple of minutes of examining it you may start seeing the hidden beauty of it. Using it to solve the problem would look something like this:
Tile[] tiles = new Tile[] {
new Tile { Top = "ab", Bottom = "abab" },
new Tile { Top = "b", Bottom = "a" },
new Tile { Top = "aba", Bottom = "b" },
new Tile { Top = "aa", Bottom = "a" }
};
foreach (Tile sequence in GetTileSequence(tiles))
{
if (sequence.Top == sequence.Bottom)
{
Debug.WriteLine("Match: " + sequence.Top + ", " + sequence.Bottom);
return sequence;
}
}
This whole piece of code was written a few days after a little debate I had with @yosit about whether using yield statements to build recursions was a good idea or not. I still hold the firm belief that it usually isn't a good idea, since it's, as you can see for yourself, pretty confusing; That and the fact that most people are used to the classic recursion.

Yesterday, in front of the staff and students at my college, I presented my final project for my C.S. B.Sc.. Once I complete it and it gets reviewed this September, I will have completed my duties for the degree.
The project is a research I'm doing for nuconomy and I'll release the code once it's complete. It uses the .NET Framework 3.5 (with C# 3.0) and SQL Server 2005 Integration Services' NLP engine.
The following is the abstract and you can also download the short presentation.
Abstract
The advent of Web 2.0, with its introduction of the concept of user generated data, has posed several problems to those developers aiming to make the navigation in such data as simple as possible.
The problem was commonly met by the coupling of meta-data (tags) to the user-generated content itself, which posed another problem, simply because the vast amounts of data were no match for the small number of website operators to cope with. Thus was introduced the concept of the Folksonomy, or social tagging, which took advantage of the content’s users, asking them to explain what the content was about in an engaging way.
Unfortunately, creating a working folksonomy requires a large and cooperative user base, something that can’t be relied upon.
Automation can be introduced into such communities in order to relieve most of the pressure classic folksonomies place on the user base. By automatically analyzing the user-generated content and meta-content and applying to it a base set of tags, such automation saves users the need to come up with those tags in the first place, leaving only the easier process of correction.
Mechsonomy consists of the following building blocks:
- Plain-Text Tagging – user-generated content is taken as-is and processed by a Term Extraction engine to retrieve ‘relevant’ tags.
- Markup Analysis – the placement of terms retrieved in the marked-up source is examined, altering terms’ significance.
- Web Analysis – the relationship between units of content is examined, altering terms’ significance.
- Machine Learning – users interact and rank tags’ relevance to the content, allowing Mechsonomy to learn the impact the site’s markup has on the content’s relevance.
Let's play a little game. To the left of this text are the five central icons from the Test Tools Toolbar in Visual Studio. Their commands are, in an unordered manner: Test List Editor, Test Runs, Test View, Code Coverage Results and Test Results. Connect the icon to the appropriate command, without checking Visual Studio.
The purpose of this game? To show that whenever you don't think about your icons enough, they're meaningless and therefore useless. There may be a certain logic behind this set of icons, but since it has eluded me for the few seconds I tried to understand it, it's as if it wasn't there in the first place.
More Posts