Code Puzzle #2 - Generate random fake surnames - Recap

Code Puzzle #2 posed the following task: Write a simple function which generates fake but passable surnames (read more here). As I'd hoped, I got several great submissions with a range of interesting approaches. I'd like to say that we're all winners, but the rules were clear: "This is a fixed contest; my solution will win first place."

Finally, here's my solution. It's the ugliest solution by far, but I was pretty happy with the output. I divided the consonants into four groups:

  • common - can appear anywhere, and appear frequently
  • average - can appear anywhere, and appear with... um... average frequency
  • middle - slightly lower frequency, and not allowed to start a name
  • rare - appear rarely, but are allowed to start a name

I seeded the random number generator with Guid.NewGuid().GetHashCode(), per Brendan Tompkins' tip. System.Random bases its randomization based on a seed. If you pass the same seed in every time, you get the sequence out every time. If you don't seed the Random Number Generator, it uses the system clock (Ticks, to be precise). The problem is that if you call it multiple times in a tight loop, you'll get the same values out. Seeding based on a GUID hashcode ensures a random (though not evenly distributed) sequence.

You'll notice that my letter arrays contain duplicates of some values - common letters like A,E,S, and T are repeated multiple times. That's a cheap trick to allow for a random output that weights some values more highly.

I added common prefixes and suffixes after looking at the common surname list, then tweaked the weightings so they'd show up at the right frequency.

 

private static string GenerateSurname()
{
string name = string.Empty;
string[] currentConsonant;
string[] vowels = "a,a,a,a,a,e,e,e,e,e,e,e,e,e,e,e,i,i,i,o,o,o,u,y,ee,ee,ea,ea,ey,eau,eigh,oa,oo,ou,ough,ay".Split(',');
string[] commonConsonants = "s,s,s,s,t,t,t,t,t,n,n,r,l,d,sm,sl,sh,sh,th,th,th".Split(',');
string[] averageConsonants = "sh,sh,st,st,b,c,f,g,h,k,l,m,p,p,ph,wh".Split(',');
string[] middleConsonants = "x,ss,ss,ch,ch,ck,ck,dd,kn,rt,gh,mm,nd,nd,nn,pp,ps,tt,ff,rr,rk,mp,ll".Split(','); //Can't start
string[] rareConsonants = "j,j,j,v,v,w,w,w,z,qu,qu".Split(',');
Random rng
= new Random(Guid.NewGuid().GetHashCode()); //http://codebetter.com/blogs/59496.aspx
int[] lengthArray = new int[] { 2, 2, 2, 2, 2, 2, 3, 3, 3, 4 }; //Favor shorter names but allow longer ones
int length = lengthArray[rng.Next(lengthArray.Length)];
for (int i = 0; i < length; i++)
{
int letterType = rng.Next(1000);
if (letterType < 775) currentConsonant = commonConsonants;
else if (letterType < 875 && i > 0) currentConsonant = middleConsonants;
else if (letterType < 985) currentConsonant = averageConsonants;
else currentConsonant = rareConsonants;
name
+= currentConsonant[rng.Next(currentConsonant.Length)];
name
+= vowels[rng.Next(vowels.Length)];
if (name.Length > 4 && rng.Next(1000) < 800) break; //Getting long, must roll to save
if (name.Length > 6 && rng.Next(1000) < 950) break; //Really long, roll again to save
if (name.Length > 7) break; //Probably ridiculous, stop building and add ending
}
int endingType = rng.Next(1000);
if (name.Length > 6)
endingType
-= (name.Length * 25); //Don't add long endings if already long
else
endingType
+= (name.Length * 10); //Favor long endings if short
if (endingType < 400) { } // Ends with vowel
else if (endingType < 775) name += commonConsonants[rng.Next(commonConsonants.Length)];
else if (endingType < 825) name += averageConsonants[rng.Next(averageConsonants.Length)];
else if (endingType < 840) name += "ski";
else if (endingType < 860) name += "son";
else if (Regex.IsMatch(name, "(.+)(ay|e|ee|ea|oo)$") || name.Length < 5)
{
name
= "Mc" + name.Substring(0, 1).ToUpper() + name.Substring(1);
return name;
}
else name += "ez";
name
= name.Substring(0, 1).ToUpper() + name.Substring(1); //Capitalize first letter
return name;
}

Please feel free to submit your solution. We've only covered a few ethnicities here, there are plenty more to cover.

No Comments