Code Puzzle #2 - Generate random fake surnames - Jon Galloway

Code Puzzle #2 - Generate random fake surnames

I'm talking to you, Mr. Thymmet! Step forward and be counted, Ms. Betusen! It's time for another Code Puzzle! My solution will be posted on Friday, 1/12. Get to work!

UPDATE: Read the recap here

Task: Write a single function which generates fake but passable surnames.

How does it go back together again?Rules / Notes:

  • Post your solution in the comments along with 25 results.
  • Names should be pronouncable and passable as real surnames. If you saw them on a business card, your reaction should at the most be an inward chuckle.
  • Your solution should be capable of generating a large number of random names. (My solution generated 1.7+ million distinct names in a simple test.)
  • My solution creates surnames which would make sense in the US. Feel free to write something target different ethnicities, but please let us know which you're targeting.
  • Keep it simple. My solution is 33 lines and 2,700 characters, including some comments. My function takes no parameters and returns a string.
  • Use whatever language you want; please specify if it's not VB.NET or C# using .NET 2.0. Yeah, I know you can probably write this in 8 characters in PowerShell, Ruby, Python, and Boo. So do it!
  • Your code cannot reference any external resouces (no internet access, no file or data access, etc.).
  • Your code doesn't need to worry about filtering obscenities (but please keep your obscene results to yourself).

Judging criteria:

This is a fixed contest; my solution will win first place. However, to achieve glory in this contest you must:

  • Generate passable surnames
  • Use concise yet readable code
  • Try to be elegant, but things like magic numbers and hardcoded strings are fine by me

Sample output:

for (int i = 0; i < 50; i++)
Console.WriteLine(GenerateSurname());

Greta
Slease
Sloje
Seethoson
Sleeslo
Tethe
Heshoez
Lackesm
Sorommees
McTeame
Louwheth
Sheysset
Sakneesm
Lathet
Tosoatesm
Betusen
McTasea
Fata
Thymmet
Degha
Thateyez
Thethatea
McShatte
Tethes

(Yes, these look silly when listed together. The idea is that they shouldn't stand out on a page in the phonebook. Let's see you do better!)

References:

UPDATE: Is this for spammers?
No. I was working on a sample application and wanted to generate a few hundred names to demonstrate paging through data. I thought about it, and I don't believe this will help spammers. It's very easy for spammers to just use lists of actual names (such as the common surnames list in the references above). Spam filtering algorithms can't block e-mail from common surnames, or they'd be blocking most of their legitimate e-mail traffic. It would be a waste of time for spammers to bother with generating random surnames.
Published Wednesday, January 10, 2007 12:16 AM by Jon Galloway
Filed under: ,

Comments

# re: Code Puzzle #2 - Generate random fake surnames

Is this something for spammers ?

More algorithms, worse for spam filter ?

Wednesday, January 10, 2007 6:48 AM by Pavel

# re: Code Puzzle #2 - Generate random fake surnames

My name's not silly, dangit!  Stop writing me into your code puzzles!

Wednesday, January 10, 2007 1:08 PM by Mr. Seethoson

# re: Code Puzzle #2 - Generate random fake surnames

Hmmm - none of the sample names start with a vowel? ;)

Thursday, January 11, 2007 12:18 PM by Wim Hollebrandse

# re: Code Puzzle #2 - Generate random fake surnames

This is great! and funny too... i had alot of laughts building my solution.. but its longer than your solution Jon.. hmmm... time to refactor!

Thursday, January 11, 2007 2:05 PM by Keith Rull

# re: Code Puzzle #2 - Generate random fake surnames

@Wim - You're right. I always started with consonants. It might work better if I allow some vowels to start. I'll give it a shot.

@Keith - Great! I was worried I wouldn't get any entries!

Thursday, January 11, 2007 2:50 PM by Jon Galloway

# re: Code Puzzle #2 - Generate random fake surnames

That was easy!

return "McSa" + Regex.Replace(Guid.NewGuid().ToString(),"\\d+|-","e") + "erson";

Thursday, January 11, 2007 2:53 PM by Mr. McSaeceeefaeeeeeedeffeeeerson

# re: Code Puzzle #2 - Generate random fake surnames

@Wim - Interesting! Your solution was very similar to mine. Mine has a few tweaks to give higher weight to more common vowel and consonant combinations, but they're remarkably similar.

Thursday, January 11, 2007 4:41 PM by Jon Galloway

# re: Code Puzzle #2 - Generate random fake surnames

Hi Jon - was thinking of weighting, but that way I wouldn't have got my solution down to 20 odd lines. ;-)

And after all, it was just to generate test surnames. I could probably 'tune' the appropriate arrays a bit more, but hey, it was only a 5 minute job.

Thursday, January 11, 2007 6:13 PM by Wim Hollebrandse

# re: Code Puzzle #2 - Generate random fake surnames

I've done two versions, but each of them is 140+ lines long, so I guess I'm out. Anyway, it's been a lot of fun!! I'll post them tomorrow.

Friday, January 12, 2007 12:37 PM by Carlos M Perez

# re: Code Puzzle #2 - Generate random fake surnames

@Carlos - Looking forward to seeing what you came up with!

Friday, January 12, 2007 1:43 PM by Jon Galloway

# re: Code Puzzle #2 - Generate random fake surnames

Here's my solution using digraphs.

using System;

using System.Text.RegularExpressions;

namespace SurnameGenerator

{

   class Program

   {        

       public static void Main(string[] args)

       {

           for (int count = 0; count < 25; count++ )

           {

               Console.WriteLine(GenerateName());

           }

           Console.ReadKey();

       }

       private static Regex invalidNameRegex = new Regex(@"(?:([aiuy])\1)|(?:(\w\w)\2)|([^aeiouy]{3,})|([aeiouy]{3,})|(\wy\w)|(^nd)|(^nt)|(^rt)|(^rs)|(^ht)|([aiou]$)|(tw$)");

       private static string[] digraphs = new string[]

           { "en", "re", "er", "nt", "th", "on", "in", "te", "an", "or", "st",

             "ed", "ne", "ve", "es", "nd", "to", "se", "at", "ti", "ar", "ee",

             "rt", "as", "co", "io", "ty", "fo", "fi", "ra", "et", "le", "ou",

             "ma", "tw", "ea", "is", "si", "de", "hi", "al", "ce", "da", "ec",

             "rs", "ur", "ni", "ri", "el", "la", "ro", "ta"

           };

       private static Random random = new Random();

       public static string GenerateName()

       {            

           string potentialName;

           do

           {

               potentialName = "";

               for (int digraphCount = 0; digraphCount < random.Next(2, 5); digraphCount++)

               {

                   potentialName += digraphs[random.Next(0, digraphs.GetUpperBound(0))];

               }

           } while (invalidNameRegex.IsMatch(potentialName));

           return potentialName.Substring(0, 1).ToUpper() + potentialName.Substring(1);

       }

   }

}

And the names generated:

Nicolaed

Iont

Esnith

Erelec

Asraed

Raerse

Rost

Vecoas

Ceener

Hisior

Raetelde

Dastur

Toed

Onende

Arfind

Eendel

Fitouren

Alraur

Erenet

Elce

Five

Orisre

Altirais

Eerore

Ator

Friday, January 12, 2007 4:32 PM by Rich McCollister

# re: Code Puzzle #2 - Generate random fake surnames

Hmmm, the blog seems to have cut off the end of my regular expression.

here the line that got cut off, more readable:

       private static Regex invalidNameRegex = new Regex(@"(?:([aiuy])\1)|

                                                           (?:(\w\w)\2)|

                                                           ([^aeiouy]{3,})|

                                                           ([aeiouy]{3,})|

                                                           (\wy\w)|

                                                           (^nd)|

                                                           (^nt)|

                                                           (^rt)|

                                                           (^rs)|

                                                           (^ht)|

                                                           ([aiou]$)|

                                                           (tw$)", RegexOptions.IgnorePatternWhitespace);

Friday, January 12, 2007 4:40 PM by Rich McCollister

# re: Code Puzzle #2 - Generate random fake surnames

yellow is my favorite color.

14 is my favorite number

Wednesday, December 30, 2009 7:44 PM by brianna dugan

# re: Code Puzzle #2 - Generate random fake surnames

links/go.txt;25;25

Friday, January 14, 2011 6:28 PM by nddujBDr