Command Line Confusion

I kept getting wierd errors in a simple console application which takes a regular expression as an argument. The regular expression kept failing with an "Illegal \ at end of pattern" error. The odd thing was that I was properly escaping the \ as \\.

After some testing, I think I've convinced myself that this had nothing to do with regular expressions; there's something wacky about the way the .NET Framework handles commandline arguments which end in \\". Here's a simplified argument test application: 

using System; public class Test { public static void Main(string[] args) { Console.WriteLine("Environment.CommandLine: " + Environment.CommandLine); for(int i=0;i<args.Length;i++) { OutputArg(i,args[i]); Console.WriteLine(); } } static void OutputArg(int i,string arg) { Console.WriteLine("Environment.GetCommandLineArgs()[{0}]: {1}", i+1, Environment.GetCommandLineArgs()[i+1]); Console.WriteLine("args[{0}]: {1}",i,arg); //foreach(char c in arg) Console.WriteLine(c); } }

First, let's test it to see how it handles quoted and unqoted strings:

C:\Temp>CommandLineArgumentsTest.exe "monkey" potato Environment.CommandLine: CommandLineArgumentsTest.exe "monkey" potato Environment.GetCommandLineArgs()[1]: monkey args[0]: monkey Environment.GetCommandLineArgs()[2]: potato args[1]: potato

Everything sees pretty good there - we've passed a quoted string and an unquoted string, and both work just fine.

Now, let's try it with some strings that end with two back slashes (\\):

C:\Temp>CommandLineArgumentsTest.exe "\\test\\" "\\test\\\" Environment.CommandLine: CommandLineArgumentsTest.exe "\\test\\" "\\test\\\" Environment.GetCommandLineArgs()[1]: \\test\ args[0]: \\test\ Environment.GetCommandLineArgs()[2]: \\test\" args[1]: \\test\"

See what I'm talking about? The last \ vanishes like X-Files was all up in this place. Normally when things get strange, I turn to my good friend Reflector for some insight, but mscorlib.System.Environment.GetCommandLineArgsNative() is rather un-surprisingly native code and is thus well sheltered from Reflector's prying eyes.

After a bit of thought it looks like the second backslash is being "used up" by escaping the final quote. Let's try it with an unquoted argument:

C:\Documents and Settings\Jon\My Documents\My Code Snippets>CommandLineArgumentsTest.exe \\test\\ Environment.CommandLine: CommandLineArgumentsTest.exe \\test\\ Environment.GetCommandLineArgs()[1]: \\test\\ args[0]: \\test\\

Sure enough, everything's great there. That doesn't solve the problem, though - if there's a space in the string, it needs to be quoted so it's handled as one argument (e.g. "\\t e s t\\"). Well, we have enough info at this point to hack together a solution, but I don't like it at all. To pass \\test\\ as a quoted command-line argument, we need to pass in "\\test\\\\":

C:\Temp>CommandLineArgumentsTest.exe "\\test\\\\" Environment.CommandLine: CommandLineArgumentsTest.exe "\\test\\\\" Environment.GetCommandLineArgs()[1]: \\test\\ args[0]: \\test\\

It's still kind of odd to me, though - why doesn't it display that quote in \\test\\\", since it's been escaped? If the GetComandLineArg() is seeing the last two characters as \", wouldn't it make more sense for args[0] to be \\test\" rather than \\test\ ? As I understand it, escaping in both C# and RegEx is supposed to be handled in a left to right fashion, but this seems to be working right to left.

7 Comments

  • I'd put my money on Raymond for this one, too, but I didn't find it on his blog. I searched on "args", "arguments", and "command line".

  • Hmm. Raymond showed me that my example demonstrates that it's not a Windows thing, it's either a .NET Framework thing or a Jon foolishness thing.
    http://blogs.msdn.com/oldnewthing/archive/2006/09/29/776926.aspx#780420

    Who to ask next? Am I looking for a left handed smoke shifter?

  • Thanks so much, Carlos. That explains it perfectly!

    CommandLineToArgvW has a special interpretation of backslash characters when they are followed by a quotation mark character ("), as follows:

    * 2n backslashes followed by a quotation mark produce n backslashes followed by a quotation mark.
    * (2n) + 1 backslashes followed by a quotation mark again produce n backslashes followed by a quotation mark.
    * n backslashes not followed by a quotation mark simply produce n backslashes.

  • This is at least the second post of yours that has saved me hours of work. Thanks Jon!

  • @Alexander - Glad it helped. This one confused the heck out of me, so I'm glad it saved you some time.

  • This error below is purely Regex error. It has nothing to do (directly) to the way command line parses the argument. My guess is that the command line parsing internally uses Regex...

    Error:

    parsing "\14\2415\" - Illegal \ at end of pattern.

  • What crackpot came up with those rules:

    " * 2n backslashes followed by a quotation mark produce n backslashes followed by a quotation mark.

    * (2n) + 1 backslashes followed by a quotation mark again produce n backslashes followed by a quotation mark.

    * n backslashes not followed by a quotation mark simply produce n backslashes."

    Seriously not even in my craziest of off days could I come up with something as well..crazy.

    Thanks for investigating.

    I'm passing some xml as a parameter and it makes a right mess of it when it comes out the other side.

Comments have been disabled for this content.