Does this make sense to anyone? Is it a bug in String.Compare()?

Given the following unit test:

<Test()> Public Sub TestSortingAlgorithm()

Dim test1, test2, test3, test4 As String

test1 = "'"

test2 = "*"

test3 = "',"

test4 = "*,"

Dim formatString As String = "Compare ""{0}"" to ""{1}"": Result: {2}.{3}"

Dim Result1 As Integer = String.Compare(test1, test2)

Dim Result2 As Integer = String.Compare(test3, test4)

Debug.WriteLine(String.Format(formatString, test1, test2, Result1, vbCrLf))

Debug.WriteLine(String.Format(formatString, test3, test4, Result2, vbCrLf))

Assert.AreEqual(Result1, Result2, "It doesn't make sense that these two sets of strings should be sorted differently.")

End Sub

Here is the output:

Compare "'" to "*": Result: -1.

Compare "'," to "*,": Result: 1.

TestCase 'TestFixture.TestSortingAlgorithm' failed: It doesn't make sense that these two sets of strings should be sorted differently.

expected:<-1>

but was:<1>


UPDATE:

The following SQL (by way of comparison)

DECLARE @Temp TABLE
(
     string varchar(50)  
)

INSERT INTO @Temp (string)
VALUES('''')

INSERT INTO @Temp (string)
VALUES('*')

INSERT INTO @Temp (string)
VALUES(''',')

INSERT INTO @Temp (string)
VALUES('*,')


SELECT *
FROM @Temp
ORDER BY String ASC
GO

Yields the following Output:

'
',
*
*,

Comments

# re: Does this make sense to anyone?

Monday, May 02, 2005 9:31 PM by John Bates

What locale is the code running in? From the .NET Framework Developer's Guide: "By default, the String.Compare method performs culture-sensitive and case-sensitive comparisons." The 4 argument Compare overload allows the culture to be specified (maybe use CultureInfo.InvariantCulture).

# re: Does this make sense to anyone?

Tuesday, May 03, 2005 3:24 AM by nsimeonov

It looks to me that, what this guy is testing is to check if both sets ( ' and , ) and ( '* and ,* ) comparision's result is the same because they start exactly the same. No matter the culture and anything else these symbols should compare equally I think.

it's like trying to sort

(a , b)
and
(ac, bc)

for both sets String.Compare should return the same result.

# re: Does this make sense to anyone? Is it a bug in String.Compare()?

Tuesday, May 03, 2005 10:09 AM by Chris McKenzie

That's right--it shouldn't matter what the cultureInfo is--either "'" < "*" or "'" > "*". The rest doesn't matter.

I ran this same test in C last night. The strcmp() function in C works as I expected it to.

# re: Does this make sense to anyone? Is it a bug in String.Compare()?

Tuesday, May 03, 2005 2:09 PM by Philip Rieck

String.Compare uses word sort rules, not string sort rules. This give different weights to characters so that things like I'll is close to Ill, even though by string sort rules theres a big difference in the 2nd character.

So it's assigning a different weight to the ' character based on wether or not it's part of a "word".


If you don't want it to act this way (and want to do a string compare instead of a word compare), you need to get the CompareInfo object from the current culture, and pass in StringCompare as the CompareOptions param:

Result1 = CultureInfo.CurrentCulture.CompareInfo.Compare(test1, test2, CompareOptions.StringSort);

Result2 = CultureInfo.CurrentCulture.CompareInfo.Compare(test3, test4, CompareOptions.StringSort);

I don't like this behavior too much either. And I don't like that String.Compare has no overload with a CompareOptions param.

# re: Does this make sense to anyone? Is it a bug in String.Compare()?

Friday, May 06, 2005 11:19 AM by Chris McKenzie

Sounds like a good idea. How would you suggest doing this in the DataView.Sort method?

# re: Does this make sense to anyone? Is it a bug in String.Compare()?

Thursday, May 12, 2005 4:10 PM by Chris McKenzie

Okay, I tried Phillip's suggestion and it doesn't work. Any other ideas?

Leave a Comment

(required) 
(required) 
(optional)
(required)