Chris Garrett

Just Another Web Developer

Emailing UTF-8 to Hotmail?

It looks suspiciously like Hotmail explicitly rejects viewing UTF-8 encoded email - is this correct?

Unfortunately my content is in UTF-8 and I can't just ignore Hotmail users :O(

I am using AspNetEmail and I tried using ISO-8859-1 as the characterset. Didn't work. So I tried converting my UTF-8 encoded text into ISO-8859-1 using the following function

Function UTFISO(ByVal src As String) As String

Dim iso As Encoding = Encoding.GetEncoding("iso-8859-1")

Dim utf As Encoding = Encoding.UTF8

Dim unicodeBytes As Byte() = utf.GetBytes(src)

Return iso.GetString(unicodeBytes)

End Function

Still no joy.

Any ideas anyone?

Posted: Jun 06 2005, 05:26 PM by chrisg | with 9 comment(s)
Filed under:

Comments

Dean Harding said:

I posted about this very topic just yesterday on my own blog: http://www.codeka.com/blogs/index.php/dean/2005/06/06/hello_hotmail

What I've found is that Hotmail will accept whatever encoding you give it, but it *always* reports to the browser an encoding of iso-8859-1. Which means if you're passing it utf-8, anything outside of us-ascii is displayed as garbage.

I'm not sure how AspNetEmail works, but you need to set the encoding on the actual email object to iso-8859-1. What's the code you use to set the message body?
# June 6, 2005 8:22 PM

Chris said:

I just set the msg.CharSet property but I think I need to convert the UTF-8 encoded content I am getting from the db also hence the function above?
# June 7, 2005 4:02 AM

Dean Harding said:

Hmm, just looking at their example page (http://www.aspnetemail.com/samples/advanced.aspx), it looks like just setting the CharSet property should work.

The problem with your method is that it takes a Unicode string (all .NET strings are Unicode), then gets a UTF-8 representation of that string with the GetBytes, then it tries to interpret those UTF-8-encoded bytes as iso-8859-1 with the GetString method.

So yeah, just try setting the CharSet property to "iso-8859-1" and don't do anything with the actual body string and see how that goes.
# June 7, 2005 6:25 AM

Chris said:

Hi Dean,

I tried that but the content in the db was entered using utf-8 so I am thinking I need to convert it into iso to send it as iso?
# June 7, 2005 6:56 AM

Dean Harding said:

Are you using SQL Server? SQL Server doesn't support UTF-8, to use Unicode with SQL Server you need to store the data in an NCHAR, NVARCHAR or NTEXT column (as opposed to a normal CHAR, VARCHAR or TEXT column). This means the data is actually stored in UTF-16. When you then use the .NET data access client (anything under System.Data) it'll automatically convert from whatever encoding is in the database to .NET's native Unicode type.

So when you assign a string to the Body property of the AspNetEmail object, it'll use it's CharSet property to encode the MIME content of the actual email.

What do you see when you send a mail to hotmail? Do you get garbage or a bunch of question marks? Garbage means you're telling hotmail it's UTF-8 and it's just decoding it and telling the browser it's iso-8859-1. If it's question marks, then something's converting your text from Unicode to a character set that doesn't support whatever the character is (for example, you can't convert an Arabic character to iso-8859-1).
# June 7, 2005 7:14 AM

TrackBack said:

# June 7, 2005 7:27 AM

Chris said:

Yes I am using NText in SQL server.

In hotmail the foreign characters are coming out like España rather than España

What I was hoping to do was to use UTF-8 or failing that whatever characterset for the country I am sending to (there are 60+ geographies each with their own email)

Right now if I can get spanish working it would be a big step forward!

Cheers

Chris
# June 7, 2005 8:03 AM

Dean Harding said:

Yeah, it looks like it's still being sent as UTF-8 (I assume if you click on View->Encoding and select UTF-8 it looks OK?)

From what I can gather of the AspNetEmail library, the following should work:

EmailMessage msg = new EmailMessage( "mailserver" );
msg.FromAddress = "from@example.com";
msg.To = "to@example.com";
msg.CharSet = "ISO-8859-1";
msg.Subject = "Message Subject, blah, blah.";
msg.Body = "Some string";
# June 7, 2005 9:36 AM

Chris said:

Hi Dean,

Yeah that is what I am doing. Latest development is I have had some success by htmlencoding the content

HtmlBody = Server.HtmlEncode(HtmlBody)
HtmlBody = Replace(HtmlBody, "&lt;", "<", 1, -1, CompareMethod.Text)
HtmlBody = Replace(HtmlBody, "&gt;", ">", 1, -1, CompareMethod.Text)
HtmlBody = Replace(HtmlBody, "&quot;", """", 1, -1, CompareMethod.Text)
HtmlBody = Replace(HtmlBody, "&apos;", "'", 1, -1, CompareMethod.Text)
HtmlBody = Replace(HtmlBody, "&amp;", "&", 1, -1, CompareMethod.Text)
msg.HtmlBodyPart = HtmlBody

Dave Wanta says there is a utility as part of AspNetEmail which will do this more elegantly, I just need to work that out :O)

Also need to figure out if it works when using the other charactersets such as cyrillic and japanese etc

Chris
# June 7, 2005 10:07 AM
Leave a Comment

(required) 

(required) 

(optional)

(required)