Notepad bug? Encoding issue?

Posted Friday, February 27, 2004 5:09 PM by CumpsD

Someone showed me a weird text file today. It was a bat file with 'copy MeYou.bak MeYou.txt'. When you would ran it, it would work. But when you opened it in Notepad, there was nothing.

So we decided to look a bit into this and here is something we came up with to 'create' invisible text:

Open notepad and enter:
' abc.bak abc.txt'

(That is: space abc dot bak space abc dot txt, no line break, without the quotes)

It doesn't work with every string, just follow us on this example and use that one.

Save your file. Notepad picks default ANSI as encoding.

Open your file, Notepad seems to open by default in Unicode encoding.

Your text is now invisible.



Does anyone know why the saving default is different from the opening default?

And why does it happen to that particular piece of text. It doesnt happen to ' b.b b.b' or ' .bak .txt'.

It looks the same when viewing it through a hex editor. But apparently it has to do something with encoding.

Anyone who can explain?

Update: When you paste it in IE or Trillian you get '????????', like some people tried in the comments ;)

Update2: In my notepad screenshot the font was Terminal, when I choose Verdana it are indeed squares. Not that invisible anymore, but still wrong :)

Update3: You can find an explanation on why this is happening at The Old New Thing.

Filed under:

Comments

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 11:11 AM by Dhoore

omg, this is really strange!!
when i do this test, i see lots of squares like this: ????????
(dunno if that worked)
so it's not really invisible but still very strange

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 11:15 AM by NibbleR

in notepad it are squars, when I copied it to here it were strange Chinese signs. After posting it became ????????

hehe :)

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 1:32 PM by Bobfox

another strange one, if I create the file with UltraEdit, and then load it in Notepad, I get the same squares. If I load it in UltraEdit, it's OK.

???

Robert

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 1:53 PM by Bertg

Maby easter egg :p

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 6:15 PM by bas westerbaan

Well.. it is just an encoding issue. Nothing more :)
I think it is stupid from notepad that it even has encoding.
It is time for a new 'notepad' with only plain dos text
in c++ for speed off course.
Interested?

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 6:38 PM by bas westerbaan

BTW, it works with that little file

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 6:56 PM by bas westerbaan

hu? Damn.. please give me ya email and i`ll send it manually.. strange.. i`ll take a look to the source

# re: Notepad bug? Encoding issue?

Saturday, February 28, 2004 5:28 PM by Bart De Smet

Weird problem indeed. When you open the file using UTF-8 encoding, nothing goes wrong, but when you open it using the default ASCII, it goes wrong. :-(

# re: Notepad bug? Encoding issue?

Sunday, February 29, 2004 6:18 PM by dotnetjunkie

this is why:
because of the encoding issue, notepad sees the character set as Chinese, which needs the font 'SimSun", and if that's not present on your system, or you're using another font (as in notepad), you get the empty space or squares...

Try copying and pasting the text in Wordpad, and you'll see that the font dropdown box shows the font SimSun.

# re: Notepad bug? Encoding issue?

Wednesday, March 24, 2004 7:36 PM by Norman Diamond

I don't have SimSun or any other Chinese font, so Notepad must be substituting a Japanese font. But no matter what encoding Notepad guessed, it isn't a Japanese encoding. The display consists of:
a full-width Kanji character
a half-width black rectangle
three full-width Kanji characters
a half-width black rectangle
a full-width Kanji character
a half-width black rectangle

The half-width black rectangles are the same as Notepad normally displays for a single byte value which is neither a valid single-byte character nor the first byte of a valid double-byte character.

# re: Notepad bug? Encoding issue?

Thursday, March 25, 2004 8:11 PM by Norman Diamond

3/25/2004 8:52 PM Sikko2go:

> It's not really an 'issue'. It's just that Notepad is not capable of
> displaying all kinds of Unicode variants

It is an issue. Notepad isn't displaying a file's contents that are perfectly well encoded in Windows's main, system, default code page. In Japanese Windows systems this is code page 932 (Shift-JIS), set by default at the beginning of the install process, and rarely changed (I didn't change it). In US Windows systems I thought it would be either code page 437 or 850, but someone told me it's something different (without saying which one), but still, surely it gets set by default at the beginning of the install process and most US users don't change it. When Notepad can't even display those files correctly, Unicode variants are not to blame.

# re: Notepad bug? Encoding issue?

Sunday, June 13, 2004 3:25 AM by priyanka

i want to save the text file as encoding"UNICODE"...ANYONE HELP
ME...AS SOON AS POSSIBLE..
THANX

# re: Notepad bug? Encoding issue?

Tuesday, October 9, 2007 12:10 PM by ALARCON

NEED TO ENCODE NOTEPAD.  ALL I SEE ARE SQUARE BOXES AND SOME LETTERS...PLS HELP