Notepad bug? Encoding issue?

Posted Friday, February 27, 2004 5:09 PM by CumpsD

Someone showed me a weird text file today. It was a bat file with 'copy MeYou.bak MeYou.txt'. When you would ran it, it would work. But when you opened it in Notepad, there was nothing.

So we decided to look a bit into this and here is something we came up with to 'create' invisible text:

Open notepad and enter:
' abc.bak abc.txt'

(That is: space abc dot bak space abc dot txt, no line break, without the quotes)

It doesn't work with every string, just follow us on this example and use that one.

Save your file. Notepad picks default ANSI as encoding.

Open your file, Notepad seems to open by default in Unicode encoding.

Your text is now invisible.



Does anyone know why the saving default is different from the opening default?

And why does it happen to that particular piece of text. It doesnt happen to ' b.b b.b' or ' .bak .txt'.

It looks the same when viewing it through a hex editor. But apparently it has to do something with encoding.

Anyone who can explain?

Update: When you paste it in IE or Trillian you get '????????', like some people tried in the comments ;)

Update2: In my notepad screenshot the font was Terminal, when I choose Verdana it are indeed squares. Not that invisible anymore, but still wrong :)

Update3: You can find an explanation on why this is happening at The Old New Thing.

Filed under:

Comments

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 11:11 AM by Dhoore

omg, this is really strange!!
when i do this test, i see lots of squares like this: ????????
(dunno if that worked)
so it's not really invisible but still very strange

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 11:15 AM by NibbleR

in notepad it are squars, when I copied it to here it were strange Chinese signs. After posting it became ????????

hehe :)

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 1:32 PM by Bobfox

another strange one, if I create the file with UltraEdit, and then load it in Notepad, I get the same squares. If I load it in UltraEdit, it's OK.

???

Robert

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 1:53 PM by Bertg

Maby easter egg :p

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 2:40 PM by Raj


I believe it is an encoding bug of some kind. If you go to a command prompt and type

> EDIT myscrewedupfile.txt

you will be able to see the text.

Here is another funny thing.
Close the command prompt. Close the text file if you still have it opened.

1) Double click on the myscrewedupfile.txt
It opens in notepad with the invisible text
Close notepad

2) Use a hex editor (I used this one: http://www.hhdsoftware.com/)
and open this same text file

3) Leaving the file open in hex editor, open it in notepad - volia - you see the text!

weird stuff.

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 6:15 PM by bas westerbaan

Well.. it is just an encoding issue. Nothing more :)
I think it is stupid from notepad that it even has encoding.
It is time for a new 'notepad' with only plain dos text
in c++ for speed off course.
Interested?

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 6:38 PM by bas westerbaan

OK,
here is my little notepad replacement.. made in mfc.. Got one prob.. I dont know how to change the font to courier new.. anyone knows.. please mail me @ wnz[at]w-nz.com

simplepad
http://files.w-nz.com/show.php?id=15&skin=default

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 6:38 PM by bas westerbaan

BTW, it works with that little file

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 6:46 PM by David Cumps

Download link is broken ;)

Notepad can have encoding, but I'm wondering if it was a mistake to set the savind encoding different from the default loading one.

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 6:56 PM by bas westerbaan

hu? Damn.. please give me ya email and i`ll send it manually.. strange.. i`ll take a look to the source

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 7:01 PM by bas westerbaan

http://files.w-nz.com/show.php?id=16&skin=default

this one should work

(server wouldnt let you download an .exe)

# re: Notepad bug? Encoding issue?

Friday, February 27, 2004 8:15 PM by David Cumps

That works :) not that I really need a notepad replacement, got it in my shellextension ;)

Just hope someone will reply with the "why" notepad does this :)

# re: Notepad bug? Encoding issue?

Saturday, February 28, 2004 5:28 PM by Bart De Smet

Weird problem indeed. When you open the file using UTF-8 encoding, nothing goes wrong, but when you open it using the default ASCII, it goes wrong. :-(

# re: Notepad bug? Encoding issue?

Sunday, February 29, 2004 6:18 PM by dotnetjunkie

this is why:
because of the encoding issue, notepad sees the character set as Chinese, which needs the font 'SimSun", and if that's not present on your system, or you're using another font (as in notepad), you get the empty space or squares...

Try copying and pasting the text in Wordpad, and you'll see that the font dropdown box shows the font SimSun.

# re: Notepad bug? Encoding issue?

Sunday, March 14, 2004 11:11 AM by David Cumps

Not that I know of, how to make them aware? or how to know if they are :)

# Some files come up strange in Notepad

Wednesday, March 24, 2004 4:01 AM by TrackBack

Notepad has to guess the encoding and can be tricked into guessing wrong.

# re: Notepad bug? Encoding issue?

Wednesday, March 24, 2004 7:36 PM by Norman Diamond

I don't have SimSun or any other Chinese font, so Notepad must be substituting a Japanese font. But no matter what encoding Notepad guessed, it isn't a Japanese encoding. The display consists of:
a full-width Kanji character
a half-width black rectangle
three full-width Kanji characters
a half-width black rectangle
a full-width Kanji character
a half-width black rectangle

The half-width black rectangles are the same as Notepad normally displays for a single byte value which is neither a valid single-byte character nor the first byte of a valid double-byte character.

# re: Notepad bug? Encoding issue?

Thursday, March 25, 2004 2:52 PM by Sikko2go

It's not really an 'issue'. It's just that Notepad is not capable of displaying all kinds of Unicode variants
See for more info http://blogs.msdn.com/oldnewthing/archive/2004/03/24/95235.aspx

# re: Notepad bug? Encoding issue?

Thursday, March 25, 2004 3:21 PM by David Cumps

Checked that, already replied there ;)
Tried adding it as a trackback, didn't work for some reason when I tried. Will update my post with the url :)

# re: Notepad bug? Encoding issue?

Thursday, March 25, 2004 8:11 PM by Norman Diamond

3/25/2004 8:52 PM Sikko2go:

> It's not really an 'issue'. It's just that Notepad is not capable of
> displaying all kinds of Unicode variants

It is an issue. Notepad isn't displaying a file's contents that are perfectly well encoded in Windows's main, system, default code page. In Japanese Windows systems this is code page 932 (Shift-JIS), set by default at the beginning of the install process, and rarely changed (I didn't change it). In US Windows systems I thought it would be either code page 437 or 850, but someone told me it's something different (without saying which one), but still, surely it gets set by default at the beginning of the install process and most US users don't change it. When Notepad can't even display those files correctly, Unicode variants are not to blame.

# re: Notepad bug? Encoding issue?

Sunday, June 13, 2004 3:25 AM by priyanka

i want to save the text file as encoding"UNICODE"...ANYONE HELP
ME...AS SOON AS POSSIBLE..
THANX

# re: Notepad bug? Encoding issue?

Sunday, July 25, 2004 3:00 PM by strangers

Another famous joke in China for notepad

if you input legend, which is once the name for the largest PC company in China, and then save and reopen it, you will find two rectangles instead of two chinese letters. Then people said you should not purchase legend PC since MS hates legend.

Notepad really has some problems in coding/decoding.

# re: Notepad bug? Encoding issue?

Monday, August 27, 2007 3:44 AM by Liang bo yi

No,it's no "legend",it's uniocom(that is "联通" in Chinese),which is the name for the 2nd largest telcom company in China.

BTW:I'm a Chinese.

mailto me: 2.81[at]163.com

# re: Notepad bug? Encoding issue?

Tuesday, October 09, 2007 12:10 PM by ALARCON

NEED TO ENCODE NOTEPAD.  ALL I SEE ARE SQUARE BOXES AND SOME LETTERS...PLS HELP