Notepad bug? Encoding issue?

Someone showed me a weird text file today. It was a bat file with 'copy MeYou.bak MeYou.txt'. When you would ran it, it would work. But when you opened it in Notepad, there was nothing.

So we decided to look a bit into this and here is something we came up with to 'create' invisible text:

Open notepad and enter:
' abc.bak abc.txt'

(That is: space abc dot bak space abc dot txt, no line break, without the quotes)

It doesn't work with every string, just follow us on this example and use that one.

Save your file. Notepad picks default ANSI as encoding.

Open your file, Notepad seems to open by default in Unicode encoding.

Your text is now invisible.



Does anyone know why the saving default is different from the opening default?

And why does it happen to that particular piece of text. It doesnt happen to ' b.b b.b' or ' .bak .txt'.

It looks the same when viewing it through a hex editor. But apparently it has to do something with encoding.

Anyone who can explain?

Update: When you paste it in IE or Trillian you get '????????', like some people tried in the comments ;)

Update2: In my notepad screenshot the font was Terminal, when I choose Verdana it are indeed squares. Not that invisible anymore, but still wrong :)

Update3: You can find an explanation on why this is happening at The Old New Thing.

12 Comments

  • in notepad it are squars, when I copied it to here it were strange Chinese signs. After posting it became ????????



    hehe :)

  • another strange one, if I create the file with UltraEdit, and then load it in Notepad, I get the same squares. If I load it in UltraEdit, it's OK.



    ???



    Robert


  • Maby easter egg :p

  • Well.. it is just an encoding issue. Nothing more :)

    I think it is stupid from notepad that it even has encoding.

    It is time for a new 'notepad' with only plain dos text

    in c++ for speed off course.

    Interested?

  • BTW, it works with that little file

  • hu? Damn.. please give me ya email and i`ll send it manually.. strange.. i`ll take a look to the source

  • Weird problem indeed. When you open the file using UTF-8 encoding, nothing goes wrong, but when you open it using the default ASCII, it goes wrong. :-(

  • this is why:

    because of the encoding issue, notepad sees the character set as Chinese, which needs the font 'SimSun", and if that's not present on your system, or you're using another font (as in notepad), you get the empty space or squares...



    Try copying and pasting the text in Wordpad, and you'll see that the font dropdown box shows the font SimSun.

  • I don't have SimSun or any other Chinese font, so Notepad must be substituting a Japanese font. But no matter what encoding Notepad guessed, it isn't a Japanese encoding. The display consists of:

    a full-width Kanji character

    a half-width black rectangle

    three full-width Kanji characters

    a half-width black rectangle

    a full-width Kanji character

    a half-width black rectangle



    The half-width black rectangles are the same as Notepad normally displays for a single byte value which is neither a valid single-byte character nor the first byte of a valid double-byte character.

  • 3/25/2004 8:52 PM Sikko2go:



    > It's not really an 'issue'. It's just that Notepad is not capable of

    > displaying all kinds of Unicode variants



    It is an issue. Notepad isn't displaying a file's contents that are perfectly well encoded in Windows's main, system, default code page. In Japanese Windows systems this is code page 932 (Shift-JIS), set by default at the beginning of the install process, and rarely changed (I didn't change it). In US Windows systems I thought it would be either code page 437 or 850, but someone told me it's something different (without saying which one), but still, surely it gets set by default at the beginning of the install process and most US users don't change it. When Notepad can't even display those files correctly, Unicode variants are not to blame.

  • i want to save the text file as encoding"UNICODE"...ANYONE HELP

    ME...AS SOON AS POSSIBLE..

    THANX

  • NEED TO ENCODE NOTEPAD. ALL I SEE ARE SQUARE BOXES AND SOME LETTERS...PLS HELP

Comments have been disabled for this content.