HTML 4.0 special characters - test files

See http://www.w3.org/TR/REC-html40/sgml/entities.html

Web browsers and web authoring tools are terribly behind in supporting the HTML special characters in the above specification.

There are only two browser versions that do even a decent job.

We can use the numeric representation for the left and right single- and double-quote characters, but not the symbolic ones. We need these so we can author high-quality documents that can be published both as web pages and for printing or PDF. ASCII's old "quote" characters “don’t cut it” for real typography.

I know of three authoring tools that are truly evil with these special characters:

I hope vendors will fix these problems and release updates as soon as possible. So far, there has been one update, GoLive from 4.0 to 4.0.1, which ignores this problem even though they know about it.

Perhaps the problem has been that there was not a good test suite. I made some proper test files:

HTML 4.0 characters as HTML web pages

HTML 4.0 characters as ASCII text files

Using Resorcerer, I made a Unicode text file of all of the relevant characters so I could try it with apps that claim to deal with Unicode. I have yet to see one that can read this file properly. This file starts with hex "FEFF" as per the article HTML Document Representation.

Mac applications don't do well with Unicode and these HTML special characters.

BTW, does anyone know of a Mac App that can author ISO 8859/1 text?


http://Yost.com/Computers/htmlchars/ - this page
1999-02-22 Created
1999-10-10 Modified
1999-10-10 Modified cosmetically