]> Crazy SGML Stuff, etc.

Crazy SGML Stuff (and conditional comments in IE)

This file contains some truly degenerate SGML constructs which are valid SGML and as such even validate according to the W3C validating engine, but many to most browsers will have various problems with them. One other thing sometimes seen in HTML files (and demonstrated first in here) is "conditional comments" which operate in Microsoft Internet Explorer only, starting with Version five and possibly to be dropped at some future point. There are also two bits of SGML that routinely belong in any HTML document and which are widely accepted by all HTML user agents of any kind, the <!DOCTYPE> declaration and the <!-- comment -->. The DOCTYPE declaration is used by some user agents to determine whether a document is to be displayed in a "quirks" mode or else a "standards" mode, or some other possibilities between them. And the comment simply means that whatever is in the comment brackets is not processed in any way; it is simply ignored.


In HTML files, conditional comments provides a means of providing options, depending upon the type of browser or user agent used, depending on whether it is Microsoft Internet Explorer ("IE") or not, of if so which version of IE that it is. This feature is not SGML but functions as a kind of macro processing in which some selection logic is applied to portions of a file before passing forward only those portions selected or applicable to the version of the browser which the user is using to read the file. Nevertheless, this first example does also showcase one feature of SGML, namely the ability to use the <!DOCTYPE> declaration to extend the definition of the version of HTML otherwise called out. Note in this example the first entry uses the TARGET attribute which does not belong in HTML 4.01 Strict (which this file otherwise conforms to) but was made possible using a modified document type declaration line, thus:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd" [ <!ATTLIST A TARGET CDATA #IMPLIED> ]>

Notice (on nearly all major browsers) that this will cause a spurious ]> marking to be displayed at the top of the displayed screen. For this first example, the raw text looks like this:

<UL>
<LI>This is a browser, <A HREF="/" TARGET="_blank">My root</A></LI>
<!--[if IE]><LI>This is Internet Explorer</LI><![endif]-->
<!--[if IE 5]><LI>This is IE 5</LI><![endif]-->
<!--[if IE 5.01]><LI>This is IE 5.01</LI><![endif]-->
<!--[if gt IE 5]><LI>This is IE, greater than 5</LI><![endif]-->
<!--[if IE 5.5000]><LI>This is IE 5.5</LI><![endif]-->
<!--[if gt IE 5.5000]><LI>This is IE, greater than 5.5</LI><![endif]-->
<!--[if gte IE 5]><LI>This is IE 5, or greater</LI><![endif]-->
<!--[if lte IE 5.5]><LI>This is IE 5.5, or lower</LI><![endif]-->
<!--[if lt IE 6]><LI>This is IE, lower than 6</LI><![endif]-->
<!--[if IE 6]><LI>This is IE 6</LI><![endif]-->
<!--[if IE 7]><LI>This is IE 7</LI><![endif]-->
<!--[if lt IE 7]><LI>This is IE, lower than 7</LI><![endif]-->
<!--[if lte IE 7]><LI>This is IE 7, or lower</LI><![endif]-->
</UL>
<!--[if !IE]><!-->
<P STYLE="color: green;">This is hidden from Internet Explorer.</P>
<!--<![endif]-->

This results in:

This is hidden from Internet Explorer.


In SGML there still remain ways to cause a section to be selectively ignored or displayed, depending upon other SGML declarations set previously in the file. For example, the following should display its contents. The code looks like this:

Start: <![INCLUDE[ This text should show. ]]> :enD

So now let's see what it looks like:

Start: :enD

However, the following should hide its contents. The code looks like this:

Start: <![IGNORE[ This text should not show. ]]> :enD

So now let's see what it looks like:

Start: :enD

And even though <XMP> and <LISTING> have been done away with, their function is nevertheless approximated by the CDATA SGML command. The code looks like this:

Start: <![CDATA[
This one should close out most quietly,
since it does not use the closing HTML bracket character,
but it does do some other things. - - <
Is &amp: one character or five?
]]>
<![CDATA[
This one will show some closing marks due to the closing
HTML bracket <character> - what a mess!
]]> :enD

So now let's see what it looks like:

Start: - what a mess! ]]> :enD

In addition to INCLUDE, IGNORE, and CDATA there are RCDATA and TEMP. See here what happens with some trivial examples of each. The code looks like this:

Start: <![TEMP[ Should be normal &amp;.]]> :enD
Start: <![RCDATA[ Should be normal &amp;.]]> :enD
Start: <![CDATA[ Should be normal &amp;.]]> :enD

So now let's see what this looks like:

Start: :enD

Start: :enD

Start: :enD


For the next example, we have a processing instruction which in at least the DocBook application would have caused a different color. The code looks like this:

<P>Is there anything here: X<?dbhtml bgcolor="#FFFF00" ?>X?</P>

and the result, if it worked, could look something like this:

Is there anything here: XX?

So now let's see what it looks like:

Is there anything here: XX?


For the next example, SGML lets you skip a closing bracket when tags are immediately nested. The code looks like this:

<P>nested? <I<B>italic and bold?</B</I> normal?</P>

and the result, if it worked, would look like this:

nested? italic and bold? normal?

So now let's see what it looks like:

nested? italic and bold? normal?


For the next example, SGML lets you use mere slash bars ("/") to delimit the start and end of the span covered by the tag. The code looks like this:

<P>One more: <I/hows this?/ and is <I>this</I> normal?</P>

and the result, if it worked, would look like this:

One more: hows this? and is this normal?

So now let's see what it looks like:

One more: this normal?


When an attribute is one of a set of an enumerated list and the items in the list are unique, and only one attribute can have any of the values in its enumerated list, then it is possible to omit even the attribute name and just put the value by itself as the attribute. The code looks like this:

<P>forwards <BDO RTL>backwards</BDO> forwards</P>

and the result, if it worked, would look like this:

forwards backwards forwards

So now let's see what it looks like:

forwards backwards forwards


In SGML, an empty comment <!> is permissible, though it should be invisible and have no discernable effect.

The code looks like this:

See here: <!>junk.

So let's see what it looks like:

See here: junk.


In SGML, an empty tag <> is permissible, and it should simply repeat the last element recently closed. Note: The W3C SGML-based parsing/validating engine does not recognize <> as opening the previous opened/closed element, so one cannot validate a file in which an occurence of an element uses both the empty opening and closing tags, even though that would be valid SGML. For example one cannot have:

<P>And <I>now</I> for <>something</> else.</P>

The parser is rigged not to recognize "<>" as anything more than ordinary text, so when the closing tag is encountered (even if it is only a "</>") the next outer element is exited (in the above example a <P>) prematurely, causing the text after it to not be accepted, and then the closing tag for that element (if present) also causes a problem ("Closing tag for element that is not open").

But an individual open tag is permitted. The code looks like this:

<P>See <I>here</I> and <>here and beyond italicized.</P>

and the result, if it worked, would look like this:

See here and here and beyond italicized.

So let's see what it looks like:

See here and <>here and beyond italicized.


Finally, SGML also allows the innermost tag to be closed (whatever it happens to be) with a simply </> closing "tag"! This would typically be used with the empty tag, and some user agents can do some strange things when they are mixed together. The code looks like this:

<P>And now for something else: <I>italic</> normal?</P>

and the result, if it worked, would look like this:

And now for something else: italic normal?

So now let's see what it looks like:

And now for something else: italic normal?