The nuts and BOLTS of technical markup
August 12, 2002
Feedback Farm
Have something to say about this article? Let us know below and your post might be the Post of the Month! Please read our Official Rules and Sponsor List.
Forums
Want to dig even deeper? Post to the new MacEdition Forums!
In recent months, MacEdition has been increasing the frequency of fresh content. There’s now a team of people responsible for markup and production on different days, so this was quite manageable. As all of you who are responsible for Web sites with multiple authors and features would know, more content generally doesn’t just mean more of the same. It means a greater variety of content types and features. One of the new features we’ve added recently is the BOLTS section (incidentally, it was me who came up with the contrived acronym).
BOLTS articles are aimed at Mac professionals interested in OS X’s Unix underpinnings and as such, they tend to contain a lot of code of various types. We needed to decide how to mark up the different kinds of code, so that distinctions between them would be meaningful. Two principles informed my decisions on how to classify and style these code snippets. First, I stuck to the principle of marking up content semantically – as I put it in an earlier column, I labelled things for what they were. Second, the labelling scheme I used for classifying the different types of code was similar to those used in books on programming languages. This fitted in with what the author wanted, and is an effective scheme that people are familiar with.
The most common way of specifying computer input or output in HTML
documents is by using a CODE
element. This is the way most
people would do it, and maybe use a bunch of class attributes to
distinguish the different types of code and formatting. But there are
semantic elements in HTML itself that allow you to make distinctions
between types of code even where CSS stylesheets aren’t appropriate
or applicable. They’re not often used, and sometimes I think that
people have forgotten they exist – if they ever knew about them in
the first place. You can distinguish input from output by using
CODE
for the former, and SAMP
for the latter. If you look at the HTML
specifications, that’s exactly what the standards mavens suggest
as the purposes of these elements. There are also the KBD
element
to denote user input, and the VAR
element to denote variables, as in
“sqrt(
n)
, where n is
a real number”. These are exactly the sorts of distinctions you see
in computer books.
For the purposes of the first few BOLTS articles, I used the
CODE
element for various types of user input and command
names, SAMP for output, VAR
for names of APIs and paths, and
KBD
for things you’re meant to type in at the command
line. I placed these elements inside either paragraph (P
) or
preformatted (PRE
) elements, depending on whether indenting of
the code was important for readers’ understanding.
To further distinguish command names and comments from other types of code, and distinguish APIs from path names, I also defined some simple classes. The following CSS code is an extract from our main stylesheet.
.api { font-style: normal; font-variant: normal; font-weight: normal; font-family: Monaco, 'Andale Mono', 'Lucida Console', monospace; } .path { font-style: italic; font-variant: normal; font-weight: lighter; font-family: Monaco, 'Andale Mono', 'Lucida Console', monospace;} .cmd { font-style: normal; font-variant: normal; font-weight: bold; font-family: Courier, 'Courier New', monospace; } .comment { color:#f30;}
As you can see from these styles, combinations of color, weight, size and italicization can be used to distinguish these different kinds of code visually, as well as semantically. These styles also show that there are two basic options for font styling that can be used to distinguish between different types of text that ought to be monospaced. In addition to the usual Courier-like font families, all versions of Mac OS, as well as Windows, include a monospaced font with a taller x-height as part of their default installation. For Mac OS, this is Monaco, while for Windows, Andale Mono is the closest equivalent. (Some Windows versions include Lucida Console, a monospaced font in the Lucida family, but frankly I don’t think it looks different enough from the regular text font to be easily distinguishable in small, inline amounts. I’ve no idea what the appropriate font name would be on Linux or other platforms, but if you let me know in the Feedback Farm, I’ll add it to the stylesheet.) This font distinction allows readers to pick up semantic distinctions between different kinds of text visually. Because of their larger x-heights, Monaco and Andale also happen to match MacEdition’s default font (Lucida Sans) better than Courier does.
There’s more to HTML than DIV and SPAN
These code-oriented elements aren’t the only ones that are ignored
too often by web authors. As I mentioned more
than eighteen months ago, Web authors just aren’t using useful
structural tags like CITE
, ABBR
and
ACRONYM
, but they use DIV
and SPAN
like they’re going out of fashion. I can understand reticence to use
ABBR
and ACRONYM
because of the problems
they cause in Netscape 4, but I wonder why the others are so unpopular.
Similarly, when was the last time you used the Q
element for
short, inline quotations? More and more browsers are doing the right thing
with this element, by putting quote marks around its content. (If you want
to check whether your browser does this, you can check the test at the
bottom of this
simple old test page I put together.) Mark Pilgrim’s blog
includes a useful
discussion on styling these quote marks using CSS.
Unfortunately, this may be moot if the latest ideas emanating from the W3C
are adopted as the future XHTML 2.0
recommendation: the Q
element is proposed to be replaced
by a QUOTE
element that doesn’t have automatic quote marks. Unfortunately,
if IE/Windows doesn’t implement part of a standard, it seems that the
W3C will eventually give in and adopt Microsoft’s failures as
“the standard” (just like they did with ID attributes starting
with a number).
I’ll have more to say about some of the proposed changes to XHTML and CSS standards in future columns. Until then, I’ll simply say that the way to stop the Web standards infrastructure from shifting under your feet is to use those standards. If a standard is widely adopted, it will be retained, but if it isn’t implemented by browser developers or adopted by Web authors, the W3C might try something different. So mark up your content semantically, and reap the maintenance benefits. Unless, of course, you really want the Web to remain a low-productivity mish-mash of tag soup, proprietary bumf, and tag soup hacks.