MacEdition : CodeBitch : August 12, 2002

The nuts and BOLTS of technical markup

August 12, 2002

Feedback Farm

Have something to say about this article? Let us know below and your post might be the Post of the Month! Please read our Official Rules and Sponsor List.

Forums

Want to dig even deeper? Post to the new MacEdition Forums!

In recent months, MacEdition has been increasing the frequency of fresh content. There’s now a team of people responsible for markup and production on different days, so this was quite manageable. As all of you who are responsible for Web sites with multiple authors and features would know, more content generally doesn’t just mean more of the same. It means a greater variety of content types and features. One of the new features we’ve added recently is the BOLTS section (incidentally, it was me who came up with the contrived acronym).

BOLTS articles are aimed at Mac professionals interested in OS X’s Unix underpinnings and as such, they tend to contain a lot of code of various types. We needed to decide how to mark up the different kinds of code, so that distinctions between them would be meaningful. Two principles informed my decisions on how to classify and style these code snippets. First, I stuck to the principle of marking up content semantically – as I put it in an earlier column, I labelled things for what they were. Second, the labelling scheme I used for classifying the different types of code was similar to those used in books on programming languages. This fitted in with what the author wanted, and is an effective scheme that people are familiar with.

The most common way of specifying computer input or output in HTML documents is by using a CODE element. This is the way most people would do it, and maybe use a bunch of class attributes to distinguish the different types of code and formatting. But there are semantic elements in HTML itself that allow you to make distinctions between types of code even where CSS stylesheets aren’t appropriate or applicable. They’re not often used, and sometimes I think that people have forgotten they exist – if they ever knew about them in the first place. You can distinguish input from output by using CODE for the former, and SAMP for the latter. If you look at the HTML specifications, that’s exactly what the standards mavens suggest as the purposes of these elements. There are also the KBD element to denote user input, and the VAR element to denote variables, as in “sqrt(n), where n is a real number”. These are exactly the sorts of distinctions you see in computer books.

For the purposes of the first few BOLTS articles, I used the CODE element for various types of user input and command names, SAMP for output, VAR for names of APIs and paths, and KBD for things you’re meant to type in at the command line. I placed these elements inside either paragraph (P) or preformatted (PRE) elements, depending on whether indenting of the code was important for readers’ understanding.

To further distinguish command names and comments from other types of code, and distinguish APIs from path names, I also defined some simple classes. The following CSS code is an extract from our main stylesheet.

.api { font-style: normal; 
       font-variant: normal; 
       font-weight: normal;
       font-family: Monaco, 'Andale Mono', 
        'Lucida Console', monospace;
}

.path { font-style: italic; 
        font-variant: normal; 
        font-weight: lighter;
        font-family: Monaco, 'Andale Mono', 
         'Lucida Console', monospace;}

.cmd { font-style: normal; 
       font-variant: normal; 
       font-weight: bold;
       font-family: Courier, 'Courier New', monospace;
}

.comment { color:#f30;}

As you can see from these styles, combinations of color, weight, size and italicization can be used to distinguish these different kinds of code visually, as well as semantically. These styles also show that there are two basic options for font styling that can be used to distinguish between different types of text that ought to be monospaced. In addition to the usual Courier-like font families, all versions of Mac OS, as well as Windows, include a monospaced font with a taller x-height as part of their default installation. For Mac OS, this is Monaco, while for Windows, Andale Mono is the closest equivalent. (Some Windows versions include Lucida Console, a monospaced font in the Lucida family, but frankly I don’t think it looks different enough from the regular text font to be easily distinguishable in small, inline amounts. I’ve no idea what the appropriate font name would be on Linux or other platforms, but if you let me know in the Feedback Farm, I’ll add it to the stylesheet.) This font distinction allows readers to pick up semantic distinctions between different kinds of text visually. Because of their larger x-heights, Monaco and Andale also happen to match MacEdition’s default font (Lucida Sans) better than Courier does.

There’s more to HTML than DIV and SPAN

These code-oriented elements aren’t the only ones that are ignored too often by web authors. As I mentioned more than eighteen months ago, Web authors just aren’t using useful structural tags like CITE, ABBR and ACRONYM, but they use DIV and SPAN like they’re going out of fashion. I can understand reticence to use ABBR and ACRONYM because of the problems they cause in Netscape 4, but I wonder why the others are so unpopular. Similarly, when was the last time you used the Q element for short, inline quotations? More and more browsers are doing the right thing with this element, by putting quote marks around its content. (If you want to check whether your browser does this, you can check the test at the bottom of this simple old test page I put together.) Mark Pilgrim’s blog includes a useful discussion on styling these quote marks using CSS.

Unfortunately, this may be moot if the latest ideas emanating from the W3C are adopted as the future XHTML 2.0 recommendation: the Q element is proposed to be replaced by a QUOTE element that doesn’t have automatic quote marks. Unfortunately, if IE/Windows doesn’t implement part of a standard, it seems that the W3C will eventually give in and adopt Microsoft’s failures as “the standard” (just like they did with ID attributes starting with a number).

I’ll have more to say about some of the proposed changes to XHTML and CSS standards in future columns. Until then, I’ll simply say that the way to stop the Web standards infrastructure from shifting under your feet is to use those standards. If a standard is widely adopted, it will be retained, but if it isn’t implemented by browser developers or adopted by Web authors, the W3C might try something different. So mark up your content semantically, and reap the maintenance benefits. Unless, of course, you really want the Web to remain a low-productivity mish-mash of tag soup, proprietary bumf, and tag soup hacks.

— CodeBitch (codebitch@macedition.com) is the grumpy cow who does the HTML production for MacEdition. Read other articles by CodeBitch

E-mail this story to a friend

The nuts and BOLTS of technical markup

Feedback Farm

Forums

There’s more to HTML than DIV and SPAN

Advertise on MacEdition!

Write for MacEdition!

Ask MacEdition!

Talkback on this story!