It’s the World Wide Web
August 11, 2003
Want to dig even deeper? Post to the new MacEdition Forums (beta)!
Whenever I release a new CSS resource, like the Guide to CSS2 Support in Mac-only browsers, its abridged version, the Guide to Support for CSS3 Selectors, or even the highly incomplete Guide to CSS Support for PDAs, lots of people mention it in their blogs. I see them in the referer logs, and I thank them for the attention. I hope that the guides I construct are useful to all these people, but I don’t always know.
You see, many of these sites aren’t in English. They’re in French, German, Spanish, Dutch, Czech, Russian, Japanese, Chinese, Icelandic and many other things besides. And while I can count to ten or so in the first three and say please, excuse me and “where’s the toilet?” in Czech, I’m completely lost in the other languages listed. The only way I could work out if these authors think I’m doing the right thing is to find a native speaker of these languages and ask them (Babelfish doesn’t do Icelandic).
There are both positive and negative aspects to this state of affairs. There’s a very positive technical aspect. In recent years, a large number of companies and other agencies, including Apple, Adobe and Microsoft, have formed the Unicode consortium for defining standard character encodings. The International Organisation for Standardisation (ISO – the abbreviation’s French) had already defined a large number of appropriate encodings, including Latin-1, the standard encoding used for Western European languages like English. Unicode brings these encodings together, making it possible to view text in many languages and scripts on the same complete and even the same Web page – something that simply wasn’t easy or even possible a few years ago. Unicode is supported in Mac OS since 9.2, Windows NT, 2000, XP and many others besides.
Unicode is a reminder that not all standards relevant to the Web come from the W3C. It’s a standard with support done right. Like HTTP or TCP/IP, there is no such thing as “buggy” support for this standard. It works or it doesn’t. And while not all fonts contain all Unicode characters – I mean, how often do you need to read or write in Georgian and Coptic in Arial, Verdana, Times and Trebuchet? – there are at least some fonts that cover any given part of the Unicode space. The fonts that come with Mac OS X are particularly good for the Asian script parts of the Unicode space. And most software now makes intelligent choices when it encounters a character that’s not in the currently selected font, switching instead of a compatible font that contains that character (or “glyph” if you want to be technical).
The relative success of the implementation of Unicode makes the frequently shoddy implementations of important Web standards like CSS, DOM and XML even more pathetic. A large part of the reason for this difference, I suspect, is that type encoding is largely a system-level technology that needs to be enabled in the operating system before it’s used in browsers or elsewhere. Even though at least two browser developers are also OS vendors, the development teams for the operating systems were probably looking at bigger issues than some silly browser war between Microsoft and Netscape, which is what got us into this Web standards mess in the first place. To be fair, it’s probably conceptually easier to implement Unicode capabilities (a mapping from code to glyph, and the inclusion of enough fonts to cover the space of glyphs) than the glorious power and complexity of the combinatorial system that is CSS.
So that’s the good thing – my English-oriented browser can display the non-Roman text in comments made by my readers. This is something that wasn’t possible or easy a few short years ago.
And yet there’s something about this state of affairs that bothers me. Maybe it’s not actually a bad thing, but it does give me pause. You see, I can’t read and understand the non-English blog postings referring to MacEdition’s resources, but the people who posted those links clearly can read those resources, which are all in English. (Some of my columns have been translated into Spanish, courtesy of faq-mac, but not the resources.)
For all the efforts going into internationalization (see, spelt with a “z” even though as an Australian I should spell it with an “s”) , the brutal truth is that no Web site is going to be produced in more than a couple of dozen languages at once, and those being only the sites of governments, international organisations or large companies. English has become the new lingua franca of international relations, with even French losing ground in diplomatic circles.
This state of affairs gives me pause for a number of reasons. First, I can write my columns, resources, academic research papers and anything else in English, with the presumption that people can read it in English without translation. This puts native speakers at an advantage in the realm of ideas that we don’t necessarily deserve on our other abilities alone.
Moreover, it accelerates the process of language loss that we’ve seen over the past couple of centuries, and will continue to see. (For although I don’t subscribe to the whole argument of Andrew Dalby in his book Language in Danger, I do wonder if we aren’t losing something important as minority languages disappear.)
Finally, it skews the information flow. The process of opening and globalization has had many positive effects – just look at the rise in living standards in South Korea or the fall in poverty rates in China and India. The Internet is part of that: information now flows more freely and cheaply than ever before. But it’s almost all one-way. Sure, now people all over the world can read the New York Times, Washington Post or the BBC News site. But Americans (and Australians) don’t generally read Le Monde, Frankfurter Allgemeine Zeitung (except the weekly English version) or Asahi Shimbun (ditto).
And if you don’t think that matters, something really has gotten lost in the translation.