Nobody likes a fibber
January 14, 2002
Want to dig even deeper? Post to the new MacEdition Forums (beta)!
Writing followups to previous articles suggests that something was left out from the originals. This is always regrettable. Still, some of the issues raised in discussion on browser tag spoofing are too important to ignore.
Who is using OS X?
One of the things that I showed in the previous column was the usage shares for browsers within users of Mac OS X. This was a simple matter of counting up IE 5.1x (which used to be an exclusively OS X browser), OmniWeb and the Netscape/Mozilla browser tags with “Mac OS X” somewhere in the string. Because neither Opera nor iCab identify themselves as running on Macs without distinguishing between Mac OS 9 and Mac OS X, I can’t add them into the OS X total. However, since neither of them make up more than about 3 percent of total traffic, and some of that Opera total is Windows and Linux, it seems unlikely that these two browsers would add much to the total of MacEdition readers using Mac OS X.
The picture gets cloudier from late December 2001, when Microsoft released IE 5.1 for Mac OS 9. This doesn’t affect the numbers I reported in the article, which predated the release of this browser, but it could well muddy the picture going forward. The Mac OS 9 version seems to report itself as 5.13, but apparently so do some versions of the Mac OS X version. Versions 5.1 and 5.12 are definitely Mac OS X.
In any case, Carbon browsers could well make this moot. While Carbonised apps can run on both Mac OS 9 and Mac OS X, existing browsers have so far been released as separate versions.
If it’s the same application, it won’t be possible to tell who is Mac OS 9 and who is Mac OS X. Still, we were able to learn a lot about the uptake of Mac OS X through 2001 from our Web site logs, even if this will no longer be possible in 2002.
Who are you really?
It’s important to distinguish between semi-spoofing and true
spoofing. Semi-spoofing is what Opera does. By default, Opera for Mac has
a browser tag like this:
Opera/5.0 (Macintosh;US;PPC) TP [en].
The “TP” means “technical preview,” in the beta
versions, and “[en]” is the language code. You can also choose
to have Opera identify itself as some other browser. Then the browser tag
becomes one of these:
Mozilla/5.0 (Macintosh;US;PPC) Opera 5.0 [en]
Mozilla/4.76 (Macintosh;US;PPC) Opera 5.0 [en]
Mozilla/3.0 (Macintosh;US;PPC) Opera 5.0 [en]
Mozilla/4.0 (compatible; MSIE 5.0; Mac_PowerPC) Opera 5.0 [en]
These semi-spoofed tags are generally considered sufficient to fool most browser sniffer scripts and server-side technologies. What’s crucial here, however, is that the word “Opera” and the true version number are still in the tag. This means it is possible for the better log analysis packages, including Analog, and more discerning Web authors such as myself, to identify these browsers correctly as Opera.
The same is true of Mozilla-derived browsers like Galeon and Konqueror, and
the BeOS browser NetPositive, which has a tag like this:
(compatible; NetPositive/2.2.2; BeOS). None of these browsers seem
to have problems doing most things on the Web – you certainly
don’t hear of people complaining that Konqueror has been blocked from
a particular Web site.
Strangely, these examples have not been picked up by the two leading minor
browsers on the Mac platform, iCab and OmniWeb. The default for iCab is to
identify itself as iCab at the front of the tag like this:
(Macintosh; I; PPC). There are also options that identify iCab as
Mozilla/4.5 (compatible; iCab 2.6; Macintosh; I; PPC)or a
similar version with a front label of Lynx. Similarly, the default for
OmniWeb is something along these lines:
OmniWeb/4.0.6; Mac_PowerPC). This is all fine.
Although most log analysis packages don’t pick up iCab or OmniWeb by default, it’s easy to catch these browser tags – what I call “semi-spoofed” tags – through some options in Analog. So even if other sources of browser stats aren’t doing it right, you can be assured that I am doing my best to compensate.
Giving false tags is lying
These browsers go further than the semi-spoof, however. iCab also offers the option of identifying itself exactly the same way as IE 5.0 for the Mac, or to write one’s own tag (some of these can be quite amusing). All I ask is that these comedians usually leave the word “iCab” in the browser tag so I know my audience.
Similarly, OmniWeb has options to report itself using exactly the same tag as IE5 – a true spoof, not a semi-spoof. As reported by an Omni Group employee, the company did this so OmniWeb users wouldn’t get blocked out of some sites. What this says to me is that either some browser sniffers are broken – so much so that plenty of users of IE and Netscape are getting chucked out of those sites too, which I don’t believe – or Web authors are deliberately blocking OmniWeb because it has such limited support for important Web standards, including security. In which case it should be getting blocked and the Omni Group should fix the shortcoming, not spoof around it.
Let’s be completely clear about this: Browser sniffers do not sniff specific full strings – there are far too many different browser tags out there for this to work. Last month I saw nearly 2000 different tags in MacEdition’s logs, with multitudes of variations on the big two, as well as dozens of search engines and minor browsers. I couldn’t build up a database of them all. Even the database-driven commercial sniffers rely on some pattern-matching. The following tags don’t get blocked by sites, so why would OmniWeb’s default “semi-spoof” tag?
Mozilla/4.78 [en]C-C-UDP; georgetownU-campus-4.7-08.06.2001 (Windows NT
Mozilla/4.08C-SYMPA (Macintosh; U; PPC)
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90; PeoplePC 2.3.2; ISP)
Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; DT)
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; UUNET)
Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; Wanadoo BE; KITV4 Wanadoo)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; MSOCD; AtHome020)
If the Omni Group is genuinely finding that standard browser sniffers are
blocking OmniWeb, it could try moving the
OmniWeb/4.x.x” identifier to the very end of the
string, after the parentheses. If that doesn’t work, maybe it needs
to investigate why those sites are blocking its product.
True-spoofing your browser tag is unnecessary, counterproductive and wrong. It’s unnecessary because semi-spoofing is sufficient for all the browser sniffers I have ever encountered. It’s counterproductive because if dumb Web authors block anything but the big two browsers through brain-dead browser sniffers (letting only the most generic tags of those browsers through), then letting them think that you are using one of those browsers means they have no incentive to change their ways. Few things will bring about the one-browser Web faster than a bunch of minor browsers that report themselves as that one browser. It’s wrong because if there is a good reason to sniff out your browser – say, because the Web author wants to serve content appropriate to that browser – it becomes impossible with true spoofing. I’d love to be able to serve OmniWeb users with a stylesheet that works around its myriad CSS bugs and support gaps. If I could reliably sniff for iCab users too, I could get around a particular Netscape 4 bug without making things unsightly for those iCab users.
But I can’t do these things if iCab and OmniWeb users can’t be identified as such. Most browser sniffing is ignorant and exclusionary (“You’re not IE5+ so you can’t see my Web site”); such behaviour can and will be mocked by me at every opportunity. But it can be used for good as well as evil, if only users would allow it.
Fibbers don’t get counted
The whole issue of browser tag spoofing makes me uneasy. As I acknowledged
in my earlier article in the context of
robots, there will always be some user agents that are deliberately faked
to look like something else. There is no sense deliberately adding to this
problem, and it’s beyond me why any maker or user of a browser would
want to behave in the same way as those rogue robots that fake their tags
for nefarious reasons. Surely there is another way around the banks’
blocking tactics, similar to the myriad of personalised tags that IE and
Netscape permit. After all, custom tags like
(compatible; MSIE 6.0; Windows NT 5.1; Q312461; MSN 6.1; MSNbMSFT;
MSNmen-us; MSNc00) presumably don’t lock you out of banking
sites, and they are much more complex than what I am suggesting OmniWeb and
Further, by making it theoretically impossible to determine true usage of these minor browsers, all claims about market share become unreliable. You can hold up this myth of hordes of users of your product who are really out there but can’t be seen (sort of like the way some Linux users sometimes carry on). Frankly, I think that most people leave their browser tag on the default setting, so the figures I produced in the earlier column will capture the vast majority of iCab and OmniWeb users, even if nobody else’s stats do. I very much doubt that these hidden users account for much more than half of one percentage point of our human readership. Even with the possibility of full spoofing, my Analog file and careful examination of our logs for anomalies will generate a sufficiently accurate analysis of the MacEdition readers browser usage, and you can do the same for your sites. The few people using the tag-spoofing options just won’t distort the numbers in a detectable way.
If you set your browser to have a tag that creates a true spoof, as opposed to a semi-spoof, then you have no right to expect that Web authors will accommodate the bugs in your browser. And if you are a browser manufacturer that makes it possible for users to true-spoof their browser tag through simple preference options, you have no right to claim that your product’s market share is larger than it seems. So it’s just as well that, thus far, the makers of iCab and OmniWeb haven’t made any claims about their market share. That’s the way it should stay.