Trawling the logs
March 26, 2001
As a firm believer in knowing your audience, I’ve long been tracking the usage of different browsers by MacEdition’s readers. This tells me whether I can use certain technologies on the site without leaving some users behind. When Opera’s technology preview for the Mac came out, I wanted to see how quickly it was adopted. Sure enough, over the following weeks Opera for the Mac garnered a bit over two percent of pageviews at MacEdition. This isn’t trivial: it’s about the same as IE4 on all platforms. What made me really sit up and take notice was the recent sharp increase in usage of Netscape 6.
Take a look at this graph. You can see the fortunes of Netscape 6, including its Mozilla sibling, unfold week-to-week. When 6.0 first came out in November, there was a spike up: people downloaded it and tried it out on their favorite sites to see how they looked. And at least some of that traffic would have been me and the other MacEdition staff checking the site in the new version, just in case our commitment to standards had run up against some horrible browser glitch. After that initial flurry, usage of Netscape 6.0 fell back to a tiny fraction of die-hards and people using the latest Mozilla builds, many of them on Linux. The initial release Version 6.0 had been such a disappointment in speed and stability that very few people stuck with it – barely one percent of our pageviews.
When Version 6.01 came out, obviously people tried it out again. This time, they were more impressed. Many of them seem to have stuck with it, if only for a few weeks. In the latest week, it seems to have fallen back, but not as far as it did after its initial release. I’m hopeful that it will stay fairly high.
The share of all versions of Netscape put together has declined, but the share of Netscape 4 has declined even more: from around 30 percent in November to just over 20 percent now. The biggest fall-off in Version 4’s share coincided with the increase in share for Version 6, implying that at least some Netscape users are upgrading. This is a good sign, both for the health of alternative (that is, non-IE) browsers and for Web designers hoping to shift to standards-based design principles.
Meanwhile, the share of Internet Explorer 5 continues to rise: by between eight and ten percentage points since November, depending on which week you start from. In MacEdition’s case, most of this has been on the Mac side, the line shown in the graph. Although much of this gain has been at the expense of Netscape, clearly earlier versions of IE and minor browsers other than the ones shown here must have declined also.
A quick note on definitions
The “IE 5/Mac” line on this graph is the sum of
pageviews accounted for by the browser tag Mozilla/4.0
(compatible; MSIE 5.0; Mac_PowerPC)
, and the betas of IE 5.5
for the Mac. Recently, the designation of the next version of IE
for the Mac changed to 5.1; I’ve included that in the latest
week. Some IE 5/Mac browsers might be missing from these figures
if the browser tag has been altered, for example, with the name of
the providing ISP. As far as I can tell from a quick check of
recent log reports, this is pretty uncommon for IE 5/Mac, even
though it is common on the Windows side.
And Opera? Well, for a technology preview, it’s done very nicely. As I mentioned above, it initially captured a pageview share of around two percent, and it has been rising marginally but steadily since then. Remember, this is for a technology preview that is meant to expire after a month.
The right measuring rod
Exercises like these – and their surprising findings –
emphasise how important it is to have access to your server’s logs.
I constructed this graph by running Analog on
raw logfiles. I configured Analog to show nested lists of browsers by minor
version by adding the option SUBBROW */*.*
to the analog.cfg
file, so I could tell the difference between, say, IE 5.0 and IE 5.5. Most
canned Web site statistics packages don’t do this.
It’s also best to set Analog to rank traffic by pageviews, not by hits or requests. If some of your pages have more graphics than others, hits can be a misleading indicator of number of readers. I also ran the analysis over weekly blocks instead of the daily logfiles, because weekend users have different characteristics from weekday users, and I wanted to average out that variation.
Counting the fibbers
This isn’t the end of the story. There are some tricks to be aware
of in order to get accurate browser share data. First, Inktomi’s
Slurp robot needs to be taken out of the Netscape statistics. Earlier
versions of Slurp reported themselves with a user-agent tag beginning with
“Slurp”, so statistics packages picked it up as such. Last
December, Inktomi introduced Version 3 of their spidering robot, with
browser tags like Mozilla/3.0 (Slurp/si; slurp@inktomi.com;
http://www.inktomi.com/slurp.html)
. Most log analyzers, including
Analog, interpret this as Netscape 3.0. If you have a reasonably large
site, Slurp’s spidering can account for as much as half a percent of
pageviews in a week. If you don’t take Slurp out of your estimate of
Netscape 3.0’s share of pageviews, you may end up thinking Netscape
3.0 accounts for double its true share of usage: on MacEdition, at least,
Netscape 3.0 only accounts for about half a percent of pageviews. This
wouldn’t have been a problem if Inktomi had set the tag to
Mozilla/3.0 (compatible; Slurp/si; slurp@inktomi.com;
http://www.inktomi.com/slurp.html)
. I’ve already asked them
to change this without success; I encourage you to contact them
about this annoyance, too.
Other user agents that don’t always report themselves accurately include iCab, Opera and OmniWeb. Fortunately these browsers only have a limited number of browser tags, so you can do a search over the Analog report and add up their total share fairly quickly. Yes, it means a bit of mental arithmetic or messing with a calculator or spreadsheet. It’s the price we must currently pay to know our audience accurately.
Finally, there are the jokers. No Web designer can accommodate users if
they can’t tell what browser they are using. So tags like
Nutscrape/1.0 (CP/M; 8-bit)
, Mozilla/4.0 (0000000000;
0000 000; 00000000000)
, Internet Ninja 4.0
,
Babbage Differential Engine/Pre0.3
, or SpaceBison/0.01
[fu] (Win67; X; ShonenKnife)
are always going to end up in the
too-hard basket (although I can work out iCab/Pre2.4
(Macintosh; I; PPC; foofly-tailed red panda)
).
Still, the share of users whose browsers support current Web standards reasonably well is about two-thirds and growing (IE5 plus Opera plus Netscape 6). Maybe the story is different for your site. But you won’t know unless you look really carefully at the logs.