The first of the year is a time for looking back at what was accomplished in the past year, and to set goals for the coming year. If you run a web site, one way to do that is to look at your site’s traffic and see what happened. I run my own server, so have access to my complete web server logs. Historically I’ve aways used Webalizer to run my stats.
But is Webalizer still the tool I should be using? There are other tools out there, so I looked at six different applications that will generate different reports based on my server logs. The tools I looked at are: Analog, AWStats, PWebStats, Visitors, Webalizer, and W3Perl.
|Tool||My Comments||Final Grade|
- I was testing offline processing. Some of these tools can be installed to run directly from a web site’s cgi-bin (W3Perl apparently prefers that setup).
- I ran these reports on the log of one of my rarely used domains, which is also used for exploring software. The log file had 7,746 records in it for 2007.
- I edited by hand the sample reports to remove spam referral links, as I don’t want to link to “bad places.”
- Processing speed can be important, but generally reporting tools like these are scripted and run overnight, so I did not track processing time. When I ran these programs on my larger sites’ logs (PlanetMike.com, ChristmasMusic247.com and ShowBizRadio.net) they all finished in an acceptable amount of time.
- The sample reports show default options. Read the tool’s docs for details on customizing your reports.
- Final Grade is entirely subjective, my own opinion.
Keep in mind reading server logs is a black art. Many assumptions are made by each of these tools. A key assumption is how to define a visit. If the same IP address and user agent visit within 30 minutes, that’s one visit. To another tool with the same data, that may be two visits. Look at this example: A user from the IP address 257.258.259.260 visits a site at 10:00pm, reads that page, and then follows links on the site. The pages are accessed at 10:10pm, 10:50pm, 11:00pm, 11:10pm, 11:45pm, 12:05am and 12:10am. Webalizer would report one site and three visits. Visitors would report two unique visitors.
And that raises the issue of defining terms. Someone (or something) that accesses a site may be called a visitor, a host, or a site. Hits may be called hits, requests, accesses. Some tools define hits as only html pages that are accessed, others think a hit is anything that is accessed from the server. So you shouldn’t compare stats from one tracking package to another. You can’t easily compare stats until you understand what they are reporting.
Another major issue is removing robots, spiders and crawlers from your reports. Most webmasters aren’t interested in how many automated critters are devouring their site, they only want to know how many people are reading their articles. That’s where third party embedded tools come into play. I will discuss those tools next week. One time when you do want to know about automated traffic is when you want to identify Bad Things. Spammers, thieves, crackers and other abusers are out there leaving fingerprints in your log files. Third party tools can’t help you when you’re looking for bots.
Each of these tools has a place in a webmaster’s toolbox. I will continue to use Webalizer for my public traffic report for PlanetMike.com. And for my own knowlede of how my sites are doing, I’ll probably use Webalizer, Visitors, and AWStats.
Update: September 8, 2009: I’ve removed the sample reports I generated. They were attracting a huge amount of attention from referer log spammers.