Archive for January, 2008
Friday, January 25th, 2008 9:59 am
One of my blogs just caught a spammer from 69.31.80.66 trying to submit trackbacks to the blog, with extra fields in the “Name” field. Gen Drebery’,’deber@gmail.com’,”,’63.2.12.45′,’2008-01-25 13:43:30′,’2008-01-25 13:43:30′,”,’0′,’Internet Explorer’,’comment’,’0′,’0′),(’0′, ”, ”, ”, ”, ‘2008-01-26 13:43:30′, ‘2008-01-26 13:43:30′, ”, ’spam’, ”, ‘comment’, ‘0′,’0′ ) /*
The web server logs showed he was trying to hit a specific post, then tried to hit the first post. Could this be an attempt to fingerprint my blog? 69.31.80.66 - - [25/Jan/2008:08:43:28 -0500] “POST /2006/10/30/post-slug-here/wp-trackback.php HTTP/1.0″ 404 19104 “-” “Python-urllib/1.17″ 69.31.80.66 - - [25/Jan/2008:08:43:28 -0500] “POST /2006/10/30/wp-trackback.php HTTP/1.0″ 404 19123 “-” “Python-urllib/1.17″ 69.31.80.66 - - [25/Jan/2008:08:43:28 -0500] “POST /2006/10/wp-trackback.php HTTP/1.0″ 404 19104 “-” “Python-urllib/1.17″ 69.31.80.66 - - [25/Jan/2008:08:43:29 -0500] “POST /2006/wp-trackback.php HTTP/1.0″ 404 19104 “-” “Python-urllib/1.17″ 69.31.80.66 - - [25/Jan/2008:08:43:29 -0500] “POST /wp-trackback.php HTTP/1.0″ 200 135 “-” “Python-urllib/1.17″ 69.31.80.66 - - [25/Jan/2008:08:43:29 -0500] “GET /wp-trackback.php?p=1 HTTP/1.0″ 302 - “-” “Python-urllib/1.17″ 69.31.80.66 - - [25/Jan/2008:08:43:30 -0500] “GET /wp-login.php?action=logout HTTP/1.0″ 302 - “-” “Python-urllib/1.17″ 69.31.80.66 - - [25/Jan/2008:08:43:30 -0500] “POST /wp-trackback.php?p=1 HTTP/1.0″ 200 78 “-” “Python-urllib/1.17″ 69.31.80.66 - - [25/Jan/2008:08:43:31 -0500] “POST /wp-trackback.php?p=1 HTTP/1.0″ 500 600 “-” “Python-urllib/1.17″ 69.31.80.66 - - [25/Jan/2008:08:43:31 -0500] “POST /wp-trackback.php?p=1 HTTP/1.0″ 500 600 “-” “Python-urllib/1.17″ If you're new here, you may want to subscribe to my RSS feed. This allows you to read my newer articles without having to visit the site again. Thanks for visiting! Mike
Posted in Spam | 3 Comments »
Saturday, January 19th, 2008 1:23 pm
I wrote an article about my experience with renting a movie via the Apple iTunes store. It’s published on TheAppleBlog.com.
Posted in Buy-a-Mac, Technology | No Comments »
Wednesday, January 16th, 2008 1:55 pm
For one of my other sites, I’ll be doing some postal mailings in which I’ll need to include the URLs of some of the posts I’ve made. I really don’t want to force people to have to retype those horribly long URLs. I could use a service like TinyUrl.com, but I’m not happy giving a third party control of portions of my web site. So I’ve made it easier by using the power of apache’s rewrite rules with WordPress’ Post ID #. So instead of me having to mail out a URL like:
http://www.showbizradio.net/2008/01/10/community-theater-schedule-wallpaper/
I can include this one, which will redirect to the same post, and is much easier to type, or read over the phone:
http://www.showbizradio.net/goto/2133
To do this, create a new folder under your WordPress directory. You can call it anything you like, but shorter is better. I’ve called the directory goto, although go would also work well. Inside that directory, create a file called .htaccess. The leading dot is important! Put these lines in the the .htaccess file: RewriteEngine On RewriteRule ([0-9]+) /index.php?p=$1 [R=301,L] RewriteRule (.*) / [R=301,L] The first line simply enables the ability for the web server to process the request. The second line says that if any page request in your “goto” directory is only digits, to pipe those digits into the index.php program. The R=301 tells web browsers and search engines to permanently redirect to the new url, and the L means this is the last command to execute. The third line catches any other request (such as http://www.showbizradio.net/goto/heck) by simply redirecting any other request to your site’s home page. And that’s all there is to it. Let me know if you have any problems with this. I’ve tested it only on WP 2.3.2, running under Apache. It should work fine if you have customized your site’s permalink structure.
Posted in Plugins, Web-design | 16 Comments »
Thursday, January 10th, 2008 7:48 am
I’ve made a good start on splitting out personal stuff from PlanetMike.com, and moving it to MichaelClark.name. Jokes, photo galleries, and a handful of categories from WordPress have been moved. I’m moving a few more categories today, then I’ll share some of the lessons I’ve learned. Hopefully you won’t see too many problems on any of the pages during this transition. I’m using linked directories, and apache’s rewrite rules to gracefully move content around.
Posted in Site-details, Web-design | No Comments »
Monday, January 7th, 2008 11:35 am
I’ve just released a WordPress plugin for disabling smart quotes in text that is inside a <code> block. Smart quotes, also known as curly quotes or fancy quotes, don’t mix well inside code, so if someone copies and pastes your code with smart quotes, they have to tweak the code they want to use. Which I think everyone will agree is a waste of time. More information and the download are available on the CodeQuote page. Please send me your feedback; I definitely need to know about situations where other characters inside code are being “fixed” by WordPress. Update: 4:42pm I’ve already made several bug fixes to CodeQuote. Things like less than symbols apparently are now working correctly. And it doesn’t matter if you have a blank line in front of the open code tag. Let me know if you see any other weirdness. I’ll need to see the code you’re entering into a post so I can experiment on it here.
Posted in CodeQuote, Plugins | No Comments »
Saturday, January 5th, 2008 2:12 pm
The first of the year is a time for looking back at what was accomplished in the past year, and to set goals for the coming year. If you run a web site, one way to do that is to look at your site’s traffic and see what happened. I run my own server, so have access to my complete web server logs. Historically I’ve aways used Webalizer to run my stats, and published a subsection of those stats here. But is Webalizer still the tool I should be using? There are other tools out there, so I looked at six different applications that will generate different reports based on my server logs. The tools I looked at are: Analog, AWStats, PWebStats, Visitors, Webalizer, and W3Perl. | Tool | My Comments | Final Grade |
|---|
| Analog | - Slightly quirky command line syntax
- Had to manually copy the images to the output directory
- Output is a single html file
- Sample Report
| B | | AWStats | - Command line configuration is intimidating
- Have to process each month separately in a two step process (one to read logs, other to generate reports)
- Command line installation is quirky, with numerous path headaches
- Gives the best, most detailed reports
- Easily scriptable with shell scripts
- 22 html files created (per month of data) at highest level of detail
- Sample Report
| A- | | PWebStats | - Ugh - lots of configuration work
- Tedious to work on
- Doesn’t seem to be able to easily do historical reports
- Many files output
- Sample Report
| D | | Visitors | - Simple configuration, works immediately after installation
- Hardcoded Google traffic report, no other search engines included
- Lots of command line options, no configuration file, so a good candidate for scripting once you figure out which options you want to use
- Report is one single file, no graphic files needed; images in the report are actually shaded table cells
- Apparently handles data from multiple years
- Sample Report
| B+ | | Webalizer | - Requires a config file; command line options available
- Creates easy to read chart and reports
- Output three files per month (one html page and two images) plus a 12 month summary (one html page and one image)
- Does not keep annual reports so to have a 2007 report once any part of 2008 is analyzed, you need to manually rename and tweak the annual report once all of December has been processed
- Sample Report
| B | | W3Perl | - Difficult to install and set up for offline usage
- I never was able to get it to find the language files, so it never would run
| I |
Notes: - I was testing offline processing. Some of these tools can be installed to run directly from a web site’s cgi-bin (W3Perl apparently prefers that setup).
- I ran these reports on the log of one of my rarely used domains, which is also used for exploring software. The log file had 7,746 records in it for 2007.
- I edited by hand the sample reports to remove spam referral links, as I don’t want to link to “bad places.”
- Processing speed can be important, but generally reporting tools like these are scripted and run overnight, so I did not track processing time. When I ran these programs on my larger sites’ logs (PlanetMike.com, ChristmasMusic247.com and ShowBizRadio.net) they all finished in an acceptable amount of time.
- The sample reports show default options. Read the tool’s docs for details on customizing your reports.
- Final Grade is entirely subjective, my own opinion.
Keep in mind reading server logs is a black art. Many assumptions are made by each of these tools. A key assumption is how to define a visit. If the same IP address and user agent visit within 30 minutes, that’s one visit. To another tool with the same data, that may be two visits. Look at this example: A user from the IP address 257.258.259.260 visits a site at 10:00pm, reads that page, and then follows links on the site. The pages are accessed at 10:10pm, 10:50pm, 11:00pm, 11:10pm, 11:45pm, 12:05am and 12:10am. Webalizer would report one site and three visits. Visitors would report two unique visitors. And that raises the issue of defining terms. Someone (or something) that accesses a site may be called a visitor, a host, or a site. Hits may be called hits, requests, accesses. Some tools define hits as only html pages that are accessed, others think a hit is anything that is accessed from the server. So you shouldn’t compare stats from one tracking package to another. You can’t easily compare stats until you understand what they are reporting. Another major issue is removing robots, spiders and crawlers from your reports. Most webmasters aren’t interested in how many automated critters are devouring their site, they only want to know how many people are reading their articles. That’s where third party embedded tools come into play. I will discuss those tools next week. One time when you do want to know about automated traffic is when you want to identify Bad Things. Spammers, thieves, crackers and other abusers are out there leaving fingerprints in your log files. Third party tools can’t help you when you’re looking for bots. ConclusionEach of these tools has a place in a webmaster’s toolbox. I will continue to use Webalizer for my public traffic report for PlanetMike.com. And for my own knowlede of how my sites are doing, I’ll probably use Webalizer, Visitors, and AWStats.
Posted in Web-design | No Comments »
|