Entries from April 2004 ↓
April 26th, 2004 — Tech
I went to the XPDay conference last December where Martin Fowler gave one of the keynotes. I enjoyed the conference as a whole very much. While there were some XP zealots, I also found plenty of more rounded views which were of great interest (to me).
Martin’s talk was interesting and I took some notes which I have been meaning to write up.
There was nary a power point slide in site as he strode around the room talking to us as he went – rather like your archetypal absent minded professor. He did refer back to previous conference talks he has given – “that was when I used to plan my talks as opposed to just give a generic title and talk about my current preoccupations!”.
He started his talk by saying that he was getting a bit tired of XP (and yes he was being a bit provocative!).
He compared software development with book writing, and concluded that the process of development varies legitimately because of who is doing it rather than what they are doing (his style vs. Steve McConnell, both of whom are successful authors). Martin thinks about something and then writes a complete section on that topic. He repeats this process, and then starts refactoring the topics, out of which comes his book. Steve apparently refines an outline view down to a very detailed level for the whole book. Then he fleshes out each detailed point with a few paragraphs and is done. Two very different ways of going about it. When asked if he would co-author a book with Steve he was doubtful because their styles are so different, whereas he has successfully co-authored with Kent Beck).
How do you measure the value of software? This is frequently rather difficult since reducing some business process down from 2 days to 45 minutes is down to a number of factors, not just the software that helped enable it. Software methodologies tend to be very difficult to compare and are not really measurable. Thus relative arguments are frequently founded on sand (there is a lot of heat in online discussions to very little benefit).
So, what makes good software design? He mentioned Eric Raymond’s The Art of Unix Programming as discussing what people have done to succeed as opposed to laying out what they should do to succeed.
In his view, the XP community consider design and programming to overlap to a huge degree (growing out of Smalltalk/Lisp/Unix world and experience). Others have the experience that programming without design leads to huge problems, and therefore conclude that the XP way can’t work.
Can evolutionary design work? Absolutely, and Unix is a prime example.
Martin discussed this topic in “Is Design Dead?”. He now considers that article fine up to a point, but he realised that he had missed a major issue to do with people. This was highlighted for him by Enrico Zaninotto during XP2002 that you need something to make the process converge. Various XP practices influence this: test driven development, refactoring and continuous integration make evolutionary design possible. Things like YAGNI seem counter intuitive, but anticipatory design is seldom correct, it adds complexity, and the “cost of carry” is frequently large. Unix worked by releasing early and often, and building simply and evolving.
In Martin’s current view, the most important factor in getting evolutionary design to work is to have people on the team with the will and the ability to “make things converge”. Ability means that people recognise imperfections in a system (this is more important than knowing how to solve them – you can ask for help with that). Will means looking at what’s going on and have an active desire to fix it, persuade others etc. Thus motivation of a project team is a huge factor. You get problems if you don’t have people poking around looking into things and how to make them converge.
If design and programming happen together, how can a manager tell that any design is actually being done? The manager needs to get a sense of “do people care?” and “is action happening?”. An important clue is that code being thrown away indicates design is happening. Next clue is the sense of the team dynamics or motivation.
And that was it. I certainly found it very interesting.
April 20th, 2004 — Tech
Having a spam filter in place is vital these days, and the one I use is SpamBayes. This has an excellent Outlook plugin which seems to work very well. The accuracy is very good (I have 2.5k spams and 10k hams for training), though I do get a reasonably number of SpamSuspects which are mostly Spam.
The problem I had was that it is a client only solution – thus I had to download all that email to Outlook before it was correctly identified as spam. This is a particular problem when I am on site with clients, and my only email access is typically via webmail. Scanning real email amongst the multitudes of spam is not something I enjoy.
So, I wondered how I could get the spam filtering to happen on my ISP’s server without me having to download it. Like many ISPs they do have a SpamAssasin option, but I really don’t find this very accurate when processing, plus the fact that it was changing my emails meant I would have to retrain SpamBayes.
Fortunately, my ISP (Beyond Perception) is accommodating about the software I can install there, so I worked out how to install SpamBayes filtering directly on the server using the spam/ham training database that I had accumulated locally in Outlook.
.forward
This file sits in my home directory and just forwards emails to be processed by Procmail which is used by the ISP. Some trial and error and Googling showed the following magic spell does what I need.
~$ cat .forward
|/usr/bin/procmail -f-
.procmailrc
This controls how Procmail works.
~$ cat .procmailrc
# .procmailrc for using Python SpamBayes filter
MAILDIR=$HOME/mail
DEFAULT=$MAILDIR/inbox
LOGFILE=$MAILDIR/log
# Run the spam filter program on all emails which will then have X-Spam headers in for
# processing by later rules below. This just executes the Python SpamBayes program specifying the
# appropriate database.
:0fw:hamlock
| $HOME/utils/spamfilter
# Anything classified as spam put directly in spam box (might eventually
# just delete it.
:0
* ^X-Spambayes-Classification: spam
${MAILDIR}/spam/inbox
# The above just puts the mail directly into a file. If I used the following action it would forward
# the email (depends on your mail handling setup)
# ! spam@vaccaperna.co.uk
Transferring spam/ham classification DB
The above works fine but needed "training". I realised I could shortcut this using the spam/ham classification database used by my local Outlook installation. All I needed to do was export the database to a text file (using dbExpImp.py which is a script which is part of SpamBayes), ftp that file up to the server and recreate the local database there. This is a slightly manual process, but the whole thing is working well enough that I did it initially and only transfer updates every month or so.
The database is sitting in:
C:\Documents and Settings\<username>\Application Data\SpamBayes
The command to export it is:
dbExpImp.py -e -D default_bayes_database.db -f db.txt
This required a slight tweak to the SpamBayes files to ensure the correct Database format was used (Berkely DB).
The reverse procedure on the server converts the text file back again.
April 15th, 2004 — SCM
It is very easy to miss the fact that Perforce has a couple of very useful branch history viewing utilities.
These are both available via the Perforce Public Depot and were contributed by individuals. In both cases they show the history of a file and how it has been branched (all ancestors etc).
It is very likely that Perforce will produce an official version of such a product, but in the meantime use one of these.
P4QTree
By Sam Stafford (who works for Perforce, but this was done in his own time and is not an official product).
It is written using QT (the cross platform GUI toolkit), so could run on other platforms. Note that the Source is also available.
Installation
From the following link in the Public Depot (you can just sync it down using a normal Perforce client if you want but I find this easier)
ftp://public.perforce.com/guest/sam_stafford/p4qtree/bin.ntx86/
Download the two files: qt-mt306.dll and p4qtree.exe and put them in the same directory in which you installed P4Win (the Perforce Windows GUI).
If you do this and stop and restart P4Win, you will notice that the right click menu within P4Win in the Depot pane now has an extra option called “Revision Timeline..”. This makes it very easy to get a quick picture.


Example of P4QTree dialog. Note that clicking and dragging from one version to another shows diffs between versions. Double-clicking a version gives you p4win history. The meaning of colours etc is shown by Help>Legend.
BranchView
This is written in Java by Andrei Loskutov.
Installation
Follow instructions at: http://public.perforce.com/guest/andrei_loskutov/readme.txt
Note that it has to be added to the p4win Tools menu and run from that menu.

Graphical branch view
BranchView example. If you right-click on trees you can reposition them. Also, if you right click in the canvas you can save to HTML + Images as well as other options.
Which One?
I use both myself. Most of the time I use P4QTree because it is faster (uses the Perforce C++ API rather than spawning p4 command processes to get the info). However, it does have the odd bug such as with long pathnames being truncated, in which case I might use BranchView, and the ability to reorganise pictures can be very useful.
April 9th, 2004 — Uncategorized
It’s taken a little fiddling, but I have tweaked Rublog to do the basics for me in running this blog.
The main things I was looking for in blogging software were:
- Local editing and uploading of entries as files (to allow me to version control all entries locally before upload)
- WYSIWYG editing of html files
Links with SCM or version control tools seem very limited with most tools which I find very surprising. I don’t want all my content just sitting on someone else’s server with all the attendant backup problems. Also, I don’t fancy a database for blog entries as they are very difficult to version control.
I did wonder if I could tweak Rublog to generate static HTML which I could then upload, but realised that things like the calendar worked against that. I also found an ISP (beyondperception.net) who has Ruby installed and who has allowed me lots of freedom in installing other packages and configuring things.
It would be nice to have a commenting capability in Rublog, and who knows, I may get around to a simple form of that.
I was testing with Apache on Windows and then uploading to a Unix server. As a result, I wanted to create a directory structure which allowed easy copying of files around. I also discovered my ISP had only barebones Ruby installed with no libraries, so I had to install those appropriately. Adding to $LOAD_PATH and using relative pathnames for my data files fixed that.
One extra change was to add a function to read the mtime (mod time) of the blog entries from an RCS keyword ($Date: 2004/04/09 $) which is in HTML commend header. This is auto updated by my version control software (Perforce) when I submit the file, rather than when it might get copied up to the server which could be sometime later.
Mod_rewrite fun
My installation was made a little more interesting by the fact that I needed to use Apache mod_rewrite on the domain name for this blog (pointing it at a sub-directory of the main site), and I wanted to have nice clean browser URLs. Thus URLs of the form http://mainhost.com/subsite/blog/index.cgi should be converted to http://subhost.com/blog, and such changes should also be reflected in the URLs used within Rublog (e.g. for links and print styles etc).
PragDave kindly pointed out that he achieved the clean URL effect using a single ScriptAlias in his Apache configuration with no changes to Rublog. Unfortunately this wouldn’t work for me since I use shared hosting and only have .htaccess files to use to configure this sort of thing (in which ScriptAlias is not allowed). Thus I got to mess a little with mod_rewrite, and I also needed to change the Rublog code a little to make URL creation happen in a single place and tweak that function (instead of just reading ENV['SCRIPT_NAME']).
What I ended up with in my .htaccess seems pretty simple (just took some trial and error and local testing so that I could get at the mod_rewrite log to see why things weren’t working along the way):
# Redirect the robertcowham.com to sub-directory and blog references to appropriate .cgi
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_HOST} robertcowham.com$
RewriteRule ^blog/$ /robertcowham/blog/index.cgi [L]
RewriteCond %{HTTP_HOST} robertcowham.com$
RewriteRule ^blog(.*) /robertcowham/blog/index.cgi$1 [L]
# Anything not already caught by the above rules will be caught by this next one.
RewriteCond %{HTTP_HOST} robertcowham.com$
RewriteRule ^(.*) /robertcowham/$1
With a single .htaccess in the robertcowham subdirectory which turned “RewriteEngine off”.