Entries Tagged 'SCM' ↓

Environmental Time Wasting

The situation I found at a recent client engagement was not unusual – large amounts of time being wasted due to poor processes around managing their environments – particularly test environments.

This post is about how we have set about improving this type of situation.

Background

Like many companies, processes and procedures had grown by accretion – a gradual build up of a tribal lore of what to do to get a working environment. As is very common, this was typically not well understood and poorly and inaccurately documented. New starters “cloned” an existing environment and then diverged!

Their processes were quite formal in some respects – multi-page forms requiring paper sign off and deployment instructions consisting of word documents with literally hundreds of manual steps listed in some cases!

Production Problems are Rare, but…

The systems being developed by this client are “only” used internally, but with tens or hundreds of thousands of client transactions being processed daily, it was not surprising that they needed to avoid problems.

The processes in place did mean that relatively few problems were being introduced into production – the key problem was that they realised they were not very efficient in how they achieved this – it required a lot of effort every time and there were far too many manual steps. A few key people were overworked and stressed, and productivity was low.

There were a number of issues causing problems, but the biggest one was the lack of control of the environment. They realised that their testers were regularly spending hours, and in some cases days of wasted effort to make sure that test failures were due to the actual code under test and not just due to the environment being incorrectly set up.

Thus an hour of two of testing could take multiple days to perform – no wonder testing was perceived as a bottle neck.

Virtual Machines

There were some good practices – the use of VMs for testing meant people could to some extent separate their working tools from the environment used for tests.

Theoretically this meant that a VM could be replaced by a fresh image, but in practice that was seldom done. Thus the various VMs had gradually drifted apart in terms of operating systems and patches and other application installations.

Configuration Identification

A key configuration management process is that of being able to accurately identify the particular versions of the programs and files in use. This was reasonably accurately being done (even if somewhat inefficiently as the build process was largely manual and they were using multiple Visual SourceSafe repositories!).

The problem was that with hundreds of .exes and .dlls as well as other files, it was effectively impossible to manually control what was going on, and also to remain up-to-date with a constant stream of updates and enhancements.

Audit

The first action to improve things was to write a small audit tool. This took as input a single master spreadsheet of executable names and specified version numbers. It then scanned the local machine and detected which versions of which .exes and .dlls were actually present, and also a list of other files.

We then started by running this on the production machines. With a little bit of batch file scripting we were able to quickly include things like registry contents and services installed.

It produced a report containing 4 basic sections:

  • files that were expected to be on the machine where the version numbers matched
  • files with mis-matching version numbers
  • files that were expected to be on the machine but were not in their expected location
  • unexpected files found locally

The results were fairly typical:

  • the spreadsheet was not up-to-date itself
  • version number mismatches, e.g. 2.00 instead of 2.0.0.0
  • file location differences – the wrong directory path specified
  • there were unexpected differences between the production and the disaster recovery sites

Master Control List

It takes detailed time and effort to go through a spreadsheet with a thousand or more entries and ensure that everything is accurate – but this is a vital step and needs to be done. Once the data is accurate it is usually fairly easy to maintain (including of course saved versions as you go to track changes).

Later on, you can look at how this is done and try and reduce the manual steps required to keep it up-to-date – but to start with it just needs to be made accurate!

The existence of the audit tool made it very easy to check and environment and find out which versions were where and make a judgement as to what needed to be fixed. This in itself is a (possibly surprisingly) big win!

Automatic Deployment

Once you have such an audit tool, it is usually pretty easy to create an automated deployment tool – in this case it shared much of the same code, and the extra requirements were:

  • check the version found locally
  • if the file doesn’t exist, or the version is not correct, then extract the specified file from the repository (and check its version again!)
  • register DLLs if required
  • install services etc.

The details of this will vary depending on the technology in use (in this case Windows with programs written in VB6 and various versions of VB.Net as well as C++)

There can be some extra complications for things like configuration data, e.g. there needs to be a registry entry with a particular key name, but the value of that entry will be specific to the name of the machine currently in use. But these are not difficult to solve.

Just getting the basic automatic deployment working is a huge win. Even if there are several steps required  - focus on making this as easy as possible.

Testing a new change still required some manual steps to install the right versions of the files to be tested. But if you are doing this on a known clean baseline, this is not difficult.

Summary

I haven’t gone into all the details here, but hopefully the principles are clear:

  • you need to keep on top of your environments
  • manual process typically don’t work
  • configuration identification is absolutely vital
  • don’t forget the “extras”: registry settings, services, database configurations and contents
  • look to automate as much as you can – saves vast amounts of time
  • you don’t need to automate everything in one go – pick the “quick wins” and go from there – but keep looking to improve!

Subversion to Perforce Migration Issues and Approaches

I would like to reflect on the lessons learnt from having recently complete the migration of 60 SVN repositories to Perforce for a client.

Migrations – Tarpits of Effort

The first thing to say is that migrations can take a lot of time and effort if you aren’t doing them regularly – like we do at VIZIM :) – perhaps not surprising! There are invariably differences in tools and techniques, and it takes time to plan the issues.

If there is an off-the-shelf tool which does the migration for you, then it is well worth a try, but in most cases various trade-offs will have been made within the tool – and you may not be pleased with the results. This in itself will take time and effort to test and evaluate.

If you choose to “roll your own” then be very careful – it may seem easy to get started, but there usually a lot of edge cases and issues along the way that will suck up your time and effort. How much is it worth becoming an expert in this through hard won experience, for a task that you are typically only going to do once?

Obviously as a consultant, I would say this, but consider bringing in help :)

Summary

For many situations SVN and P4 are very similar to each other, but there are a couple of key differences.

  • The easy bits were the basic adds/edits/deletes
  • SVN Revisions correspond (mostly!) straightforwardly with P4 Changelists
  • branching needs some intelligence applied due to the different underlying implementations – the naive approach has considerable problems (unfortunately it’s the approach used by the official Perforce migrator) – our approach/tool was dramatically faster than the official tool.
  • some SVN history (”kitchen sink revisions”) require extra care and attention due to their complexity (e.g. in the same revision the user deletes a file and then replaces it with a branched copy – or vice versa, or deletes the parent directory, or – lots of other edge cases to consider…)

SVN vs P4 Branching

SVN tags and branches are the same thing – it is just by convention they live in /tags and /branches respectively.

svndumptool -v log <path to repo>

will produce a nice history showing the type of information such as:

------------------------------------------------------------------------
------------------------------------------------------------------------ r157 | autobuild | 2007-06-05T13:11:44.145570Z | 1 line Changed paths: A /tags/1.1-RC1 (from /trunk:156) Milestone tag created: 1.1-RC1

An SVN tag or branch is just a reference to a from-path/from-revision pair. In Perforce, the exact analogy is a “dynamic” label – e.g.

Label:  1.1-RC1

Revision:   @156

View:
    //depot/trunk/...

where the Revision: field identifies the changelist.

This leads on to a major problem…

SVN O(1) branching vs P4 O(N) branching

SVN branches (or tags) are just references and can be created in constant time. In Perforce, if you branch 1,000 files, you end up with 1,000 records in db.integed table (and indeed a similar 1,000 records in db.rev as they are new revisions). There are two issues with this:

  • This metadata starts to mount up over time (large db.* files can degrade performance and increase backup time)
  • The time taken to create a branch is proportional to the number of files being branched – O(N) – and this can become noticeable for larger repositories and lots of users (two tables need to be locked while the branch is being created).

Imagine doing this for 10k files per branch, or 30k files, or … (servers can be locked for minutes).

Poor Performance of the Naive Approach

The simple approach is to create a full Perforce branch for every SVN tag or branch. Depending on the number of branches, this can take quite a long time.

For example, using the official Perforce Subversion migration tool for a small test repository (288 revisions) took 5 minutes 6 seconds.

Our tool took 15 seconds…! (5% of the time)

In addition, the db.* files for the naive approach were significantly larger – 4Mb vs 2.5Mb.

The more tags you have, the worse the problem. We had several repositories with thousands of tags – for a test migration using the naive approach, we killed it after 24 hours. The db.* files were nearly 20Gb at that point and the server was effectively thrashing and getting slower and slower. Our actual migration using the approach below took just over 3 hours for that repository.

Our Approach

There are a couple of reasons for the dramatic speed difference for our tool:

  • we read and parse SVN dump files – they contain the revision information and the contents of the files
  • we perform intelligent handling of tags automatically.

The intelligent handling of tags means that we defaulted to use Perforce “dynamic” labels as the equivalent of SVN tags (or branches) in the first instance. The tool only creates Perforce branches if a file is actually modified on the tagged branch (which can happen quite often in real life even if it is “supposed not to”!).
So the previous SVN history might also contain:

------------------------------------------------------------------------
r158 | autobuild | 2007-06-05T13:12:02.223695Z | 1 line
Changed paths:
   M /tags/1.1-RC1/ivy.xml

Recorded ivy.xml for 1.1-RC1

The problem as can see from the SVN log, is that a file is then modified on the “tag” branch. You can’t do this to a Perforce label. If you wish to replicate this type of history you must create a full Perforce branch for that tag and check the modified file in on that branch.

It is also frequently the case that tags in SVN are created and then later deleted. No problem if they correspond to the creation and deletion of labels in Perforce, but potentially very expensive with thousands of branched files. If you have a “spec” depot in Perforce then you have a full record of the label being created and deleted.

This approach has quite a few edge cases that need to be considered, including:

  • branching from tags – and checking files in
  • having multiple levels (not just /tags/<tagname>, but also /tags/sublevel/<tagname> etc) – how do you decide what to do?

Wrapping Up

Migration from SVN into Perforce is beneficial for many companies as the size of their repositories grows and the number of people using it. The tools are sufficiently similar to make user acceptance and training very straight forward.

However, you do need to perform full-history migrations with some care – feel free to contact me for more details. VIZIM has full history migration tools for Subversion, ClearCase and CM Synergy to Perforce.

Keeping Track of your DVCS Development

One of the blogs I keep tabs on is Eric Sink, and his latest article Obstacles to an enterprise DVCS struck a chord.  I was also interested in Martin Fowler’s somewhat related take:  http://www.martinfowler.com/bliki/VersionControlTools.html

I personally think that the rise of DVCS (Distributed Version Control Systems just in case you’re wondering) such as Git and Mercurial is a really good thing. As Martin mentions, the value of GitHub and BitBucket show their project collaboration features.

I am quite happy for a bit of controversy to arise (Linus on “Subversion is Brain Dead!”) as it proves that the subject is important, and worth learning about (Oscar Wilde’s dictum “The only thing worse than being talked about is not being talked about”).

One of the points made by Martin is about the importance of process and making sure you are using your tool sensibly – he mentions ThoughtWorks teams using local VCS on a daily basis and checking in to the corporate standard one a week or so at times.

The advantages of a DVCS:

  • the ability to work easily remotely (”on a plane” etc)
  • lightweight branches and excellent merging capabilities – often rather better than centralised systems
  • day to day performance – because it’s local is typically very good
  • distributed repositories make backup less of an issue

However, I think there are challenges to using DVCSs in corporate environments:

  • centralised systems are easier to control – not always a bad word! You may want only certain people to be able to view or update certain files (as an example I have done some consultancy for Camelot Group who run the UK National Lottery – as you can imagine, the audit requirements are fairly high. I have seen the names of modules called things like NotifyWinner.java – starts certain creative processes going in the mind!).
  • the visibility and communication of status – if checkins or branches are made on a central system then they are typically visible to other people, and can be tracked and reviewed, and if necessary chased up. Where is the “master copy” of code, or a particular release?
  • security of your assets – what happens if your whole code base on a cloned repository on someone’s laptop goes walkabout? Encryption can address some of this, but it is still a danger.

Some of these can be addressed by working practices, but the nature of DVCSs mean you have to be careful. For example, there is typically a “master repository” for open source projects, updating of which is restricted to a few people.

The fit of DVCSs to open source development is easy to see – anyone can clone a repository and make changes, and those changes can be fed back and committed in a controlled manner (much better than sending patches).

However, I was interested to see the lament by James Bennet, Django’s release manager, mentioning  the difficulties of tracking and coordinating development across larger numbers of developers, repositories and workflows: http://www.b-list.org/weblog/2010/feb/02/branching/ (see his lightning talk at Pycon). As he mentions: “in Django 1.1 there was a feature scheduled for this release, but he lost track of it, and they slipped by a month).

Team Foundation Server Evolution and Principles

I was interested to browse information about Microsoft’s Team Foundation Server (TFS) 2010 and it’s newer features.

It made me think a little about how TFS has evolved and what is the comparison to Perforce. This is interesting because TFS shares a lot of principles with Perforce as regards the underlying model (integrations, branches etc) – not too surprising as Microsoft still uses their own re-badged version of Perforce called SourceDepot internally for some of their major projects (Windows and Office development). I doubt that TFS shares any code with Perforce/SourceDepot as Microsoft totally re-architected it on top of SQL Server and with different comms architecture etc.

It has been interesting to see how TFS has changed some of the functionality/terminology from the Perforce equivalents, and yet with similar underlying principles.

Perforce Branching and Merging

  • Perforce has the integrate command which will both create a new codeline from an existing codeline, or propagate changes between codelines.
  • Integrate has many options for source and target specifications, including wildcards and branch specifications (with complete set of view mappings), and options to selectively integrate specific changes
  • The -v option does not copy files to the workspace (makes creation of new codelines faster)
  • You need to run the resolve command after the integrate to decide how to merge changes, or whether to discard the source changes (but remember them as merged), or overwrite the target files, etc.

TFS Branching and Merging

  • TFS has the branch command which creates new codelines from an existing codeline, and the merge command which propagates changes.
  • TFS can only branch files or folders – it has no equivalent of branch specs
  • The branch command has a /noget which is the equivalent of integrate -v.
  • The merge command has a /discard option which is the equivalent of resolve -ay
  • The resolve command does merges

Thoughts

Based on my experience of training people (for Perforce in particular), Perforce has more power/more options, but these take longer to learn. TFS has considered Perforce’s options and simplified and moved some functionality to different commands.

For example, I quite like the way TFS splits the branch and merge commands – users tend to think of these as distinct operations. Under the covers, it is understandable why Perforce treats them as similar, since for each individual file, creating a copy on a new codeline is the same whether you are creating the whole codeline or have added a new file and are propagating that single add sometime after the codelines have been created.

TFS First Class Branches

The TFS simplication of only allowing files or folders to be branched (folders recursively) is typically what most Perforce users do anyway – otherwise it starts to get very complicated to understand your repository structure.

In particular, this has allowed them to implement First Class Branches (from Matt Mitrik’s blog):

we’re not reinventing the way that branching works in TFS.  Branches are still created in much the same way, our path spaced, early branching model hasn’t changed, and merging is fundamentally the same (see my post on slot mode).  What has changed is the presentation to the user, the representation of logical relationships, and the introduction of structured metadata.

Branches in the UI

Branches in SCENow that branches are treated differently from other folders in the system, we have a new icon to visually distinguish them in the UI.  In Source Control Explorer, branches will continue behave just like folders, meaning that they are still containers of files and folders, and all of the same actions are present – they can be branched, merged, deleted, etc…

Continue reading →

Call for Papers for BCS CMSG and itSMF Conference 2010 Now Live!

Just a reminder to all that the Call for Papers is now live for the conference which will take place on 8 June 2010.

Title is:

Foundations for Success – Change, Configuration & Release Management
Optimising your Service Assets

We had a successful event last year inspite of the economic downturn. This year is about making sure that appropriate foundations are in place to support coming out of recession.

Look forward to those papers being proposed!

Hansel and Gretel’s Lessons for Safe Exploration (the Breadcrumbs of Version Control)

One of the things that I do quite frequently is download new packages and bits and pieces of code to play with on my local machine. I frequently start making local edits to perform local configuration changes, or perhaps try out the odd idea. Quite often these packages are downloaded, played with, and then discarded if they don’t meet my needs. But of course some of them end up with a permanent place in my tool chest, or installation.

Exploring with Bread Crumbs

I start getting twitchy if I start making changes to anything without checking them in somewhere as I go. It is all too easy otherwise to lose track of the changes you have made and quickly get yourself into a mess, losing lots of time and effort.

So, like Hansel and Gretel, I prefer to leave a trail of bread crumbs along the way so that I can explore safely (avoiding if possible my breadcrumbs being eaten by the birds!).

I view this as being able to explore while having the safety net of my saved versions stored away. Advantages:

  • I know what I have changed at any point (can show what’s changed and do diffs)
  • I have a bread crumb trail leading me back home to my initial clean configuration
  • I can revert to known working states if the current experiment goes wrong (e.g. accidentally delete chunks of text in the editor, or change several settings at once without appreciating their interactions)

Bread Crumb Trail Tools

My tool of choice for SCM is Perforce which is fast, effective and has lots of client tools. Having used it for many years as a consultant and trainer it is second nature to me and quick and easy to use, and rock solid.

And yet, for this particular type of work, I find it not always  ideal. Why?

  • I need to setup a client workspace – not difficult, but enough steps to still be annoying and to often result in me not doing it (requires making decisions on naming, default paths in the repository etc – seems like it could be easily automated, but just slightly too variable in requirements to make this easy)
  • The changes are “permanent” in my repository – even if the whole experiment turns out to have been a red herring (and yes of course I can obliterate stuff, but that’s another extra step)

Please note that if I really want to keep all my work – for example the package turns out to be of long term use, then I make the small extra effort and import into Perforce (together with third party codeline pattern etc to be able to track future changes as new release are made etc.). At this point, it turns out that saving it “centrally” is very beneficial.

Bread Crumb Tool of Choice

My favourite current tool for this “temporary” bread crumb saving is Mercurial (hg).

The advantages for me:

  • It is very quick and easy (but also a full featured system should I ever need it)
  • The “repository” is saved in the .hg directory at the root
  • If I remove the whole tree then repository goes too

I use the command line client to save an initial snapshot (from the root of the tree where I have extracted the package):

hg init .
hg add .
hg commit -m "Initial version of XXX as downloaded"

Subsequently I typically use the following subset of commands (from “hg help”):

  • add add the specified files on the next commit
  • commit       commit the specified files or all outstanding changes
  • copy         mark files as copied for the next commit
  • diff         diff repository (or selected files)
  • help         show help for a given topic or a help overview
  • log          show revision history of entire repository or files
  • remove       remove the specified files on the next commit
  • revert       restore individual files or directories to an earlier state
  • status       show changed files in the working directory

TortoiseHg is also useful.

Conclusion

Perforce remains my tool of choice for most SCM related activity, but Mercurial is a very useful addition to my personal tool chest, and in particular in this type of scenario.

The main thing I would always encourage people: Whatever you do, get into a habit of using a version control tool as often as you can! You will seldom regret it.

Flickr’s Flipping Flags

I was interested to see the post about Flickr’s developing new code on the mainline and using flags and flippers to control it.

Obviously an alternative to this approach is to use feature branches – so developing longer lived features off the mainline and only merging it back in when there you want to “turn the feature on”.

As Ross Harmes says though, “This style of development isn’t all rainbows and sunshine.”

So what are some of the tradeoffs requiring consideration before adopting this approach?

Pros

  • Avoids multiple branches and thus merging – makes developer’s lives easier in that they know they are only in a single branch (mainline) and they need to keep that working
  • Single codebase – force people to “fix forwards” rather than “roll back” changes.
  • According to Ross,  ”Deploys become smaller and more frequent; this leads to bugs that are easier to fix, since we can catch them earlier and the amount of changed code is minimized.”

Cons

  • Lots of “if … else …” constructs complicate the code base. As he mentions “after launching a feature, we have to go back in the code base and remove the old version (maintaining separate versions of all features on Flickr would be a nightmare)
  • Makes testing more complex with all these configurations needing to be tested
  • If you are not careful, can lead to a certain amount of “copy and paste” coding as opposed to DRY style. This can require significant discipline to avoid.

As regards the “Deploys become smaller and more frequent”, this is possible to achieve with other methods, including task branches or the equivalent. Indeed I would say that making deploys smaller and more frequent is a desirable thing to do in any case.

The phrase “remove the old version” is very important – and implies a suitable amount of refactoring post release (and the discipline to make sure that happens). One of the things that I have learnt to appreciate more and more is the ability and the willingness to delete unneeded code (with the safety net of old versions being stored in my repository). You may comment it out line by line (a quick /* … */ around a block, while easy to do is very easy to miss when reading the code), but it’s amazing how quickly redundant code starts to slow you down and consume energy and brain cycles – especially for new people to that code. Keeping a clean code base through good refactoring has proven to be a very effective agile development practice for many people.

I have seen this flag approach used successfully in the past, but also seen it start to become a morass of spaghetti code when there are too many options, or when new features interact with each other (which is where “copy and paste” starts to become attractive, inspite of all its myriad failings and creation of maintenance problems). This is particularly true of “products” where there are lots of customisable options which need to be kept available to customers in the code base for years. Addressing this particular problem in a maintainable way requires a lot of thought and effort. Codebases in C/C++ with multiple #ifdefs can become a big problem if not very carefully managed. I remember seeing an excellent internal paper within Symbian on the half dozen potential approaches to introducing variability within their codebase for mobile phone systems (a harder problem than for many due to needing to provide source to manafacturers – so code changes are visible to others). The options ranged from #ifdef (very seldom considered as valid) to modular components to ordinary conditional statements.

A similar danger story was a company producing financial software for 15 different customers, with multiple different options, all from the same “codebase”, and indeed from the same workspace/development environment  (it had a database and 4GL involved which made it hard to create separate workspaces). If a customer wanted a new feature, they got it, but they were also forced to take 20 other features or bug fixes that had been checked in on the mainline and were requested by other customers. It often took weeks to stabilise things. Luckily things have improved substantially since then for that particular company.

This whole area has resulted in approaches and tools to address the problem – search for “software product lines” and “variant management” for some links. A friend of mine, Mark Dalgarno, runs Software Acumen who consults and sells products in this arena.

In the Flickr case, it would be interesting to get some metrics as to how many “features in progress” they have on the go at any one time, and how their development team tends to be split amongst features  (how many people per feature).

In the absence of such metrics to help people really understand the issues fully, I hope that the Flickr post won’t be a siren call which lures unsuspecting software developers on to a new set of rocks. As always, learn from others but make sure they are solving the same problem that you have to solve.

Review of BCS Agile SCM Event on 24 November 2008

This is a brief review of the BCS CMSG Agile SCM event on 24th November 2008.

Please note that the presentations are on line from the link above. Below are some extra notes to go with the presentations.

All in all a very good and informative day!

Michael Azoff (The Butler Group) – The Agile Difference

Presentation here

Michael did a good job of setting the scene and introducing various terms around agile etc.

He talked about the wider context – and how people need to stay engaged in projects. Quoted an example of a US Healthcare provider where management did not get sufficiently involved in the new billing system because it was not “exciting enough”. The company had sufficient problems that it lost 2/3 of its value, and ended up having to pay out $200m to clients etc.

Such things promote the need for a dashboard – simple summary of project status with the ability to drill down. Data is always available.

Regarding SCM he noted nothing in the indexes of various books on Agile methods that relates to SCM – an oversight!

However, came up with a couple of useful links:

Richard Erwin (Microsoft) – SCM and Agile Practices within Microsoft Developer Division

Presentation here

Another excellent talk and the slides well worth reading. Interesting to see Microsoft’s “dog fooding” of their own product and the benefits and improvements of their process between Visual Studio 2005, 2008 and 2010 (already out in previews).

Some points:

  • Most companies using adhoc processes, and having problems getting a handle on all the information that is coming in. Showed off some of the dashboards and similar feedback options within VSTS.
  • Within Microsoft there are 4 major divisions:
    • Windows
    • Office
    • Developer Tools
    • SQL Server
  • Their largest instance is 17m source files, 500k items, 3 Tb data
  • VSTS 2010 has been “dog fooding” since 2007. Problems don’t go unnoticed as a result – he mentioned the great summer ‘08 outage!
  • There is a slide showing usage of VSTS within the various divisions. Developer Tools is completely VSTS. However Windows and Office are still using SourceDepot for version control (although using VSTS being used for Planning and Bug Tracking). Richard responded to one question that the internal tools (SourceDepot) were “not available” outside Microsoft. On further prompting, he admitted that SourceDepot was a rebadged version of Perforce – based on the source code that Microsoft bought in 1999 – one of those “industry secrets” that has leaked out over the years!
  • A successful import from the Office division is the concept of “Feature Crews”
  • Concept of Bug Debt – carry no debt in feature model. One of the slides shows the burn down chart from 25,000(!) bugs for VS 2005 – they did get it close to zero for release, but life was hellish! The comparable chart for VSTS 2008 is much flatter.
  • Not surprisingly the cost of hotfixes or sevice packs is massive – its a major incentive to avoid those – hence major quality gates when teams push code back to the mainline.

Branching Model

It’s a fairly standard Mainline model with large team branches. There’s a huge amount of automated testing. Teams take it in turns to “publish” to the Mainline, and have to pass very significant quality gates to do so.

Andrew Tunnicliffe (London Underground) – Towards Visualizations of Configuration Management

Presentation here

Andrew wanted to leave us with the message – “Configuration Management isn’t just for Christmas!”

Some notes:

  • His presentation was a little different in that it looked at the complexities of managing what according to the Royal College of Engineers is the most complex system in the UK.
  • Software is very much only one element – in addition to hardware, staff/people are vital. This includes training needs etc – you can’t introduce new stuff without it. Takes 2 years to train 100 drivers. How do you manage change in that time?
  • In one morning they went from 4 trains to 5 trains per period – this is a step change of 25% in one go – you can’t introduce 10% of a new train :)
  • S/w control and signalling all changing.
  • He called it saftety critical agile.
  • Of course in a public project of this size, there are huge political pressures to deliver.
  • A lot of the value in visualisation is for people at the edges rather than those in the trenches. Train driver just needs to know that something has changed.

The examples he showed us are a tool based around MS Project with various extra attributes, and a Network diagram tool. It can also publish to the web.

Interesting to see how something relatively simple and straight forward could make a significant difference to acceptance by people – the CM information is much more easily visible.

Finbarr Joy (Upco) – All Things to All Men: Keeping it simple?

Presentation here

Finbarr wanted to avoid any perception of giving a “sales pitch”, inspite of Upco’s sponsorship of the event – paid for a nice lunch!

The main focus of his talk was around build and deployment frameworks – making it easy to setup, customize and manage.

He mentioned the experience of Upco in doing development for their clients and wanting to scratch their own itch and improve their processes. The only drawback for me was the lack of specifics in the talk – it would have been useful to see some before and after figures showing how this approach has worked in practice. A couple of useful nuggets:

  • Is your change management really change prevention – or perceived as that?
  • Projects change a standard pattern many times which leads to incompatible processes and much rework.

He finished by pointing at the open source project Kundo which is well worth checking out:

Kundo provides a structured, convention based approach for Java builds. Kundo has a pluggable, extensible architecture; it harnesses the power and flexibility of Groovy and Ant to provide a highly configurable Java build framework.

Kundo achieves this flexibility with a plug-in architecture that attaches behaviours (provided by Kundo plug-ins) to build lifecycle phases. Kundo consists of a kernel and a set of foundation plug-ins. The kernel is responsible for the orchestration of the multiple collaborators within the build system.

Conceptually similar to the approach taken by the Apache Maven project, the Kundo implementation is simpler (the kernel library jar file is ~ 60Kb) and, in our humble opinion, offers greater flexibility; if you want to simply wire an Ant into a buildfile and use it, you can. Build lifecycles are defined within a build ‘recipe’. A recipe declares the plug-ins required to perform a build. Each Kundo plug-in, much like a Maven plug-in, encapsulates a discreet set of build management (or deployment, release management etc etc) logic and has its own versioning/release cycle.

Sean Cody (Bank of America) – SCM – the Agile Keystone

Sean’s talk has unfortunately not yet had the slides approved to be released externally. It was an excellent talk and well received.

He talked about some of the background and issues of both SCM (software configuration management) and Agile.

In particular for agile:

  • Not always clear what it means
  • Is it a project management approach or just technical practices?
  • Fragilism?!
  • Image problem:
    • Hard to manage and control
    • Too many cowboys
    • An excuse for no documentation
    • “Decline and Fall of Agile”

He used the metaphor of the keystone which is vital to arches, domes and other architectural designs. A lack of good SCM renders Agile ineffective.

Key components for Bank of America:

  • Story management server
  • SCM server
  • Continuous Integration server
  • Testing server
  • Wiki for publishing results

Story management is the most important function for users. The key thing is to track the evolution of a story from inception to release. This means the following are all tied to stories (with no exceptions):

  • All SCM check-ins
  • Build records
  • Testing records

The business can see the details of the stories and indeed click through to view all aspects of development should they so wish. This visibility allows development to prove its value to the business. It also makes for easy traceability – audit is not a problem.

Sean is keen on tools that are easy to integrate, even though customisation comes with the cost of maintenance. For the SCM tool the following are mandatory:

  • Atomic check-ins
  • Change history
  • Performance
  • Reliability
  • Simple to use

These aren’t new requirements – but they even more important for Agile development. They (together with the framework) help address some of the criticisms of Agile:

  • Lack of structure
  • Lack of documentation
  • Management of scope creep
  • Meeting enterprise change requirements
  • Meeting audit and regulatory requirements

He left with a word of warning: cultural change is hard to implement – tools can only do so much – your people are important!

Custom Perforce GUI Installers (for large sites)

For quite a while there has been an option to create your own custom installer for Perforce deployments (including automation). This can be particularly useful for larger sites with lots of new users coming along regularly.

The customizable element of the scripted installer is a configuration file named perforce.cfg. The Perforce Administrator edits this file, and places it on a shared network drive along with the expanded contents of the perforce.zip file. Perforce client programs can then be installed by running setup.exe from each desktop.

One of the annoyances with this approach is that you have to expand the .zip file and put things on a network drive. It is not easy to do this sort of thing within a single executable for example (which makes it hard to put on your intranet and get users to double click).

Well, having played around with various installers and options, I have discovered a fairly simple way of doing this via a self extracting executable (SFX), and the open source 7-Zip program.

Instructions

Create a directory structure:

  • some root dir
    • 7-zip – contains a batch file, a couple of config files and perforce.cfg – see below for contents.
    • p4winst – contains expanded p4winst.zip (without the perforce.cfg)
    • p4vinst – contains expanded p4vinst.zip (without the perforce.cfg)

Currently (as of 2007.2) the perforce.cfg file is the same for both P4Win and P4V.

You need to get a couple of files from 7-zip.org/download.html

Edit perforce.cfg to customize for your installation.

Create a p4winst.conf and p4v.conf (which must both be UTF-8 – Notepad can save this format, as can Notepad++ or other editors) in that directory.

Run make_installers.bat from the command line in the appropriate directory and check that the following are created (which you can put on an intranet or whatever):

  • p4winst.exe
  • p4vinst.exe

You can then send your users a single exectuable which unzips itself and automatically runs the perforce setup.exe with the included perforce.cfg – QED!

make_installers.bat

This contains something like the following:

:: Batch file to package up P4Win using .zip file and the 7-Zip freely available
:: Zip program (which can create SFX - Self Extracting Archives). 

@echo off 

:: May need to customize the following
set ZIP_DIR="d:\apps\7-zip" 

call :make_install p4winst
call :make_install p4vinst
goto :exit 

:make_install 

set INSTALLER=%1 

:: First create new clean versions of the zip files (including current directory version of perforce.cfg in preference)
if exist %INSTALLER%.7z del %INSTALLER%.7z
if exist %INSTALLER%.exe del %INSTALLER%.exe 

%ZIP_DIR%\7z a %INSTALLER%.7z ..\%INSTALLER%\* perforce.cfg
:: Copy SFX file plus config file plus zipped installer into a single SFX .exe
copy /b %ZIP_DIR%\7zsd.sfx + %INSTALLER%.conf + %INSTALLER%.7z %INSTALLER%.exe
if not errorlevel 1 goto :EOF
if exist %INSTALLER%.exe goto :EOF 

echo *** ERROR: failed to create %INSTALLER%.exe
goto :exit 

:exit

p4winst.conf (and p4vinst.conf)

These are simple UTF8 format file with contents similar to:

;!@Install@!UTF-8!
Title="P4Win Custom Installer"
BeginPrompt="Do you want to install P4Win?"
RunProgram="setup.exe"
;!@InstallEnd@!

Subconf 2007 – Subversion and the Enterprise

This article is a result of my visit to Subconf in Munich a couple of weeks ago.

The first thing to say is that Subconf was very enjoyable and interesting and my compliments to the organisers. There were some excellent speakers and it was good to chat to various vendors too. I also had a very good time at the dinner that evening – the only slight fly in the ointment was the S-Bahn strike the following morning which meant a rather earlier taxi to the airport than I might have wished!

The two keynotes by Brian Behlendorf and Karl Fogel respectively were particularly good. Indeed it was good to get them at the conference since both guys seem to be scaling down their Subversion involvement – Brian has stepped down as CTO at Collabnet and Karl is running QuestionCopyright.org (although remains as President of Subversion Corporation.

Some of the slides are now available online.

Brian’s Keynote

Brian mentioned his involvement in Apache and some of the lessons learnt about creating a community that worked. These lessons were then applied by Collab.net to get Subversion going. If you create a high quality community, you will create high quality software.

  • Need to be nice to people to avoid a “fork” – Development leaders exhibit good communication skills, and can bring different ideas together.
  • Conscious effort to bring new developers along the path: from “consumers”, to bug reporters, to patch submitters, to active contributors.

Brian was asked the question in his talk about Linus’s somewhat inflammatory video about Git where various sideswipes about brain-dead CVS and Subversion people are made. Brian had no particular axe to grind about this and said that the Subversion camp bore no ill will to the distributed crowd and were rather surprised at some of the venom coming back – a very measured and mature response.

Brian mention the difference between a centralised model and a distributed model – there is a lot that enterprise like about the control of a centralised model. For example, laptops get stolen (he lost his a few weeks ago), and a laptop might contain a whole distributed repository. Enterprises prefer more control!

Martin Doettling’s Intro to Karl

The key points of this were:

  • Estimated user base now exceeds 2 million
  • 10x growth since 1.4
  • Large numbers of enterprise users

Karl’s Keynote - How Stuff Happens

Karl picked up on some of Brian’s points about the community, and showed various ways in which they have been able to get hackers to create software that the enterprise can use!

  • There is a very comprehensive guide to how to contribute – 43 pages of it! This is where people start.
  • Some principles:
    • Make it easy to do things right
    • Make it rewarding to do things right
    • Influence proportional to effort
      • Moving from contributor to partial-committer to commiter
      • Tracking no of patches etc by user – all automated with links (see Contribulyzer)

Other Notes

I couldn’t stay for the second day, but some other points of interest:

  • There are increasing numbers of tools based around Subversion – Collabnet, Polarion, CodeBeamer etc
  • Subversion command line is increasingly irrelevant as people use it through Tortoise, Eclipse etc.
  • Merge tracking there at long last with 1.5 just around the corner
  • Really being used in the industry
  • SVK provides an interesting distributed method

I also had some chats with Collabnet around the idea of perhaps the BCS CMSG working with them to put on an event in London next year – definitely something to look into.

Well worth a visit as a conference.