One of the the things that comes up just about every time I go in to do some Perforce related consulting for a client is that of repository structure – how do you map some particular branching scheme (combination of branching patterns) onto the repository?
This of course necessary since Perforce uses the repository structure to represent branches as well as other directory structures. This way of recording branches is feature one might occasionally wish were implemented otherwise, but it does have its attractions, and is the same model as used in Subversion and also Microsoft’s new Team Foundation Server product (as an aside, it is perhaps not surprising that Microsoft chose this model since it is a not very well kept “secret” that their in house SCM tool called SourceDepot, introduced during the development of Windows 2000, was a re-badged copy of Perforce. Team Foundation Server is obviously a total rewrite on top of SQL Server, but they seem to be comfortable with the “branch in path space” model).
Some clients need help working out a good branching scheme, and some have relevant experience with other tools so are comfortable with that part, but many need help with the mapping. I have covered a little of this in my article Introduction to Perforce Branching, but this article expands on that.
For their flexibility, I would generally recommend people use branch specs (e.g. created by “p4 branch” command) for branching since you can add view mappings to address not propagating renames, and branch extra components. However, if you can keep the branch spec to a single view mapping then that is great! (Note that the P4V GUI allows integration via drag-and-drop – sometimes convenient, but can be somewhat dangerous…).
Guidelines
A repository branch structure is about creating an appropriate information hierarchy (designing an information architecture). So here are some guidelines that I have found useful over the years and seen successfully used in many organisations (and I certainly didn’t invent any of them!). Maybe in the future I can come up with some appropriate pattern like descriptions, but for now, here they are:
- Standardise naming conventions – make branch directories obvious
- Keep branch spec mappings and client workspace spec mappings as simple as possible (and permissions)
- Make it as easy as possible for users (new and experienced) to visualise the branching scheme as they click down through the hierarchy
- Find the right balance between broad & shallow vs. narrow & deep hierarchies
- Keep branches of a similar type at similar depths in the structure
- Make branch directories sort sensibly
- Plan for the future: 2, 3, 5 years down the line – include things like the year (and optionally month) somewhere in the path to provide a natural sorting.
- Provide Guidelines and Delegate!
- Remember – there is no one best solution, but there are many good enough solutions!
Some of what comes below is also covered by Laura Wingerd in her recent book “Practical Perforce” (O’Reilly) – you really need to get this book if you are responsible in any way for Perforce usage at your company! Her quote which sums things up nicely is:
“There is no reason that the repository structure should match the release lineage, but when it doesn’t it causes confusion to new users [and cognitive dissonance to all users].”
Now while I agree with the intent of what Laura has written, I find her suggested option is perhaps a little too restrictive, and it doesn’t take into account organisations with many branches at a particular level (and indeed many different types of branches).
Standardise Naming Conventions – Make Branch Directories Obvious
Laura gives a very simple example in her book of naming convention – just use uppercase for the branch component. Thus she has:
//depot/MAIN/... //depot/REL1/... //depot/SOME_PROJECT/... //depot/ANOTHER_PROJECT/...
This is simple and can work very well (although it rather suggests that you don’t use uppercase for any other paths which might be unnatural in certain companies for certain items). Another alternative is to append (or prepend) something to the name indicating that it is a branch directory, e.g. “.branch” or or “-branch”, or “.br” for those who don’t like typing! Thus the above becomes:
//depot/main.br/... //depot/rel1.br/... //depot/some_project.br/...
I am happy to have other suggestions recommended!
Sensible Sorting
Directory (and thus branch) names should sort sensibly at any point in the hierarchy, thus
- use leading zeros for numeric components, such as 01 instead of 1 if you are going to have more than 10, or indeed 4, 5 or 6 leading zeroes where necessary
- reverse the order of dates using yyyy-mm rather than mm-yyyy
Examples:
//depot/main.br/... //depot/rel_01_00.br/... //depot/rel_02_00.br/... //depot/rel_02_01.br/... //depot/rel_02_02.br/...//depot/dev/2007_06_some_project.br/...
Broad & Shallow vs. Narrow & Deep
At any one level the number of choices (sub-directories/branches) should be appropriate and naturally limiting (note that the structure needs to be as useable in several years time as it is now). For example, a structure which ends up with thousands of branches in a single directory will become unwieldy over time. Divide these branches into sub-directories using appropriate naming (and sorting). For example if you wish to create lots of branches for individual change requests then the obvious structure is very wide and shallow at some point in your repository:
~/rel /CR000001/... /CR000002/... : /CR000099/... /CR000100/... : /CR009100/... :
Instead of that, try:
~/rel /CR0000xx /01/... /02/... : /99/... /CR0001xx /01/... : /CR0091xx/ /01/... :
This naturally splits things up and also sorts at all levels.
As a corollary of the above you should split things up even if they don’t have a “natural” dividing component. One alternative is to use the date as an element of the path (or even a prefix) as YYYY or YYYY-MM to give some sort of natural sorting by date. Examples being:
~/2006-01/some_project/...
Or without introducing another level but using a prefix (which is often just a matter of personal preference):
~/2006-01-another_project/...
In 2 or 5 years time it should be obvious to any user which are the which are the “new” ones where they are likely to be spending most of their time, and which are the “old” branches which can typically be ignored.
Hiding or Retiring Old Branches
It is possible “hide” old branches by removing read permissions on those repository paths (leaving them visible to superusers or perhaps an “archive” user). There are some potential performance implications to doing a lot of this (the more lines you add to the protections table the more work the server has to do for commands).
Often the easiest way is just to make them easily ignored via an appropriate hierarchical naming convention as shown above.
Provide Guidelines and Delegate Responsibility
Larger organisations may have a number of teams or units each of which is responsible for products with sometimes quite different lifecycles and thus branching schemes. In these cases it makes sense to set the guidelines and principles and delegate to the teams the work of defining the precise conventions to be used.
Automation
If your repository structure is simple enough and regular enough then it is easy for tools or scripts to:
- create them (perhaps with input from a user), including adding a mapping to the current client workspace view
- have triggers that validate that they are being created in the correct place (e.g. don’t allow a branch with wrong naming convention to be created in certain directories or levels of the repository). This prevents user error.
- work out the relationship between branches automatically and thus be able to produce simple reports for things like changes that need to be propagated etc. While this is often done by means of some configuration file (stored in the repository), and indeed this is sometimes necessary anyway, it is easier if it can be automatically deduced from the structure (so that a new branch “popping up” is automatically included into the reporting mechanism).
- make your protections table easier to maintain
There is No One Best Design!
And remember that this is always true – there are always many ways that are good enough, and the final choice often comes down to personal preference.
However, every type of branch should have a “home”, so that when a user creates a type of branch your CM Plan should have simple guidelines so the user is never at a loss as to where to create one. This requires identifying the types of branches likely to be used.
You need to do enough thinking and planning to get at least 80% of it right. It is of course possible to re-structure your repository (by branching from old name to new name and deleting the old within the same atomic changelist). However, this can cause you quite a few problems if the branch you are re-structuring has relationships with lots of other branches (e.g. release branches being maintained).
People who are aware of the issues and have experience, can slot in a new branch type fairly easily (though they can still end up making sub-optimal choices that they regret).



