|
One of the the things that comes up just about every time I go in to do some
Perforce related consulting for a client is that of repository structure - how
do you map some particular branching scheme (combination of branching patterns)
onto the repository?
This of course necessary since Perforce uses the repository structure to
represent branches as well as other directory structures. This way of recording
branches is feature one might occasionally wish were implemented otherwise, but
it does have its attractions, and is the same model as used in Subversion and
also Microsoft's new Team Foundation Server product (as an aside, it is perhaps
not surprising that Microsoft chose this model since it is a not very well kept
"secret" that their in house SCM tool called SourceDepot, introduced
during the
development of Windows 2000, was a re-badged copy of Perforce. Team
Foundation Server is obviously a total rewrite on top of SQL Server, but they
seem to be comfortable with the "branch in path space" model).
Some clients need help working out a good branching scheme, and some have
relevant experience with other tools so are comfortable with that part, but many
need help with the mapping. I have covered a little of this in my article
Introduction to
Perforce Branching, but this article expands on that.
For their flexibility, I would generally recommend people use branch specs
(e.g. created by "p4 branch" command) for branching since you can add view
mappings to address not propagating renames, and branch extra components.
However, if you can keep the branch spec to a single view mapping then that is
great! (Note that the P4V GUI allows integration via drag-and-drop - sometimes
convenient, but can be somewhat dangerous...).
Guidelines
A repository branch structure is about creating an appropriate information
hierarchy (designing an information architecture). So here are some guidelines
that I have found useful over the years and seen successfully used in many
organisations (and I certainly didn't invent any of them!). Maybe in the future
I can come up with some appropriate pattern like descriptions, but for now, here
they are:
- Standardise naming conventions - make branch directories obvious
- Keep branch spec mappings and client workspace spec mappings as simple
as possible (and permissions)
- Make it as easy as possible for users (new and experienced) to visualise
the branching scheme as they click down through the hierarchy
- Find the right balance between broad & shallow vs. narrow & deep hierarchies
- Keep branches of a similar type at similar depths in the structure
- Make branch directories sort sensibly
- Plan for the future: 2, 3, 5 years down the line - include things like
the year (and optionally month) somewhere in the path to provide a natural
sorting.
- Provide Guidelines and Delegate!
- Remember - there is no one best solution, but there are many good enough
solutions!
Some of what comes below is also covered by Laura Wingerd in her recent book
"Practical Perforce" (O'Reilly) - you really need to get this book if you are
responsible in any way for Perforce usage at your company! Her quote which sums
things up nicely is:
"There is no reason that the repository structure should match the
release lineage, but when it doesn’t it causes confusion to new users [and
cognitive dissonance to all users]."
Now while I agree with the intent of what Laura has written, I find her
suggested option is perhaps a little too restrictive, and it doesn't take into
account organisations with many branches at a particular level (and indeed many
different types of branches).
Standardise Naming Conventions - Make Branch Directories Obvious
Laura gives a very simple example in her book of naming convention - just use uppercase
for the branch component. Thus she has:
//depot/MAIN/...
//depot/REL1/...
//depot/SOME_PROJECT/...
//depot/ANOTHER_PROJECT/...
This is simple and can work very well (although it rather suggests that you
don't use uppercase for any other paths which might be unnatural in certain
companies for certain items). Another alternative is to append (or prepend) something to the name indicating that it is a branch directory, e.g.
".branch" or or "-branch", or ".br" for those who don't like typing! Thus the
above becomes:
//depot/main.br/...
//depot/rel1.br/...
//depot/some_project.br/...
I am happy to have other suggestions recommended!
Sensible Sorting
Directory (and thus branch) names should sort sensibly at any point in the
hierarchy, thus
- use leading zeros for numeric components, such as 01 instead of
1 if you are going to have more than 10, or indeed 4, 5 or 6 leading zeroes
where necessary
- reverse the order of dates using yyyy-mm rather than mm-yyyy
Examples:
//depot/main.br/...
//depot/rel_01_00.br/...
//depot/rel_02_00.br/...
//depot/rel_02_01.br/...
//depot/rel_02_02.br/...
//depot/dev/2007_06_some_project.br/...
Broad & Shallow vs. Narrow & Deep
At any one level the number of choices (sub-directories/branches) should be
appropriate and naturally limiting (note
that the structure needs to be as useable in several years time as it is now).
For example, a structure which ends up with thousands of branches in a single
directory will become unwieldy over time. Divide these branches into
sub-directories using appropriate naming (and sorting). For example if you wish
to create lots of branches for individual change requests then the obvious
structure is very wide and shallow at some point in your repository:
~/rel /CR000001/...
/CR000002/...
:
/CR000099/...
/CR000100/...
:
/CR009100/...
:
Instead of that, try:
~/rel /CR0000xx /01/...
/02/...
:
/99/...
/CR0001xx /01/...
:
/CR0091xx/ /01/...
:
This naturally splits things up and also sorts at all levels.
As a corollary of the above you should split things up even if they don’t
have a “natural” dividing component. One alternative is to use the date as an
element of the path (or even a prefix) as YYYY or YYYY-MM to give some sort of
natural sorting by date. Examples being:
~/2006-01/some_project/...
Or without introducing another level but using a prefix (which is often just a matter of personal preference):
~/2006-01-another_project/...
In 2 or 5 years time it should be obvious to any user which are the which are
the “new” ones where they are likely to be spending most of their time, and
which are the “old” branches which can typically be ignored.
Hiding or Retiring Old Branches
It is
possible “hide” old branches by removing read permissions on those repository
paths (leaving them visible to superusers or perhaps an "archive" user).
There are some potential performance implications to doing a lot of this (the
more lines you add to the protections table the more work the server has to do
for commands).
Often
the easiest way is just to make them easily ignored via an appropriate
hierarchical naming convention as shown above.
Provide Guidelines and Delegate Responsibility
Larger organisations may have a number of teams or units each of which is
responsible for products with sometimes quite different lifecycles and thus
branching schemes. In these cases it makes sense to set the guidelines and
principles and delegate to the teams the work of defining the precise
conventions to be used.
Automation
If your repository structure is simple enough and regular enough then it is
easy for tools or scripts to:
- create them (perhaps with input from a user), including adding a mapping
to the current client workspace view
- have triggers that validate that they
are being created in the correct place (e.g. don't allow a branch with wrong
naming convention to be created in certain directories or levels of the
repository). This prevents user error.
- work out the relationship between branches automatically and thus be
able to produce simple reports for things like changes that need to be
propagated etc. While this is often done by means of some configuration file
(stored in the repository), and indeed this is sometimes necessary anyway,
it is easier if it can be automatically deduced from the structure (so that
a new branch "popping up" is automatically included into the reporting
mechanism).
- make your protections table easier to maintain
There is No One Best Design!
And remember that this is always true - there are always many ways that are
good enough, and the final choice often comes down to personal preference.
However, every type of branch should have a “home”, so that when a user creates a type
of branch your CM Plan should have simple guidelines so the user is never at a
loss as to where to create one. This requires identifying the types of branches
likely to be used.
You need to do enough thinking and planning to get at least 80% of it right.
It is of course possible to re-structure your repository (by branching from old
name to new name and deleting the old within the same atomic changelist).
However, this can cause you quite a few problems if the branch you are
re-structuring has relationships with lots of other branches (e.g. release
branches being maintained).
People who are aware of the issues and have experience, can slot in a new
branch type fairly easily (though they can still end up making sub-optimal
choices that they regret).
|