Entries Tagged 'Perforce' ↓
October 11th, 2010 — Perforce, SCM, subversion
I would like to reflect on the lessons learnt from having recently complete the migration of 60 SVN repositories to Perforce for a client.
Migrations – Tarpits of Effort
The first thing to say is that migrations can take a lot of time and effort if you aren’t doing them regularly – like we do at VIZIM
– perhaps not surprising! There are invariably differences in tools and techniques, and it takes time to plan the issues.
If there is an off-the-shelf tool which does the migration for you, then it is well worth a try, but in most cases various trade-offs will have been made within the tool – and you may not be pleased with the results. This in itself will take time and effort to test and evaluate.
If you choose to “roll your own” then be very careful – it may seem easy to get started, but there usually a lot of edge cases and issues along the way that will suck up your time and effort. How much is it worth becoming an expert in this through hard won experience, for a task that you are typically only going to do once?
Obviously as a consultant, I would say this, but consider bringing in help
Summary
For many situations SVN and P4 are very similar to each other, but there are a couple of key differences.
- The easy bits were the basic adds/edits/deletes
- SVN Revisions correspond (mostly!) straightforwardly with P4 Changelists
- branching needs some intelligence applied due to the different underlying implementations – the naive approach has considerable problems (unfortunately it’s the approach used by the official Perforce migrator) – our approach/tool was dramatically faster than the official tool.
- some SVN history (”kitchen sink revisions”) require extra care and attention due to their complexity (e.g. in the same revision the user deletes a file and then replaces it with a branched copy – or vice versa, or deletes the parent directory, or – lots of other edge cases to consider…)
SVN vs P4 Branching
SVN tags and branches are the same thing – it is just by convention they live in /tags and /branches respectively.
svndumptool -v log <path to repo>
will produce a nice history showing the type of information such as:
------------------------------------------------------------------------
------------------------------------------------------------------------
r157 | autobuild | 2007-06-05T13:11:44.145570Z | 1 line
Changed paths:
A /tags/1.1-RC1 (from /trunk:156)
Milestone tag created: 1.1-RC1
An SVN tag or branch is just a reference to a from-path/from-revision pair. In Perforce, the exact analogy is a “dynamic” label – e.g.
Label: 1.1-RC1
Revision: @156
View:
//depot/trunk/...
where the Revision: field identifies the changelist.
This leads on to a major problem…
SVN O(1) branching vs P4 O(N) branching
SVN branches (or tags) are just references and can be created in constant time. In Perforce, if you branch 1,000 files, you end up with 1,000 records in db.integed table (and indeed a similar 1,000 records in db.rev as they are new revisions). There are two issues with this:
- This metadata starts to mount up over time (large db.* files can degrade performance and increase backup time)
- The time taken to create a branch is proportional to the number of files being branched – O(N) – and this can become noticeable for larger repositories and lots of users (two tables need to be locked while the branch is being created).
Imagine doing this for 10k files per branch, or 30k files, or … (servers can be locked for minutes).
Poor Performance of the Naive Approach
The simple approach is to create a full Perforce branch for every SVN tag or branch. Depending on the number of branches, this can take quite a long time.
For example, using the official Perforce Subversion migration tool for a small test repository (288 revisions) took 5 minutes 6 seconds.
Our tool took 15 seconds…! (5% of the time)
In addition, the db.* files for the naive approach were significantly larger – 4Mb vs 2.5Mb.
The more tags you have, the worse the problem. We had several repositories with thousands of tags – for a test migration using the naive approach, we killed it after 24 hours. The db.* files were nearly 20Gb at that point and the server was effectively thrashing and getting slower and slower. Our actual migration using the approach below took just over 3 hours for that repository.
Our Approach
There are a couple of reasons for the dramatic speed difference for our tool:
- we read and parse SVN dump files – they contain the revision information and the contents of the files
- we perform intelligent handling of tags automatically.
The intelligent handling of tags means that we defaulted to use Perforce “dynamic” labels as the equivalent of SVN tags (or branches) in the first instance. The tool only creates Perforce branches if a file is actually modified on the tagged branch (which can happen quite often in real life even if it is “supposed not to”!).
So the previous SVN history might also contain:
------------------------------------------------------------------------
r158 | autobuild | 2007-06-05T13:12:02.223695Z | 1 line
Changed paths:
M /tags/1.1-RC1/ivy.xml
Recorded ivy.xml for 1.1-RC1
The problem as can see from the SVN log, is that a file is then modified on the “tag” branch. You can’t do this to a Perforce label. If you wish to replicate this type of history you must create a full Perforce branch for that tag and check the modified file in on that branch.
It is also frequently the case that tags in SVN are created and then later deleted. No problem if they correspond to the creation and deletion of labels in Perforce, but potentially very expensive with thousands of branched files. If you have a “spec” depot in Perforce then you have a full record of the label being created and deleted.
This approach has quite a few edge cases that need to be considered, including:
- branching from tags – and checking files in
- having multiple levels (not just /tags/<tagname>, but also /tags/sublevel/<tagname> etc) – how do you decide what to do?
Wrapping Up
Migration from SVN into Perforce is beneficial for many companies as the size of their repositories grows and the number of people using it. The tools are sufficiently similar to make user acceptance and training very straight forward.
However, you do need to perform full-history migrations with some care – feel free to contact me for more details. VIZIM has full history migration tools for Subversion, ClearCase and CM Synergy to Perforce.
December 26th, 2009 — Mercurial, Perforce, SCM
One of the things that I do quite frequently is download new packages and bits and pieces of code to play with on my local machine. I frequently start making local edits to perform local configuration changes, or perhaps try out the odd idea. Quite often these packages are downloaded, played with, and then discarded if they don’t meet my needs. But of course some of them end up with a permanent place in my tool chest, or installation.
Exploring with Bread Crumbs
I start getting twitchy if I start making changes to anything without checking them in somewhere as I go. It is all too easy otherwise to lose track of the changes you have made and quickly get yourself into a mess, losing lots of time and effort.
So, like Hansel and Gretel, I prefer to leave a trail of bread crumbs along the way so that I can explore safely (avoiding if possible my breadcrumbs being eaten by the birds!).
I view this as being able to explore while having the safety net of my saved versions stored away. Advantages:
- I know what I have changed at any point (can show what’s changed and do diffs)
- I have a bread crumb trail leading me back home to my initial clean configuration
- I can revert to known working states if the current experiment goes wrong (e.g. accidentally delete chunks of text in the editor, or change several settings at once without appreciating their interactions)
Bread Crumb Trail Tools
My tool of choice for SCM is Perforce which is fast, effective and has lots of client tools. Having used it for many years as a consultant and trainer it is second nature to me and quick and easy to use, and rock solid.
And yet, for this particular type of work, I find it not always ideal. Why?
- I need to setup a client workspace – not difficult, but enough steps to still be annoying and to often result in me not doing it (requires making decisions on naming, default paths in the repository etc – seems like it could be easily automated, but just slightly too variable in requirements to make this easy)
- The changes are “permanent” in my repository – even if the whole experiment turns out to have been a red herring (and yes of course I can obliterate stuff, but that’s another extra step)
Please note that if I really want to keep all my work – for example the package turns out to be of long term use, then I make the small extra effort and import into Perforce (together with third party codeline pattern etc to be able to track future changes as new release are made etc.). At this point, it turns out that saving it “centrally” is very beneficial.
Bread Crumb Tool of Choice
My favourite current tool for this “temporary” bread crumb saving is Mercurial (hg).
The advantages for me:
- It is very quick and easy (but also a full featured system should I ever need it)
- The “repository” is saved in the .hg directory at the root
- If I remove the whole tree then repository goes too
I use the command line client to save an initial snapshot (from the root of the tree where I have extracted the package):
hg init .
hg add .
hg commit -m "Initial version of XXX as downloaded"
Subsequently I typically use the following subset of commands (from “hg help”):
- add add the specified files on the next commit
- commit commit the specified files or all outstanding changes
- copy mark files as copied for the next commit
- diff diff repository (or selected files)
- help show help for a given topic or a help overview
- log show revision history of entire repository or files
- remove remove the specified files on the next commit
- revert restore individual files or directories to an earlier state
- status show changed files in the working directory
TortoiseHg is also useful.
Conclusion
Perforce remains my tool of choice for most SCM related activity, but Mercurial is a very useful addition to my personal tool chest, and in particular in this type of scenario.
The main thing I would always encourage people: Whatever you do, get into a habit of using a version control tool as often as you can! You will seldom regret it.
December 4th, 2007 — Perforce, SCM
For quite a while there has been an option to create your own custom installer for Perforce deployments (including automation). This can be particularly useful for larger sites with lots of new users coming along regularly.
The customizable element of the scripted installer is a configuration file named perforce.cfg. The Perforce Administrator edits this file, and places it on a shared network drive along with the expanded contents of the perforce.zip file. Perforce client programs can then be installed by running setup.exe from each desktop.
One of the annoyances with this approach is that you have to expand the .zip file and put things on a network drive. It is not easy to do this sort of thing within a single executable for example (which makes it hard to put on your intranet and get users to double click).
Well, having played around with various installers and options, I have discovered a fairly simple way of doing this via a self extracting executable (SFX), and the open source 7-Zip program.
Instructions
Create a directory structure:
- some root dir
- 7-zip – contains a batch file, a couple of config files and perforce.cfg – see below for contents.
- p4winst – contains expanded p4winst.zip (without the perforce.cfg)
- p4vinst – contains expanded p4vinst.zip (without the perforce.cfg)
Currently (as of 2007.2) the perforce.cfg file is the same for both P4Win and P4V.
You need to get a couple of files from 7-zip.org/download.html
Edit perforce.cfg to customize for your installation.
Create a p4winst.conf and p4v.conf (which must both be UTF-8 – Notepad can save this format, as can Notepad++ or other editors) in that directory.
Run make_installers.bat from the command line in the appropriate directory and check that the following are created (which you can put on an intranet or whatever):
You can then send your users a single exectuable which unzips itself and automatically runs the perforce setup.exe with the included perforce.cfg – QED!
make_installers.bat
This contains something like the following:
:: Batch file to package up P4Win using .zip file and the 7-Zip freely available
:: Zip program (which can create SFX - Self Extracting Archives).
@echo off
:: May need to customize the following
set ZIP_DIR="d:\apps\7-zip"
call :make_install p4winst
call :make_install p4vinst
goto :exit
:make_install
set INSTALLER=%1
:: First create new clean versions of the zip files (including current directory version of perforce.cfg in preference)
if exist %INSTALLER%.7z del %INSTALLER%.7z
if exist %INSTALLER%.exe del %INSTALLER%.exe
%ZIP_DIR%\7z a %INSTALLER%.7z ..\%INSTALLER%\* perforce.cfg
:: Copy SFX file plus config file plus zipped installer into a single SFX .exe
copy /b %ZIP_DIR%\7zsd.sfx + %INSTALLER%.conf + %INSTALLER%.7z %INSTALLER%.exe
if not errorlevel 1 goto :EOF
if exist %INSTALLER%.exe goto :EOF
echo *** ERROR: failed to create %INSTALLER%.exe
goto :exit
:exit
p4winst.conf (and p4vinst.conf)
These are simple UTF8 format file with contents similar to:
;!@Install@!UTF-8!
Title="P4Win Custom Installer"
BeginPrompt="Do you want to install P4Win?"
RunProgram="setup.exe"
;!@InstallEnd@!
October 23rd, 2006 — Perforce, SCM
Have been meaning to do some work on P4Python recently, and the first thing I realised I should do is to update the test harness.
This is based around the unittest module so is fairly standard Python. Does some fairly standard things with test suites etc, and provides a reasonably good example of how to use the code itself, so acting as some level of documentation of P4Python.
The old version assumed a pre-existing Perforce server installed and running with some known content. This was fine for my own personal testing but had a couple of problems:
- it assumed the training repository which is fine for Perforce Consulting Partners but which isn’t available to ordinary mortals (so they couldn’t run the test harness locally)
- was a snapshot with existing users and client workspaces so required a license to use – not good for everyone
- it required too much manual setup before running
Obviously a candidate for tidying. So recent work done:
- Change to use the new Perforce Sample Repository which anyone can download and install (works without a license too)
- Change to automatically create and run a new server instance on a fresh install from the sample repository download (automatically unzip etc)
The end effect is fairly nice and automated, and provides a much better ongoing resource to anyone wishing to do work on P4Python. Please note that it currently assumes Windows environment, but will insert a few checks to make platform independent shortly.
To have a look you can either:
Note that credit for various aspects should go to Ravenbrook from their work on P4DTI – I stole various techniques from their implementation of a test harness.
It also serves as a reasonable example of a test harness for Perforce scripting, and am very happy to receive comments and improvements (there are no doubt some Python gurus out there who can suggest some improvements at points in the code). I use something fairly similar as a test harness for the VSStoP4 scripts which are actually written in Perl (too horrible to write such a thing in Perl was my feeling)!
Would like to encourage people to take their own copies and give back some improvements in the general framework at least…
September 22nd, 2006 — Perforce, SCM
Life has been just a touch busy recently having been flat out on various client projects pretty much over the whole summer (managed a week away but only just!). All grist to the mill for future blogging, so hopefully a variety of articles to come!
Meanwhile one of the things I was doing was preparing and then giving a presentation for the (first) Perforce European Conference on 19th September in central London.
I think the papers will be out pretty shortly on the Perforce site, but meanwhile a few highlights and personal notes. There were some big names present and it was good to hear about various practices and principles in operation.
Keynote
Christopher Seiwald did a variation on his slightly “aw shucks” style keynote. Some key points:
- Perforce doing fine: 200,000 users and 4,000 companies
- Company motto: “Aim low and hit!” (do one thing well, remain best of breed and wait for the analyst pendulum to swing back to best of breed rather than suite integration, which it seems to do on a regular basis)
- Working on a variety of things for future world domination, but don’t want to pre-announce as usual
- Very pleased with the way things are happening in Europe, and obviously at the response to this event.
- Next US Conference 9 – 11 May 2007, Las Vegas.
- Sydney office now opened to give global timezone support coverage!
Symbian
Deepak Modgill did a nice presentation on the challanges faced by Symbian for their offshoring. Another in the Symbian series of how their business and vairous configuration management practices have evolved. Not deeply technical but interesting never-the-less.
SAP
Obviously a flagship site for Perforce. Thomas Kroll and Claudia Loff did a good presentation. Interesting how much process and tools they had wrapped around Perforce. A few key stats:
- 4,800 users
- 80+ Perforce servers (but all on same cluster hardware)
- Fujitsu Siemens clusters with 32Gb RAM running SunOS 9
- SAN (mirrored) for main data
They use a very structured process (repository structure and branching scheme) and a parallel (P4SAP) system with its own database to record things like changes and migrations (they call them transports) of releases between different servers. There is also a layer P4MS (Management System) to handle users etc.
Quite impressive.
Process Automation
Obviously my talk was wonderful! I was thought fairly pleased with how it went down and got some good comments afterwards. For anyone interested, the Ruby triggers framework and a couple of utilities are in my area of the Perforce Public Depot.
I will no doubt be blogging on various related aspects (that I haven’t already touched on).
Bank of America
Good talk by Sean Cody and Kevin Breidenbach about different approaches with the bank. They have been replacing ClearCase with Perforce in various groups, mainly due to the performance for shared development between US, UK and India. Experience of Multisite sometimes taking hours to “sync up”, vs. 10-20 minutes max in Perforce.
Another feature of the talk was the power of continuous integration.
Google
Dan Bloch discussed Google’s use of Perforce and in particular how they manage issues around Perforce database locking and identifying and bumping off rogue commands.
Some more stats:
- 3,000+ users
- Single Perforce Repository
- HP DL585 4-way Opeteron with 128Gb RAM
- Linux 2.4 and NetApp filer
Sounds like it wins the contest for largest number of users against a single server!
The details of the lock identification was very interesting and Dan said he would be releasing the lock.pl script and some docs on the Public Depot real soon now!
Perforce 2006.1 Update
A very interesting and technical talk by Michael Shields regarding a variety of performance optimisations made between 2005.2 and 2006.1.
Summary: 2006.1 is quite a bit faster!
Read the slides for more details.
Laura Wingerd
Laura did another fairly technical talk on what has happened to the branching/merging algorithm, and more particularly common ancestor detection algorithm used in various releases of 2006.1. In her usual inimitable style she came up with some very useful ways of explaining things like convergence and divergence of branches over time. Things got decidedly more technical with discussions on common ancestors and I was left knowing I have to go through some of this in detail in a quiet moment just to make sure I really do understand it! The changes with 2006.1 look good, but I did get the impression some edge cases could give some slightly surprising results if you don’t know what’s going on behind the covers (and indeed the driving intentions behind the algorithm).
Summary
Venue worked very well for location. Networking with both Perforce people and various other delegates was as ever a highlight.
Unfortunately the room booked was not huge which meant the event sold out well ahead of time – a shame a more flexible venue wasn’t chosen, but that was only quibble. Organisation well run.
An excellent day!
May 13th, 2006 — Perforce, SCM
Writing good Perforce triggers, and, more importantly, debugging them in live use, turns out to be one of those things that seems simple but has lots of tricky issues that can lead to lots of time being wasted.
In spite of thinking that I understood lots of the issues, I still spent a couple of hours recently debugging a problem that turned out to be a combination of environment and password issues. This was particularly annoying as I had rather though I knew about this stuff (and indeed have advised people over the years about it!), and yet was blindsided and caught out by some issues I had forgotten about or not thought through deeply enough.
I reserve the right to revisit this subject more than once in the future with further insights and news…
Assume Nothing About The Environment!
The classic approach to triggers is to write a nice script (Python or Ruby for me these days – no Perl, though just occasionally I miss it!) and debug it by running with the appropriate parameters from the command line (e.g. create a pending changelist and pass in the pending changelist number). This does indeed tend to turn up a number of issues, but the good thing is you can usually debug them with the appropriate command (<rant> why does python require you to execute pdb.py which isn’t by default put in the path on Windows machines, and why does Ruby not learn from Perl and for example use -d as a parameter to debug things instead of “-rdebug” – very unobvious!</rant>).
The major problem turns out to be the fact that the trigger is executed by the Perforce server process and may have a very different environment to what you might think as you run a “login” session. One sort of expects this on Unix, but on Windows it can be particularly surprising how little is in the environment due to the username that the Perforce process is running in when it is running as a service (default installation on Windows).
Thus the first rule of trigger writing is “assume nothing about the environment!“.
It is very easy to forget this and assume very simple things, like:
- P4PORT is always defined
- P4USER is always defined
- failures of individual p4 commands within the trigger will be obvious
Thus immediate recommendations are:
- Give full pathnames to executables. For example, “/usr/bin/ruby” or “C:\ruby\bin ruby.exe” as the initial parameter for the ruby script, rather than assuming that “ruby” or “python” or whatever will always be in the PATH of the user executing the command.
- When in doubt (I’m generally always in doubt) give full pathnames to scripts too.
- Pass in as parameters the p4port and any other parameters to be used rather than expecting them to be already present in the environment.
- Within the script, explicitly add any extra directories to the search path for commands such as “import p4″ in Python or “require ‘P4′ ” in Ruby or any equivalent import-type statement, unless you are absolutely sure that the imported libraries are globally installed on the machine your are working with. Don’t assume the same directory as the trigger script itself is in is in the path unless you can prove it.
- Trap and print to stdout (or stderr which goes to the p4d server log file) any errors/stack traces including exceptions from your p4 interface to aid hunting out problems. This is much easier to say than to do!
Passwords Cause Problems
In the good old days, before “p4 login” was even a twinkle in Christopher’s eye, you could write your trigger assuming super user privileges (says in Yorkshire accent “we had it tough – could only dream of admin privileges in those days”) and everything would work.
Life became substantially more complicated with security level 3 and login being required. Commands failed due to not being logged in, and this turned out to be a bit of a bugger (’scuse my French) to work out (why it had failed that is).
Received wisdom is “run your triggers as a special trigger/admin user, put that user in a special group with timeout of some very large number, log them in manually and all will be sweetness and light”.
The interesting thing about this approach is that it often works, but as I discovered recently, can flatter to deceive. The problem I had was that the super user was indeed in a special “long timeout” group, and logged in on the same box (generating a suitable ticket). However, as I discovered only after some hair was torn out, the P4PORT that the user was logged in under was different to that used by trigger and thus the P4TICKET file entry was also different and the existing “login” had had no effect and my trigger was unfortunately failing silently.
Thus P4PORT=localhost:1666 where localhost=some_server.some_company.com will not work if the superuser is logged in using P4PORT=some_server.some_company.com:1666, since the latter is what will be in P4TICKET and the former will not be found and thus commands will fail. Be warned and expect/check for this! [Note: this was fixed in p4d 2007.2]
When in doubt print out the environment within your script (via some sort of debug parameter).
Belt and Braces
My current intentions on this front are to produce a trigger framework that helps detect the above problems, and helps both avoid them and, when necessary, debug them in a (relatively) painless manner. This, at the moment of writing, is a work in progress, but I hope to be able to share it with the wider Perforce community as it emerges into the glare of publicity. I do reserve the right to retain the right of surprise to add some slight spice to my upcoming presentation at the European Perforce User Conference on the 19th September (in London).
Update: hopefully will be able to share a rework/expansion of Tony Smith’s P4Trigger.rb framework which addresses some of the above issues fairly shortly – seems to be working at a client – time will tell – but fairly quickly.
Future topics will include ideas on test frameworks etc.