(Photo credit by Crystalline Radical)
Nowadays, while most people in our industry know about DVCS tools, such as Git and Mercurial, and what they can do, not all are aware that git can be used with SVN. It is, of course, quite a shame as on top of being the best way to learn how to use git - this feature also enables one to use all the nice tricks of DVCS (offline commit, local history rewriting, commit search, bisect...) while still being stuck with a remote SVN server...
Some years ago, I had already done a quick overview of this feature in my personal blog, so I decided to update and move content here - especially, as I've, even recently, run into fellow Red Hatters who did not know about it also !
This HowTo was, and still is, driven by uses cases, which is a good thing because most of those use cases match what any developer do on a daily basis with SVN. Hopefully, this will help readers relate to the tool, but will also make this blog entry a nifty reference page for later on. (In this regard, the DZone Refcardz on Git will also be an excellent reference material).
Cloning
As for SVN, the very first step one will do with git-svn, is to get the sources from the remote server. If with SVN, one only checks out the latest version of the source code - hence calling this step a checkout, with a DVCS one retrieves the entire project history. Therefore, this step is no longer called a checkout but a clone.
It's worth mentioning that if a checkout is a somewhat fast operation - highly depending on the remote server, a clone might take longer. When I start using git-svn, I used a lot of Sourceforge for both PMD and XRadar, and the cloning of those projects could take up to a full hour, but, at this time, Sourceforge bandwidth and general performance were rather poor. However, this is not such a drawback. Indeed, one has to keep in mind that once the history has been completely copied over, all the others operations will be extremely fast - as most operation will no longer required a network access, but just a local file system access.
So, as described above, the 'clone' operation is likely to be the first you want to do. It's more and less like SVN checkout, except that you are going to copy over the all project history, including all its revisions, and store them locally, into the hidden directory, where Git keep its "stuff", .git:
$ git clone http://svn/repo/
Clone standard SVN repository
If the repository you are cloning is following the standard SVN directory layout - meaning the root directory contains the folders 'branches', 'tags', and 'trunk' - you can make Git aware of it by using the '-s' option. This will allow Git to create the appropriate branches and tags inside its own repository:
$ git clone -s http://svn/repo repo.git
$ cd repo.git
$ git branch -r # show me the remote branches associated with my repo
beta
development
old_cvs
release
tags/beta
tags/development
tags/old_cvs
tags/release
trunk
Remarks:
- The extension .git is not mandatory, I add it so that I know if I am in a svn (.svn) working directory, git clone or a mercurial clone (.hg) - which I also use. I even use those for svn (.svn) and cvs (.cvs) checkout.
- Having a complete copy of the repository is not that hungry for the local hard drive. Indeed, if you compare a simple SVN checkout with a clone of the same project, you'll see that the resulting repository with git is not that bigger - thanks to the fact that Git stores one complete revision, and then only diff.
Working on my branch
First thing you should always do, when you start working on something, is to create a local branch for it. Creating branch with SVN, while being easier than with CVS, is not a common reflex. People tend quite naturally to only create branches when they need to. However, with Git, as the branch is local, it is lightweight and very practical to keep track of things.
Indeed, if you need to switch to an other task, on the same code base, your work on a specific task will already be isolated into a proper branch. Many developers, for instance, simply create a local branch for each tasks they are assigned, using the id of the task as a name for the branch. More importantly, this allows one to leave the 'master' branch untouched, which will ease updating it from the SVN server later on...
$ git branch
master
$ git checkout -b my-branch
Switched to a new branch 'my-branch'
Branch from a revision
By default, as shown previously, when creating a branch, one should do it from the 'master' branch - which in the case of git-svn, maps to the latest revision on the remote server. However, if you are working a fix for a previous revision, one can easily use the commit id to create a branch for this revision:
$ git checkout 52052963178fcc1e65e8bdff35f40b0bd92a34e4 -b bug-branch
Switched to a new branch 'bug-branch'
It's worth mentioning here that this operation is local - no access to the remote server is needed to create a branch based on a different revision. Therefore one can do such an operation while sitting on a train or at home for instance...
Going back to trunk to fix a bug
Let's assume that you were working on a new feature, and your manager asks to immediately stop this work, to switch and fix a bug the QA found on the trunk. The feature you are working on is also based on the trunk, but you have already did some work - maybe even some local commit (see below).
Hence you cannot switch directly because your working directory - the local checkout of the source code you are working on, is dirty, meaning there is uncommitted changes:
git status
# On branch bug-branch
# Changes to be committed:
# (use "git reset HEAD ..." to unstage)
#
# new file: ./server/common/src/main/java/com/foxmobile/cms/service/manager/impl/query/locale/LocaleByNameQuery.java
# new file: ./server/common/src/test/java/com/foxmobile/cms/service/LocaleManagerTest.java
#
# Changed but not updated:
# (use "git add ..." to update what will be committed)
# (use "git checkout -- ..." to discard changes in working directory)
#
# modified: ./server/common/src/main/java/com/foxmobile/cms/service/manager/impl/LocaleManagerImpl.java
# modified: ./server/common/src/main/resources/queries.xml
# modified: ./server/common/src/test/java/com/foxmobile/cms/service/DistributionManagerTest.java
# modified: ./server/common/src/test/java/com/foxmobile/cms/service/mockdb/TableCreator.java
# modified: ./server/common/src/test/resources/applicationContext-test.xml
# modified: ./server/common/src/test/resources/ddl.xml
#
Overall, those changes are "work in progress". Some part could be committed, but most of it is not finished yet. However, you need to go back to trunk to work on this bug, and git will not let you create a new branch in such a state because it would mean erasing the working directory's changes !
One easy and obvious way to cope with that would be to clone the repository into a new copy, which would have a clean working directory. However, this would mean having to handle several repositories, which quickly turns into a hassle especially, as in git-svn case, the new repository would not be linked to the remote SVN server.
An other option is to check in everything into one, dirty commit that you can fix later on using local history rewriting (see below). This will work but be quite impractical. Fortunately, there is a far easier option, called stash:
Stash
$ git add . # add every local changes, including the new files for the next commit/stash operation
$ git stash save "current work on my branch"
...
$ git stash apply # get my changes back
This option will basically took all the locally added changes and store it "somewhere". Later on, one can easily ask Git to apply those store changes on the working directory - even if its content has changed since the stashing happens. For instance, if you working directory now contains the fixes you did to solve the issue reported by QA, git will try to merge the current working directory with the stashed changes.
Committing changes
Pretty much like SVN, one can commit changes into the project repository. Of course, the very, very neat thing here is that the commit is local. It does not requires connectivity, and can even be altered afterward - granted you have not pushed it to the remote server (see below).
Stash
$ git add my-new-file a-modified-file
$ git commit -m "my cool new feature"
Edit your local history before committing it to SVN
You have worked on this issue, and find a fix. You have committed the set of changes associated to it locally. Let's say your fix includes 3 revisions of the source code. This is rather neat, but one of those commit, the second one is just a fixup, something you just forgot to commit in the previous one. Thanks to git, one can now rearrange those revisions a little bit before pushing them to the SVN server, and server make the history simpler to read and more consistent.
One can achieve this using the now quite famous rebase feature of git:
$ git rebase -i HEAD~3
This will trigger an "interactive" reconstruction of your history for the last 3 (HEAD~3) revisions. You'll be able to change the commit message, merge them, even reorder them, and so on... So you can at last say goodbye to useless commits such as "sorry, forgot a file in the last commit".
However as powerful as history rewriting is, do it VERY carefully - especially when you start using git-svn. And always remember that this can only be done before pushing the changes back to SVN !
Push the changes back to SVN
Ok, now that you have fixed the bug, you want to publish it on the SVN server before resuming your work on your own feature. For that, you'll need first to pull the changes from SVN, pretty much like you were used to do a 'svn checkout' before doing a 'svn commit'. In the case of git-svn, this is achieved with 'git svn rebase' command.
The name of the command is purposely corresponding to the previous 'rebase' command I've described. Essentially they are doing the same thing - rebuild the project history, except that 'svn rebase' pulls the changes from the remote SVN server before rebuilding the project's local history:
$ git svn rebase # will get changes from SVN and merges them with you own
Assuming this happen without any issues, one should be able to push its changes to the SVN server using the dcommit command:
$ git svn dcommit # pushes your commits to SVN
Merging stuff into my branch
You're now finished working on this bug, and you want to resume working on your feature. You can go back to your branch easily, that's pretty straightforward. However, a far more interesting fact is that you can also use git to merge into the fix's changes you just did into your branch !
$ git merge branch-with-wanted-changes
$ git svn rebase # get changes from svn
Or, alternatively, if you just need to get back one or two changes - maybe a fix from a maintenance branch that you need to have for your feature, you can even 'cherry-picked' a revision:
$ git cherry-pick 52052963178fcc1e65e8bdff35f40b0bd92a34e4 # the revision id you want to apply to your branch
Remarks: Most of the time, git very powerfully merges the changes, solving many merging issue that would required manual editing on your part. However, from time to time, git simply cannot resolve the conflict for you and you'll to help it, pretty much like SVN. However, this is happening far less often than with SVN - and generally, when git cannot resolve the conflict, you do it quite easily...
Do a back up of my work
If you go "solo" for a long time with Git (ie, not committing to svn), you might probably want to keep some kind of back up. Indeed, if your laptop dies or gets stolen, and you have not pushed to SVN for while (for instance, you went away for a trip) you can lose up to several days of work !
However, by its very nature, git allows you to easily create clone of your local repository, rendering backing up a very simple task. You can easily use the "clone" command to do so over ssh for instance:
git svn clone my-project.git ssh://mybackup/
Note that by default, Git only clones the current branch of the project and will not keep track of the SVN information. So if you lose the repository you did a copy from the SVN server, you may need to clone it again, and then push the backup changes into the new repository.
Find the guilty change
Anybody who has been a developer for sometime has run into this. A user or Q&A reports a bug that was literally fixed yesterday, or the week before. Somebody screwed up something - probably erasing the fix with a bad commit, and now one may end up having to implement the fix again. This is both frustrating and a loss of time, especially as the project history has all the data needed to find out culprit commit.
Fortunately, one can ask git to run a test on a range of revisions to find out which of them was the first to introduce the bug again. This feature is called 'bisect'.
(Photo credit by Crystalline Radical)
Using Git locally to version your stuff
Now that you are familiar with git, you may want (like me or any DVCS fan) to use it to version pretty much anything, including your own selfish projects that are not especially aimed at being shared. This will let you do local commit and enjoy all the power of branching and tagging, for your own benefits:
$ mkdir my-personnal-project
$ cd my-personnal-project
$ git init
Initialized empty Git repository in /tmp/test/.git/
...
# add files, changes stuff
$ git commit -m "first version" -a
Having a local repository for your own project will also allow to get use to the command, and screw things over, without messing up with the company SVN's server ;)
Another awesome side effect of this is that, if at some point you need to share this project with someone else, you can easily share your work. The other guy or guys can simply clone your repository and start collaborating with you. And once, all your changes are checked and merged into one local repository, you push'em back to SVN !
Links
Video
- A very funny video of Linus Torvalds explaining Git (and mostly trashing CVS)
- In a more British fashion, there is a video of Bryan O'Sullivan talking about Mercurial, which is a very good (and simpler alternatives) to Git.