Candid’s brain

Git disadvantages

I’ve been considering merging from Subversion to Git lately, and have finally managed to understand how Git works. A good introduction is “Understanding Git Conceptually”. Not a good introduction is the Wikipedia article, as it mainly explains what Git is trying to be, and not what it actually is.

The biggest misunderstanding is that Git is called a “distributed” or “decentralised” revision control, as opposed to a “centralised revision control”. In fact, aside from the fact that you normally have a copy of the full repository in your working copy, Git isn’t that at all. When you hear about a “decentralised” revision control system, you suspect it to work a bit like p2p file sharing, commits would be exchanged directly between developers, a central server mediating only. This is not the fact with Git, you will always have a central repository that you commit to. If you try to develop without a central repository, you will end up in a mess.

The fact that Git is trying to be decentralised without being it leads to confusions that could have been avoided by designing it as a centralised system. For example, when you create a public branch in the central repository, you cannot use that branch directly but instead have to create another local branch that “tracks” the remote branch.

The main difference between Git and Subversion is often claimed to be that in Git you have the whole repository in your working copy, whereas in Subversion you only have the newest commit. This is a minor difference in my eyes, the main difference (and advantage) is that Git has native support for branches (whereas you have to emulate branch behaviour by duplicating directories in Subversion), and these branches can optionally only exist locally. It is very easy to create different branches for different minor changes, and you can develop on them even when you don’t have an internet connection.

The only disadvantage of Git compared to Subversion I have come across is a huge one, and you should really consider it before merging to Git. Git comes with very useful functionality to participate on the development of a project. However, one other very important use of Subversion is to get a copy of a project or a part of it (such as a library) and to easily keep up to date with changes without dropping your own local hacks (something like a “read-only working copy”, you never intend to commit any changes). Git is currently not made for this use at all, as you always have to download a lot more than you need. There is in fact an option to avoid downloading old commits that you don’t need (using --depth=1), but there is no way to only download a specified sub-directory. Common practice in Git is to create an own repository for every part of a project that one might want to check out on its own, and then to include these repositories using submodules (something similar to SVN externals). The problem about that is that it creates a lot of work in Git to make a change in one submodule, commit that and then load the changes into the other repositories including that. And for many projects, it is just impractical to split the code into multiple repositories. If I want to include a part of a large project (such as a part of a library) into my project, I have to include that whole project, which can take hours to download given today’s slow internet connections. There might be workarounds around this, but they certainly aren’t as simple as a single “svn up”.

So the major advantage of SVN over Git is that it is very easy and fast to get a complete and up-to-date “read-only” copy of a project by just using “svn co” or “svn up”. In Git, you have to clone the repository, then it might be incomplete because you have to additionally initialise the sub-modules. And downloading those might take hours.

As long as this disadvantage persists in Git, many projects will keep using SVN. And as long as these projects keep using SVN, it will be difficult for other projects to merge to Git, because they reference these projects using svn:externals.

I hope that these possibilities will be included in future versions of Git:

  1. Download a “read-only” copy of a Git repository or one of its sub-directories, with automatic initialisation of its sub-modules. (The copy should of course still be updatable using “git pull”.) Transmitting old commits and other useless bandwidth usage should be avoided.
  2. Something like svn:externals. Something that automatically pulls the newest commit of a repository or its sub-directory into a sub-directory of my working copy.
  3. SVN support for this svn:externals-like thing. It is already well possible to clone a subversion repository, so it shouldn’t be a problem to support importing them.

Filed under bugs

1 Comment

  1. Sam

    another thing that i don’t like about GIT is – it’s not tracking empty directories.

    Now – GIT site claims that it’s not-needed feature and no one cares about it . I’d like to disagree. Many of myweb projects require temp, sessions, logs cache directories which are usually empty to start with. Now – I can add .gitignore file into that directory to make GIT track it – but it feels like such a hack and also inconveniences clearing these directories before commit. Another workaround for this they say – use build scripts. I’m sorry – but in many cases id’be writing build scripts for the whole purpose of commiting an empty directories – this just wrong. after prolonged hesitation I’ve decided to rather go with mercurial which handles empty dirs antively..