Git submodules
Next example why the current Git submodule support is completely shit. I have various PHP applications that use the same Java backend, calling it using exec()
. Now each of these PHP applications has its own Git repository, the Java backend has one, too. The PHP repositories include the Java repository as submodules, so, if someone clones one of these PHP repositories, they have to run git submodule update --init
after git clone
.
If I make an update now in my Java library and commit that to the public repository, the new version won’t be used automatically in the PHP applications. Instead, I have to run these commands in all PHP applications:
cd java git pull cd .. git commit -a git push
After updating the working space of the Java submodule (using git pull
), it appears like a modified file in the PHP application repository, so I have to commit the change.
Users of the PHP applications now cannot just run git pull
to get the new version of the application (including the new version of the Java submodule), instead they have to run an additional git submodule update
after that so that the working space in the submodule gets updated, too. So you have to tell your users that they can’t just git pull
changes, but instead they have to run an additional command every time.
Now things get even funnier: The Java library requires an external library to work, so it includes a submodule itself. The thing is, when people download a PHP application and load the Java submodule using git submodule update --init
, the submodules of the Java submodule won’t be initialised automatically. So users have to run the following commands to get a working copy of my PHP application after git clone
:
git submodule update --init cd java git submodule update --init
Now imagine that the external library used by my Java library introduces a new feature that I begin to use in my Java library. I have to update the submodule of the external library in my Java library and commit that change. Then I have to update the Java library submodules in all my PHP applications and commit these. Imagine what a user of my PHP application has to run every time he wants to update his working space to a working new version:
git pull git submodule update cd java git submodule update
My projects are rather small, imagine what you’d have to do to update a working copy of a huge project…
When I use a web application on my server (such as an webmail client or phpmyadmin), I usually check out the stable branch of their SVN repository and run svn up
every now and then to get new updates. I don’t need to know anything about the repository structure of the projects to do that. With Git, I would have to know in which directories I would have to update the submodules manually, or, alternatively, there could be a shell script to update the working copy, which I would have to remember to call instead of git pull
. This makes things unbelievably complicated. I hope Git will at least introduce a feature that automatically fetches the submodules on git clone
or git pull
.
Update: Man pages on the internet list a --recursive
option for both git submodule
as well as git clone
that does exactly this. On none of my computers, these are supported by the installed Git version yet, so it must be a very new feature. I don’t know though if the option is available for git pull
or git checkout
as well. I hope that it will become or already be the default behaviour of Git. Yet I am missing an option to always include the newest commit of a repository as a submodule instead of a specific one.
Update: Oviously, the --recursive
option was added in Git 1.6.5, which is still marked unstable in Gentoo Portage.
Filed under bugs