Candid’s brain

Git submodules

Next example why the current Git submodule support is completely shit. I have various PHP applications that use the same Java backend, calling it using exec(). Now each of these PHP applications has its own Git repository, the Java backend has one, too. The PHP repositories include the Java repository as submodules, so, if someone clones one of these PHP repositories, they have to run git submodule update --init after git clone.

If I make an update now in my Java library and commit that to the public repository, the new version won’t be used automatically in the PHP applications. Instead, I have to run these commands in all PHP applications:

cd java
git pull
cd ..
git commit -a
git push

After updating the working space of the Java submodule (using git pull), it appears like a modified file in the PHP application repository, so I have to commit the change.

Users of the PHP applications now cannot just run git pull to get the new version of the application (including the new version of the Java submodule), instead they have to run an additional git submodule update after that so that the working space in the submodule gets updated, too. So you have to tell your users that they can’t just git pull changes, but instead they have to run an additional command every time.

Now things get even funnier: The Java library requires an external library to work, so it includes a submodule itself. The thing is, when people download a PHP application and load the Java submodule using git submodule update --init, the submodules of the Java submodule won’t be initialised automatically. So users have to run the following commands to get a working copy of my PHP application after git clone:

git submodule update --init
cd java
git submodule update --init

Now imagine that the external library used by my Java library introduces a new feature that I begin to use in my Java library. I have to update the submodule of the external library in my Java library and commit that change. Then I have to update the Java library submodules in all my PHP applications and commit these. Imagine what a user of my PHP application has to run every time he wants to update his working space to a working new version:

git pull
git submodule update
cd java
git submodule update

My projects are rather small, imagine what you’d have to do to update a working copy of a huge project…

When I use a web application on my server (such as an webmail client or phpmyadmin), I usually check out the stable branch of their SVN repository and run svn up every now and then to get new updates. I don’t need to know anything about the repository structure of the projects to do that. With Git, I would have to know in which directories I would have to update the submodules manually, or, alternatively, there could be a shell script to update the working copy, which I would have to remember to call instead of git pull. This makes things unbelievably complicated. I hope Git will at least introduce a feature that automatically fetches the submodules on git clone or git pull.

Update: Man pages on the internet list a --recursive option for both git submodule as well as git clone that does exactly this. On none of my computers, these are supported by the installed Git version yet, so it must be a very new feature. I don’t know though if the option is available for git pull or git checkout as well. I hope that it will become or already be the default behaviour of Git. Yet I am missing an option to always include the newest commit of a repository as a submodule instead of a specific one.

Update: Oviously, the --recursive option was added in Git 1.6.5, which is still marked unstable in Gentoo Portage.

Filed under bugs

Comments are closed.