Depending on how your brain works, one of the following explanations will make sense to you:
It often happens that while working on one project, you need to use another project from within it. Perhaps it’s a library that a third party developed or that you’re developing separately and using in multiple parent projects. A common issue arises in these scenarios: you want to be able to treat the two projects as separate yet still be able to use one from within the other - Git-scm documentation
Git submodules allow you to keep a git repository as a subdirectory of another git repository. Git submodules are simply a reference to another repository at a particular snapshot in time. Git submodules enable a Git repository to incorporate and track version history of external code - Atlassian documentation
So, you might have a Git repository which requires another Git repository to be downloaded as a sub-folder. The naive method would be to git clone
or download it, and commit the entire repository. However, there’s no easy way of syncing changes between your copy and the original repo. Additionally, you need to commit a potentially large number of files
The solution to this is to use Git Submodules:
- If there’s a change in the source repo of one of your dependencies, you can selectively sync it to stay up to date
- If you make a change to your local copy, you can commit it and propagate it back
For example, I have a Python typeshed-inspired library of my own called typesieve. I add it as a development dependency of sorts to many of my projects, and develop it further while working on the projects. A contribution in one copy of the library is then pulled to others, purposefully and as needed, with all the advantages that Git brings.
The docs do a good job of explaining more complex workflows, but the simple approach I use is as follows:
Add a submodule to a repository
Let’s say you want to add typesieve
as a submodule to an existing Git repo – do so by entering the following command from within the repo:
$ git submodule add https://github.com/alknemeyer/typesieve
You can add it under a different name (let’s say sievetype
) in a different folder:
$ git submodule add https://github.com/alknemeyer/typesieve my-folder/sievetype
This adds an entry like the one below to a .gitmodules
file in the parent repository:
[submodule "typesieve"]
path = my-folder/sievetype
url = https://github.com/alknemeyer/typesieve
git status
will tell you something like,
alex@scruffy ~/P/blog (main)> git status
On branch main
Your branch is up to date with 'origin/main'.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: .gitmodules
new file: typesieve
Staging and committing typesieve
it will cause it to be tracked in the parent project’s Git history, as if it were a single file. More on this in a moment!
Download submodules to disk
By default, git clone
won’t actually download submodules – do so by using the --recurse-submodules
flag:
$ git clone --recurse-submodules https://github.com/alknemeyer/myproject
If the project is already cloned, download the submodules using,
$ git submodule init
$ git submodule update
Managing changes
Let’s say that two repositories (A
and B
) have typesieve
as a submodule. While working on project A
you make a change to typesieve
. The change is committed and pushed using a normal Git workflow (ie, by running Git commands from the typesieve/
directory). You’ll need to stage and commit the new version of the submodule to project A
’s Git history
Since the parent project only “sees” the submodule as if it were a single file, a diff
will indicate that a different commit is being used:
alex@scruffy ~/P/blog (main)> git diff typesieve/
diff --git a/typesieve b/typesieve
index 15845cf..c289736 160000
--- a/typesieve
+++ b/typesieve
@@ -1 +1 @@
-Subproject commit 15845cff62182149a1600b066b1f9754b7c1d920
+Subproject commit c28973600691c0f2ce16fe1bfc316c013731cc7f
Now, in project B
, typesieve
was changed on the remote repository. You’re not sure whether you want to update your local copy. As answered here, you can diff
against the remote:
$ git diff origin/main
and if you want to accept the remote changes, enter
$ git merge origin/main
A summary of some commands to see changes before pulling from a remote Git repository is here
Managing changes in VS Code
It’s worth mentioning that VS Code handles Git Submodules quite well. Let’s look at the workflow for my blog, which has hugo-PaperMod as a submodule for the theme. I navigate to the Git tab and see this:
Okay, we’re 38 commits behind! I can look scroll through the changes if I’m feeling cautious/curious:
alex@scruffy ~/P/b/t/hugo-PaperMod (master)> git diff origin/master
diff --git a/assets/css/search.css b/assets/css/search.css
index 275cbe1..d788cd2 100644
--- a/assets/css/search.css
+++ b/assets/css/search.css
@@ -1,4 +1,4 @@
-#searchbox input {
+.searchbox input {
padding: 4px 10px;
Looks fine. Pull the changes by clicking the circular arrow, so that they arrive on disk. Next up, we stage and commit the new version to the parent project (my blog), adding a message like “update hugo-PaperMod version”
Recognizing submodules on GitHub
On GitHub, submodules show up as follows:
Clicking the link takes you to the submodule’s repository, with the tree corresponding to the shown hash
Removing a submodule
Uh oh, you goofed and want to clean your mess. Remove using
git rm typesieve
This de-initializes the submodule and removes its entry from .gitmodules
, but intentionally leaves behind .git/modules/typesieve
so that you can undo the change using, for example, git reset --hard
as mentioned by this Stack Overflow user. Delete using,
rm -rf .git/modules/typesieve/
You’ll also need to delete the relevant section in .git/config
. There’s official-looking documentation on it here. This thread on Stack Overflow also has more information. Removing submodules seems to be a strangely messy/manual affair!