Splitting a single Git repository into multiple repositories

It’s a dark abyss to face: One repo, multiple projects.

In my case, it was just a sensible horrible way to store my university assignments.

For my assignment’s code, I didn’t get many benefits from cloud-based version control because I just needed quick backups. Plus, I wanted to use my private repo allowance for other projects (Note: I think this might be less of a problem with Github.com now?)

So, all my source code was added to a mega-repo called “assignments”.

But then one day: my requirements changed!

Nowadays, I need the source code to be available as part of my portfolio. For obvious reasons, this means keeping each project in its own repo*.

So, how best to split the mountain?

  1. Create new directories, one for each project
  2. Clone the mega-repo (just once to save time)
  3. Then, for each project:

A. Copy the mega-repo into the directory

B. Cull unrelated projects

C. Upload the remaining individual project

The commands used (as much as I saw from a brief glance) leave commit history for each project intact, too.

Let me know if that isn’t clear. Otherwise, please enjoy this technical walk-through:

I created a folder per project, with awesome snake_case names.

1
Fresh directories created to house each project.

Then, using Git Bash, I downloaded a clean copy of the mega-repo into the ORIGINAL_CLONE directory.

cd ORIGINAL_CLONE
git clone https://github.com/<USERNAME>/<MEGA_REPO_NAME>.git
2
Results of the clone: assignments (root dir) contains the mega-repo.

Then, I copied the whole thing (contents of root dir) into a project directory for extracting. First up: blackjack_game.

3
Fresh copy of the clone of the mega-repo (phew) ready for extracting.

Then I “extracted” each project from the copy of the mega-repo – for example the blackjack_game[1]:

cd blackjack_game
git filter-branch --prune-empty --subdirectory-filter Programming2_Assignment2/ master

The magic worked and I was left with only the project I care about in its own directory.

4
filter-branch command completes and leaves the project extracted.

Finally, upload as usual:

git remote set-url origin https://github.com/<USERNAME>/<NEW_REPO_NAME>.git
git push -u origin master

And repeat the process for each project.

Quirks of the job

Fortunately, I didn’t have to deal with restructuring because each project was contained in it’s own directory in the “assignments” repo so the filter command was simple.

Unfortunately, I foolishly initialized my Github repository with a README.md generated by Github.com.

So, when it came to pushing, Git client saw two unrelated histories and refused to push to the server. A tedious solution: I had to pull (with allow-unrelated-histories), resolve any conflicts with the blank file, add, commit and THEN I could push.

cd blackjack_game
git pull --allow-unrelated-histories

Another option was to recreate the repository on Github.com without the blank README, but for me the website task seemed more tedious than the command-line task.

Also, I ended up adding extra steps to remove large binary resources from the graphics projects[2]. This was needed to avoid slowly uploading 50MB of unnecessary junk per project. It was a dangerous but useful process which involved manually overwriting the Git history; it may get a write-up later if you’re interested.

*Obvious reasons are that the commit history for the “assignments” mega-repo is barely comprehensible, the front-page would become massive even just introducing each project and it would require downloading everything even just to view one project. Inelegant.

References

[1] https://help.github.com/en/articles/splitting-a-subfolder-out-into-a-new-repository

[2] https://dalibornasevic.com/posts/2-permanently-remove-files-and-folders-from-a-git-repository

Thanks for reading!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s