It’s a dark abyss to face: One repo, multiple projects.
In my case, it was just a
sensible horrible way to store my university assignments.
For my assignment’s code, I didn’t get many benefits from cloud-based version control because I just needed quick backups. Plus, I wanted to use my private repo allowance for other projects (Note: I think this might be less of a problem with Github.com now?)
So, all my source code was added to a mega-repo called “assignments”.
But then one day: my requirements changed!
Nowadays, I need the source code to be available as part of my portfolio. For obvious reasons, this means keeping each project in its own repo*.
So, how best to split the mountain?
- Create new directories, one for each project
- Clone the mega-repo (just once to save time)
- Then, for each project:
A. Copy the mega-repo into the directory
B. Cull unrelated projects
C. Upload the remaining individual project
The commands used (as much as I saw from a brief glance) leave commit history for each project intact, too.
Let me know if that isn’t clear. Otherwise, please enjoy this technical walk-through:
I created a folder per project, with awesome snake_case names.
Then, using Git Bash, I downloaded a clean copy of the mega-repo into the ORIGINAL_CLONE directory.
cd ORIGINAL_CLONE git clone https://github.com/<USERNAME>/<MEGA_REPO_NAME>.git
Then, I copied the whole thing (contents of root dir) into a project directory for extracting. First up: blackjack_game.
Then I “extracted” each project from the copy of the mega-repo – for example the blackjack_game:
cd blackjack_game git filter-branch --prune-empty --subdirectory-filter Programming2_Assignment2/ master
The magic worked and I was left with only the project I care about in its own directory.
Finally, upload as usual:
git remote set-url origin https://github.com/<USERNAME>/<NEW_REPO_NAME>.git git push -u origin master
And repeat the process for each project.
Quirks of the job
Fortunately, I didn’t have to deal with restructuring because each project was contained in it’s own directory in the “assignments” repo so the filter command was simple.
Unfortunately, I foolishly initialized my Github repository with a README.md generated by Github.com.
So, when it came to pushing, Git client saw two unrelated histories and refused to push to the server. A tedious solution: I had to pull (with allow-unrelated-histories), resolve any conflicts with the blank file, add, commit and THEN I could push.
cd blackjack_game git pull --allow-unrelated-histories
Another option was to recreate the repository on Github.com without the blank README, but for me the website task seemed more tedious than the command-line task.
Also, I ended up adding extra steps to remove large binary resources from the graphics projects. This was needed to avoid slowly uploading 50MB of unnecessary junk per project. It was a dangerous but useful process which involved manually overwriting the Git history; it may get a write-up later if you’re interested.
*Obvious reasons are that the commit history for the “assignments” mega-repo is barely comprehensible, the front-page would become massive even just introducing each project and it would require downloading everything even just to view one project. Inelegant.