How to recover a Google Code SVN Project and migrate to Github
Google Code shut down a year ago, and recently also their “Migrate to Github” feature was disabled.
Because I have/had also a old project hosted (and some people where asking about it recently), I had to find a way on how to migrate the project to Github anyway.
I did these steps
- Download the svn dump'ed repo:
wget https://storage.googleapis.com/google-code-archive-source/v2/code.google.com/spring-security-facelets-taglib/repo.svndump.gz
- Unzip:
gunzip repo.svndump.gz
- Create the repo:
svnadmin create /tmp/testgc
- Restore it:
svnadmin load /tmp/testgc/ < repo.svndump
- Launch a local svn daemon:
svnserve --foreground -d
- In another terminal, checkout the repo from localhost in order to generate authors.txt:
svn checkout svn://localhost/tmp/testgc/ testgc-svn
- Enter the svn repo:
cd testgc-svn/
- Dump the authors to the list:
svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > ../authors.txt
- Edit the file so that each name on the left remains the same, and on the right you have git hosting site account names for every person and their email addresses in the brackets (resolve different svn credentials a single person might have with the same git credential)
author1@mail.com = author1 <author1@mail.com>
author1 = author1 <author1@mail.com>
author2 = author2 <author2@mail.com> - Exit the svn repo:
cd ..
- Clone your repo now using git svn [1]:
git svn --stdlayout -A authors.txt clone svn://localhost/tmp/testgc/
- Go into the cloned git repo:
cd testgc/
- Add the upstream github repo:
git remote add origin https://github.com/domdorn/spring-security-facelets-taglib.git
- Push it:
git push --set-upstream origin master
- Till now, we only have the trunk / master branch
- Get atlassians svn-migration-scripts.jar from https://bitbucket.org/atlassian/svn-migration-scripts/downloads:
wget https://bitbucket.org/atlassian/svn-migration-scripts/downloads/svn-migration-scripts.jar
- Run the scripts and expect the suggested actions:
java -Dfile.encoding=utf-8 -jar svn-migration-scripts.jar clean-git
- If you like what you see (usually you do..), perform the actions:
java -Dfile.encoding=utf-8 -jar svn-migration-scripts.jar clean-git --force
- After this i had a branch structure like this:
git branch -a
* master
remotes/origin/0.2_nate
remotes/origin/0.4_gblaszczyk
remotes/origin/jsf-1.2-spring-2
remotes/origin/jsf-1.2-spring-3
remotes/origin/jsf-2.0-spring-2
remotes/origin/jsf-2.0-spring-3
remotes/origin/master
remotes/origin/site
remotes/origin/site@17
remotes/origin/tags/0.1
remotes/origin/tags/0.3_jsf-1.2-spring-2
remotes/origin/tags/0.3_jsf-1.2-spring-3
remotes/origin/tags/0.3_jsf-2.0_spring-2
remotes/origin/tags/0.3_jsf-2.0_spring-3
remotes/origin/tags/0.5
remotes/origin/trunk - Checkout each branch (except tags and trunk) and push it:
for i in `git branch -r | grep -v 'tags\|trunk' `; do git checkout ${i/origin\// }; git push; done
- Push the branches:
git push --all origin
- Checkout each tag and create a tag with the same name:
for i in `git branch -r | grep 'tags'`; do git checkout $i; git tag ${i/origin\/tags\// }; done
- Push the tags:
git push --tags origin
Thanks to @chrsmith for responding quickly to my google code email (4 minutes, wow!) and telling me about how to download the repo in the svn dump file format.
Thanks to Atlassian for their svn migration scripts
You can see the result of this work at my Spring Security JSF Taglib Github Project.
[1]: Use the --no-metadata switch, if you don't want every commit message to contain "git-svn-id ...", or clean it up in some other way, as mentioned at http://stackoverflow.com/questions/16092509/how-to-remove-svn-url-from-commit-messages