We recently started preparing to open source trackr - our Java-based application which we’ve written about before. It wasn’t made with that in mind so there are some files in the commit history that shouldn’t be exposed to the public.
Until we introduced OAuth, the frontend static files were delivered from within the WAR file. We changed that with OAuth, but of course the files are still in the old commits. Since we are using a commercial template we cannot keep these files. So one objective was to remove the whole frontend from the commit history.
Some files contained passwords or URLs to our testing system which we don’t want to be published, either. That means editing these files in the commit history.
Important: What I am about to describe will alter your history and should only be used under very certain circumstances.
You will not be able to push without
--force after altering history and you really should not do that. The Git book describes
git filter-branch as the nuclear option.
In our case I created a new clone of our repository, edited it and then we pushed the edited content to a new repository.
I want to edit a file or remove a folder in my entire git history in all branches.
Since the only two tools I knew to alter history in git were
git commit --amend which certainly would not help here and
git rebase --interactive I started with rebasing. I searched for the commits I needed to edit with
git log -Gword and marked them as
edit in an interactive rebase.
This worked - but two things about that approach.
- It’s cumbersome and takes a long time. I had to edit multiple conflicts when continuing the rebase.
- It’s not working very well when you have multiple branches that are merged at some time.
We have a
developmentbranch and I only rebased the development branch. But that destroyed our merge history. Just rebasing
masterwould not work either.
So I started googling around.
You will find
git filter-branch relatively fast and there are actually a lot of good descriptions on how to use it.
git filter-branch did for me was the following: You can execute a shell command on the working directory for every commit in every branch.
If you change something the commit will be altered.
There are other options to change the message, the author, files in the index.
How does it look?
git filter-branch --tree-filter 'some command' -- --all
Another great thing about
git filter-branch is that you won’t get merge conflicts.
To remove passwords from a specific file, e.g.
src/main/resources/application.properties I tried to use
git filter-branch --tree-filter 'sed -i "" s/password//g src/main/resources/application.properties' -- --all
Problem: what if the file does not exist in a commit (which happened in our example)?
So we got a
sed: src/main/resources/application.properties: No such file or directory and the
But since we’re executing a shell script, why not include a check for the file?
git filter-branch --tree-filter 'if [ -f src/main/resources/application.properties ]; then sed -i "" s/password//g src/main/resources/application.properties; fi' -- --all
And that will replace
password with nothing in
application.properties in all commits.
If in the commit the file wasn’t present the commit won’t be altered.
Removing a Folder
Additionally I wanted to remove the folder that contained the frontend files.
This again is very easy with
git filter-branch --tree-filter 'rm -r src/main/webapp/WEB-INF/app' -- --all
While I consider myself pretty versatile with git this was the first time I really used
git filter-branch. So found an error? Leave a comment. Did I do something stupid? Leave a comment! Not sure if you want to use it on your own repository? Copy the repository and try it out first!