Revise hg Repo

From JoBaPedia
Revision as of 13:12, 17 October 2011 by Joachim (talk | contribs)
Jump to navigation Jump to search

Revise a HG Repository

I have a mercurial repository, that I wanted to publish for a long time. But it contains some infos in its history (files, author, commit messages) that are not intended for public audience.

How do I get rid of this in the entire repositories history and not just at tip?

Ok, this literally took months to figure out, but in the end it was so easy.

Prerequisites are

  • you have full control over all clones, because they have to be deleted and replaced by the revised repository or the revised info will come back during syncs.
  • merges need to be checked manually (worked for me, but may not work)
  • no branching (perhaps needs manual intervention)

If you know the changes are only in one revision and one file (or a low number at least) it may be better to use quilt extention

If it is only about author changes, it may be better to use convert extention

Method that works without hg extensions

This works with many changes in files and meta data

start=5 (no changes needed up to and including this revision)
end=10 (last repository revision)
hg clone originalrepo revisedrepo --rev $start
cd originalrepo
hg log -gp >/tmp/toxic-repo-history-unfolded.txt
i=$((start + 1))
while [ $i -le $end ]; do 
   hg export --rev $i -g >/tmp/changelist-$i.export
   sed -i 's/toxic-pattern/good-pattern/g' /tmp/changelist-$i.export
   i=$((i+1))
done
cd -
cd revisedrepo
i=$((start + 1))
while [ $i -le $end ]; do 
   hg import /tmp/changelist-$i.export
   i=$((i+1))
done
hg log -gp >/tmp/good-repo-history-unfolded.txt
cd -

Method using quilt extension

This works with some changes in some files, not for metadata changes.

This edits file test.txt that was wrong in hg rev 1 and another.txt from rev 3.

hg clone orig revised
cd revised/
vi .hg/hgrc
  [diff]
  git = True
  [extensions]
  mq=
hg qimport -r 1:tip
hg qpop -a
hg qpush 1.diff
vi test.txt
hg qrefresh
hg qpush 3.diff
vi another.txt
hg qrefresh

=== Method using convert extension


Final Checks

That's it. Check thoroughly if it worked:

Check changes between toxic and good repo. Should only output lines expected to changed from toxic to good.

diff -U0 /tmp/toxic-repo-history-unfolded.txt /tmp/good-repo-history-unfolded.txt

Check for toxic patterns in good repo history. Should not output anything

grep toxic-pattern /tmp/good-repo-history-unfolded.txt

If checks are ok, remove the old repo(s) and all intermediate files in /tmp