Revise hg Repo: Difference between revisions

From JoBaPedia
Jump to navigation Jump to search
No edit summary
 
(5 intermediate revisions by the same user not shown)
Line 7: Line 7:
Ok, this literally took months to figure out, but in the end it was so easy.
Ok, this literally took months to figure out, but in the end it was so easy.


Prerequisites are
=== Preparations ===
* make sure all existing repos are at the same level and there are no pending commits (hg status)
* make sure the repo directories contain no files you want to keep (or save them now)
 
=== Prerequisites ===
* you have full control over all clones, because they have to be deleted and replaced by the revised repository or the revised info will come back during syncs.
* you have full control over all clones, because they have to be deleted and replaced by the revised repository or the revised info will come back during syncs.
* merges need to be checked manually (worked for me, but may not work)
* there is basically only one branch possible from start to end changelist (no merges)
* no branching (perhaps needs manual intervention)


If you know the changes are only in one revision and one file (or a low number at least) it may be better to use quilt extention
If you know the changes are only in one revision and one file (or a low number at least) it may be better to use mq extention


If it is only about author changes, it may be better to use convert extention
If it is only about author changes, it may be better to use convert extention
Line 41: Line 44:
  cd -
  cd -


=== Method using quilt extension ===
=== Method using mq extension ===


This works with some changes in some files, not for metadata changes.
This works with some changes in some files, not for metadata changes.
Line 68: Line 71:
  cd -
  cd -


=== Method using convert extension
=== Method using convert extension ===


This works for changing authornames.
cd repodir
hg log -gp >/tmp/toxic-repo-history-unfolded.txt
cd ..
vi ~/.hgrc
  [extensions]
  hgext.convert=
mv repodir repodir.old
vi author-replace.lst
  use lines like oldauthor=newauthor (usual format "name <email>")
hg convert --authors author-replace.lst repodir.old repodir
cp -av repodir.old/.hg/hgrc repodir/.hg/
cd repodir
vi .hg/hgrc
  # could look similar to this:
  [paths]
  default = ssh://user@host:port//srv/hg/sourcerepo
  [ui]
  username = me <me@mail.com>
  [web]
  contact = me <me@mail.com>
  description = The Repo
  allowgz = true
hg log -gp >/tmp/good-repo-history-unfolded.txt
cd ..
Do extra checks after next commit (I had old authornames reappearing sometimes, dont know why)


=== Final Checks ===
=== Final Checks ===
Line 81: Line 112:
  grep toxic-pattern /tmp/good-repo-history-unfolded.txt
  grep toxic-pattern /tmp/good-repo-history-unfolded.txt


If checks are ok, remove the old repo(s) and all intermediate files in /tmp
If checks are ok, remove the old repo(s) and all intermediate files in /tmp and replace all cloned repos now
 
=== Replace clones with the new version ===
If eclipse is in use, then by removing the eclipse project with files (except .hg/hgrc) ,
then create mercurial project by cloning, then copy hgrc back
 
If eclipse is not used: cp -av repodir/.hg/hgrc hgrc.bak; rm -r repodir; hg clone sourcerepo repodir; cp -av hgrc.bak repodir/.hg/hgrc
 
Make sure all repodir/.hg/hgrc have valid authors and all clones are replaced.
If not, you will have to repeat the whole procedure after the next commit/push/pull ;)

Latest revision as of 18:51, 17 October 2011

Revise a HG Repository

I have a mercurial repository, that I wanted to publish for a long time. But it contains some infos in its history (files, author, commit messages) that are not intended for public audience.

How do I get rid of this in the entire repositories history and not just at tip?

Ok, this literally took months to figure out, but in the end it was so easy.

Preparations

  • make sure all existing repos are at the same level and there are no pending commits (hg status)
  • make sure the repo directories contain no files you want to keep (or save them now)

Prerequisites

  • you have full control over all clones, because they have to be deleted and replaced by the revised repository or the revised info will come back during syncs.
  • there is basically only one branch possible from start to end changelist (no merges)

If you know the changes are only in one revision and one file (or a low number at least) it may be better to use mq extention

If it is only about author changes, it may be better to use convert extention

Method that works without hg extensions

This works with many changes in files and meta data

start=5 (no changes needed up to and including this revision)
end=10 (last repository revision)
hg clone originalrepo revisedrepo --rev $start
cd originalrepo
hg log -gp >/tmp/toxic-repo-history-unfolded.txt
i=$((start + 1))
while [ $i -le $end ]; do 
   hg export --rev $i -g >/tmp/changelist-$i.export
   sed -i 's/toxic-pattern/good-pattern/g' /tmp/changelist-$i.export
   i=$((i+1))
done
cd -
cd revisedrepo
i=$((start + 1))
while [ $i -le $end ]; do 
   hg import /tmp/changelist-$i.export
   i=$((i+1))
done
hg log -gp >/tmp/good-repo-history-unfolded.txt
cd -

Method using mq extension

This works with some changes in some files, not for metadata changes.

This edits file test.txt that was wrong in hg rev 1 and another.txt from rev 3.

hg clone orig revised
cd orig
hg log -gp >/tmp/toxic-repo-history-unfolded.txt
cd -
cd revised
vi .hg/hgrc
  [diff]
  git = True
  [extensions]
  mq=
hg qimport -r 1:tip
hg qpop -a
hg qpush 1.diff
vi test.txt
hg qrefresh
hg qpush 3.diff
vi another.txt
hg qrefresh
hg log -gp >/tmp/good-repo-history-unfolded.txt
cd -

Method using convert extension

This works for changing authornames.

cd repodir
hg log -gp >/tmp/toxic-repo-history-unfolded.txt
cd ..
vi ~/.hgrc
  [extensions]
  hgext.convert=
mv repodir repodir.old
vi author-replace.lst 
  use lines like oldauthor=newauthor (usual format "name <email>")
hg convert --authors author-replace.lst repodir.old repodir
cp -av repodir.old/.hg/hgrc repodir/.hg/
cd repodir
vi .hg/hgrc
  # could look similar to this:
  [paths]
  default = ssh://user@host:port//srv/hg/sourcerepo
  [ui]
  username = me <me@mail.com>
  [web]
  contact = me <me@mail.com>
  description = The Repo
  allowgz = true
hg log -gp >/tmp/good-repo-history-unfolded.txt
cd ..

Do extra checks after next commit (I had old authornames reappearing sometimes, dont know why)

Final Checks

That's it. Check thoroughly if it worked:

Check changes between toxic and good repo. Should only output lines expected to changed from toxic to good.

diff -U0 /tmp/toxic-repo-history-unfolded.txt /tmp/good-repo-history-unfolded.txt

Check for toxic patterns in good repo history. Should not output anything

grep toxic-pattern /tmp/good-repo-history-unfolded.txt

If checks are ok, remove the old repo(s) and all intermediate files in /tmp and replace all cloned repos now

Replace clones with the new version

If eclipse is in use, then by removing the eclipse project with files (except .hg/hgrc) , then create mercurial project by cloning, then copy hgrc back

If eclipse is not used: cp -av repodir/.hg/hgrc hgrc.bak; rm -r repodir; hg clone sourcerepo repodir; cp -av hgrc.bak repodir/.hg/hgrc

Make sure all repodir/.hg/hgrc have valid authors and all clones are replaced. If not, you will have to repeat the whole procedure after the next commit/push/pull ;)