CVS merge algorithm

What algorithm does CVS use when merging two branches (using -j)?

  • Are cvs tags, branch, or date?
  • Does it just do a simple text diff (using the unix diff tool for example)?
  • Does it use a two-way or three-way interface?
  • If it uses 3 different diffs, which baseline does it use?



source to share

3 answers

I remember a while ago that the CVS merge actually uses the diff3 algorithm to perform the merge.

This PDF article Official Investigation into Diff3, Sanjeev Hannah, Keshav Kunal, and Benjamin S. Pierce of the University of Pennsylvania describe the diff3 algorithm.

If the focus is on the properties of the merge algorithm itself, rather than how it integrates with CVS.

Answering your questions:

Tag, event awareness

From the CVS man page:

-j tag [: date] Merge changes with the changes indicated by the tag, or when a date is specified and the tag is a branch tag, the version from the branch tag is as it was on the date

2 or 3 ways and text / binary inform

diff3 uses plain text diff by default. It compares (distinguishes) 3 versions of a file.

From diff3 man page:

If diff3 thinks that any of the files it is comparing are binary (a non-text file), it usually reports errors, since such comparisons are usually not useful. As with diff, you can force diff3 to look at all files for text files and compare them line by line using either -a or --text.

Basic version when compared

The base version, according to the linked article, is the last common version (O) between the two current versions of the file (A and B). It first uses a two-way algorithm to find the longest common subsequences between O and A and O and B.

Then (quoted from the article):

... takes areas where O is different from A or B and concatenates those that overlap, resulting in alternating sequences of stable (all replicas are equal) and unstable (one or both copies changed) chunks shown in Figure 1 (c). 3 Finally, it looks at what changed in each chunk and decides what changes can be propagated as shown in Figure 1 (d) - there, the second block is only changed in (by inserting 4, 5), so this change can be extended to B, but the fourth piece has changes in both A and B, so nothing can be extended.



There are two different forms of the "merge" command that you can use that do subtly different things:

  • cvs up -j TAG

  • cvs up -j TAG1 -j TAG2

The difference between flavors is how the "base" revision is selected, but the basic algorithm for any flavors is that for each file, the difference between the two revisions selected by CVS is applied on top of your current working copy.

In the first form, the merge base is the common ancestor of the given TAG and the revision of the working copy. So let's say your local revision is 1.38 (rev # 38 on HEAD) and you merge at (rev 2 on branch 4 of rev # 34 on HEAD) - the common ancestor will be 1.34. I believe this variant is doing a three-way merge using the two differences 1.34..1.38 and 1.34.., creating conflicts where they are not the same.

In the second form, you specify the base revision yourself, so the end result is about the same as cvs diff -r TAG1 -r TAG2 | patch

, except for getting conflict markers from CVS.




uses three revisions of the file, merging the differences between them in the third. The actual merge uses no information other than the three extracted files. Details can be found in the source code CVS

. This function is in RCS_merge

in src/rcscmds.c

which uses call_diff3

from diff/diff3.c

(or something like that). You can, of course, satisfy your curiosity by looking at the sources themselves CVS




All Articles