I recently made progress on supporting real cherry-picking, and even more importantly supporting real local undo. This leaves a surprisingly subtle question: When doing a merge, how do you decide if a given line is included or not?
The simplest way of merging a single line is for its state to be a generation count. If it has never existed it gets a count of 0. When it’s first made it gets a count of 1. When it’s deleted again it gets a count of 2, when recreated 3, etc. When merging together two things the greater generation count for each line wins. This unfortunately fails horribly when trying to do a local undo. As I mentioned in that other post, the UX for doing a local undo should be that you perform your backing out change locally, then make a patch (with extra metadata) which reverts that backing out and apply that patch to the main branch. That way the main branch ignores the backing out when a merge eventually happens. Central to this is that the backing out patch should do nothing to the main branch until the eventual merge and just cause that to get ignored. Generation counts most definitely do not do that, because while you can make an undo patch it will work by having yet even higher generation count numbers which will change the behavior of the main branch when it merges with other things.
I now have an alternative metadata format and semantics which appears to behave as desired. There are a lot of potential edge cases here, including cherry picking, undo cherry picking, applying changes across those, etc. It’s possible that there are edge cases what I’m about to say gets wrong, or some kind of an impossibility proof that you can’t get all desired edge case behavior right, or maybe this does everything right and there’s some way to derive it on first principles. I don’t have any of those things and am presenting what I’ve come up with as what appears to be the best idea so far.
The core idea is that each commit is given its own id, and the actions of a commit on a single line are that it can obviate the actions of earlier commits. To make it possible to bring lines of code into existence in the first place a bit of history is observed: At the beginning of time all lines of code existed then they were all deleted in the great nil commit. There’s probably some deep spiritual insight which can be gleaned from this, but I’m just going to use it for implementation convenience. The first creation of a line obviates the nil commit.
In summary the state of a given line is {commit_id: [earlier_commit_id]} with the earlier commit id’s being things it overrides. In order to support cherry picking it should also have a bit giving whether it adds or deletes the line, because the path to nil might not be completely included locally. To figure out if the line of code of interest is currently included you find a commit id which none of the other commits are attempting to obviate, then remove both that commit and all the commits it obviates from the list (some of them may already be not there if there’s a cherry pick). Repeat this until there are no records left. If at any point along the way the nil commit was obviated then the line of code is present. If it wasn’t then it’s not.
That was a bit of a mouthful without clear motivation. The strategy behind how to interpret the meaning of a change and turn it into metadata helps explain. The simplest ambiguous example is that you have a line of code which was previously added and removed and now it’s re-added again. Because you want changes to generally be ‘local’ it’s a good idea to interpret this as obviating the commit which did the removal rather than obviating the original nil commit. That leaves the question of what to do when there are multiple options as to how to get to where you want. For example two different branches might have separately added and removed the same line and then those were merged together and now you locally added the line again. Guessing either of them could result in truly bizarre behavior when merging with other things, so the only sane thing to do is all of them. When doing an add find all earlier removal commits which don’t already have some commit trying to obviate them and obviate all of those. Likewise when doing a removal find all earlier add commits which don’t already have something trying to obviate them and obviate all of those. This includes hitting commits which aren’t currently having any effect because they were cherry-picked, which seems to be necessary to avoid some weird edge cases. A nice feature of this approach is that if a line is locally added and removed it’s always down to that final commit being the only one which is currently relevant so the complexity of historical changes is swept away quickly even for very complex cases.
I continue to believe that good support for local undo is the feature which finally makes weave-based version control clearly superior to rebase-based hackery. (One could argue that taking all that developer time should have done it already, but people have been willing to put up with it.) This insight about how to merge individual lines is the final puzzle piece to make that possible. I don’t know why I didn’t manage to figure this all out twenty years ago, but better late than never.