I’ve written several previous posts about how to make a distributed version control system which has eventual consistency, meaning that no matter what order you merge different branches together they’ll always produce the same eventual result. The details are very technical and involved so I won’t rehash them here, but the important point is that the merge algorithm needs to be very history aware. Sometimes you need to make small sacrifices for great things. Sorry I’m terribly behind on making an implementation of these ideas. My excuse is that I have an important and demanding job which doesn’t leave much time for hobby coding, and I have lots of other hobby coding projects which seem important as well.
My less lame excuse is that I’ve been unsure of how to display merge conflicts. In some sense a version control system with eventual consistency never has real conflicts, it just has situations in which changes seemed close enough to stepping on each other’s toes that the system decided to flag them with conflict markers and alert the user. The user is always free to simply remove the conflict markers and keep going. This is a great feature. If you hit a conflict which somebody else already cleaned up you can simply remove the conflict markers, pull from the branch which it was fixed in, and presto you’ve got the cleanup applied locally.1
So the question becomes: When and how should conflict markers be presented? I gave previous thoughts on this question over here but am not so sure of those answers any more. In particular there’s a central question which I’d like to hear peoples’s opinions on: Should line deletions by themselves ever be presented as conflicts? If there are two lines of code one after the other and one person deletes one and another person deletes the other it seems not unreasonable to let it through without raising a flag. Its not like a merge conflict between nothing on one side and nothing on the other side is very helpful. There are specific examples I can come up with where this is a real conflict, but then there are examples I can come up with where changes in code not in close proximity produces semantic conflicts as well. Ideally you should detect conflicts by having extensive tests which are run automatically all the time and conflicts will cause those to fail. The version control system flagging conflicts is for it to highlight the exact location of particularly egregious examples.
It also seems reasonable that if one person deletes a line of code and somebody else inserts a line of code right next to it then that shouldn’t be a conflict. But this is getting shakier. The problem is that if someone deletes a line of code and somebody else ‘modifies’ it then arguably that should be a conflict but the version control system thinks of that as being both sides having deleted the same line of code and one side inserting an unrelated line which happens to look similar. The version control system having a notion of individual lines being ‘modified’ and being able to merge those modifications together is a deep rabbit hole I’m not eager to dive into. Like in the case of deletions on both sides a merge conflict between something on one side and ‘nothing’ on the other isn’t very helpful anyway. If you really care about this then you can leave a blank line when you delete code if you want to really make sure replace it with a unique comment. On the other hand the version control system is supposed to flag things automatically and not make you engage in such shenanigans.
At least one thing is clear: Decisions about where to put conflict markers should only be made on whether lines appear in the immediate parents and the child. Nobody wants the version control system to tell them ‘These two consecutive lines of code both come from the right but the ways they merged with the left are so wildly different that it makes me uncomfortable’. Even if there were an understandable way to present that history information to the user, which there isn’t, everybody would respond by simply deleting the pedantic CYA marker.
I’m honestly unsure whether deleted lines should be considered. There are reasonable arguments on both sides. But I would like to ignore deleted lines because it makes both UX and implementation much simpler. Instead of there being eight cases of the states of the parents and the child, there are only four, because only cases where the child line is present are relevant2. In all conflict cases there will be at least one line of code on either side. It even suggests an approach to how you can merge together many branches at once and see conflict markers once everything is taken into account.3
It may be inevitable that I’ll succumb to practicality on this point, but at least want to reassure myself that there isn’t overwhelming feedback in the other direction before doing so. It may seem given the number of other features I’m adding that ‘show pure deletion conflicts’ is small potatoes, but version control systems are important and feature decisions shouldn’t be made in them without serious analysis.
It can even cleanly merge together two different people applying the exact same conflict resolution independently of each other, at least most of the time. An exception is if one person goes AE → ABDE → ABCDE and someone else goes AE → ACE → ABCDE then the system will of necessity think the two C lines are different and make a result including both of them, probably as A>BCD>BCD|E. It isn’t possible to avoid this behavior without giving up eventual consistency, but it’s arguably the best option under the circumstances anyway. If both sides made their changes as part of a single patch this can be made to always merge cleanly.
The cases are that a line which appears in the child appears in both parents, just the left, just the right, or neither. That last one happens in the uncommon but important criss-cross case. One thing I still feel good about is that conflict markers should be lines saying ‘This is part of a conflict section, the section below this came from X’ where X is either local, remote, or neither, and there’s another special annotation for ‘this is the end of a conflict section’.
While it’s clear that a line which came from Alice but no other parent and a line which came from Bob but no other parent should be in conflict when immediately next to each other it’s much less clear whether a line which came from both Alice and Carol but not Bob should conflict with a line which came from Bob and Carol but not Alice. If that should be presented as ‘not a conflict’ then if the next line came from David but nobody else it isn’t clear how far back the non-David side of that conflict should be marked as starting.