Discussion about this post

User's avatar
Justin D Kruger's avatar

First Bram, I love that you are working on this. Git has been in the wild for over a decade, and version control could use some fresh thinking. Especially, some challenging ideas.

I plan to try to digest this for a bit, it's not 100% clear how this will work all of the time. I'd like to seem more examples, to fully understand it. Merges never failing do not yet make sense to me. I can only see your solution as a merge being a super position of commits.

I love the idea of improving rebasing, and you seem to be on a good path. On my current personal project I've started to use feature branches, and then merge to an intermediate branch that mirrors main, so that merge conflict resolution occurs on the intermediate branch, and then i merge ff-only back to main. this preserves history, doesn't use a squash, and keeps main with clean stable only commits.

My 3 largest version control pain points are:

* angenic coding - wip commits tend to have more changed lines of code, by a few orders of magnitude. the commit message, and the commit size are not well summarized or understood. code agents can refactor like a beast given the right prompt for hours at a time. I once switched from react to svelte in a few prompts.

* left vs. right code ancestries always seem to confuse me. rather than ordinal, I'd prefer logical assignment. which branch is left, which branch is right? who(blame)'s branch is left, who's branch is right? what commit message is left & right?

* binaries - git works with lines of code only, and also only well with 'pretty' code. if your code has syntactic sugar, is compressed, or is a binary, then git breaks down quickly.

compiled javascript that is all on one line of code is almost pointless to store in git. one statement per line works best, but if you define multiple vars, or a longer anonymous object or anything with any degree of complexity, then git treats the whole line as a change. then you have to compare A & B between the two lines and physically notice what changed. Some IDEs will show you sub token changes between the two lines, most wont.

in the era of angenic coding, imagine a pull request where two complex lines look identical, but to the human it looks like only one variable name was added or removed -- but to the machine or hacker one of the characters in one of the other variables where changed to an international character to go undetected. in this case you could sneak in a change on one of the other variables and switch code behavior unexpectedly.

on the binary front it would be cool, if version control software could become 'token' aware in much the same way of how LLMs are now first tokenizing input and output. tokens could be full variable names, or part of variable name (camel case, snake case, etc..).

maybe the version control software supports tokenizer plugins, or maybe it supports file type plugins for binaries?

maybe the version control has plugins for different languages, pdfs, bitmaps, jpgs, videos?

I've also been thinking of what I love about Resilio Sync, and what feels missing from it. Version Control seems like something that I both miss, and conflicts with Resilio Sync, Dropbox, and other file sync solutions.

Nothing really works well at both file sync across teams and version control for most files. File Sync solutions seem to conflict with git, and git doesn't work well with binaries.

6 more comments...

No posts

Ready for more?