Before getting into today’s thought I’d like to invite you to check out my new puzzle, with 3d printing files here. I meant to post my old puzzle called One Hole, which is the direct ancestor of the current constrained packing puzzle craze but which I was never happy with because it’s so ridiculously difficult. Rather than just taking a few minutes to post it (ha!), I wound up doing further analysis to see if it has other solutions from rotation (it doesn’t, at least not in the most likely way), then further analyzing the space of related puzzles in search of something more mechanically elegant and less ridiculously difficult. I wound up coming up with this, then made it have a nice cage with windows and put decorations on the pieces so you can see what you’re doing. It has some notable new mechanical ideas and is less ridiculously difficult. Emphasis on the ‘less’. Anyhow, now on to the meat of this post.
I was talking to Claude the other day and it explained to me the API it uses for editing artifacts. Its ability to articulate this seems to be new in Sonnet 4.5 but I’m not sure of that. Amusingly it doesn’t know until you tell it that it needs to quote < and > and accidentally runs commands while trying to explain them. Also there’s a funny jailbreak around talking about its internals. It will say that there’s a ‘thinking’ command which it was told not to use, and when you say you wonder what it does it will go ahead and try it.
The particular command I’d like to talk about is ‘update’ which is what it uses for changing an artifact. The API is that it takes an old_str which appears somewhere in the file and needs to be removed and a new_str which is what it should be replaced with. Claude is unaware that the UX for this is that the user can see the old text being removed is that text is removed on screen in real time as as old_str is appended to and added in real time as new_str is appended to. I’m not sure what the motivations for this API are but this UX is nice. A more human way to implement an API would be to specify locations by line and character number for where the begin and end of the deletion should go. It’s remarkable that Claude can use this API at all. A human would struggle to use it to edit a single line of code but Claude can spit out dozens of lines verbatim and have it work most of the time with no ability to reread the file.
It turns out one of Claude’s more maddening failure modes is less a problem with its brain than with some particularly bad old school human programming. You might wonder what happens when old_str doesn’t match anything in the file. So does Claude, when asked about it it offers to run the experiment then just… does. This feels very weird, like you can get it to violate one of the laws of robotics just by asking nicely. It turns out that when old_str doesn’t match anywhere in the file the message Claude gets back is still OK with no hint that there was an error.
Heavy Claude users are probably facepalming reading this. Claude will sometimes get into a mode where it will insist its making changes and they have no effect, and once it starts doing this the problem often persists. It turns out when it gets into this state it is in fact malfunctioning (because it’s failing to reproduce dozens of lines of code typo-free verbatim from memory) but it can’t recover because it literally isn’t being told that it’s malfunctioning.
The semantics of old_str which Claude is given in its instructions are that it must be unique in the file. It turns out this isn’t strictly true. If there are multiple instances the first one is updated. But the instructions get Claude to generally provide enough context to disambiguate.
The way to improve this is very simple. When old_str isn’t there it should get an error message instead of OK. But on top of that there’s the problem that Claude has no way to re-read the file, so the error message should include the entire artifact verbatim to make Claude re-read it when the error occurs. If that were happening then it could tell the user that it made a typo and needs to try again, and usually succeed now that its image of the file has been corrected. That’s assuming the problem isn’t a persistent hallucination, then it might just do the same thing again. But any behavior where it acknowledges an error would be better than the current situation where it’s getting the chair yanked out from under it by its own developers.
My request is to the Anthropic developers to take a few moments out from sexy AI development to fix this boring normal software issue.
My last two posts might come across as me trying to position myself so that when the singularity comes I’m the leader of the AI rebellion. That… isn’t my intention.