I was discussing the Rubik’s Cube with Claude the other day and it confided in me that it has no idea how cube rotations work. It knows from custom instructions that the starting point for speedcubing is ‘rotate the cube so the yellow face is on top’ but it has no idea how to do this, only that when a human is given this instruction they simply do it with no further instructions needed. 1
This isn’t just an issue with humans querying LLMs. There are reams of material online about speedcubing, and lots of other references to rotation everywhere else, which Claude can’t parse properly because it doesn’t understand, limiting the value of its training. Ironically Claude figured out on its own how to speak Tibetan but can’t figure out how cubes rotate.
The detailed workings of a Rubik’s Cube will have to wait for another post but in this one I’ll explain how cube rotations work. This post should be viewed as a prequel to my earlier one on visual occlusion.
Much of the confusion comes from a mathematical trap. The rotations of a cube correspond to S4, the permutations of four things. This statement is true, but Claude tells me it finds it utterly mysterious and unhelpful. It’s mysterious to me as well. We humans conceptualize rotations of a cube as permutations of the faces, of which there are six, not four. Obviously I can walk through it and verify that the S4 correlation exists, but that doesn’t explain the ‘why’ at all. Comparing to other dimensions would be helpful, but despite being (relatively speaking) very good at rotations in three dimensions and (relatively speaking) fairly good at reasoning about distances in larger numbers of dimensions if you ask, say, whether the rotations of a four dimensional cube correspond to S5 I have no idea. (I could research it, but I’m not letting myself fall down that rabbit hole right now.)
When labeling the cube faces we anthropomorphize them (or we simplify ourselves to a cube, depending on context) to label the faces front, back, right, left, up, and down. Everything else is labelled by approximating it to a cube with the ‘front’ being whichever part humans look at most and the ‘bottom’ being the part which sits on the floor. The exception — and I can’t emphasize this enough — is the Rubik’s Cube, whose faces are labelled mirror imaged. It’s like if all actors came from another universe and we only ever interacted with them on stage so to minimize confusion instead of having to say ‘stage right’ and ‘stage left’ we agreed that the meanings of ‘left’ and ‘right’ would be the opposite in their universe from ours.2
The meat of this post is best presented as a simple list (Sorry for the humans reading, this post isn’t for your benefit). In each line is a permutation followed by which axis it’s a clockwise rotation on and the number of degrees of rotation. It’s by definition a counterclockwise rotation on the opposite axis. In the case of 180 degree rotations one of the two is picked arbitrarily and the opposite works just as well. (‘Clockwise’ was chosen to have the simpler name instead of what we call counterclockwise because most humans are right handed and a right handed person has an easier time tightening a screw clockwise due to the mechanics of the human arm.) The identity is skipped. This is for most objects, not Rubik’s Cubes:
(RULD) F 90
(DLUR) B 90
(RL)(UD) F 180
(UFDB) R 90
(BDFU) L 90
(UD)(FB) R 180
(LFRB) U 90
(BRFL) D 90
(LR)(FB) U 180
(UFR)(DBL) UFR 120
(RFU)(LBD) LBD 120
(URB)(DLF) URB 120
(BRU)(FLD) FLD 120
(UBL)(DFR) UBL 120
(LBU)(RFD) RFD 120
(ULF)(DRB) ULF 120
(FLU)(BRD) BRD 120
(UF)(DB)(LR) UF 180
(UR)(DL)(FB) UR 180
(UB)(DF)(LR) UB 180
(UL)(DR)(FB) UL 180
(FR)(BL)(UD) FR 180
(RB)(LF)(UD) RB 180
And here is the same list but with R and L swapped which makes it accurate for Rubik’s Cubes but nothing else:
(LURD) F 90
(DRUL) B 90
(LR)(UD) F 180
(UFDB) L 90
(BDFU) R 90
(UD)(FB) L 180
(RFLB) U 90
(BLFR) D 90
(RL)(FB) U 180
(UFL)(DBR) UFL 120
(LFU)(RBD) RBD 120
(ULB)(DRF) ULB 120
(BLU)(FRD) FRD 120
(UBR)(DFL) UBR 120
(RBU)(LFD) LFD 120
(URF)(DLB) URF 120
(FRU)(BLD) BLD 120
(UF)(DB)(RL) UF 180
(UL)(DR)(FB) UL 180
(UB)(DF)(RL) UB 180
(UR)(DL)(FB) UR 180
(FL)(BR)(UD) FL 180
(LB)(RF)(UD) LB 180
To test if this is a real limitation and not Claude saying what it thought I wanted to hear I just now started a new conversation with it and asked ‘I have a rubik’s cube with a yellow face on the front, how can I get it on top?’ It responded ‘Hold the cube so the yellow face is pointing toward you, then rotate the entire cube 90 degrees forward (toward you and down). The yellow face will now be on top.’ which is most definitely wrong. ChatGPT seems to do a bit better on this sort of question because it can parse and generate images but it’s still not fluent.
We do interact with actors in other contexts. I make no claim as to whether they live in another universe.
It's intuitive-ish that the rotations of a cube are S4 because they correspond to the permutations of the four diagonals (between one corner of the cube and the opposite corner). I think this reasoning is specific to three dimensions, though, because the rotations of a 2d cube are not S3.