Discussion about this post

User's avatar
Christopher's avatar

One technical correction: the Attention paper didn't make everything sublinear; it just made it much more parralelizable and practical.

George Coss's avatar

Also training humans is slower, assuming you have the training data

2 more comments...

No posts

Ready for more?