I tried the same prompt a lot of times and saw “chain of thought” attempts complete with the state modeling… they must be augmenting the training dataset with some sort of script generated crap.
I have to say those are so far the absolute worst attempts.
Day 16 (Egg 3 on side A; Duck 1, Duck 2, Egg 1, Egg 2 on side B): Janet takes Egg 3 across the river.
“Now, all 2 ducks and 3 eggs are safely transported across the river in 16 trips.”
I kind of feel that this undermines the whole point of using transformer architecture instead of a recurrent neural network. Machine learning sucks at recurrence.
Frigging exactly. Its a dumb ass dead end that is fundamentally incapable of doing vast majority of things ascribed to it.
They keep imagining that it would actually learn some underlying logic from a lot of text. All it can do is store a bunch of applications of said logic, as in a giant table. Deducing underlying rules instead of simply memorizing particular instances of rules, that’s a form of compression, there wasn’t much compression going on and now that the models are so over-parametrized, even less.