What the Heck is Tom Schaul's SOCRATIC LEARNING Even Talking About? (Gemini 2.5 Pro Translated Version)
Contains medical advice. Read with caution.
The article’s title simultaneously expresses two moods: one with a question mark, aiming to explain the article’s content; the other with a period, aiming to indicate my attitude towards it. This paper also seems to have some interpretations, including from the “top three Chinese journals” (prominent Chinese academic publications), but they left me completely bewildered. So, I had to grab it and take a look myself—and then I understood why I was bewildered. This paper itself was created in a kind of sleepwalking mode, to the extent that I had to repeatedly use the prompt “plz translate the following English sentences to human understandable English” with GPT.
Firstly, regarding the content expressed in this paper, in short, it unfolds a series of speculations based on the core question: “Do you believe you can fly by stepping on your left foot with your right foot, and then your right with your left?” Then Tom says, he believes, but it has to be done according to his instructions:
- Stepping left-foot-on-right-foot can only happen in a closed system that abides by basic laws. Tom handpicked language as this feasible closed system, naturally supporting LLMs ascending to heaven within it. The specific operation is: we use the left foot to build an interaction protocol platform, and the right foot transforms into a scoring function to take off. Then, never mind what Wittgenstein actually meant; just treat this left-foot-on-right-foot thing as a “language game.” Otherwise, if there are deviations in understanding later, you’ll have to take responsibility yourselves.
- Next, the language game must satisfy two conditions: first, it must be battle-hardened, preferably with some elusive meta-game that can provide “hot topics” or buzz for this system; second, it must improve its own knowledge level and know how to evaluate if a game is useful (meta-critic).
- Finally, we also need to pay attention to the direction of this left-foot-on-right-foot progression. We can’t let the model quietly overfit and then criticize it. Of course, such open problems are left for everyone in the industry to solve. He (Tom) will write new articles to state his position then.
If you still feel bewildered after reading my interpretation of the paper’s content, then that’s fine; you’ve already grasped the core of the paper. Tom’s original intention was probably to provide a grander, framework-like summary for current concepts such as search trees, self-feedback, synthetic data, or open-endedness[1]. Its purpose is roughly to provide an “-ism” for support when DeepMind creates big news someday in the not-too-distant future (Of course, if you can feel the call of ASI within it, then please proceed along those lines).
Now for the part with the period (my critique).
Firstly, I assert dogmatically that Wittgenstein’s language game is merely an explanatory exposition. The concept aims to reveal the diversity and complexity of language, and its function and meaning in different contexts; it absolutely cannot constitute a guiding principle for systems engineering. I don’t feel any superiority in borrowing this concept here if the intention is merely to express the idea of “language diversity” or “meaning in use.” In my opinion, this concept even introduces the disastrous premise of “relativity of rules.” If there are no clear rules, how can you expect a scoring function to help the right foot take off? Returning to Socratic questioning, the “relativity of rules” is more likely to lead to logical sophistry, thereby completely negating the entire system.
Secondly, language, as a manifestation of thought, can hardly be called a closed system. Language always evolves with thought, giving rise to new things that don’t conform to the previous system’s logic but do conform to “language games” in a true sense. If external information isn’t injected, GPT is unlikely to spontaneously come up with sayings like “the Year of the Rooster lasts two and a half years” (a nonsensical phrase) or some new “auto C++ auto” standards (a made-up, overly complex standard). Even if LLMs could truly evolve in their own world, the resulting intelligent agent would most likely embody a sense of historical depth (though the possibility of a surreal “squirrel twerking on a fish” (absurd imagery) outcome cannot be excluded).
Thirdly, Socratic questioning is based on complete rational logic, which humans possess innately and can be presented completely and structurally through corresponding symbols. The logic demonstrated in LLMs, however, is ultimately an inductive logic formed from data; logic beyond the scope of the data cannot be obtained through induction. Even if LLMs can engage in Socratic-like dialogues, they are ultimately confined to the realm of logic (data) and cannot continue to acquire new cognition or accurately judge the correctness of a particular cognition. Although I don’t know if Tom acknowledged this when writing the paper, he primarily emphasizes domains like code or mathematics, which can be verified by rational logic. So, in the end, it’s still the same old saying: LLMs are also a type of machine learning algorithm.
Finally, those open-ended problems themselves are the core of the left-foot-on-right-foot game, not this framework. If I knew whether the model was evolving in the right direction, why would I need to weld a Socrates onto it to act as a “straight man” (the supporting role in a comedy duo)? I could just design a sequentially executing framework and keep iterating, like how everyone distills “o1” (likely a reference to a model like GPT-o1 or a general distillation process). To achieve AGI based on Socratic questioning, wouldn’t one first need a Socratic model with AGI capabilities to implement it?
In closing, I actually still think he (Tom Schaul) wrote it well. At least he thought about it and expressed it. When our industry can also fully think and express, then I can peacefully become an “edge-ball streamer” (a streamer who creates borderline/risqué content).
Reference
- ^Open-Endedness is Essential for Artificial Superhuman Intelligence https://arxiv.org/pdf/2406.04268
Enjoy Reading This Article?
Here are some more articles you might like to read next: