Large Models and Coin Minting, Continued (Gemini 2.5 Pro Translated Version)
Some viewpoints in this article are derived from the following paper, as well as some English-to-English elaborations based on GPT-4o. Readers with the ability and inclination may directly consult the original text.
Boisseau, Éloïse. “Imitation and Large Language Models.” Minds and Machines 34.4 (2024): 42.
On the question of whether large models understand language, a spectrum of views can be listed, from “complete understanding” to “complete non-understanding,” for example:
- (Complete Understanding) Usually also known as (pan-)computationalism, which holds that cognition and consciousness are essentially computational processes, and mental states and processes can be explained by computational models. Therefore, large models are a type of ideal model that can imitate the human nervous system and fully understand what they process.
- (Partial Understanding) The model’s output is a mixture of understanding some logic and not understanding others. This is similar to some students who “seem” to understand the meaning of a formula in class and can use these formulas for calculations, but if you dig deeper, they don’t truly understand the full implications of these formulas; they are merely using them in a seemingly competent manner.
- (Limited Understanding) The model only possesses linguistic knowledge but lacks real-world practical knowledge. For example, a model can clearly explain how to perform division but cannot actually perform division calculations. Although the model can talk like an expert, it cannot understand how to apply this knowledge in reality.
- (Complete Non-understanding) The “Stochastic Parrots” school, which believes that large models are merely imitating human language in a formal way, like parrots, without understanding any of its meaning.
We will not specifically debate which understanding is correct here. Our focus is that the description of “understanding” itself is largely explained by the correlation and similarity between the model’s behavior and human (or parrot) behavior. To put it plainly, we are concerned with the extent to which the judgment “large models are imitating humans or engaging in behavior similar to humans” is correct. This requires us to provide an accurate definition of “imitation.” Simplifying the various cumbersome discussions in the aforementioned original paper, we basically consider “imitation” to be a term with the following attributes:
- The core of imitation lies in similarity, and this similarity cannot be accidental. For example, if I criticize ByteDance’s large model a few times, it doesn’t mean I am imitating “Length Unit-chan’s” (the author’s previous self-description) way of speaking; it’s just an accidental similarity based on shared consensus.
- Imitation should be between different subjects. I cannot imitate my own writing style, but I can use my own writing style to criticize a certain large model a hundred times.
- Imitation has two forms: one is imitative behavior itself, and the other is a “status of imitation” (物态 - a state or material manifestation resulting from imitation). For example, the act of me using Length Unit-chan’s speech pattern to be sarcastic about a certain company is a form of imitation. If I write an article mocking a large model in such a way that an uninformed third party might think it was written by Length Unit-chan, then that article can be considered a “status of imitation.”
- A more intuitive example: Counterfeit currency is a “status of imitation” produced by imitative behavior because it intentionally creates similarity to another thing.
- Of course, I personally hold a relatively reserved attitude towards these two forms from the aforementioned original paper. It’s hard to explain some boundary cases. If I use a certain pattern to sarcastically critique a company, but I am actually writing an advertorial (puff piece), is this behavior an imitation of Length Unit-chan, if the core meaning of the behavior lies in a negative attitude towards the target?
- Therefore, in my view, if imitative behavior itself is directly defined as a state in which similarity can be intentionally created, it would better align with certain logical lines of thought. For example, if a company issues its own “O-coins” usable on all its products, although O-coins are not counterfeit currency, the act of issuing them can be seen as an imitative state of a bank.
- Imitation is different from duplication. A duplicate should have the same core effect. Even if “imitative behavior” and “duplicative behavior” sometimes have the same appearance, after excluding some ambiguous scenarios, the effects achieved by these two should be clearly distinguishable. Taking coin minting as an example again, even if we use the same molds as real currency and the same process to mint, we are still producing an imitative status, i.e., counterfeit currency. Conversely, a legally authorized institution produces duplicates of currency.
- Imitation is different from simulation. The relationship between a simulation and the simulated thing differs from that between imitation and the imitated. Simply put, the implementation mechanism of a simulation and even its results can be completely different from the simulated object, whereas imitation should be as close as possible.
- Clearly, if we consider large language models to be a simulation, then we are discussing whether we understand large language models, not whether large language models understand language. Although one could indeed argue this, based on the idea that model neurons are simulations of human brain neurons, this departs from the original scope of discussion.
Based on the above definitions, the author of the aforementioned original paper believes that large models are separate from imitative behavior itself, nor are they a “status of imitation” produced by imitative behavior. To put it directly, large models are just unconscious counterfeit coin minting machines, and their products are a “status of imitation” of human language. The original paper’s author has a very strong assumption that imitative behavior itself should be based on a comparable “original behavior.” For example, parrots, in addition to imitating human speech, have their own set of survival behaviors and logic; parrots can communicate with each other using their “bird language.” Large language models (after their training is complete) have only one behavior: outputting human language. Conversely, if a large language model is installed on a robot system, we can consider the entire system to be an imitation of human behavior, because a robot—or a mechanical system itself—does not necessarily have to replicate specific human functions to perform tasks.
At the end of the article, I will discuss my personal views:
-
Firstly, from a framework perspective, the conclusion that large models are machines for minting counterfeit currency is not necessarily wrong, but it is overly broad. To deny that large models have an understanding of language itself based on this goes somewhat beyond the boundaries of the framework, because the process of creating a “status of imitation” is very complex. Especially with the recent popularity of test-time computing or long Chain-of-Thought (CoT) reasoning, these are output modes “spontaneously” formed by the model after guidance, distinct from general inference modes, and should be seen as an imitation of human thought processes.
-
Secondly, the nature of language should not be overestimated. The vast majority of everyday language consists of fixed expressions (formulaic language); that is, language use does not entirely depend on creative grammatical rules but relies heavily on prefabricated, high-frequency, and holistically stored “chunks.” Fluent language output results from rapid “retrieval” rather than “generation.” Therefore, one cannot deny that the behavior of large models is imitation based on this, much less say that large models do not understand language.
-
Finally, the author of the original paper treated the ability of large models to output text as almost prophetic (implying LLM output is perfectly indistinguishable), meaning that language output by large models is absolutely indistinguishable from human language. However, this is not the case. Humans can generally distinguish model-generated content quite easily, not to mention certain internal models that, despite being trained with hundreds of people and tens of thousands of GPUs, can’t even speak human language properly and are despised by their entire company. This actually provides a kind of “original state” (baseline state) for the model’s behavior. In this state, the model can be instructed to imitate specific characteristic linguistic styles or patterns.
Enjoy Reading This Article?
Here are some more articles you might like to read next: