Language as General Intelligence

Until recently I agreed with David Deutsch’s view that we can’t program an AGI until we understand the nature of human intelligence & creativity. But I’m starting to change my mind. This thread is worth a read:

AI research is converging on a major finding: language models are a great substrate for all AI applications.

This feels like a HUGE deal.
— Sergey Karayev (@sergeykarayev) April 25, 2022

Recent language models have shown reach beyond just language—in robotics, protein folding, etc. Perhaps there’s something special about the knowledge of language that allows a jump to universality.

I don’t know exactly what that is, but Deutsch himself says that the crucial thing is not to understand every detail, but to have a good explanation of why. So here’s the case for why the knowledge of language led to human intelligence and could lead to AGI.

Universal Explainers

What makes us human? As Brett Hall puts it:

People are universal explainers. We explain stuff—our lives, science, how things work.
We create new explanations: new theories. Creativity is what we have, and what animals (and computers!) lack.

Since humans are universal explainers and our closest ape relatives are not, what happened? Perhaps the evolution of language is key. Imagine simple ape gestures representing a limited vocabulary evolving into a greater repertoire (as seen between bonobos and chimps). As the hardware of the brain & vocal systems evolved alongside the software of memes, a jump to language universality occurred and the repertoire of human language became infinite.
Key pieces of that jump:

Phones → words – all human languages combine sounds into lexical items.
Shared semantics – different languages carve up reality in surprisingly similar ways.
Grammar – every language structures words into sentences.

Explanatory ability could be infused into this very structure. We often believe that language allows us to communicate meaning & logic. But perhaps it’s the other way around—meaning & logic depend on language abilities in the first place.

The relationship between language and universal explainers is explored by Bruce Nielson:

Popper unintentionally took a stance on one of the requirements for being a universal explainer (or General Intelligence). Specifically, he argued that language was a requirement for his epistemology to work and thus a requirement for general intelligence.

Voila: language abilities and universal explanatory abilities could be the same thing. The vagueness of that claim shouldn’t worry us—vagueness is often the first step toward a sharper explanation.

Universal Writing Systems

Another crucial jump to universality occurred with writing systems capable of representing any word. As Deutsch notes in The Beginning of Infinity (ch. 6):

A small change in a system to meet a parochial purpose just happened to make the system universal as well.

So we have a universal system—writing—created accidentally on top of another accidental universal system—language. Is it surprising that machine-learning models parochially trained on massive corpora of writing also display signs of universality, accidentally, without their creators understanding the details?

Language Models & the Brain

I’m not a machine-learning expert but have a general grasp. Attention-based models assign vectors representing the importance of words to one another, building a higher-level symbolic representation of the dataset.

I’m proposing a similarity between that representation and the structure of the knowledge of language in our brains—and that this inherently includes the universal explanatory abilities of AGI.

We don’t need to understand the specifics; they’re emergent. Evolution didn’t “understand” what language really means—it just selected for increasing language ability until that ability leapt to universality. Why should our artificial attempts be any different?

Deutsch says that human creativity and AGI are “the only kind of universality capable of transcending their parochial origins.” Our AI so far has been parochial, but we may be on the cusp of leaping beyond those origins into something general—simply by improving language models and the knowledge within them. Understanding exactly how may only be clear in retrospect, but perhaps the above begins to explain the why.

Many thanks to @bnielson for feedback and criticism on an earlier draft.