The Math of Fitting In

Think back to the first week at a new job, at a new school, in a new city. You weren’t sure whether to say “hi” to neighbors in passing or keep your eyes down. You didn’t know whether to speak up in meetings or hang back, to call people by their first names or formal titles. Then, one day, it clicked. Suddenly you knew what to do.

How exactly do we learn those unspoken social rules? A recent study in Proceedings of the National Academy of Sciences by Charles Yang, Professor of Linguistics and Computer and Information Science, and colleagues from Stanford University and City University of New York proposes a theory: The same cognitive law that governs how toddlers learn language also applies to learning social conventions.

Language and Social Learning

Social conventions are arbitrary customs that influence everything from how close we’ll stand to a stranger to whether we shake hands or kiss on the cheek. The two dominant theories for how people learn these conventions comprise the imitation model, where you copy the person with whom you most recently interacted, or the optimizer model, where you conform to whatever’s repeated most frequently.

Yang wasn’t satisfied with either. “Something didn’t feel quite right about this kind of decision making,” he says. “And the reason I didn’t feel quite right about it really came from language.”

Language, he points out, is full of inconsistencies. Take English verbs: Most become past tense when adding “-ed” to the end—but some don’t. These irregular verbs, such as “go” or “think,” extend the time it takes young children to feel confident with the rules of grammar. And yet, even as toddlers hear a jumble of regular and irregular verbs, they eventually form mental constructs for how past tense verbs typically work.

Previous linguistics research on language acquisition produced a mathematical formula called the Tolerance Principle that predicts exactly when that shift happens. That formula states that a pattern becomes a rule only when exceptions are very rare compared to the majority. Yang had a hunch that same threshold might govern social learning.

“Standing from two miles above the horizon, language learning and social learning look pretty similar,” he says. “You have to make a commitment when the data isn’t 100 percent consistent.”

The Name Game

To test this theory, Yang and his co-authors used a game where online participants saw a picture of a stranger’s face and had to guess their name. They were then randomly paired with another player; if both players guessed the same name, they earned a reward. If not, they lost points. The names themselves were arbitrary—what the researchers wanted to know was not which name people chose, but how and when they committed to one.

“Standing from two miles above the horizon, language learning and social learning look pretty similar. You have to make a commitment when the data isn’t 100 percent consistent.”

The data showed two important patterns: Groups of players, regardless of size, consistently converged on a name in about 30 rounds. When 25 percent or more of the group consisted of “defectors”—fake players who always advocated for a different name and refused to bend to the group’s will—the name they suggested eventually became the established convention.

When Yang crunched the numbers, he found that the Tolerance Principle predicted both results. And when the team tested it against the actual round-by-round choices participants made, it was the most accurate model by far: It correctly predicted what someone would do next nearly 9 times out of 10, compared to about 8 times out of 10 for the imitation model and roughly 6 out of 10 for the optimizer.

Humans vs. Machines

The results suggest that learning social conventions is likely a two-part process. First, before enough evidence has accumulated, people stay flexible, choosing among options at random, weighted by how often each has come up. But once a majority of recent encounters conform to a rule, a convention snaps into place.

As artificial intelligence becomes increasingly central to how we understand learning, Yang says the findings highlight the unique nature of the human brain. Large language models like ChatGPT process enormous amounts of data and return the statistically most probable answer—essentially, the optimizer model that Yang and colleagues debunked.

Human brains, it turns out, hold themselves to a much higher standard before committing. “Humans need a supermajority of evidence before the brain locks in,” Yang explains. “That’s where the differences between human and AI can be most saliently observed.”

This strategy may have been an advantage for our ancestors, he adds. “Being conservative up to a point is probably evolutionarily adaptive, because you don’t want to take a chance when it may mean dying or starving.”

Yang says researchers like him are still trying to understand why the brain works this way, tracing from the behavioral level down to the neural one. He also wants to test how the principle applies to a problem even closer to everyday experience: how we form mental categories, and how we tolerate the exceptions that come with them.

“These decision-making skills have been enough for us to get by for a very long time,” he says. “And that, I think, is something worth studying.”