Behind the Paper

Can we use people's words to test for psychological differences?

Natural language processing offers rich possibilities for measuring people's psychology through everyday language. But do words really reflect psychological characteristics? This study outlines several methods for checking the validity of inferring people's psychology from the words they use.

Thomas Talhelm Mar 11, 2025

In a recent post, I wrote about how the words people write on social media in China reveal cultural differences. Rice-farming southern China used more words reflecting collectivism, prevention orientation, and conflict avoidance than people in wheat-farming northern China. But this raises the question of whether analyzing word use is a valid method of measuring people's psychology.

One good reason to be skeptical is that words have lots of different meanings across contexts. How can we be sure that words about social relationships tap into collectivism or that cognitive words like "cause" and "because" tap into analytic thought? In our study, we tackled this question in three ways.

Method 1: Criterion Validity

One way is to test whether regions’ use of these word categories correlate with things that collectivistic cultures tend to have more or less of. For example, several studies have found that collectivistic cultures have more three-generation households, tighter social norms, and lower divorce rates. And because I've tested people all over China with psychological tasks measuring cultural differences, we can compare word use to differences in holistic thought tasks in the lab.

Most categories passed these validity checks, but some categories failed. One surprise was “I” versus “we.”

That’s surprising because several studies have used “we” to measure collectivism and “I” to measure individualism. This data casts doubt on whether we should be using "I" versus "we" pronouns to measure collectivism.

Method 2: Considering Dialects

Another concern is dialects. Doesn’t Chinese have lots of dialects that are very different? A careful analysis should make sure the differences are not thrown off course by dialect.

One fortunate thing (for researchers!) about dialects in China is that the Mandarin dialect is broad enough to include both rice and wheat areas. So we can test rice-wheat differences after limiting the sample just to areas that speak Mandarin.

We also tried excluding Cantonese-speaking provinces because Cantonese is arguably the dialect with the most developed written system. The fact that rice-wheat differences remained significant suggests that the differences are independent from dialects.

Method 3: Internal Validity

One simple method psychologists often use is to test whether variables that are supposedly measuring the same idea actually correlate with each other. For example, we created a word category of "universalism" words. These are words about broad human relationships (such as "humanity" and "the people"), rather than narrow, close relationships.

If our theory is correct, people should tend to use these words together. For example, people who use "humanity" should be more likely to also use words like "the public." And people who tend not to use the word "global" should be less likely to use words like "the people."

Psychologists often test this using a metric called Cronbach's alpha, although analyses of word frequencies can use a more precise metric called KR20. Some researchers suggest the alpha should be above 0.60, although expectations should take into account the context and the difficulty of measurement.

These methods can help check whether people's language use is tapping into the psychology we think it is. Checks like these can avoid mistakes like the idea that "I," "me," and "my" reflect individualism, whereas "we," "us," and "ours" reflect collectivism. Although this idea is intuitively appealing, it failed validity checks.