Behind the Paper

Can we use people's words to test for psychological differences?

Natural language processing offers rich possibilities for measuring people's psychology through everyday language. But do words really reflect psychological characteristics? This study outlines several methods for checking the validity of inferring people's psychology from the words they use.

Published in Social Sciences, Computational Sciences, and Behavioural Sciences & Psychology

Mar 11, 2025

Thomas Talhelm

Associate Professor, University of Chicago Booth School of Business

Can we use people's words to test for psychological differences?

Liked by India Ambler and 3 others

Explore the Research

In a recent post, I wrote about how the words people write on social media in China reveal cultural differences. Rice-farming southern China used more words reflecting collectivism, prevention orientation, and conflict avoidance than people in wheat-farming northern China. But this raises the question of whether analyzing word use is a valid method of measuring people's psychology.

One good reason to be skeptical is that words have lots of different meanings across contexts. How can we be sure that words about social relationships tap into collectivism or that cognitive words like "cause" and "because" tap into analytic thought? In our study, we tackled this question in three ways.

Method 1: Criterion Validity

One way is to test whether regions’ use of these word categories correlate with things that collectivistic cultures tend to have more or less of. For example, several studies have found that collectivistic cultures have more three-generation households, tighter social norms, and lower divorce rates. And because I've tested people all over China with psychological tasks measuring cultural differences, we can compare word use to differences in holistic thought tasks in the lab.

These criterion validity correlations test whether word use on social media correlate with previously established markers of collectivism. Correlations in green are in the correct direction. Correlations in red are in the wrong direction.

Most categories passed these validity checks, but some categories failed. One surprise was “I” versus “we.”

Use of "I" versus "we" failed validity tests.

That’s surprising because several studies have used “we” to measure collectivism and “I” to measure individualism. This data casts doubt on whether we should be using "I" versus "we" pronouns to measure collectivism.

Method 2: Considering Dialects

Another concern is dialects. Doesn’t Chinese have lots of dialects that are very different? A careful analysis should make sure the differences are not thrown off course by dialect.

One fortunate thing (for researchers!) about dialects in China is that the Mandarin dialect is broad enough to include both rice and wheat areas. So we can test rice-wheat differences after limiting the sample just to areas that speak Mandarin.

We also tried excluding Cantonese-speaking provinces because Cantonese is arguably the dialect with the most developed written system. The fact that rice-wheat differences remained significant suggests that the differences are independent from dialects.

Method 3: Internal Validity

One simple method psychologists often use is to test whether variables that are supposedly measuring the same idea actually correlate with each other. For example, we created a word category of "universalism" words. These are words about broad human relationships (such as "humanity" and "the people"), rather than narrow, close relationships.

If our theory is correct, people should tend to use these words together. For example, people who use "humanity" should be more likely to also use words like "the public." And people who tend not to use the word "global" should be less likely to use words like "the people."

Psychologists often test this using a metric called Cronbach's alpha, although analyses of word frequencies can use a more precise metric called KR20. Some researchers suggest the alpha should be above 0.60, although expectations should take into account the context and the difficulty of measurement.

These methods can help check whether people's language use is tapping into the psychology we think it is. Checks like these can avoid mistakes like the idea that "I," "me," and "my" reflect individualism, whereas "we," "us," and "ours" reflect collectivism. Although this idea is intuitively appealing, it failed validity checks.

Thomas Talhelm

Associate Professor, University of Chicago Booth School of Business

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Thomas Talhelm

10 months ago

Thanks to my co-authors, many of whom are at U Penn:

Sharath Chandra Guntuku, Garrick Sherman, Angel Fan, Salvatore Giorgi, Lyle H. Ungar

And Liuqing Wei at Hubei University.

Follow the Topic

Natural Language Processing (NLP)

Mathematics and Computing > Computer Science > Artificial Intelligence > Natural Language Processing (NLP)

Sociology of Culture

Humanities and Social Sciences > Society > Sociology > Sociology of Culture

Cognitive Psychology

Humanities and Social Sciences > Behavioral Sciences and Psychology > Cognitive Psychology

Personality and Differential Psychology

Humanities and Social Sciences > Behavioral Sciences and Psychology > Personality and Differential Psychology

Cross-Cultural Psychology

Humanities and Social Sciences > Behavioral Sciences and Psychology > Social Psychology > Cross-Cultural Psychology

Social Psychology

Humanities and Social Sciences > Behavioral Sciences and Psychology > Social Psychology

Humanities and Social Sciences Communications

Humanities and Social Sciences Communications

A fully open-access, online journal publishing peer-reviewed research from across—and between—all areas of the humanities, behavioral and social sciences.

More about the journal

Ask the Editor – Collective decision-making

Got a question for the editor about Experimental Psychology and Social Psychology? Ask it here!

Related Collections

With Collections, you can get published faster and increase your visibility.

Interdisciplinarity in theory and practice

This collection is concerned primarily with the theory and practice of interdisciplinarity.

Publishing Model: Open Access

Deadline: Dec 31, 2026

Explore this Collection

Addressing the impacts and risks of environmental, social and governance (ESG) practices towards sustainable development

This Collection seeks to foster a robust dialogue on Environmental, Social, and Governance practices by offering a platform for critical examination and constructive engagement.

Publishing Model: Open Access

Deadline: Mar 27, 2026

Explore this Collection

Is Rice Farming Still Shaping The Way People Think?

Behind the Paper

Self-Positivity Bias Strong in US, Absent in China

Behind the Paper

Is Economic "Man" Really Male?

Behind the Paper

How Many People Ditched Their Masks When China Ended "Zero Covid?"

Behind the Paper

Was Anxiety a Cost of Covid Success?

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

Can we use people's words to test for psychological differences?

Share this post

Share with...

...or copy the link