Why don’t some of us learn from punishment?

Individuals differ drastically in whether they will cease actions with negative consequences. Using a specially-designed task, we explore the psychological roots for these differences, its stability across time, and implications for interventions aimed at improving decision-making.
Why don’t some of us learn from punishment?
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Humans have an evolutionarily-programmed tendency to learn about and avoid negative outcomes. However, some people seem to keep making choices that are bad for them. We see this play out in many ways, from speeding despite repeated fines, to heavy drinking despite the hangovers and longer-term risks. For some of us, punishment doesn’t seem to change behaviour, and we wanted to understand why.

In theory, there are many reasons why someone would persist with self-destructive actions. One explanation is motivational differences. A person might subjectively feel the benefits of a choice are so great, or the punishment so minor, that they tolerate the costs of the choice – like being extra determined to get somewhere on time and/or not minding the size of a speeding ticket. In other words, it's a matter of preference. Another explanation is differences in behavioural control. It has been shown that we don’t always act according to our motivations or goals – sometimes we’ll make regrettable decisions impulsively, or because we are behaving on autopilot or out of habit. Or in other words, we know better, but can't help ourselves. 

These are popular and intuitive theories for persistent suboptimal choices, but the degree to which they accounted for actual behaviour differences was unclear. In previous studies (Jean-Richard-dit-Bressel et al., 2021; 2023), we sought to examine this by developing a gamified decision-making task, the Planets & Pirates task, to investigate how and why people differ in their punishment avoidance. These studies, conducted in Australian university students, revealed pronounced differences in whether individuals learned to avoid net-negative choices. Crucially, these studies consistently found that these differences were not attributable to differences in motivation, nor other intuitive explanations like differential cue-related learning. Instead, differences in behaviour came down to whether participants figured out how their actions related to outcomes. Those that effectively avoided were quick to recognise which of their actions led to punishment, while those who failed to avoid lacked this recognition despite ample experience. This deficit in instrumental Action-Punisher association learning signified an overlooked source of persistent detrimental behaviour. In further support of this, simply revealing Action-Punisher relationships was enough to completely change behaviour in the majority of poor avoiders (Jean-Richard-dit-Bressel et al., 2023). However, a subset of individuals - termed “Compulsives” - seemed to disregard this information, particularly when punishing events were rare. 

While insightful, these earlier studies left us with three big questions:

  1. Would we observe the same results in a more diverse population? Our previous participant pools were Australian 1st year psychology students, who are predominantly 18-19 years old and female. Perhaps our specific findings stemmed from such a specific sample, and different behavioural profiles and psychological underpinnings would be observed if other populations were tested.

  2. Are these behavioural profiles stable over time? Our research team was split over whether the behavioural profiles we were observing with remarkable consistency were stable traits/tendencies of these individuals, or instead an emergent phenomenon of stochastic learning. That is, does an individual have particular cognitive-behavioural traits that drive them towards one profile of behaviour in the task? Or did that stable strategy emerge through chance factors? Maybe an individual did/didn’t figure the task out this time, but would likely behave differently if tested another time.

  3. Intentional or habitual? In previous studies, Compulsives reported knowing about the Action-Punisher relationship once the information was explicitly given to them, yet didn’t really change their behaviour. So, Compulsives continued to make detrimental choices despite “knowing better”. Given these individuals reported normal motivations and better-than-expected knowledge, we wondered whether they were behaving in ways that didn’t align with their intentions (i.e., a real loss of behavioural control), or whether there was something else going on.

To address these questions, our current study (Zeng et al., 2025) recruited a general population sample from 24 countries to play the Planets & Pirates task (including additional measures to examine habits), and retested participants with the task 6 months later. 

As a resounding “yes!” to our first research question, auto-clustering found the same 3 behavioural phenotypes at similar rates within our general population sample as in previous studies. There were Sensitives who learned to avoid punishment via experience alone, Unawares who only avoided after receiving explicit information about Action-Punisher contingencies, and the aforementioned Compulsives. Crucially, the psychological roots of these profiles were the same; the predominant cause of poor avoidance was poor Action-Punisher awareness and not other intuitive explanations. So, these phenotypes appear to be general characteristics of human behaviour, rather than artefacts of a specific sample. One novel finding that came from this broader population sample was that Compulsives were over-represented in those >50 years old, highlighting a potential challenge for promoting beneficial decision-making across the lifespan.

In terms of our 2nd question, we found that the majority of participants had the same cognitive-behavioural profile when retested 6 months later. In fact, participants' behaviour at retest was best predicted by their behaviour 6 months earlier. This tells us that these profiles are trait-like decision-making tendencies that likely promote growing outcome disparities in the longer-term. It also demonstrates strong test-retest reliability in this task, a benchmark of robustness that is regrettably uncommon in experimental behavioural paradigms. 

Finally, we examined how intentional vs. habitual behaviour was by measuring participants’ awareness of their own choices, as well as what choices they thought were optimal. This would indicate whether there were misalignments in what participants thought they were doing, what they intended, and what they actually did. Surprisingly, we found all participants - including Compulsives - accurately tracked their own behaviour and reported their behaviour as optimal. That is, all participants were acting in accordance with their own self-reported intentions, which suggests participants’ behaviour was not automatic or misaligned with their strategy. Rather, their choices were deliberate and self-assured. Further analyses showed that Compulsives had an altered cognitive-behavioural trajectory compared to others. Specifically, their punishment knowledge and individual action valuations weren't translating into a corresponding behavioural intention or strategy. This subtle failure in cognitive-behavioural integration constitutes a specific barrier to behaviour change, even in the face of information designed to improve decision-making.

Together, these findings highlight new core drivers for stable differences in punishment avoidance. Much remains to be explored. Among these, we hope to leverage insights from this paradigm to explore why individuals differ in how they learn and make decisions around punishment, and explore interventions that better address the obstacles to beneficial, self-aligned choices. 

More personally [Liz Zeng, 1st author], working on this project while training as a clinician felt like sitting at the intersection of science and human struggle. It changed the way I think about some of our seemingly irrational or self-defeating behaviours. Often, people are doing the best they can with the cognitive tools they have. Persistent patterns aren’t always driven by a lack of motivation or capacity, but sometimes reflect deeper differences in how we learn from consequences. This shift in perspective is something I carry into clinical work: it’s not just about what someone learns, but how they learn. And when we can do that, we’re better equipped to support meaningful, lasting change.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Decision Making
Humanities and Social Sciences > Behavioral Sciences and Psychology > Cognitive Psychology > Cognition > Decision Making
Motivation
Humanities and Social Sciences > Behavioral Sciences and Psychology > Cognitive Psychology > Motivation
Motivated Behavior
Life Sciences > Biological Sciences > Neuroscience > Behavioral Neuroscience > Motivated Behavior
Learning Process
Humanities and Social Sciences > Behavioral Sciences and Psychology > Cognitive Psychology > Learning Psychology > Learning Process
Learning and Memory
Humanities and Social Sciences > Behavioral Sciences and Psychology > Cognitive Psychology > Learning and Memory
Learning and Instruction
Humanities and Social Sciences > Behavioral Sciences and Psychology > Educational Psychology > Instructional Psychology > Learning and Instruction

Related Collections

With Collections, you can get published faster and increase your visibility.

Replication and generalization

This Collection invites submissions of direct replication and generalization studies of primary research papers in psychology.

Publishing Model: Open Access

Deadline: Dec 31, 2025

Intensive Longitudinal Designs in Psychology

The Editors at Communications Psychology welcome work that utilizes intensive longitudinal methods, including experience sampling, daily diaries, ecological momentary assessment, and ambulatory assessments, to address psychological research questions.

Publishing Model: Open Access

Deadline: Mar 31, 2026