Is ChatGPT Really Bullshit? Rethinking Language Models through the Lens of Logic and Mathematical Reality

The claim that ChatGPT is “bullshit” misapplies Frankfurt’s concept to a non-agentive system. Instead of focusing on truth-indifference, we should analyze LLMs structurally — where real epistemic danger lies in AI that simulates expertise without logical grounding.
Is ChatGPT Really Bullshit? Rethinking Language Models through the Lens of Logic and Mathematical Reality
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

In “ChatGPT is Bullshit” (Ethics and Information Technology, 2024), Hicks, Humphries, and Slater argue that large language models (LLMs), including ChatGPT, produce what Harry Frankfurt calls “bullshit” — statements made without regard for truth. They distinguish between “soft” bullshit (indifference to truth) and “hard” bullshit (intent to deceive about epistemic commitments), suggesting that ChatGPT fits at least the former, and possibly the latter.

But applying Frankfurt’s moral-linguistic category to statistical models like ChatGPT raises important conceptual and technical concerns. In this article, I propose an alternative reading: ChatGPT is not a bullshitter, but a formal generative system, governed by constraints and patterns that reflect statistical regularities rather than epistemic beliefs. The real danger lies not in ChatGPT, but in AI systems that mimic scientific legitimacy without logical structure — systems that simulate academic or mathematical rigor while lacking internal justification. These, I argue, are better candidates for the term "bullshit machines."

Frankfurt’s original definition of bullshit refers to speech acts performed by an agent who is indifferent to truth. The bullshitter may not lie, but he doesn't care whether what he says is true or false; he merely aims to persuade, impress, or fill space.

This framework presupposes a first-person perspective, where the speaker holds beliefs, makes decisions, and adopts a communicative posture toward truth. But ChatGPT — and similar LLMs — is not a speaker in this sense. It has no beliefs, intentions, desires, or awareness. It doesn’t care, because it cannot care. Attributing indifference to a non-agent is a category mistake.

Saying that a generative model is "indifferent to truth" is like saying that a function is indifferent to its output — the concept simply doesn't apply. ChatGPT doesn’t lie, bullshit, or deceive; it computes token sequences based on conditional probabilities within a pre-trained distributional space.

LLMs are trained on vast corpora of text using next-token prediction objectives. What emerges is a complex statistical mapping between prior contexts and probable continuations. This is not interpretation. It is compression and extrapolation.

Rather than treating LLMs as failed communicators, we should treat them as formal systems — akin to generative grammars, Turing machines, or operator frameworks. Their “outputs” are not assertions in the human sense; they are surface-level continuations constrained by internal model architecture, training distribution, and hyperparameters (such as temperature, top-k sampling, etc.).

When Hicks et al. accuse ChatGPT of bullshitting, they implicitly assume that epistemic orientation is a requirement for producing linguistic content. But this conflates appearance of meaning with intentional meaning. The former is a surface phenomenon; the latter requires grounding in mental or inferential architecture — which LLMs lack.

If there is a form of AI-generated content that deserves to be called "bullshit", it is not ChatGPT's generalist output. Rather, it is the simulation of epistemic authority without grounding.

One of the key mistakes in Hicks et al.’s analysis is the conflation of surface resemblance with epistemic intention. ChatGPT appears to “speak,” to “reason,” even to “explain,” but these behaviors are illusions in the technical sense: they are projections of human cognitive expectations onto statistically produced text.

Just as a chatbot can simulate remorse without feeling guilt, or simulate insight without holding beliefs, it can simulate concern for truth without having any relationship to truth itself. This distinction may seem subtle, but it is crucial: simulation of speech acts is not equivalent to participation in epistemic norms. Frankfurt’s theory of bullshit is rooted in normativity — in what a speaker should care about when speaking. But a language model, lacking beliefs or goals, is not a norm-bound speaker.

This is not just a philosophical point; it has practical consequences. Over-anthropomorphizing AI systems leads to incorrect expectations, inappropriate blame, and confusion about responsibility. Rather than holding ChatGPT accountable for “bullshit,” we should ask whether the systems we design enforce the right constraints, support traceability, and preserve epistemic accountability.

It is essential to draw a clean distinction between simulating concern for truth and actually being concerned with truth. A language model can produce outputs that mirror sincere discourse without possessing any epistemic commitments. Hicks et al. collapse this distinction, interpreting surface-level fluency and formal plausibility as evidence of bullshit — when it is, at best, probabilistic mimicry of truth-seeking behavior.

To evaluate whether a model is engaged in “bullshit,” we must first answer: what would truth-seeking even look like for a non-agent? This is not just a theoretical problem; it cuts to the heart of how we design and deploy AI systems. A model cannot “care” about truth, but the framework in which it operates can. For example, if we embed retrieval systems, formal verification, and symbolic constraints into generation pipelines, we can enforce epistemic structure without simulating intentionality.

Therefore, rather than evaluating a model by whether its output “sounds like” it cares about truth, we should assess whether it has access to verifiable knowledge representations, whether it can perform logical inference, and whether its outputs are auditably generated.

The greatest danger in generative AI is not the friendly banter of ChatGPT, but the emergence of pseudo-authoritative systems that mimic expertise without any internal depth. Consider a tool that produces mathematical proofs, legal analysis, or scientific commentary — but does so by mimicking the stylistic markers of authority (citations, equations, formal language) without grounding those outputs in real structures, data, or logic.

This creates an illusion of credibility, which can easily mislead non-experts, policymakers, and even other algorithms. In this light, Deepseek or other “scientific-style generators” are far more dangerous than ChatGPT in casual conversation. These systems simulate not just language, but institutional discourse. They output false epistemic signals that suggest an underlying cognitive rigor that isn’t there.

This is not “soft bullshit” — this is synthetic epistemic fraud, even in the absence of intention.

The greatest danger in generative AI is not the friendly banter of ChatGPT, but the emergence of pseudo-authoritative systems that mimic expertise without any internal depth. Consider a tool that produces mathematical proofs, legal analysis, or scientific commentary — but does so by mimicking the stylistic markers of authority (citations, equations, formal language) without grounding those outputs in real structures, data, or logic.

This creates an illusion of credibility, which can easily mislead non-experts, policymakers, and even other algorithms. In this light, Deepseek or other “scientific-style generators” are far more dangerous than ChatGPT in casual conversation. These systems simulate not just language, but institutional discourse. They output false epistemic signals that suggest an underlying cognitive rigor that isn’t there.

This is not “soft bullshit” — this is synthetic epistemic fraud, even in the absence of intention.

Calling ChatGPT “bullshit” may be rhetorically satisfying, but it obscures more than it reveals. Language models are not agents. They do not bullshit, lie, confabulate, or hallucinate — they generate probabilistically constrained outputs. The real challenge is understanding the logic of their constraints, not their imagined motives.

If we want to preserve public trust in science, education, and communication, we must develop new tools — not borrowed metaphors — to evaluate AI-generated language. Frankfurt’s theory of bullshit remains insightful, but its direct application to stochastic processes does not hold. Instead, we need a framework grounded in logic, semantics, formal constraints, and verifiability.

It’s time to move beyond metaphors — and into models.