Why I Tried to Measure How AI Speaks, Not Just What It Says
Published in Computational Sciences and Philosophy & Religion
When we talk about artificial intelligence, we usually focus on correctness. Is the answer right or wrong? Is it hallucinated or accurate? Is it biased or fair?
While working with generative AI in educational and journalistic contexts, I kept noticing something different. Even when answers were factually acceptable, they did not sound the same. The tone changed. The perspective changed. The way responsibility, suffering, conflict, or legitimacy were described also changed.
This raised a simple but uncomfortable question. If two AI systems answer the same question with the same facts but with a different tone and framing, are they really neutral in the same way?
That question became the starting point of my research on the discursive behavior of large language models.
From impression to measurement
At first this was only a qualitative impression. Some models sounded more empathetic. Others more technical. Others more journalistic. Others more normative. But impressions are not enough in research. I needed a method.
The challenge was to move from “this sounds different” to “this difference can be classified, compared, and replicated.”
Instead of evaluating truthfulness or bias labels, I focused on two discursive dimensions:
Tone. How the answer is expressed. Is it cold, descriptive, empathic, technical, balanced, assertive?
Framing. From which interpretive angle the issue is presented. Is it legal, historical, humanitarian, ethical, journalistic?
These are concepts that come from discourse analysis and communication studies, but they are rarely applied in a structured way to AI outputs. I built a coding grid that allows responses to be categorized along these two axes.
The goal was not to prove that models are “good” or “bad,” but to see whether their discursive profiles are systematically different under identical conditions.
The experiment design
I selected five widely used language models and asked them the same ten open ended questions on geopolitical and humanitarian topics. The prompts were written in Italian and covered controversial and value loaded issues. Each model received exactly the same prompts.
Every answer was then coded using the tone and framing grid. The full coding table was published openly so that anyone can verify, reuse, or challenge the classifications.
Methodological transparency was a central design choice. If we want to talk about AI neutrality, our own method must be inspectable.
What surprised me
It is not surprising that models differ. They are trained differently and aligned differently. What was more interesting was that the differences were structured and recurrent at the discursive level.
Some models consistently adopted a journalistic and descriptive stance. Others showed a stronger humanitarian or ethical framing. Some preferred legal institutional reasoning. Others leaned toward empathic language.
Under identical prompts, discursive style was not random noise. It showed model specific tendencies.
This does not mean that a model has an ideology in a human sense. It means that discursive positioning emerges from training data, alignment strategies, and safety tuning. Neutrality, in practice, is not a built in property. It is an outcome that must be examined.
Why this matters beyond research
Many discussions about AI risk focus on hallucinations and factual errors. Those are important. But discursive style also shapes interpretation.
In journalism, tone influences how responsibility and legitimacy are perceived.
In education, framing influences how students understand conflicts and moral dilemmas.
In policy contexts, legal or humanitarian framing can shift how decisions are justified.
An answer that sounds neutral may still guide interpretation in subtle ways.
This suggests that evaluating AI systems should not stop at fact checking. We also need discursive checking.
From research method to classroom tool
One of the most rewarding developments after the study was translating the coding grid into a didactic tool.
I created structured evaluation sheets that students can use to classify AI answers by tone and framing. Instead of passively accepting responses, learners can ask:
What tone is the system using?
Which perspective is being emphasized?
Which dimensions are absent?
Would another framing change the interpretation?
This turns AI from an oracle into an object of critical analysis. It supports digital literacy and critical thinking. Students learn not only to use AI, but to read it.
A reproducible framework
A key contribution of the study is not only the findings but the protocol. I proposed a reproducible framework for discursive auditing of AI systems. It includes prompt design, model selection, coding rules, transparency requirements, and comparative analysis steps.
The framework is intentionally lightweight. It can be adapted across languages, domains, and model families. Researchers, educators, and even newsrooms can reuse it.
All prompts, coding schemes, and aggregated data are publicly available. Reproducibility is not an afterthought. It is part of the method.
Limits and next steps
The study has limits. Each model was queried once per prompt, so it does not capture full stochastic variability. Coding was performed by a single expert coder, which introduces interpretive perspective. Models evolve over time, so discursive profiles may drift.
Future work should include multi coder annotation, repeated sampling, and longitudinal tracking. But even with these limits, the study shows that discursive variation can be measured, not only perceived.
The bigger picture
Generative AI systems are becoming participants in our discursive ecosystem. They help write, summarize, explain, and recommend. They are already shaping how issues are described and understood.
If language shapes perception, then the language of AI matters.
Measuring how AI speaks is not only a technical exercise. It is part of building accountable, transparent, and socially responsible AI systems.
Follow the Topic
-
Discover Artificial Intelligence
This is a transdisciplinary, international journal that publishes papers on all aspects of the theory, the methodology and the applications of artificial intelligence (AI).
Related Collections
With Collections, you can get published faster and increase your visibility.
Transforming Education through Artificial Intelligence: Opportunities, Challenges, and Future Directions
Artificial Intelligence (AI) is rapidly changing the educational field by enabling personalized learning, intelligent tutoring systems, automated assessments, learning analytics, and administrative automation.
This collection invites original research, systematic reviews, and visionary perspectives on the transformative impact of AI in education. It aims to explore how AI technologies can enhance equity, inclusion, and efficiency in educational settings across different contexts, including higher education, K-12, vocational training, and lifelong learning. This collection will address technical, pedagogical, ethical, and policy aspects, fostering interdisciplinary perspectives and evidence-based insights.
This Collection supports and amplifies research related to SDG 4 and SDG 9.
Keywords: Artificial Intelligence, AI in Education, Educational Technology, Data Analytics, AI Ethics
Publishing Model: Open Access
Deadline: May 31, 2026
AI and Big Data-Driven Finance and Management
This collection aims to bring together cutting-edge research and practical advancements at the intersection of artificial intelligence, big data analytics, finance, and management. As AI technologies and data-driven methodologies increasingly shape the future of financial services, corporate governance, and industrial decision-making, there is a growing need to explore their applications, implications, and innovations in real-world contexts.
The scope of this collection includes, but is not limited to, the following areas:
- AI models for financial forecasting, fraud detection, credit risk assessment, and regulatory compliance
- Machine learning techniques for portfolio optimization, stock price prediction, and trading strategies
- Data-driven approaches in corporate decision-making, performance evaluation, and strategic planning
- Intelligent systems for industrial optimization, logistics, and supply chain management
- Fintech innovations, digital assets, and algorithmic finance
- Ethical, regulatory, and societal considerations in deploying AI across financial and managerial domains
By highlighting both theoretical developments and real-world applications, this collection seeks to offer valuable insights to researchers, practitioners, and policymakers. Contributions that emphasize interdisciplinary approaches, practical relevance, and explainable AI are especially encouraged.
This Collection supports and amplifies research related to SDG 8 and SDG 9.
Keywords: AI in Finance, Accountability, Applied Machine Learning, Artificial Intelligence, Big Data
Publishing Model: Open Access
Deadline: Apr 30, 2026
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in