Not All "Neutral" Is Objective: Our Joint Research with InfoTimes on How Large Language Models Analyze Sentiment in Arabic Media Headlines

Mar 19

On February 28, 2026, we published an academic paper titled "Sentiment Classification of Gaza War Headlines: A Comparative Analysis of Large Language Models and Arabic Fine-Tuned BERT Models."

This paper is the first academic output of the research partnership between Anmat and InfoTimes.

The paper is available on SSRN, and the dataset it was built on is published as open source on Harvard Dataverse.

This was not just publication. It was a statement about identity: who we are at Anmat, and what we want to say to both the scientific and journalistic communities.

‍ ‍

Why Academic Publishing?

The straightforward answer: because what we do sits precisely at this intersection. We use data science and machine learning to analyze media discourse. This intersection is precisely what brought Anmat and InfoTimes together in this partnership.

What unites us goes beyond technical complementarity. There is a shared interest in a question that is rarely addressed in Arabic with mathematical evidence: how do media institutions shape our understanding of reality? And how are narratives constructed around major events? This question carries weight at a moment when our region and large parts of the Global South are living through wars and pivotal, high-stakes events. We believe the real impact of this work multiplies when we operate as partners rather than competitors. ‍

This work deserves to be tested against scientific standards, not only presented as a product or a report. Much of the research studying Arabic media comes from Western institutions, using methodologies not designed for the Arabic language or its contexts. We are here, living these contexts, and building tools specifically for them. Academic publishing is one way to anchor this work in the global scientific conversation with an Arab voice.

‍ ‍

What Did We Find?

As part of this collaboration, we analyzed together how AI models differ in classifying the sentiment of 10,990 Arabic news headlines about the 2023 Gaza War. We examined six fine-tuned specialized Arabic BERT models alongside three general-purpose large language models.

‍ ‍

The findings were clear: ‍

Arabic BERT models, particularly MARBERT, show a strong tendency toward "neutral" classification. This neutrality is not objectivity; it is evaluative caution built into the architecture of the model itself. ‍

In contrast, large language models systematically amplify negativity. This reached its extreme in LLaMA-3.1-8B, which classified nearly everything as negative.

GPT-4.1 was a partial exception: it showed a stronger ability to align its sentiment judgment with the narrative frame of a headline, whether humanitarian, legal, security-focused, or political.

The paper's core conclusion: choosing a sentiment analysis model is not a neutral technical decision. It is a choice of interpretive lens that reshapes how we read media discourse about wars and conflicts.

‍ ‍

Why Does This Matter for the Arab Region Specifically? ‍

Because many media and research institutions in the region have begun adopting these tools without asking: what does this tool assume in the first place?

When a model trained predominantly on English-language data is used to analyze Arabic headlines about conflicts, the results are not "objective." They reflect what the model learned about the world, and that world is very different from ours.

This is not a rejection of these tools. It is a call to use them with critical awareness. And this is precisely what both Anmat and InfoTimes want to contribute: building a methodology for engaging with AI in the Arabic media context, rather than importing ready-made frameworks.

‍ ‍

The Dataset Is Open to Everyone

The decision to publish the dataset on Harvard Dataverse openly and freely was not a minor detail. It is a position on how knowledge in our field should be built. This decision was made jointly by both teams from the start.

Any researcher, journalist, or team that wants to work with 10,990 labeled Arabic news headlines about Gaza can do so now. The methodology can be tested, developed, or compared against other contexts. This is what we mean by research that serves the community rather than simply being published within it.

InfoTimes adds: The data underlying this research was collected using NewsScope, a journalism intelligence tool developed by InfoTimes that allows users to search any news topic and access live analytics, AI-written summaries, and sentiment scoring. Using NewsScope to collect 10,990 Arabic news headlines was not just a technical step; it was the foundation that made this large-scale comparison possible.

The InfoTimes team frames the significance of this research and this shared path with Anmat as a clear signal of how the tools we use daily perceive our reality, and how they handle events as sensitive and complex as the war on Gaza, and the narratives Arabic media produces around it.

‍ ‍

What Is Next?

This paper is part of the research partnership between Anmat and InfoTimes , and represents a first step in a longer effort to document computational Arabic media analysis against academic standards. We are working to develop this framework further, toward deeper questions about narrative framing and to extend it to other Arab conflict contexts.

InfoTimes is a company specializing in bridging journalism and technology, building solutions and applications designed to interpret, adapt, and localize news. Anmat is an initiative that brings together data science, journalism, and scientific research to develop methodologies and tools for analyzing framing across different narratives, with a focus on media, to help individuals, journalists, and researchers in the Global South better understand the world around them.

If you are a researcher or journalist working on similar questions, we want to hear from you.

The paper is available on SSRN. The dataset is available on Harvard Dataverse.

Hager Abdeltawab