14th POINT Conference officially kicked off with the AI panel
The 14th edition of the POINT Conference is officially opened in Sarajevo Dom mladih, bringing together nearly 60 participants from around the world who will discuss topics such as artificial intelligence, elections, media literacy and many others.

We live in a world where, more and more often, artificial intelligence (AI) shapes the way we understand truth, facts, and knowledge. But what is the truth according to AI? Can AI take us closer to factual accuracy or make distinguishing fact from fiction even more difficult?
Moderated by Marija Ćosić, researcher in CA “Why not”, panel “The Truth According to AI” addressed the ways how large language models (LLM) and chatbots approach different topics, how they shape sensitive narratives, and what consequences this may have on the credibility of information in the public space, bringing together speakers who discussed on AI from various perspectives.
Hyrije Mehmeti, the Head of Editorial and Program Coordination at Hibrid.info, a Kosovo-based platform for digital information integrity, investigated how major AI systems shape geopolitical narratives. She presented a short video that depicted, as she said, a very unique report, published in December, about how LLMs shape and sometimes disorder narratives about Kosovo, Western Balkans and global comparison topics.
The research examined three AI chatbots – ChatGPT, DeepSeek and Alice – using 100 prompts designed to assess the factual accuracy of their responses. Particular attention was given to politically sensitive and geopolitical topics, including the status of Kosovo, the Srebrenica genocide, and the annexation of Crimea.
– The analysis also considered the architecture of the chatbots, the datasets on which they were trained, and the degree of state influence over their operation and outputs. ChatGPT was assessed as having the lowest level of direct state interference, while DeepSeek, developed in China, was considered the most affected by state control. Alice, from Russia, occupied an intermediate position. Notably, Alice was the only chatbot that refused to answer certain questions related to the Western Balkans and Crimea, Mehmeti explained.

One of the interesting findings was that Alice, in certain countries, occasionally responded in Russian when asked questions about Crimea, even when the prompts had been submitted in English.
Hicham Yezza, Principal Data Scientist for Responsible AI at the BBC, contributed one of the largest international evaluations of AI assistance and news content, covering 14 languages in 18 countries. He talked about research the BBC conducted last year in conjunction with the European Broadcast Union, covering 22 public service broadcasters from the EU in addition to the BBC.
– The research, published in October 2025, evaluated four AI systems: ChatGPT by OpenAI, Gemini by Google, Copilot by Microsoft, and Perplexity. Researchers tested the systems using 30 core questions and collected a total of 3,000 responses. The questions were designed to reflect the types of queries people commonly use to understand and verify news, making the sample as representative of real-world user behaviour as possible.
The findings showed that 45% of responses contained significant issues that made them not fit for purpose. Accuracy problems were identified in 20% of responses, while nearly one-third contained issues related to sourcing, and attribution. A separate study published by the BBC found that AI assistants are increasingly being used as a source of news and factual information. Research from the Reuters Institute similarly indicates that public trust in these tools is growing, despite persistent concerns about the reliability and accuracy of their outputs. If AI assistants are not yet a reliable way to access the news, but many consumers trust them to be accurate, we have a problem, Yezza noted.

A different perspective came from Almira Osmanović Thunström, Swedish neuroscientist and pioneer in digital psychiatry and AI ethics, who gained international recognition for high-profile experiments exposing AI vulnerabilities. She, among other things, orchestrated the “Bixonimania” hoax, tricking AI models and human researchers into diagnosing a fictional disease, which went pretty viral.
– Doing my chatboot studies, I wanted to show my medical students how we go from data to product or language model. Usually, most of these large corporations have the same ancestry model; they all stem from BERT (Bidirectional Encoder Representations from Transformers) which is Google’s model. They all sort of piggyback on open data from the Common Crawl foundation, a foundation that crawls data online. But that is like a crawling fishnet that not only picks up tuna; it picks up trash, it picks up whales, it picks up everything that is not supposed to be there. And then there are humans in this sort of factory, who sit and sift through this data and create sort of chains of commands and prompts to stop the worst of the data from reaching us. I realized that they cannot always be the eyes for everyone. So it’s easy to trick the system into accepting any. I talked to my fellow colleagues and asked what kind of condition would be benign enough to put out there and make it clear to everyone that this was not a real eye disease, but the systems would just swallow it up. So I came up with “Bixonimania”, said Osmanović Thunström, jokingly adding the reason it became viral in the Balkans is because main author is called “Lažov Izgubljenović”, which means “the lying loser”.
– That was sort of the idea behind it, knowing how these systems work, Osmanović Thunström said.
One of the speakers was Nataliia Romanyshyn, an AI Specialist at Texty.org.ua, where she builds AI-powered tools to detect and analyze Russian disinformation. As panel moderator noted, Natalia opted to work on open-weight models rather than chatbots. She described a scientific research project conducted in Ukraine to measure political bias – specifically pro-Ukrainian versus pro-Russian leanings – in foundation LLMs.
– As a result, we got around 3,000 questions with four multiple choice answers. In such a setting, we could make a study where all the models had the same input capabilities. Another important point of our research was to not work with chatbots, but LLM models itself: a chatbot is a ready to use product which not only includes an LLM, but also all of the company policies, moderation and filtration, Romanyshyn explained.
Aistė Meidutė, one of the founders of the biggest Lithuanian news outlet Delfi fact-checking initiative “Melo detektorius”, researched similar topics on how AI handles politically sensitive narratives. The particular goal of the study was to discover what kind of sources these different chatbots use to answer.
Author: Aldijana Handžar Zorlak
(point.zastone.ba)