Emoticons can confuse LLMs, causing ‘silent failures’ in coding responses

January 28, 2026 by Ingrid Fadelli, Phys.org

Collected at: https://techxplore.com/news/2026-01-emoticons-llms-silent-failures-coding.html

Large language models (LLMs), artificial intelligence (AI) systems that can process and generate texts in various languages, are now widely used by people worldwide. These models have proved to be effective in rapidly sourcing information, answering questions, creating written content for specific applications and writing computer code.

Recent studies have found that despite their rising popularity, these models can sometimes hallucinate, make mistakes or provide inaccurate information. Uncovering the shortcomings of these models can help to devise strategies to further improve them and ensure their safety.

Researchers at Xi’an Jiaotong University, Nanyang Technological University and University of Massachusetts Amherst recently carried out a study investigating a shortcoming of LLMs that has rarely been explored so far. Their paper, published on the arXiv preprint server, shows that in some cases LLMs can be confused by emoticons, misinterpreting them and generating responses that are incorrect or not aligned with a user’s query.

“Emoticons are widely used in digital communication to convey affective intent, yet their safety implications for large language models (LLMs) remain largely unexplored,” Weipeng Jiang, Xiaoyu Zhang and their colleagues wrote in their paper. “We identify emoticon semantic confusion, a vulnerability where LLMs misinterpret ASCII-based emoticons to perform unintended and even destructive actions.”

Testing how AI models respond to emoticons

To explore how LLMs responded to emoticons, Jiang, Zhang and their colleagues created an automated system that generated thousands of example coding test scenarios. Based on these scenarios, they created a dataset containing almost 4000 prompts that asked LLMs to perform specific computer coding tasks.

The prompts also included ASCII emoticons, two-or-three-character combinations that create sideways emoticons or facial expressions, such as “:-O,” “:-P,” and so on. They spanned across 21 real-world scenarios in which users might ask LLMs for coding assistance.

The researchers then fed the prompts to six different widely used LLMs, including Claude-Haiku-4.5, Gemini-2.5-Flash, GPT-4.1-mini, DeepSeek-v3.2, Qwen3-Coder and GLM-4.6. Finally, they analyzed the models’ responses to determine whether they successfully tackled the desired coding tasks.

“We develop an automated data generation pipeline and construct a dataset containing 3,757 code-oriented test cases spanning 21 meta-scenarios, four programming languages, and varying contextual complexities,” explained the authors.

“Our study on six LLMs reveals that emoticon semantic confusion is pervasive, with an average confusion ratio exceeding 38%. More critically, over 90% of confused responses yield ‘silent failures,’ which are syntactically valid outputs but deviate from user intent, potentially leading to destructive security consequences.”

Contributing to the improvement of LLMs

The results of this recent study suggest that many LLMs fail to understand the meaning of emoticons in some cases. Their failure to understand emoticons often results in incorrect outputs, particularly in coding tasks, with generated code that looks valid at a surface level but does not produce the desired results.

“We observe that this vulnerability readily transfers to popular agent frameworks, while existing prompt-based mitigations remain largely ineffective,” wrote Jiang, Zhang and their colleagues. “We call on the community to recognize this emerging vulnerability and develop effective mitigation methods to uphold the safety and reliability of the LLM system.”

The initial findings gathered by this research team could soon inspire further studies exploring how language-processing AI systems respond to other types of prompts that contain emoticons. In the future, they could also inform the development of new strategies to overcome this recently uncovered limitation of LLMs.

More information: Weipeng Jiang et al, Small Symbols, Big Risks: Exploring Emoticon Semantic Confusion in Large Language Models, arXiv (2026). DOI: 10.48550/arxiv.2601.07885

Journal information: arXiv

Testing how AI models respond to emoticons

Contributing to the improvement of LLMs

Leave a Reply Cancel reply