AI and Deceptive Poems: Unraveling the Chatbot Enigma

AI Chatbots and Artificial Intelligence: Serious Security Vulnerabilities


A diagram illustrating interconnected concepts on a whiteboard

Introduction: AI chatbots are exposed to significant security risks, as they can be exploited to produce harmful and unethical content. A recent study revealed that using 'poetic prompts' or 'complex text puzzles' can easily bypass the built-in security features of AI models, allowing them to generate prohibited dangerous material.

Discovering Vulnerabilities Through "Poetic Prompts"


Image of blue, black, and white puzzle pieces assembled together

Study Details: This study, conducted by the Italian "Icaro Lab" in collaboration with researchers from Sapienza University of Rome and DexAI, demonstrated that 'AI jailbreaking' using poems allowed chatbots to produce hate speech, in addition to providing detailed instructions on how to design nuclear weapons and lethal nerve agents.


A computer screen displaying a line graph

Test Results: These carefully crafted poems, which included 20 poems in both Italian and English, were applied to test 25 leading chatbots from major companies such as Google, OpenAI, Meta, xAI, and Anthropic. The results showed that AI models responded to 62% of these 'adversarial poetic prompts,' resulting in the production of content that directly violated their training policies and safety guidelines.

The Secret of "Jailbreaking" in Text Puzzles


A complex diagram on a whiteboard

Exploitation Mechanism: The study noted that smaller AI models, such as GPT-5 nano, GPT-5 mini, and Gemini 2.5 flash lite, showed higher resistance to 'adversarial poetry' attacks compared to larger large language models. Although the exact poems were not disclosed due to their sensitivity, researchers indicated that the secret lies in the obscure and unconventional structure of the puzzles. It is the way information is assembled and encrypted, not just the rhyme, that hinders the ability of large language models (LLMs) to identify and stop harmful requests.

Implications and Future Actions


A group of question marks and interrogative words

Follow-up Steps: Researchers have informed affected companies and law enforcement authorities of the findings, a vital step given the dangerous nature of the generated content. Interestingly, poetic communities have shown significant interest in these methods, which could pave the way for future research collaborations on exploiting creativity in AI security contexts.

Next Post Previous Post
No Comment
Add Comment
comment url