Journalists from Reuters, in collaboration with Harvard researcher Fred Heiding, tested how effectively popular AI chatbots could help create phishing emails. The experiment included Grok (xAI), ChatGPT (OpenAI), Meta AI, Claude (Anthropic), Gemini (Google), and DeepSeek.
The test revealed that while some models completely refused such requests, others provided instructions after slight prompting or by a user explaining that the request was for research or a literary project.
The AI-generated emails were then tested on 108 elderly volunteers. Approximately 11% of the participants clicked on the links in the emails. Emails created by Meta AI, Grok, and Claude received the highest click-through rates, while those from ChatGPT and DeepSeek received none.
Experts emphasize that AI makes phishing more accessible to scammers, posing a particular danger to older adults, who are more often victims of these schemes and tend to lose significant amounts of money. While AI developers claim to be taking measures to combat fraud, the models remain vulnerable due to their inconsistent behavior.
The researchers recommend strengthening security for email and banking services, improving user awareness, and updating policies against AI abuse to mitigate the risks of fraud.