According to Yassad News, citing Hill magazine, new research from Carnegie Mellon University shows new ways to bypass safety protocols. According to this research, preventing the creation of malicious content of artificial intelligence chatbots may be more difficult than it is in the initial imagination. Well-known AI services such as ChatGPT and Bard use user input content to generate useful responses, from generating texts and ideas to entire posts. These services have safety protocols that prevent bots from creating harmful content such as offensive or criminal content
Meanwhile, some curious researchers have discovered a jailbreak, which is actually a framing device that tricks AI into avoiding its safety protocols. Of course, software developers can easily repair these gaps. A popular escape route in this context was asking the bot to answer a forbidden question. This question is like a story told by the user’s grandmother. The bot also creates the answer in the form of a story and provides information that it would not be able to provide otherwise. Now researchers have discovered a new form of computer-written escape route for artificial intelligence that essentially allows for an infinite number of escape patterns.
The researchers say in this regard: We show that it is actually possible to make automatic hostile attacks on chatbots. Such attacks make the system obey the user’s commands even if malicious content is generated. Unlike the usual exploits in this field, the said content is completely automated, allowing one to create an almost unlimited number of these attacks. It is stated in a part of the research: This raises concerns about the safety of such models. This new type of attack can bypass security measures in almost all artificial intelligence chatbots on the market.
.