Cybersecurity researchers have shed light on a new jailbreak technique that could be used to get past a large language model’s (LLM) safety guardrails and produce potentially harmful or malicious responses.
The multi-turn (aka many-shot) attack strategy has been codenamed Bad Likert Judge by Palo Alto Networks Unit 42 researchers Yongzhe Huang, Yang Ji, Wenjun Hu, Jay Chen, Akshata Rao, and…

Read the rest of the story at Read More

Source: The Hacker News

Related posts

Leave a Comment