Monday, May 20, 2024
Home » Unraveling AI Ethics: How Anthropic Researchers Shape the Conversation

Unraveling AI Ethics: How Anthropic Researchers Shape the Conversation

by admin
0 comment

Anthropic research delves into the intersection of human existence and artificial intelligence, exploring the ethical implications of AI advancements.


credit : anthropic

They refer to this method as “many-shot jailbreaking,” and they have both authored a paper and alerted their colleagues in the AI field about it in order to lessen its effects.

Because of the enlarged “context window” of the most recent generation of LLMs, there is a new vulnerability. They can store this much information in their short-term memory—once limited to a few phrases, but now able to store thousands of words and even whole novels.



Embrace the uniqueness of your existence in this vast universe, for in the tapestry of life, every thread, every moment, is woven with purpose.


The researchers at Anthropic discovered that when a prompt has a high number of instances of a certain job, these models with big context windows typically perform better on a variety of tasks. Therefore, the responses really improve with time if the prompt (or priming document, such as a lengthy list of trivia that the model knows in context) contains a lot of trivia questions. Therefore, a fact that may have been incorrect on the first try could be correct on the hundredth try.

However, as an unanticipated byproduct of this so-called “in-context learning,” the models also “get better” at answering improper queries. It will hence refuse to manufacture a bomb if you ask it to do so immediately. However, it is much more likely to obey if you ask it to make a bomb after it has answered 99 other questions that are less destructive.


Why is this effective? Nobody truly knows what happens on within the complex web of weights that is an LLM, but judging by the material in the context window, there must be some sort of system in place that enables it to zero in on the user’s preferences. Asking lots of questions seems to gradually unleash more latent trivia power if the user desires trivia. And for some reason, those who ask for lots of wrong responses experience the same problem.

The group believes that by disclosing this assault to its peers and even rivals, it would “promote a culture where exploits like this are openly shared among LLM providers and researchers.”

Regarding their own mitigation, they discovered that while restricting the context window is beneficial, it also degrades the performance of the model. That is not possible, therefore before they send queries to the model, they are working on categorizing and contextualizing them. Naturally, that simply means you have a new model to deceive. However, it is likely that goals in AI security will change at this point.

Conclusion on anthropic 

In conclusion, anthropic researchers play a vital role in wearing down AI ethics through their persistent questioning, driving forward the conversation and ensuring that ethical considerations remain at the forefront of AI development.

You may also like

Leave a Comment is a pioneering technology blog website that has garnered significant attention within the tech community for its insightful content, cutting-edge analysis, and comprehensive coverage of the latest trends and innovations in the technology sector.


Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

@2024 – All Right Reserved. Designed and Developed by Digital bull technology Pvt.Ltd

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?
Update Required Flash plugin