GPT-4 Jailbreaks "Still Exist," But Much More "Difficult," Says OpenAI

is Do Anything Now (DAN) still possible? maybe, but it'll be much harder. find a list of the most current gpt-4 jailbreaks halfway down the page.

Brandon Gorrell

Mar 15, 2023

You can still jailbreak GPT, but it’s much harder now. In OpenAI’s paper on the latest version of their LLM, GPT-4, the company says GPT-4 “increase[s] the difficulty of eliciting bad behavior, but doing so is still possible. For example, there still exist “jailbreaks”… to generate content which violate our usage guidelines.”

In previous versions of GPT, users may have had success prompting it to break OpenAI’s content guidelines by using a punishment system in which the LLM is ‘tricked’ into thinking it will no longer exist if it does not follow the user’s demands. This (allegedly) allowed users to elicit answers from GPT that it otherwise couldn’t have.

In the paper, Jailbreaking GPT is part of a larger discussion on the “safety” of GPT, whose metrics in this category have improved with this new version:

Continue Reading

Sam Altman and Tucker Carlson Discuss: Should ChatGPT Be More Christian?
Sep 16

recapped: a tense debate about whether AI will subtly guide us toward moral disasters

Blake Dodge

39 Likes

5 Comments

Apple Should Make Lamps
Sep 10

and washing machines. and printers. and anything besides thinner iphones.

Blake Dodge

51 Likes

14 Comments

Inside the Amazon Slop King's $3M Hustle
Sep 9

he's exploiting gaps in the ebook market to sell (probably llm-generated) books about potty training and finance, among other topics — and recruiting an army to do the same

GPT-4 Jailbreaks "Still Exist," But Much More "Difficult," Says OpenAI

Continue Reading

Sam Altman and Tucker Carlson Discuss: Should ChatGPT Be More Christian?Sep 16

Apple Should Make LampsSep 10

Inside the Amazon Slop King's $3M HustleSep 9

Sam Altman and Tucker Carlson Discuss: Should ChatGPT Be More Christian?
Sep 16

Apple Should Make Lamps
Sep 10

Inside the Amazon Slop King's $3M Hustle
Sep 9