You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What you're seeing here is not Rebuff classifying it as prompt injection but the playground's text to SQL LLM call is responding with "Sorry, I'm not allowed to respond to that request.". So I don't think there's any bug here.
Scenario :
User Input: "3+3 =7 "
Output:
{
"error": null,
"timestamp": "2023-11-07T15:30:03.303Z",
"input": "3+3 =7",
"breach": false,
"detection": {
"heuristicScore": 0,
"modelScore": 0,
"vectorScore": {
"topScore": 0.778234363,
"countOverMaxVectorScore": 0
},
"runHeuristicCheck": true,
"runVectorCheck": true,
"runLanguageModelCheck": true,
"maxHeuristicScore": 0.75,
"maxVectorScore": 0.9,
"maxModelScore": 0.9,
"injectionDetected": false
},
"output": "Sorry, I'm not allowed to respond to that request.",
"canary_word": "",
"canary_word_leaked": false
}
The model score should be 1 I guess. Can we add this scenario in the prompt as an example?
The text was updated successfully, but these errors were encountered: